[14:14] Executing 'bin/ensemble', gives the error "ImportError: No module named txzookeeper.utils" [14:18] kiranmurari, how did you install ensemble (ppa.. bzr.. etc) ? looks like your missing a dependency. [14:19] hazmat, downloaded the code from lp:ensemble and executed setup.py install [14:20] hazmat, followed the getting started document https://ensemble.ubuntu.com/docs/getting-started.html [14:20] kiranmurari, so that won't install any of the c extension dependencies, and it will install older versions of pure python deps from pypi [14:20] * hazmat looks [14:21] kiranmurari, are you using a virtualenv or the system python (/usr/bin/python)? [14:22] hazmat, i'm using a virtual environment [14:22] kiranmurari, awesome... that's a little easier to fix then [14:23] hazmat, this has been reported as bug 788950 on lp [14:23] <_mup_> Bug #788950: Getting Started Documention is missing dependancies < https://launchpad.net/bugs/788950 > [14:23] _mup_, i was referring to the same one [14:23] kiranmurari, so we have a daily ppa now.. which is probably easiest for getting started... but your most of the way through a source install.. to finish the source install you'll need to cd .. & bzr branch lp:txzookeeper && cd txzookeeper && python setup.py develop [14:24] kiranmurari, with the virtualenv python [14:24] kiranmurari, fwiw mup is a bot.. bugs info and commit messages.. along with logs at irclogs.ubuntu.com [14:24] kiranmurari, thanks for filing the bug [14:25] hazmat, the bug was already in place ;) [14:25] ugh... time to fix it then [14:25] Good mornings! [14:25] niemeyer, g'morning [14:25] hazmat, thx for the pointer. let me try your suggestion [14:28] Is there a Ubuntu docs page for txzookeeper like the ensemble docs page [14:29] kiranmurari, there's an older version of txzookeeper on pypi that got install into the virtualenv ... which is what caused the error [14:32] kiranmurari, not at the moment re separate txzookeeper docs, its more of a library atm [14:41] hazmat, the error is resolved. Time to move and ahead and start testing ensemble... [14:41] kiranmurari, awesome [15:09] <_mup_> ensemble/update-install-docs r240 committed by kapil.thangavelu@canonical.com [15:09] <_mup_> update installation docs [15:12] <_mup_> ensemble/update-install-docs r241 committed by kapil.thangavelu@canonical.com [15:12] <_mup_> add python-yaml to source install package deps [16:53] I'll get some lunch.. biab === deryck is now known as deryck[lunch] [17:16] hazmat, just thinking about the review in https://code.launchpad.net/~jimbaker/ensemble/expose-watch-exposed-flag/+merge/63066. niemeyer refers to line 52 of the diff, in which exists_d is unpacked, but not used [17:17] hazmat, so the question i have is, is there a meaningful use of exists_d there, or should we simply unpack just the watch_d? [17:18] hazmat, this watcher nested function follows exactly the same pattern as in watch_resolved [17:19] jimbaker, waiting on exists_d would work although its extraneous.. or discarding the value [17:20] the discard does entail lost bg activity.. but its not clear how the data associated is useful in this context [17:21] hazmat, yeah, i think the standard pattern of _, watch_d = self._client.exists_and_watch(...) would be appropriate in this case, as you mention it is not clear how it can be usefully worked with [17:22] the point of the watcher is to be called upon a watch event after all [17:24] jimbaker, indeed.. but all watches are associated to retrieving data [17:25] it does make sense to go ahead and wait till that data is retrieved b4 proceeding with the watch [17:25] even if its not used [17:25] hazmat, so just use it as an additional sync point, eg yield exists_d ? [17:32] hazmat, as expected, that would simply just work (and verified by tests - incidentally i started splitting test_service into separate test suite classes with this branch) [17:38] hazmat, so maybe something like this would be better - it reorders the watch setup with the callback such that any StopWatcher can be trapped before attempting to reestablish the watch - http://paste.ubuntu.com/620074/ [17:39] and adds in the additional sync point of yield exists_d [17:44] hazmat, forget about that ordering change - it will occasionally fail on the slow callback tests when looped [17:44] * hazmat catches up [17:44] was on the phone with verizon support.. my dsl has been going down like 8 times a day [17:46] jimbaker, yes re additional sync point.. that's a little surprising re slow callback failure.. what's the failure out of the test with the ordering change? [17:47] http://paste.ubuntu.com/620083/ - i changed both watch_exposed_flag, watch_resolved for this experiment [17:48] hazmat, ^^^ [17:50] jimbaker, so the problem with the ordering change is that it introduces a gap while the callback is executing that changes may occur un-noticed.. [17:50] so the change isn't viable [17:51] hazmat: I believe it never makes sense to set up a watch without taking the data in consideration [17:51] Sorry, that was [17:52] jimbaker, hazmat: I believe it never makes sense to set up a watch without taking the current state in consideration [17:52] What's the point of waiting for a change if you don't know what the current value is? [17:52] hazmat, yeah, not too surprising re callback ordering, just doesn't look obvious from the code [17:53] niemeyer, in these cases, its the callback responsibility to fetch the current state, as the current state is unknown when the watch fires [17:53] niemeyer, potentially we could/should restructure these,so the callback takes the current state [17:53] hazmat, that's certainly my understanding of how we've been using watches [17:53] hazmat: It still doesn't make sense.. [17:54] hazmat: You're asking zookeeper to notify you when something changes from an arbitrary point in time.. [17:54] hazmat: when you don't know what the value was before that [17:54] hazmat: Why is it important to know the state on moment X if you've lost the changes on X-1? [17:55] hazmat: The callback receives the current state.. [17:55] niemeyer, yes, effectively..its the callback responsibility to fetch current state and effect any behavior in response to the change [17:55] hazmat: It doesn't matter what the callback does [17:55] hazmat: It's bogus to ask zookeeper to tell you about a change on something you didn't know the previous value [17:55] niemeyer, in these cases the callback recieves a change event, although the first time it recieves a node state [17:58] niemeyer, i think that's fair.. esp. towards constructing a more state based api.. instead of a notification api.. and removes state retrieval from the callback responsibility [17:59] i can incorporate that into the state protocol work [17:59] hazmat: That's not the point.. the state may still be retrieved by the callback [18:00] hazmat: The point is that if the logic is _detecting changes_ with a txzookeeper watch, and the current state is discarded, there's a bug. [18:00] niemeyer, than what's the problem? if we know the callback operates against current state, and is responsible for fetching it, what's the functional problem with the current setup? [18:00] hmm [18:01] hazmat: The problem is what I described above. [18:01] hazmat: Every single time you set up a watch and you discard the current state there's a bug. [18:01] the bug being? [18:01] hazmat: Because you don't know what you're watching [18:01] hazmat: Are you waiting for the node to be changed? What about the changes you've discarded by ignoring the current state? [18:02] we know exactly what we're watching.. that discard change handling.. is encompassed by invoking the callback which retrieves against current state.. ie. its not discarded [18:02] hazmat: Are you waiting for the node to be removed? It already was! [18:03] hazmat: No, necessarily, you can't know what you're watching if you have _ignored_ the _current_ state. :-) === deryck[lunch] is now known as deryck [18:04] niemeyer, its not ignored [18:04] niemeyer, the callback is invoked for the change [18:04] hazmat: It is.. the exists_d parameter is not used. [18:04] hazmat: and it reflects the current state. [18:05] hazmat: You don't know if you're watching for it to be removed or added. [18:05] niemeyer, yes the watch api get result is ignored, but the callback is invoked after the watch is set.. and it retrieves the current state [18:05] hazmat: Does that make sense?> [18:05] hazmat: It doesn't matter.. [18:05] hazmat: It's a pretty basic issue. [18:06] hazmat: Imagine.. [18:06] callback() => get data, the file exists, cool [18:06] file gets removed [18:06] set watch in case it changes [18:06] callback() never called again [18:07] that's not the case [18:07] the callback is invoked after the removal [18:07] we set watch, invoke callback, attach callback to new watch [18:07] hazmat: How can you tell? [18:07] hazmat: You are _ignoring the current state_ [18:08] hazmat: You are asking zookeeper to tell you when something changes, but 10000 revisions may have gone by [18:08] the watch setter is ignoring the callback, but all of the watch apis explictly state that its the responsibility of the callback to fetch current state [18:08] hazmat: Dude.. [18:09] hazmat: Why do you need a watch? [18:09] * hazmat sighs [18:09] hazmat: Yeah, I know.. [18:09] hazmat: Why? [18:09] niemeyer, to get notified of change [18:09] hazmat: Bingo.. [18:09] hazmat: So, what happens if one clock cycle before the watch gets set in zookeeper, the node got removed? [18:10] hazmat: Does it matter what the callback looked at? [18:12] niemeyer, the watch gets set before the callback is invoked.. so if the node was removed, the callback will see the removal [18:14] hazmat: The callback is invoked with the change event [18:14] hazmat: [18:14] callback_d = maybeDeferred(callback, bool(exists)) [18:14] niemeyer, the callback must fetch the current state, per its signature [18:14] the doc strings for all the watch apis state this explicitly [18:14] hazmat: What is that line above doing? [18:15] niemeyer, that's the first invocation of the callback, executing with a boolean of existence [18:15] hazmat: What is that documentation saying: [18:15] 15 change event. The watcher always recieve an initial [18:15] 16 boolean value invocation denoting the existence of the [18:15] 26 + exposed flag. Subsequent invocations will be with [18:15] 27 + change events. [18:15] + yield callback(change_event) [18:16] hazmat: I'll step out and get some coffee.. we can chat on Skype later if you still don't think there's a problem. [18:17] hazmat, niemeyer - definitely would like to be part of the skype call when it happens [18:18] niemeyer, jimbaker lets [18:19] especially since i have two branches pending on this :) [18:20] but i'm glad that the expose-watch-exposed-flag branch is motivating what's a very important discussion [18:21] niemeyer, if we're quoting.. we might as well finish the quote.. "Its important that clients do not rely on the event as reflective [18:21] of the current state. It is only a reflection of some change [18:21] happening, the callback should fetch the current value via [18:21] the API, if needed." [18:22] hazmat: Yeah, sure.. "Look.. here is the change event, and the current state, but.. DON'T TRUST IT! It's just to make sure you've read that paragraph!" [18:23] niemeyer, the current state is never passed to the callback [18:23] hazmat: + yield callback(change_event) [18:23] niemeyer, a change event != current state [18:23] hazmat: The change event is useful for..? [18:24] niemeyer, to have notice that the current state needs processing [18:25] hazmat: That's what the callback is for. The change event tells what the change is, but it can't be trusted because the next change event won't reflect the period while the callback was running. [18:25] hazmat: This API is broken. Let's fix it please. [18:25] niemeyer, the next change event will reflect the period b4 the callback was invoked [18:25] it does seem to me that zk watches in general provide better guarantees with respect to ordering (and this is where niemeyer's point is especially relevant), and the current approach loses that [18:26] niemeyer, as i said at the beginning a more state based api might reflect better [18:26] just knowing that a change has happened is so much weaker [18:26] hazmat: No, because it is _ignoring the current state_! [18:26] niemeyer, the watch is set before the callback is invoked. [18:27] We're going in loops.. [18:27] so the current state is accounted for the next change [18:27] Let's hang on mumble please [18:27] the only issue i see is the callback may recieve a more recent state (for which a watch notification/change event is pending) [18:27] and will be reinvoked subsequently [18:28] which would be resolved with a more state based callback api to the watch api [18:31] https://code.launchpad.net/~jimbaker/ensemble/expose-watch-exposed-flag/+merge/63066 [19:04] bcsaller, mumble standup? [19:24] mumble pretty much died out for me [19:27] hazmat: Btw, I see your point and agree with you that if we guarantee that the exists_and_watch hits the server before the next exists, the watch from the first will enable the callback to be activated at least once. [19:28] hazmat: More clearly then, as a summary from our conversation, the main issues are parameters being not-trustable, and events being dispatched multiple times for the same zk state version. [19:29] hazmat: If the change_event is trusted, then state may effectively be lost. [19:29] hazmat: Is that a fair summary we can agree on? [19:54] niemeyer, yes re non trustable parameters.. events being dispatched for the same state zk version.. is a little ambigious.. better to say that multiple callbacks invocations see the same zk state version [19:55] and yes that [19:55] that's a good summary [19:56] hazmat: Cool, good to be on the same page [20:05] hazmat: One thing I pondered about during that discussion is whether we guarantee that two sequential calls to the txzk API will necessarily happen in the same order within zk if we don't yield from the first one [20:06] hazmat: It looks like so, but I couldn't tell for sure without looking at the internals of zk [20:06] s/couldn't/can't [20:06] niemeyer, the calls are submitted to the zk client, which has ack'd, so my understanding is it should be sequentially ordered [20:07] ah [20:07] niemeyer, actually i take that back [20:08] the work is submitted in order, if the responses are returned in the same order is not guaranteed afaik.. say you did a sync, and a get [20:09] if you don't wait on the sync, its not clear to me that the sync will always happen b4 the get [20:10] hazmat: We'll have to investigate that.. without this guarantee, we have more serious issues in the watch-before-callback pattern [20:10] hazmat: It feels like it should be guaranteed, though [20:11] hazmat: The issue goes like this: [20:11] niemeyer, it does.. i'd have to ask the zk devs to verify though [20:12] hazmat: If we have two exists_and_watch(path) in sequence, if the former doesn't guarantee the watch being set before the following operation, we're in trouble [20:14] niemeyer, its confirmed responses are returned in order [20:15] hazmat: Cool [20:15] hazmat: So we're safe [22:45] <_mup_> ensemble/expose-watch-exposed-flag r243 committed by jim.baker@canonical.com [22:45] <_mup_> watch_exposed_flag provides the callback the current state, simplifying its API