[00:18] <_mup_> ensemble/verify-version r323 committed by jim.baker@canonical.com [00:18] <_mup_> Merged trunk [00:22] <_mup_> ensemble/verify-version r324 committed by jim.baker@canonical.com [00:22] <_mup_> Change protocol version to 0; explicity verify version key in test of InternalTopology.reset [00:30] SpamapS: no dice unless im pushing to the wrong place. http://paste.ubuntu.com/667795/ [00:53] <_mup_> ensemble/expose-cleanup r312 committed by jim.baker@canonical.com [00:53] <_mup_> More initial code [01:31] niemeyer, do we have a eureka kanban? [01:31] hazmat: Not yet, but if the milestone is in place I can set it up right away [01:31] niemeyer, the milestone is in place and most of the bugs are moved over [01:31] hazmat: Woot [01:31] hazmat: let me handle that [01:31] niemeyer, much more fun programming lp than doing it in the browser [01:32] niemeyer, i ended up writing a mongodb sync script last night after abandoning cli integration [01:32] hazmat: Wow [01:32] hazmat: That sounds awesome [01:32] niemeyer, and used that just now as a basis for doing all the mods (queries).. its gratitious i could have done it through the lp api, but the turnaround was much faster this way [01:33] niemeyer, fwiw the mongodb sync code.. very much still in progress, i haven't gotten around to syncing people or teams yet, but its been useful by itself.. i need to parallelize the lp connections to take real advantage of the gevent concurrency options http://pastie.org/private/lnrfrdufyffnpbi7m69q [01:36] right now the gevent integration is gratitutous [01:39] interesting http://aws.amazon.com/govcloud-us/ === niemeyer changed the topic of #ubuntu-ensemble to: http://j.mp/ensemble-eureka | http://ensemble.ubuntu.com/docs [02:50] <_mup_> ensemble/expose-cleanup r313 committed by jim.baker@canonical.com [02:50] <_mup_> Implemented reasonably robust spike [02:51] mocking tomorrow... === otubo is now known as otubo[AFK] [11:14] <_mup_> ensemble/trunk r318 committed by gustavo@niemeyer.net [11:14] <_mup_> Merge missed revision from kirkland's byobu-tmux branch, left [11:14] <_mup_> out due to an error on my part when merging. [trivial] [11:23] niemeyer: morning! missed you before :) [11:23] fwereade: Hey! [11:23] niemeyer: thanks for all the reviews [11:23] fwereade: No problem! [11:24] niemeyer: I think I'll be doing significant post-review changes in new stacked branches in future, to avoid monstrous diffs like the hide-instances one [11:24] niemeyer: sensible? [11:24] fwereade: That sounds very sensible [11:25] niemeyer: in that case: https://code.launchpad.net/~fwereade/ensemble/provider-base-launch-machine/+merge/71850 :) [11:26] niemeyer: really just to check that that was what you wanted me to do [11:26] fwereade: Looking at it right now, coincidently [11:26] niemeyer: it could be taken much further but I'm not yet convinced that's a good idea [11:26] niemeyer: cool :) [11:29] fwereade: The bootstrap parameter doesn't look bad per se [11:29] fwereade: But there's something weird in it as a whole [11:29] niemeyer: it's not *bad* but it feels a bit ...off-kilter [11:29] fwereade: Why do we have bootstrap() and start_machine(bootstrap=True)? [11:29] (rhetorical) [11:30] fwereade: It sounds like we're about to have an eureka moment :) [11:31] niemeyer: because (1) bootstrap, in the context of start_machine, is shorthand for saying "I'd like this machine to play another couple of roles"; and (2) because LaunchMachine has an extra resposibility, which it shouldn't, that also uses the bootstrap parameter to complete the bootstrap operation (ie saving provider-state) [11:31] niemeyer: I'd definitely like to fix it but it feels like a distinct bug/branch to me [11:33] fwereade: We need to distinguish the two operations in this branch, even if it's a matter of naming [11:35] niemeyer: how about (1) moving the state-saving into bootstrap where it should be, and renaming the start_machine parameter "master"? [11:35] niemeyer: (there should have been a (2) somewhere in there) [11:35] fwereade: Moving state-saving is unrelated to this, I think [11:36] niemeyer: well... it's a lot of the justification for the param name being bootsrtap [11:36] niemeyer: but that makes sense [11:36] niemeyer: rename to "master", open a bug to move the state-saving? [11:37] fwereade: Alright, I think I know what to do.. [11:37] niemeyer: cool :) [11:37] Maybe [11:38] fwereade: Renaming to master is hiding the intention rather than making it more obvious [11:39] fwereade: Hmm, or maybe not [11:39] * niemeyer thinks [11:39] niemeyer: well, one problem is that I think we lack a canonical term for that machine [11:40] niemeyer: sometimes we call it the bootstrap node, sometimes the zookeeper [11:40] fwereade: True [11:40] fwereade: But that's not entirely a coincidence [11:40] fwereade: The machine itself is not special [11:40] niemeyer: and I guess that when we have multiple zookeepers the precise nature of the machine(s) will change again [11:40] niemeyer: quite so :) [11:41] fwereade: It just runs a service that we care about early on [11:41] niemeyer: that was why I was thinking that it's actually part of machine_data (where machine_data refers to a putative sensible replacement for the data bag) [11:42] niemeyer: machine-id is always needed; a machine implicitly always runs a machine agent; it may also run a zookeeper, and/or a provisioning agent [11:42] fwereade: That said, I retract my pedantism :) [11:42] fwereade: master=True sounds like a straightforward solution right now [11:42] niemeyer: however I'm loath to dirty up machine_data further right now [11:42] niemeyer: cool, cheers :) [11:43] fwereade: We can improve the situation once we do more about turning zookeeper and the provisioning agents into actual formulas [11:44] niemeyer: ...I'm not sure we could deploy them as formulas without already having them in place, could we? [11:44] niemeyer: oh, additional ones [11:44] niemeyer: (right?) [11:45] fwereade: There's a chicken & egg problem we need to solve, but I think it's doable [11:45] niemeyer: I can't quite see it yet, but it will be very nice if we can figure it out [11:47] niemeyer: anyway, can I go ahead and merge that one back into provider-base with just your review? [11:47] niemeyer: or should we wait for someone else to review both separately? [11:47] fwereade: E.g. 1) deploy zk alone; 2) Run machine agent against zk; 3) Deploy a unit within machine 0 with a second zk; 4) Sync second zk with first one; 5) Kill first zk.. [11:48] fwereade: I think it's find to merge this as a review point for the first one [11:48] niemeyer: hm, maybe :) [11:48] niemeyer: ok, cool [11:48] niemeyer: cheers [11:48] fwereade: Thanks for separating this out, btw.. it made a whole lot easier to review it [11:48] fwereade: We should talk to others in our meeting about this workflow [11:59] niemeyer: cheers :) [11:59] niemeyer: I'll try to remember [12:08] kim0: ping [12:09] niemeyer: Hey [12:09] kim0: Hey man [12:09] kim0: You probably haven't had time yet, but when you have a moment, can you please fix the "Ensemble security and firewall enhancements" post? [12:18] niemeyer: done .. thanks for clearing that up [12:29] <_mup_> Bug #827994 was filed: MachineProvider interface inconsistent (list/*args) < https://launchpad.net/bugs/827994 > [12:42] kim0: np! [12:46] niemeyer: couple of clarifications on https://code.launchpad.net/~fwereade/ensemble/cobbler-zk-connect/+merge/71734 ... [12:46] niemeyer: error style as follows? [12:47] Some general thing that went wrong: Some specific explanatory message: Some message from the underlying exception [12:48] eg [12:50] ...er, sorry don't have the exact example I'm looking for [12:51] niemeyer: anyway, the other question was: remove launch_time from ProviderMachine as well? I don't think anything else uses it... [12:52] niemeyer: anyway, error message example [12:53] Ensemble environment is not accessible: Machine i-foobar may not be ready: Connection timed out [13:02] fwereade: Sorry, was grabbing some coffee [13:02] fwereade: re. the message, [13:02] niemeyer: np [13:03] fwereade: More than two levels feels a bit weird to me [13:03] fwereade: But you get the idea [13:03] niemeyer: likewise, because that was my intent with the ones that aren't capitalised [13:03] fwereade: +1 on dropping launch_time if we have no users (woohay!) [13:04] niemeyer: although I guess it's maybe not ideal having a common prefix on EnvironmentPending ("Ensemble environment is not available:") [13:04] fwereade: I think I provided an example of this message in the review, btw [13:04] Ensemble environment is not accessible: Machine i-foobar may not be ready: Connection timed out [13:04] This message, I mean [13:05] niemeyer: yep, seems reasonable, I think I misread exactly what you were after [13:05] niemeyer: and, yay, code to delete :D [13:06] "Can't connect to machine %s (perhaps still initializing): %s" [13:06] fwereade: This will work regardless of context [13:07] fwereade: The message in your quote mentions the environment not being accessible, which has assumes things [13:07] s/has// [13:07] niemeyer: I should point out that ProviderInteractionError uses That: Nested: Style [13:08] fwereade: Hmm.. what's the outside message prefix? [13:08] fwereade: (wondering if we can drop it) [13:08] niemeyer: return "ProviderError: Interaction with machine provider failed: %r" % self.error === otubo[AFK] is now known as otubo [13:09] niemeyer: ...and we expect PIE to wrap a range of other possible errors, I think, which may themselves Do: That [13:09] fwereade: That's not very nice [13:09] fwereade: That was kind of ok in the original context [13:09] niemeyer: I think we can certainly lose the ProviderError: prefix [13:10] fwereade: ProviderInteractionError was used for wrapping [13:10] fwereade: Agreed [13:10] fwereade: Ugh.. and it uses repr [13:10] fwereade: This feels like an overlook [13:10] niemeyer: OK, I'll see what I can do with that [13:11] fwereade: I think we can remove this entirely [13:11] fwereade: The wrapping, that is [13:11] fwereade: We shouldn't be wrapping our own error messages [13:11] fwereade: Unless we actually have something interesting to say, of course [13:11] fwereade: Which is not the case. [13:11] niemeyer: I think PIEs tend to have been tested with assertIn("blah", str(error)) [13:12] niemeyer: I'll look for assertIns and make them assertEqualss, which should then make the nasty messages stick out [13:12] fwereade: If we wrap a message from e.g. S3 into a PIE, we should add a prefix at the wrapping spot, if at all [13:12] fwereade: This may be more work than you're looking for [13:12] fwereade: I'd just fix the messages themselves and fix whatever breaks [13:13] niemeyer: I'll start a stacked branch and at least do a search, see how heavyweight it looks [13:13] fwereade: Cool, thanks [13:13] niemeyer: killing launch_machine is pretty low-risk/low-noise, though, I'll do that inline first [13:15] niemeyer: er, launch_time [13:15] fwereade: Superb [14:07] hi, just wanted a quick heads up [14:07] is ensemble still working only with ec2? [14:07] or is it possible to use it with vms, like Vagrant , for example [14:16] pindonga: Hey [14:16] There's good work in progress to make it work with physical machines [14:16] pindonga: and local deployments too [14:19] fwereade, just a quick note - take a look at bzr log. our standard is to indicate in the log text which branch was merged in, along with reviewers and the bug fixed [14:19] when merging branches into trunk [14:20] jimbaker: whoops, sorry [14:21] jimbaker: I'll try to remember [14:23] fwereade, no worries [14:26] niemeyer, everyone: I assert that (1) MachinesNotFound is a ProviderInteractionError, not just a ProviderError; and (2) anywhere we "except ProviderInteractionError" is still wrong, because many provider interactions raise ProviderError (for bad args, for example) [14:26] opinions? [14:27] fwereade: 2 sounds sensible.. 1 feels strange [14:28] fwereade: Provider*Interaction*Error was, again, supposed to wrap interaction errors with the provider that we didn't expect [14:28] niemeyer: if I fix 2, I'm happy [14:29] fwereade: MachinesNotFound may not be an error in the provider, but on our own data about it [14:29] niemeyer: yep, makes sense [14:29] fwereade: It's a ProviderError, but a well understood one [14:29] fwereade: I hope we kill ProviderInteractionError entirely at some point [14:30] niemeyer: cool [14:30] niemeyer: I'll be making sure the existing except PIEs also catch PEs in the errors branch I think [14:31] niemeyer: feels like a really bad idea to let that linger [14:31] fwereade: Sounds good [14:32] niemeyer: cheers, I'll just let it all bed in for a few minutes [14:41] <_mup_> ensemble/expose-cleanup r314 committed by jim.baker@canonical.com [14:41] <_mup_> Cleanup [14:54] Folks, have been working since early morning and stayed around until late yesterday.. I'm taking a longer mid-day break today for resting a bit. [14:56] <_mup_> ensemble/formula-state-with-url r314 committed by kapil.thangavelu@canonical.com [14:56] <_mup_> lazy init of storage url in formula serialization, default none url on state, restore passthrough error, and update failure/mock test that verifies for additional url param [14:58] niemeyer, cheers [15:07] niemeyer, anything I could try out (re: local deployments) [15:12] pindonga, not at the moment, its actively being developed for our next internal release milestone mid september. [15:13] hazmat, ok, I was asking as I'm in need of automating our infrastructure, so I wanted to follow the path of least effort :) [15:13] it will be in the ppa ensemble roughly around then [15:13] I want to be able to use ensemble in the future [15:13] but since I need this now, I guess puppet+something else will have to do [15:13] pindonga, atm ensemble really only works against ec2 [15:13] yeah [15:14] thx anyway [15:14] pindonga, we're striking out in a couple of different directions to address this, orchestra/cobbler integration for physical machines, lxc/local development, and out of the box openstack support. i'd say its probably about a month till we get these landed. [15:15] cool [15:15] I'll take a look again in a month then :)_ [15:20] fwereade: ping [15:21] RoAkSoAx: pong [15:21] RoAkSoAx: how's it going? [15:21] fwereade: pretty goo, and you? heard you got stuck in chicago on the weekend? [15:22] RoAkSoAx: yeah, it was a bit of a hassle :) [15:22] RoAkSoAx: all good now though [15:24] RoAkSoAx: are you in a position to do some reverification? [15:24] RoAkSoAx: we're only 3 merges away from trunk now :) [15:24] fwereade: yay!! and will do later today [15:24] fwereade: i'm finishing some stuff up with cobbler/arm and once that's done i'm free [15:24] RoAkSoAx: awesome! [15:25] RoAkSoAx: how's that going? [15:25] fwereade: pretty good, just patching cobbler, and have to write another small fix and it iwll be good to go [15:25] fwereade: btw.. what branch of yours should I be using? [15:25] RoAkSoAx: I'll send you a new branch, I honestly can't remember the state of the old ones and I don't fancy merging everything through [15:26] RoAkSoAx: can you wait an hour or so? I'll make sure I've done it before I stop for the day :) [15:28] fwereade: sure, take your time [15:30] fwereade, so the base of provider-base-machine is provider-base-launch-machine? [15:31] hazmat: I branched PBLM from PB, so niemeyer could review that separately and check it matched his thinking from the PBLM review; then I merged it back into PB [15:32] ic, so the base is still trunk [15:32] hazmat: gaah: s/from the PBLM review/from the PB review/ [15:32] hazmat: yep [16:01] hey, is it team meeting time? [16:03] fwereade, i though that was tomorrow [16:04] s/thought [16:04] hazmat: that would be quite convenient as it happens :) [16:04] hazmat: thanks for the review btw [16:06] fwereade, np.. great stuff [16:06] hazmat: I try :) [16:11] niemeyer, bcsaller, jimbaker, fwereade btw. you guys probably saw the bug spam, but i closed out the dublin milestone, any dublin bugs which didn't have an assignee, got unassigned, and the rest moved on to the eureka milestone. [16:12] hazmat: thanks [16:12] the eureka milestone has hard deadline, as its release time, so we're trying to only keep things on the milestone which we expect/need to get done for the close of the oneiric cycle [16:12] jimbaker: curses, I included the bug and reviewers, and forgot the branch :/ [16:12] fwereade, np.. was fun to script it with the launchpad api [16:18] <_mup_> Bug #828147 was filed: Ensemble branch option needs to allow for distro pkg, ppa, and source branch install < https://launchpad.net/bugs/828147 > [16:27] <_mup_> Bug #828152 was filed: default formula config values not available to hooks < https://launchpad.net/bugs/828152 > [16:28] later all [16:35] fwereade, cheers [16:39] jimbaker, do you mind if i put bug 828147 on your plate for the eureka milestone? [16:39] <_mup_> Bug #828147: Ensemble branch option needs to allow for distro pkg, ppa, and source branch install < https://launchpad.net/bugs/828147 > [16:39] its something unrelated to feature dev, that we need for the release [16:39] hazmat, sure, looks like i would learn something fun [16:39] in terms of working with launchpad [16:59] jimbaker, sadly the lp integration there is pretty minimal [16:59] jimbaker, are you going to be enabling a runner that will do functional tests? [16:59] jimbaker, i'm looking at some of the ftest outstanding bugs, and wondering if they should be on the eureka milestone [17:08] <_mup_> ensemble/machine-agent-uses-formula-url r313 committed by kapil.thangavelu@canonical.com [17:08] <_mup_> merge predecessor [17:13] <_mup_> Bug #828189 was filed: machine agent should use formula url for unit deployment < https://launchpad.net/bugs/828189 > [17:15] hazmat: dublin is over, right? [17:17] hazmat, ftest should be on the eureka milestone [17:17] as well as part of some sort of CI, presumably using jenkins [17:18] buildbot would also be ok, but jenkins has much better reporting [17:23] SpamapS, it is [17:23] jimbaker, jenkins is also a bit easier to setup imo [17:23] If only canonistack's S3 worked, it would already be running there. ;) [17:24] Actually I'm not sure canonistack is going to work.. 1.5GB isn't really enough for all those logs and graphs. ;) [17:24] hazmat, sounds good - i haven't done either setup, but i was very impressed w/ working w/ jenkins/hudson in the past [17:24] * SpamapS is thinking maybe that should be his Ensemble Audition candidate. :) [17:25] SpamapS, sounds like a good use of ensemble [17:25] SpamapS: can canonistack point to real S3? [17:25] to self monitor [17:26] at least temporarily [17:26] <_mup_> ensemble/verify-version r325 committed by jim.baker@canonical.com [17:26] <_mup_> Merged trunk & resolved conflict [17:26] or just external swift [17:26] jcastro is your branch for bug 806638 ready for merging/review? [17:26] <_mup_> Bug #806638: Docs need updating to mention what it expects as a value for instance type < https://launchpad.net/bugs/806638 > [17:27] jcastro, actually that looks more like a trivial [17:28] m_3: no because the auth details would be different [17:28] jcastro, i'll cowboy that in [17:28] jimbaker, bcsaller1, niemeyer can i get +1 on this trivial, http://bazaar.launchpad.net/~jorge/ensemble/docfix-instance-type/revision/266 [17:29] gotcha... wasn't sure if we could configure separate S3_URL from ec2 endpoint for ensemble === bcsaller1 is now known as bcsaller [17:29] m_3: been there, tried that. ;) [17:29] it's really about the image store though [17:29] hazmat,taking a look [17:30] hazmat, +1 on the trivial [17:30] hazmat: lgtm too [17:33] hazmat: ah yeah I had forgotten about that, I was learning the docs and took an opportunity to do a quick fix. [17:36] <_mup_> ensemble/trunk r320 committed by jim.baker@canonical.com [17:36] <_mup_> merged verify-version [r=niemeyer,hazmat][f=825398] [17:36] <_mup_> Store ensemble protocol version in /topology znode to ensure all [17:36] <_mup_> topology using ops failfast with IncompatibleVersion if mismatch. [17:38] <_mup_> ensemble/trunk r321 committed by kapil.thangavelu@canonical.com [17:38] <_mup_> [trivial] clarify valid values ec2 provider option default-instance-type [r=bcsaller,jimbaker][a=jcastro] [17:43] <_mup_> ensemble/trunk-merge r276 committed by kapil.thangavelu@canonical.com [17:43] <_mup_> merge trunk [17:43] <_mup_> ensemble/security-policy-with-topology r326 committed by kapil.thangavelu@canonical.com [17:43] <_mup_> merge trunk and resolve conflict [17:50] <_mup_> ensemble/security-agents-with-identity r314 committed by kapil.thangavelu@canonical.com [17:50] <_mup_> resolve conflict and merge [17:53] <_mup_> ensemble/states-with-principals r326 committed by kapil.thangavelu@canonical.com [17:53] <_mup_> remove merge conflict files that got added. === otubo is now known as otubo[AFK] [18:18] <_mup_> ensemble/expose-cleanup r315 committed by jim.baker@canonical.com [18:18] <_mup_> Merged trunk & resolved conflict [18:32] <_mup_> ensemble/expose-cleanup r316 committed by jim.baker@canonical.com [18:32] <_mup_> Fixes due to provider refactoring === otubo[AFK] is now known as otubo [19:57] * niemeyer waves [19:59] hazmat: I started going in the direction we discussed for gozk last night, and suddenly a feeling of approach-rightness stroke me.. [20:00] hazmat: I have to say it's really weird the way zk works by default [20:00] Can't imagine people writing reliable software would want this [20:00] In the specific sense of watch vs. CONNECTING/ASSOCIATING events [20:01] It's a bit like TCP popping up a notice to the stream saying "Hey, an ip packet got duplicated, but I dropped it, alright?" [20:08] niemeyer, indeed its noise to most apps [20:09] niemeyer, my short term memory must be really bad, i don't recall discussing gozk last night [20:11] niemeyer, bcsaller, jimbaker also reminder we've got our weekly team meeting tomorrow [20:11] hazmat: We didn't.. I just felt in the mood to fix it after the previous talk we had a while ago [20:12] ah.. right, previous discussions about gozk and session events, gotcha [20:14] <_mup_> Bug #828326 was filed: need to be able to retrieve a service config or schema from the cli < https://launchpad.net/bugs/828326 > [20:26] <_mup_> ensemble/expose-cleanup r317 committed by jim.baker@canonical.com [20:26] <_mup_> Partial fix of mock expectations in shutdown tests [21:31] <_mup_> ensemble/expose-cleanup r318 committed by jim.baker@canonical.com [21:31] <_mup_> Mocking on describe_instances going through state transitions [21:54] <_mup_> Bug #828378 was filed: handle ec2 instance quotas < https://launchpad.net/bugs/828378 > [21:56] bcsaller, niemeyer so does an lxc sf mini-sprint still sound good? [21:57] hazmat: yes, did something change? [21:58] bcsaller, just wanted to confirm === otubo is now known as otubo[AFK] [22:02] hazmat: Absolutely [22:03] hazmat: Makes a lot of sense [22:04] niemeyer, great [23:16] <_mup_> Bug #828411 was filed: relation status shows "up" before relation hooks complete execution < https://launchpad.net/bugs/828411 > [23:40] m_3, bug 828411 is interesting - certainly not at a granularity that we currently track as you note [23:40] <_mup_> Bug #828411: relation status shows "up" before relation hooks complete execution < https://launchpad.net/bugs/828411 > [23:42] jimbaker: yeah, main use-case is testing at the moment [23:42] m_3, the only thing that really gives us a fairly comprehensive trace of the system state are the logs [23:43] but this should be simple and it's definitely something you'd expect the framework to tell you [23:43] maybe we can push more info into ZK, not certain [23:43] especially when it comes to more automation [23:44] m_3, the info being reported by status is actually is what is driving hook execution [23:45] might surface more with user-defined hooks or something [23:45] right now it's just a nice-to-have [23:45] m_3, sure, definitely will think about it. i just wonder if we can extract more value from logs [23:45] but seems like it would be important for state consistency going forward [23:46] logically it's two different states [23:46] (haven't taken the time to figure out a good solution... just starting the conversation with the bug) [23:47] m_3, it's possible that the shared scheduler mentioned in UnitRelationLifecycle would benefit from this [23:48] in terms of sharing more state through ZK, which status could pick up [23:48] the shared scheduler being something not implemented [23:49] to my knowledge, the unit lifecycle is not recorded in ZK, but only in memory in the unit agent [23:49] hmmm... I'll have to look [23:50] i don't know dev plans for this. maybe to be addressed in the go rewrite, maybe earlier [23:53] kinda almost sits at the same level of user-defined hooks... user-defined events... user-defined states [23:53] that's probably what it should get lumped with [23:53] not sure [23:57] m_3, currently they are separate things, including what i would understand user-defined things to be [23:58] m_3, but where they could intersect is on the scheduling [23:59] also it's possible that user-defined could be meaningful on a relation, so that it would expand beyond settings