#ubuntu-ensemble 2011-08-15
<niemeyer> hazmat: Yo!
<hazmat> niemeyer, pong
<_mup_> Bug #826498 was filed: virtualbox machine provider for osx local dev <Ensemble:New> < https://launchpad.net/bugs/826498 >
<hazmat> g'morning
<hazmat> i guess i'm on point
<niemeyer> Hello everyone!
<jimbaker> niemeyer, hi
<niemeyer> jimbaker: Hey man
 * kim0 pushing an ensemble mongodb cluster screencast in an hour
<niemeyer> kim0: Where?
<kim0> I'm still uploading :)
<kim0> will hit the usual places 
<kim0> figuring out if replication was actually working, was way much harder than deploying with ensemble hehe :)
<niemeyer> kim0: Ah, cool
<niemeyer> robbiew: ping
<robbiew> niemeyer: pong
<niemeyer> robbiew: Hey there
<niemeyer> robbiew: Had a good trip back home? ;-)
<niemeyer> robbiew: Quick pvt question
<hazmat> niemeyer, it looks like the kanban is stale again
<hazmat> niemeyer, also i've been brainstorming on our workflow and tooling, i think we might be able to use a bzr plugin to some good effect to solve some of our workflow issues
<niemeyer> hazmat: Use http://people.canonical.com/~niemeyer/dublin.html, I didn't stop it given the number of recent issues we've had
<niemeyer> We should really put Eureka in place, though
<fwereade> hey all
<niemeyer> fwereade!
<hazmat> niemeyer, sounds good re eureka
<niemeyer> fwereade: Made it home safely?
<fwereade> niemeyer: yeah, all good :)
<niemeyer> Great to hear
<fwereade> niemeyer: great to be ;)
<hazmat> fwereade, great.. that travel sounded hard
<fwereade> hazmat: well, worse things happen at sea ;)
<fwereade> niemeyer: while travelling I got lp:~fwereade/ensemble/hide-instances basically reverted
<fwereade> I need to do some live verification now I have a connection again
<fwereade> but if it works -- and the UI is no worse than before -- will that be ok to merge?
<niemeyer> fwereade: Ok.. please take a careful time with this branch and reevaluate whether the issues there are all fixed
<niemeyer> fwereade: As Kapil pointed out, the UI is worse than before
<niemeyer> fwereade: "Machine not found" after a bootstrap is senseless to a user
<fwereade> niemeyer: I think that's been dealt with, but I'll reverify everything
<niemeyer> fwereade: Sounds good
<fwereade> niemeyer: and mark it "needs review" again when I'm done
<niemeyer> fwereade: Please try it in a few real interactions
<niemeyer> fwereade: Try e.g. ensemble status several times right after ensemble bootstrap
<fwereade> niemeyer: I have been, it was a surprising real interaction that led me to back stuff out
<niemeyer> fwereade: That's cool then.. I'm all for less code as you know :)
<fwereade> 2011-08-15 20:51:29,795 ERROR Ensemble environment is not accessible: machine i-d1e8bdb0 has no address assigned yet
<fwereade> tolerable?
<fwereade> human-readable intro, machine specification for those who need to know
<fwereade> (that's the only change; I'm pretty sure we can do better, but that'll come in another branch when I generalise it to cobbler)
<niemeyer> fwereade: That's beautiful IMO
<fwereade> niemeyer: cool -- I'll check everything else, but that was the specfic really bad bit (it certainly beats "NoneType has no attribute 'groups'") ;)
<niemeyer> fwereade: LOL.. I can agree with that
<fwereade>  niemeyer: ok, it's back on "needs review", but I can't seems to ask for another from you... it thinks you've already approved it
<niemeyer> fwereade: Sounds great, thanks!
<fwereade> niemeyer: a pleasure :)
<fwereade> right, that's definitely it for me today
<fwereade> nn all :)
<niemeyer> jimbaker: ping
<niemeyer> jimbaker: ping?
<SpamapS> adam_g: reading your openstack deploy stuff with glee.. looks pretty cool
<niemeyer> SpamapS: So, I've heard we are getting ftests automated?  Do you know something about that? :-)
<adam_g> SpamapS: cool, thanks
<adam_g> niemeyer: if by ftests you mean formula tests, james came up with a cool way of testing and aggregating results back to jenkins via a "tester" formula.
<niemeyer> adam_g: It was actually about "functional tests"
<niemeyer> adam_g: We have some interesting logic in the tree already that runs a real deployment against EC2 and performs checks against it
<niemeyer> adam_g: It's pretty ineffective at the moment, though, because we don't really run the tests
<niemeyer> adam_g: It'd be awesome to have these tests being run on every commit to trunk, to ensure ensemble works for real at all times
<niemeyer> adam_g: and then enhance those tests
<adam_g> niemeyer: ah, gotcha. 
<niemeyer> Also makes a lot of sense in that final runway towards 11.10
<jimbaker> niemeyer, hi
<niemeyer> jimbaker: Hi.. we have to talk about tasks for the next couple of months, but I can't do that right now unfortunately.  Let's talk tomorrow.
<jimbaker> niemeyer, sounds good
<niemeyer> I'll step out for now..
<_mup_> ensemble/pythonpath-fix-bug-816264 r305 committed by kapil.thangavelu@canonical.com
<_mup_> update injection of ENSEMBLE_PYTHON_PATH per review
<_mup_> ensemble/trunk r312 committed by kapil.thangavelu@canonical.com
<_mup_> merge python-path-fix-bug-816264 [r=niemeyer,jimbaker][f=816264]
<_mup_> Avoid setting PYTHONPATH when executing hooks, as it can have
<_mup_> side-effects on hook execution. Per the bug report mod-wsgi pkg
<_mup_> install was a reproducable error on natty). 
<_mup_> Instead utilize ENSEMBLE_PYTHONPATH environment variable and some shim
<_mup_> code in each hook CLI-API to support development scenarios where the
<_mup_> PYTHONPATH is needed for the CLI-API.
<hazmat> that's odd using --fixes=lp:bug_num when committing a branch merge adds a link to the bug against trunk
<hazmat> i thought it was supposed to just close the bug..
<hazmat> hmm. ic, i was using it incorrectly.. it has to be used on the branch to be merged, and then it creates the bug-branch link
#ubuntu-ensemble 2011-08-16
<hazmat> niemeyer, using the lp api in bzr doesn't look it will work, too many roundtrips to lp
<hazmat> its too slow
<hazmat> it would have to be some sort of secondary cache for querying
<hazmat> with bg updates
<niemeyer> hazmat: What were you looking to do there?
<hazmat> niemeyer, my intent was to do a bzr plugin that gave us the lp integration we want.. i also started playing with colo/pipeline.. stuff i basically wrote a spec for it.. and then realized i needed to backtrack to not dictate bzr workflow... so i went ahead and started a cli implementation that can answer questions, like show me my pending branches that need work, show me things i can review, show things that are misisng for the kanban and why
<hazmat> i've got a few of them already done, although its a library that i'm driving from the interpreter rather than a cli but i wanted to explore the concept
<hazmat> even with a cache though, there's too much link traversal/roundtrips to answer some questions
<hazmat> its a bit slow... it might just need to be a web app
<hazmat> so i can use the bg updated cache for live views
<niemeyer> hazmat: Yeah, that's a fairly common complaint about that API
<niemeyer> hazmat: Even for our store API, it took a few roundtrips before they accepted the traditional way of doing things would be non-efficient
<niemeyer> hazmat: There's some level of agreement that something must be done to improve the conventions, though
<niemeyer> REST FTW..
<niemeyer> not
<hazmat> :-)
<hazmat> it would be nice an app could keep a local cache and then just get a stream/feed of changes against a higher level entity like a project or milestone, to do its own cache invalidation.. the other issue for interactive use is that it requires a good connection... it would be nice if the bzr integration could do its read functionality soley against its local cache
<hazmat> its really not clear what would make it nicer, as is the api is very functional, but i'm almost wish for incremental data dumps against a top level entity
<hazmat> project most likely given that's the common organization level in lp
<_mup_> Bug #827071 was filed: Improved tmux byobu key bindings <Ensemble:New> < https://launchpad.net/bugs/827071 >
<_mup_> Bug #827073 was filed: Fixed number of machines placement policy <Ensemble:New> < https://launchpad.net/bugs/827073 >
<_mup_> ensemble/verify-version r314 committed by jim.baker@canonical.com
<_mup_> Addressed review, but mocking is still not working for new scheme
<_mup_> ensemble/verify-version r315 committed by jim.baker@canonical.com
<_mup_> Try a mocker like solution for the interim
<_mup_> ensemble/verify-version r316 committed by jim.baker@canonical.com
<_mup_> Try a mocker like solution for the interim
<_mup_> ensemble/verify-version r316 committed by jim.baker@canonical.com
<_mup_> Cleanup
<_mup_> ensemble/verify-version r317 committed by jim.baker@canonical.com
<_mup_> PEP8
<_mup_> ensemble/verify-version r318 committed by jim.baker@canonical.com
<_mup_> Restored blank line
<_mup_> ensemble/verify-version r319 committed by jim.baker@canonical.com
<_mup_> Merged trunk
<_mup_> ensemble/formula-state-with-url r312 committed by kapil.thangavelu@canonical.com
<_mup_> merge trunk
<kirkland> hazmat: any action I need to take on that one?
<kirkland> hazmat: i see it's assigned to me
<hazmat> kirkland, no just needed to get it onto the kanban view
<hazmat> kirkland, we've been primarily using that as our view into the review queue, and branches without associated bugs (and milestones) don't show up
<kirkland> hazmat: coolio;  should i file a bug per branch in the future?
<hazmat> kirkland, yes, that would be great, also helps to assign it to the current milestone
<hazmat> s/helps/needs to be
<kirkland> hazmat: sweet, will do
<hazmat> its getting a bit cumbersome, i've been exploring alternate options based on lp extensions, but it looks like we'll need either a 3rd party system, or an alternate web front end onto launchpad to get things simpler for the project workflow
<hazmat> but till then, that's the mechanism..i was just testing out the lp extension which mentioned all the merge proposals that weren't showing up in the kanban
<hazmat> it takes too long sadly to make it a useful cli tool by itself
<fwereade> so, I was wondering about _run_operation in ec2.MachineProvider
<fwereade> it turns all errors raised by the operation into ProviderInteractionErrors
<fwereade> but it's only used on .connect
<fwereade> and I feel it should either wrap everything, or wrap nothing, with a moderate preference for wrapping nothing -- individual operations can turn appropriate exceptions into PIEs if they want, but errors in operations don't necessarily imply errors interacting with the provider
<fwereade> they could be, say, attempts to get groups() from an unchecked regex match result
<fwereade> niemeyer: morning!
<niemeyer> fwereade: Hey man
<fwereade> niemeyer: I was just talking to nobody in here :)
<fwereade> niemeyer: quick recap
<fwereade> niemeyer: ec2.MachineProvider._run_operation turns exceptions from the operation it runs into ProviderInteractionErrors
<fwereade> niemeyer: only used for .connect
<fwereade> niemeyer: I don't think it should exist at all, because the errors aren't necessarily related to provider interaction
<fwereade> niemeyer: individual ops can wrap errors in PIEs if they really want to
<fwereade> niemeyer: concur?
<niemeyer> fwereade: Agreed
<fwereade> niemeyer: sweet
<niemeyer> fwereade: This was an easy way to make sure that all operations we cared about would be properly wrapped
<niemeyer> fwereade: As long as you're sure the operations are indeed taking care of translating the errors into something reasonable, it's not necessary
<fwereade> niemeyer: it's just the one operation that uses it anyway
<fwereade> niemeyer: the other possibility is that everything should use it (or a variant)
<fwereade> niemeyer: but that doesn't feel right to me
<niemeyer> fwereade: This was used in strategic operations we believed were leaky in terms of which errors they could raise
<niemeyer> fwereade: We didn't want it to blow in random ways
<niemeyer> fwereade: Which is why it's fine as long as you indeed check that we're taking care of possibilities there
<fwereade> niemeyer: cool, got you
<fwereade> niemeyer: we've got some work to do on general provider interactions then
<fwereade> niemeyer: EC2Errors, S3Errors and xmlrpc Faults are currently passed through unless they correspond to specific error states we already know about
<niemeyer> fwereade: This is exception handling "FTW".. in Go we'd just say "if <thing errored> { <do something local to adapt>; <pop error up, perhaps wrapped> }.  That doesn't tend to feel nice when doing exception handling _everywhere_
<niemeyer> fwereade: Well, there you go.. they shouldn't be passed through
<fwereade> niemeyer: writing a bug
<niemeyer> fwereade: Generic/common code should not worry about EC2Error, or S3Error, etc
<fwereade> niemeyer: agreed
<niemeyer> fwereade: That's why we wrap, and why we had that method there
<fwereade> niemeyer: sure, so it feels like we might want provider-specific wrappers
<fwereade> niemeyer: any ec2 MachineProvider call should be wrapped in something that converts EC2Errors and S3Errors but lets everything else through
<niemeyer> fwereade: Indeed, or no wrapper at all if you can ensure that all errors are indeed being taken care of or reraised in a way outside logic can catch properly
<fwereade> niemeyer: yep, makes sense
<niemeyer> fwereade: Ideally these errors wouldn't exist.. because the user can't really react to "Instance does not exist".. or "S3 auth error".. without any kind of context
<fwereade> niemeyer: potentially easy to forget to do if it's not automagic but we can worry about that if we do turn out to be forgetting it
<_mup_> Bug #827308 was filed: ProviderInteractionError wrapping not consistent <Ensemble:New> < https://launchpad.net/bugs/827308 >
<fwereade> niemeyer: btw... how often should the kanban board update?
<niemeyer> fwereade: I believe the one maintained by IS is broken again
<niemeyer> Please use this one for now: people.canonical.com/~niemeyer/dublin.html
* niemeyer changed the topic of #ubuntu-ensemble to: http://people.canonical.com/~niemeyer/dublin.html | http://ensemble.ubuntu.com/docs
<fwereade> niemeyer: cheers
* niemeyer changed the topic of #ubuntu-ensemble to: http://j.mp/ensemble-dublin | http://ensemble.ubuntu.com/docs
<fwereade> niemeyer: https://code.launchpad.net/~fwereade/ensemble/find-all-zookeepers/+merge/71668 may qualify as trivial
<niemeyer> fwereade: I'm planning to go over all your branches once I get rid of a summary for last week that I'm writing, but let me see that one
<fwereade> niemeyer: cheers
<niemeyer> fwereade: Are the call sites ready to take more than one machine?
<fwereade> niemeyer: only client is connect, which currently just grabs the first one
<fwereade> niemeyer: I was proposing to grab the first one with an assigned dns_name, if it exists
<fwereade> niemeyer: cleverer choosing algorithm can wait until we actually bring up multiple zookeepers, I think
<fwereade> niemeyer... oh no it's not
<fwereade> niemeyer: I missed bootstrap
<fwereade> niemeyer: not sure what the semantics should be there
<fwereade> niemeyer: it bothers me slightly that start_machine returns a list of one machine; the fact that bootstrap will return a list of pre-existing bootstrapped machines is not necessarily material, because I don't think anyone uses that result
<fwereade> niemeyer: if we were to decide that start_machine should return a naked machine, it would matter
<niemeyer> fwereade: I don't understand what you mean?
<fwereade> niemeyer: sorry :)
<niemeyer> fwereade: If we didn't use the result of bootstrap, how would we guess later the machine we bootstrapped?
<fwereade> niemeyer: bootstrap() returns the result of get_zookeeper_machines(), or the result of start_machine()
<niemeyer> fwereade: Sounds sane?
<fwereade> niemeyer: isn't that internal?
<niemeyer> fwereade: What's internal?
<fwereade> niemeyer: bootstrap uses the result of start_machine internally to write the state
<fwereade> niemeyer: or passes through the result of get_zookeeper_machines
<fwereade> niemeyer: well, actually, LaunchMachine itself does the writing if it's constructed in bootstrap mode, that's definitely a bug
<niemeyer> fwereade: Yes, you're describing what happens.. I don't understand what you think the problem is
<fwereade> niemeyer: but it's asymptomatic until we have multiple zookeepers anyway
<fwereade> niemeyer: I don't think there's a problem, I was trying to explain why the change isn't a problem, even though bootstrap also uses the result of get_zookeeper_machines
<niemeyer> fwereade: Yes, you're saying it returns a single machine at all times
<niemeyer> fwereade: Which is what it already did
<niemeyer> fwereade: My original question (which remains) is whether all call sites are ready to deal with multiple machines
<niemeyer> fwereade: Because that's what the branch does
<fwereade> niemeyer: let me start again
<fwereade> niemeyer: the call sites are in connect, and bootstrap
<fwereade> niemeyer: both expect lists
<fwereade> niemeyer: neither are perfectly specialised to deal with multiple zookeepers yet, but I don't believe the differences are material because the changes necessary for working with multiple zookeepers will be more wide-ranging than that
<fwereade> niemeyer: this branch is just preparing a small part of the necessary ground
<niemeyer> fwereade: The question is simpler than that
<niemeyer> fwereade: Will the call sites break because you're passing a list?
<niemeyer> fwereade: with more than one element
<fwereade> niemeyer: no
<niemeyer> fwereade: Phew.. ;_)
<fwereade> niemeyer: :)
<fwereade> niemeyer: on general principles, shall I add a test for bootstrap with multiple zookeepers, to make it clear?
<niemeyer> fwereade: Nah
<niemeyer> fwereade: Whoever adds logic to actually deal with multiple machines will go through that
<fwereade> niemeyer: ok, cool 
<niemeyer> fwereade: I just wanted to make sure we were not consciously adding a bomb
<fwereade> niemeyer: just a change, it'd need to be mixed with other stuff to become a bomb ;)
<niemeyer> fwereade: So, sure.. you can merge it.. it's not clear why that's being done, but fine
<fwereade> niemeyer: cheers -- just because I'd rather write upcoming code to deal with get_zookeeper_machine*s* as if there really were machine*s*, rather than encoding time-sensitive assumptions in more places than they need to exist
<niemeyer> fwereade: Sound sgood
<niemeyer> SpamapS: ping
<niemeyer> fwereade: How much time do you think we still need for the "Orchestra works" bits of the problem to be in trunk?
<fwereade> niemeyer: all rather depends on the merge queue; just to ensure it's as speedy as possible, I think I'll get a minimal orchestra connect in very soon, and work on nicening up the UI separately
<fwereade> niemeyer: the orchestra connect should basically be null, it's just a matter of writing tests
<niemeyer> fwereade: Ok, but by now you have an idea of the pipeline "cost".. how long do you think before your buffer empties?
<fwereade> niemeyer: it's totally dependent on review speed and verdicts -- I wouldn't MP if I didn't think they were good, but I'm not always right ;)
<fwereade> niemeyer: this week sounds very plausible though
<niemeyer> fwereade: Thanks.. that gives me an idea of order of magnitude
<niemeyer> fwereade: No need for spot on accuracy in this case
<fwereade> niemeyer: cool
<fwereade> niemeyer: you know, I could have sworn I was at some sort of sprint last week :p
<niemeyer> fwereade: Oh man, did I miss you?
<fwereade> niemeyer: don't worry, the opportunity for a snarky comment makes up for it :)
<niemeyer> kim0: ping
<kim0> niemeyer: pong
<niemeyer> kim0: Can you please tweak the blog post to reflect the points made in the thread? 
<jcastro> ok guys, here's the report for the last week so far: http://pad.ubuntu.com/ensemble-report
<kim0> niemeyer: sure yeah
<niemeyer> kim0: Thanks
<niemeyer> jcastro: Anders and Dustin also worked on the Orchestra provider
<niemeyer> jcastro: and Clint also worked with Jim on testing the expose/unexpose IIRC
<hazmat> g'morning
<niemeyer> jcastro: Kapil also worked on the local development/LXC
<niemeyer> jcastro: That's why I avoid giving names to the specific tasks.. :)
 * jcastro just adds more names instead
<niemeyer> jcastro: I add all names on the sprint/project.. one boat. :)
<niemeyer> jcastro: Some times I forget, though..
<hazmat> fwereade, my understanding is the only reason not to return multiple machines for zk, is that sshclient can't really use them atm, but it maintains the same interface as txzk.. its probably a better idea to just specialize for that in sshclient then in the providers
<fwereade> hazmat: that's the idea -- in my current branch, ZookeeperConnect has a very simple _pick_machine method, which may become more complex in the future
<jimbaker> hazmat, there should be support in sshclient for this scenario... but it only takes the first one in the list
<fwereade> hazmat: in practice, it works the same as it already does, but I'm trying to avoid assuming that get_zookeeper_machine*s* returns a one-element list
<fwereade> jimbaker: there are going to be lots of changes when we have multiple zookeepers... with this, there will be one fewer
<fwereade> jimbaker: it's not much but it's a start
<jimbaker> fwereade, ok, i should definitely review your branch. i happened to be looking at test_sshclient to better understand the power of mocker
<fwereade> jimbaker: oh, sorry, I misread sshclient... yeah, I think that's a very nice idea, I hadn't even thought of that
<fwereade> jimbaker: but the sshclient code scares me :)
<jimbaker> fwereade, the sshclient code is simple. you should see test_sshclient ;)
 * fwereade cowers
<_mup_> ensemble/verify-version r320 committed by jim.baker@canonical.com
<_mup_> At least don't do something stupid/unpythonic with action at a distance module namespace changes
<jimbaker> the specific code is going away once i figure out how to get mocker patch to do what i want, but at least it's been cleaned up
<fwereade> about sshclient...
<fwereade> why do we raise txzookeeper ConnectionTimeoutExceptions?
<fwereade> shouldn't we have an ensemble ConnectionTimeout(NoConnection)?
<jimbaker> fwereade, i believe this is to keep the interface compatible with txzookeeper client
<niemeyer> fwereade, jimbaker: I'm in a call right now, but I'm debt with both of you, and would like to sort it out this afternoon.
<jimbaker> niemeyer, sounds good
<niemeyer> fwereade: I'll clean the queue today, so expect good feedback tomorrow
<niemeyer> jimbaker: Can we have a call after lunch?  In.. 1.5h?
<jimbaker> niemeyer, cool, i was wondering which time zone's afternoon :)
<jimbaker> so that time works for me
<jimbaker> niemeyer, i still haven't figured out how to use mocker to perform the equivalent of just setting the module name, but i will keep working on that
<niemeyer> jimbaker: Setting the module name?
<niemeyer> jimbaker: Don't know what that'd mean
<jimbaker> niemeyer, i'm trying to do something like the following: http://paste.ubuntu.com/667409/
<jimbaker> it's a very simple thing: in this test, i want to change ensemble.state.topology.VERSION to another number (say by incrementing it)
<jimbaker> niemeyer, so action at a distance for code that is relying on this VERSION, specifically InternalTopology.parse
<jimbaker> niemeyer, now when i had first constructed this test, VERSION was defined in a completely separate file, ensemble.common. for whatever reason, perhaps due to importing, this was working. likely accidentally working
<SpamapS> niemeyer: pong, good morning :)
<niemeyer> jimbaker: Yeah, I get it.. I've sent a mail about it yesterday
<fwereade> niemeyer: cool, thanks (I'm around for a little while longer today but not *that* long :))
<niemeyer> jimbaker: Use the patch method in ensemble.lib.testing
<niemeyer> SpamapS: morning!
<jimbaker> niemeyer, cool, that's what i have been trying, as well as seeing how it's used
<jimbaker> niemeyer, sorry, now i see the difference - this is a separate patch from mocker
<niemeyer> jimbaker: That's not what I get from your message, but let's talk later.. have to pay attention here
<jimbaker> given this is exactly equivalent to what i want to do, just something we have coded up, that should do what i want
<fwereade> jimbaker: ah, I see, thanks
<SpamapS> niemeyer: so I was thinking about this, and wondering if there's a way we can run the ftests from the .deb ..
<SpamapS> niemeyer: That way we can ask people to run them in support situations
<jimbaker> niemeyer, cool, the correct patch method works
<_mup_> ensemble/verify-version r321 committed by jim.baker@canonical.com
<_mup_> Use TestCase.patch for action at a distance change to module namespace
<niemeyer> SpamapS: Probably not..
<niemeyer> SpamapS: Well.. maybe yes, depending on what you mean by that
<niemeyer> SpamapS: You mean running ftests when debs are built, or running ftests with an ensemble that was deb-installed?
<niemeyer> jimbaker: Super
<SpamapS> niemeyer: the latter
<niemeyer> SpamapS: That's doable
<jimbaker> niemeyer, i see my source of confusion - you mentioned in your email "ensemble.state.testing" which doesn't exist, but i error corrected to the tests on ensemble.state, which use self.mocker.patch extensively (as well as one use of self.patch)
<jimbaker> now it's all clear
<niemeyer> SpamapS: But I think it makes most sense to do against trunk
<niemeyer> SpamapS: Ideally ftests would be run more frequently than deb-builds
<jimbaker> at least it got me look at test_sshclient again, just in time to discuss w/ fwereade 
<niemeyer> jimbaker: Cool, sorry for the confusion
<fwereade> serendipitous :)
<jimbaker> fwereade, indeed :)
<niemeyer> Alright, I'm off the call, and will get some lunch and get cranking
<SpamapS> niemeyer: I'm more thinking of the integration w/ the OS.. agreed that at some point you guys will want to see if trunk fixes somebody's problems. :)
<jimbaker> niemeyer, no worries
<niemeyer> SpamapS: That's not the point
<SpamapS> niemeyer: the results of ftests with the version somebody is reporting a bug on are interesting to see if they've mucked up their environment.
<niemeyer> SpamapS: If a revno from bzr is ftested, it's ftested, no matter if it's within a deb or within a branch
<niemeyer> SpamapS: For quality assurance, it's beneficial to have ftests running even before they get to a deb
<niemeyer> SpamapS: Even with interim revisions
<SpamapS> niemeyer: right, you're looking at it from a "testing the code" perspective. I'm looking at it from a "testing the environment" perspective.
<niemeyer> SpamapS: ftests don't test an environment..
<niemeyer> SpamapS: They test that ensemble works
<_mup_> ensemble/verify-version r322 committed by jim.baker@canonical.com
<_mup_> Merged trunk
<SpamapS> Yeah, though they're more sensitive to environment breakage than unit tests.
<niemeyer> SpamapS: Yes, they are functional tests
<SpamapS> Like I could run through all the unit tests, and still have a broken system because I'm pointed at a DNS resolver that is doing something wonky with my requests to amazon.
<SpamapS> But the ftests should expose that. I think.
<niemeyer> Yep
<niemeyer> and that's why we want to run it with every revision from trunk, rather than just packaged revisions
<SpamapS> all good. I *also* want to have it in the arsenal of things to suggest that a suer try when they're having trouble
<SpamapS> user too
<SpamapS> suer's are even more scary than users :)
<niemeyer> SpamapS: LOL
<niemeyer> SpamapS: I wouldn't do that myself
<niemeyer> SpamapS: For the same reason we don't ask users to run unittests
<niemeyer> SpamapS: But if you want to do it.. well..
<SpamapS> with other projects I've asked users to run unit tests before.. and found out all kinds of things.. like broken paths.. missing files.. weird compilers.. 
<SpamapS> Anyway it was just a thought that passed through my head as I'm planning to setup the jenkins tests
<niemeyer> SpamapS: Sounds good
<niemeyer> Lunch for reals
<jamespage> SpamapS: what are you planning in terms of automated testing?  I spent some time last week thinking about how we test the OpenStack formulas so it would be good to sync up
<jamespage> make sure we are not overlapping/share knowledge etc..
<m_3> jamespage: +1 flyonthewall for that
<jamespage> m_3: coolio - I need to write up what I did last week and circulate - it works but its not that elegant
<SpamapS> jamespage: Yeah, Adam said you had been doing some stuff
<m_3> jamespage: understand... I've had similar "compromises" in this area
<SpamapS> jamespage: m_3 has some thoughts as well. :) I think formulas embedding a test that can be scooped up and run in an automated fashion would be brilliant
<jamespage> SpamapS, m_3: thats pretty much what I did - its basic ATM but it can be run from Jenkins to generate graphs etc..
<m_3> are you mocking or actually relating?
<jamespage> also puts writing the tests for a formula IN the formula which feels nice
<m_3> yes!
<jamespage> m_3: actually relating
<m_3> right
<jamespage> m_3, SpamapS: I implemented it as a single formula-tester service
<SpamapS> jamespage: so I have a strong desire to test the orchestra -> openstack -> all other formulas chain .. over and over..
<jamespage> and tester-* hooks each of the services which get related after the environment is deployed
<jamespage> at which point the tests get executed and collated back to the formula-tester
<jamespage> which jenkins pulls all of the results from
<SpamapS> jamespage: hah, sweet!
<SpamapS> jamespage: so the test relation tells jenkins what to relate the formula to?
<jamespage> its a little 'grey' in some areas as it was hard to tell when relations and finished relating
<jamespage> kinda
<m_3> jamespage: nice... how is it specifying what the relation is expecting/providing?
<m_3> jamespage: or is it just discovery?
<SpamapS> I do think we need an 'ensemble watch x=y' command where you can say "just wait here until state x changes." .. if it changes to y.. then exit 0, other wise, 1
<jamespage> SpamapS: that would be great
<SpamapS> we can fake it w/ status for now
<jamespage> m_3, SpamapS: so I have this new formula - http://bazaar.launchpad.net/~james-page/+junk/formula-tester/files
<jamespage> which provides the 'testing' interface
<jamespage> So a formula that wants to be tested optionally requires the 'testing' interface - branch of rabbitmq as an example - http://bazaar.launchpad.net/~james-page/+junk/rabbitmq-server/files
<jamespage> All the relation does it setup SSH keys and exchanges identification to allow the formula-tester to execute the tests and collate the results on each of the service units its hooked up with
<m_3> nice, and stays out of the way unless the tester relation is joined
<jamespage> I then hacked together a wrapper script - http://bazaar.launchpad.net/~james-page/+junk/test-formula/files
<jamespage> to bootstrap, build the environment, collate the test results and kill everything
<jamespage> it parses the yaml of ensemble status plus other hacks to determine where its up-to
<jamespage> Then there is a openstack-formula-test branch - http://bazaar.launchpad.net/~james-page/+junk/openstack-formula-test/files
<jamespage> which has the hooks and config for deploying the openstack formulas and testing them
<jamespage> freeze/rehydrate environment in ensemble would really help with this
<jamespage> as it has lots of little snoozes at the moment
<jamespage> (adam_g and I where not that brave)
<m_3> jamespage: brilliant man
<SpamapS> jamespage: quite interesting... I'd actually create a user for running the tests, not rely on the ubuntu user. That way you can drop the user when the tester relation is broken.
<jamespage> yeah - as I said it lacks polish
<SpamapS> jamespage: But I do think its a good framework for getting tests written. What is the XML format?
<jamespage> I'm going to get it up and running somewhere visible so folks can see
<jamespage> so the tests are written as python unittest
<jamespage> subunit + subunit2junitxml is then used to massage the output into JUnit XML format which Jenkins likes
<SpamapS> jamespage: so it kind of makes sense for all of that to happen on the jenkins side
<jamespage> probably
<jamespage> I'm not conivinced that python unittest quite the right solution.
<m_3> jenkins service could actually even _be_ the testing service
<jamespage> m_3 - hey - neat idea!
<SpamapS> s/could/should/ ;)
<m_3> but it might make sense to keep it separate for stability
<m_3> I can see the tester getting jammed at times
<m_3> awesome...
 * m_3 grokking ramifications
<SpamapS> This can all be quite fluid.
<m_3> and contained :)
<m_3> it sort of is a mocker too
<m_3> so all of those patterns apply
<SpamapS> Formulas are their own mocks in some way.
<SpamapS> Want to test if mediawiki's mysql works, before relating it to the database? deploy it.. relate it.. done.
<SpamapS> Since the formula defines that mysql will be on 3306 and have x,y,z .. you couldn't mock without overriding the mysql client library in mediawiki
<m_3> and it keeps in the spirit of ensemble with freeform tester scripts/hooks in each formula
<m_3> not forcing a whole lot of framework
<m_3> SpamapS: right, it's doesn't carry over directly
<jamespage> m_3, SpamapS: so one thing we did not have a good answer for was how you test a service's relations
<m_3> what are the goals of a formula unit-test?
<jamespage> m_3: good question
<m_3> implements the service's relation APIs?
<SpamapS> jamespage: relations should test themselves really... after receiving user/pass/host/port for mysql, you should try it in the relation-changed hook.. and error out if it doesn't work.
<hazmat> i was thinking that a testing framework for formuls would work in two parts, a per hook test script which would validate what the hook had done executed on the units, and a setup script that would execute with the admin cli
<hazmat> effectively each formula would have a tests/hook-name-test and tests/hook-name-setup
<jamespage> SpamapS: guess so; so scope of the formula unit test should be how it validates its working once its up and has it minimal set of relations running
<hazmat> the setup could add units/relations etc, anything needed to test, its a bit minimal in terms of number of test scenarios that could be run though against an individual hook
<SpamapS> jamespage: indeed. poke the web service and see if you get a login screen instead of "It works!" .. make sure rabbitmq is running and listening on port XYZ .. those sorts of things, right?
<jamespage> thats what we have done so far
<jamespage> they are pretty easy to implement unittest style 
<SpamapS> sounds like we should plan to chat about formula testing in Orlando
<m_3> worth hangouts before then too?
<m_3> (this is a big deal)
<SpamapS> at this point i just want to get a bzr triggered test against OpenStack setup.
<SpamapS> If I can also make it test against Orchestra.. that would be fabulous. :)
<SpamapS> ooo oo ooo bug 386596 passed QA! :)
<_mup_> Bug #386596: pushing to a packaging branch can't create a new package <codehosting-ssh> <escalated> <lp-code> <not-pie-critical> <package-branches> <principia> <qa-ok> <stakeholder> <Launchpad itself:Fix Committed by abentley> < https://launchpad.net/bugs/386596 >
<m_3> ah... nice
<hazmat> bcsaller, got time for a chat on lxc work?
<bcsaller> hazmat: absolutely, can you give me 10 minutes
<hazmat> bcsaller, sure
<fwereade> nn all, take care
<bcsaller> hazmat: did the picture of the planning board at the sprint come out? can we share that on U1?
<hazmat> bcsaller, sure, i've to grab it off my camera
<jimbaker> niemeyer, ready to talk?
<hazmat> bcsaller, i'm chilling in a g+ hangout
<jimbaker> bcsaller, i was taking a look at bug 823586. although the stream.flush() in RelationGetCli.format_shell does guarantee that the stream is in fact flushed before stderr is written to, i don't believe this guarantee can be respected by HookProtocol in outReceived and errReceived
<_mup_> Bug #823586: intermittent failure in test_relation_get_format_shell_bad_vars <Ensemble:In Progress by jimbaker> < https://launchpad.net/bugs/823586 >
<hazmat> bcsaller, g+ fail
<bcsaller> jimbaker: ahh, that could be why it still randomly fails. I think I'd remove the stderr output in this case then, but I suppose we could just loosen up the tests placement expectations
<bcsaller> hazmat: I'll start a new one
<jimbaker> bcsaller, i think relaxing the test is the best course of action. we can still be best effort on the flush
<jimbaker> bcsaller, anyway i will put a fix in place and push it for review
<bcsaller> jimbaker: thanks alot
<jimbaker> bcsaller, the other thing to be aware of is that any log tests must wait for the process to end (in this case, that would be yield exe.ended)
<bcsaller> hazmat: skype?
<hazmat> bcsaller, installing it now
<bcsaller> jimbaker: good point
<jimbaker> bcsaller, that was a change i recently made, r286 (robust-hook-exit)
<_mup_> ensemble/fix-shell-bad-vars-test r314 committed by jim.baker@canonical.com
<_mup_> Relax stream ordering requirement in test, but do wait on streams being closed before testing
 * robbiew goes to pack for linuxcon flight today :/
<jimbaker> bcsaller, this looks like a trivial to me, so if you want to review it first and concur, we can do it as such
<niemeyer> jimbaker: Here
<jimbaker> bcsaller, lp:~jimbaker/ensemble/fix-shell-bad-vars-test
<jimbaker> niemeyer, skype?
<niemeyer> jimbaker: G+?
<jimbaker> niemeyer, g+ always seems heavyweight for this sort of thing, but sure
<niemeyer> jimbaker: Luckily we don't have to send packets manually
<jimbaker> niemeyer, ;), i just mean the setup. let's see here...
<niemeyer> jimbaker: Setup?  You mean clicking on that button in a web page?
<niemeyer> jimbaker: What's your gmail account?
<niemeyer> I'll send the invite
<jimbaker> niemeyer, still not quite like a skype call. anyway, just waiting here
<jimbaker> niemeyer, james.edward.baker@gmail.com
<niemeyer> jimbaker: Danke
<niemeyer> jimbaker: Done
<hazmat> jimbaker, bcsaller niemeyer, i'm going to go ahead and move over all open bugs in the dublin milestone to eureka milestone, any objections?
<hazmat> i'd like to go ahead and close out the dublin milestone
<niemeyer> hazmat: It'd be better to do a more realistic selection..
<niemeyer> hazmat: Or just move everything out, and then include selectively 
<niemeyer> hazmat: We have a month, pretty much
<niemeyer> hazmat: Would be great to do the move, though
<niemeyer> hazmat: I can do the kanban once you've opened the new one
<hazmat> niemeyer, https://blueprints.launchpad.net/ensemble/+spec/ensemble-local-development
<hazmat> niemeyer, the new one is open
<hazmat> oops
<hazmat> bcsaller, https://blueprints.launchpad.net/ensemble/+spec/ensemble-local-development
<bcsaller> niemeyer: we have a debate still about if we should be using cloud-init for local containers or not. There are strong arguments for doing it that way as it more closely mirrors what ec2 does, but in the case of local container creation it adds latency and defers to boot that which could already be cached on the local machine. do you want to have a talk about this now or another time?
<niemeyer> bcsaller: We can talk now
<niemeyer> bcsaller: Here or G+?
<bcsaller> g+
<niemeyer> bcsaller: Ok.. waiting for invite
<SpamapS> shouldn't this be moved out of drafts? https://ensemble.ubuntu.com/docs/drafts/service-config.html
<hazmat> SpamapS, it should indeed
<SpamapS> Also we should do redirects on 404 in drafts to /docs/$filename
<SpamapS> Maybe some mod_rewrite magic so we can only do it when files of the same name exist there.
<niemeyer> Alright!  Launchpad formulas (aka source packages) on demand are live!
<niemeyer> A toast to the Launchpad guys
<SpamapS> hip hip
<niemeyer> woohaa
 * SpamapS wishes the bot would say HOORAY when it sees that :)
<niemeyer> SpamapS: Hah, that'd be nice :-)
<SpamapS> adam_g: hey, can you try pushing your +junk formulas to /principia/oneiric/xxxx/trunk ?
<niemeyer> Alright.. let's see that queue
<niemeyer> kirkland: Was the version of byobu-tmux that is in review the same you tested in real interactions?
<vciaglia> hi *.*!
<SpamapS> vciaglia: welcome!
<vciaglia> hi SpamapS!
<SpamapS> vciaglia: I'm Clint from the mailing list btw. Great job on the initial squid formula. :)
<vciaglia> SpamapS, nice to meet you!
<_mup_> ensemble/formula-state-with-url r313 committed by kapil.thangavelu@canonical.com
<_mup_> ec2 storage.get_url better test verification of hmac signature
<vciaglia> SpamapS, i'm just hacking around to create a good proftpd formula. It's very interesting because we could work with TLS too.
<SpamapS> vciaglia: cool. Do you have a use case for these or just wanting to play?
<vciaglia> SpamapS, currently just playing. I'm identifying some services where i could work on.
<SpamapS> vciaglia: great. One of the things that Ensemble is really good at is relating services together.. so if you can think of something that proftpd needs to use (like, for instance, a mysql server for auth) thats a great place to start.
<vciaglia> SpamapS, i saw formulas for worpdress, wikimedia, joomla. To relate togheter services i thought to Magento but unfortunately isn't in Ubuntu Repository.
<vciaglia> is it a problem or we can create a simple bash script to download and install from sources?
<SpamapS> vciaglia: the only requirement is that you verify it somehow.. so gpg or md5sum works.
<SpamapS> vciaglia: You can also embed a tarball in the formula itself.
<SpamapS> vciaglia: curious, why isn't Magento in Ubuntu?
<vciaglia> SpamapS, dunno :D
<SpamapS> Wow a 13MB download.. :)
<vciaglia> eheh
<niemeyer> Anyone able to review a trivial: https://code.launchpad.net/~jimbaker/ensemble/fix-shell-bad-vars-test/+merge/71749
<niemeyer> kirkland: ping
<kirkland> niemeyer: pong
<niemeyer> kirkland: Hey
<kirkland> niemeyer: howdy!
<niemeyer> kirkland: The branch byobu-tmux.. is it exactly what you tested last time?
<niemeyer> kirkland: Just want to make sure what we merge passed through a real use round
<kirkland> niemeyer: i did test what I proposed for merging;  I do have two branches, though;  let me be explicit here on which one
<kirkland> niemeyer: one minute
<niemeyer> Cool
<kirkland> niemeyer: yes, https://code.launchpad.net/~kirkland/ensemble/byobu-tmux is the correct branch, and yes, tested last week on 11-Aug-2011
<niemeyer> bcsaller, hazmat, jimbaker: ping
<niemeyer> kirkland: Awesome, merging, thanks!
<kirkland> Nicke: cheers, thanks
<hazmat> niemeyer, pong
<niemeyer> hazmat: Easy one: https://code.launchpad.net/~fwereade/ensemble/spike-catchup/+merge/71269
<_mup_> ensemble/trunk r314 committed by gustavo@niemeyer.net
<_mup_> Merged byobu-tmux branch from Dustin [r=fwereade,niemeyer]
<_mup_> This adds supports for the byobu tmux configuration on debug-hooks.
<vciaglia> SpamapS, maybe i could help with phpmyadmin formula (related to apache). It's already assigned but i don't see any progresses.
<hazmat> niemeyer, done, easy indeed, looks great
<niemeyer> hazmat: Thanks!
<vciaglia> SpamapS, related to apache and mysql
<SpamapS> vciaglia: I think at this point you're welcome to take it on.. the person who started work on it seems to have gotten stuck.
<vciaglia> SpamapS, great! Ok, tomorrow i'll start working on
<hazmat> jimbaker, re mocker.replace from yesterday.. ensembletestcase.patch is perhaps what you where looking for
<jimbaker> hazmat, yeah, that's definitely the one - i have updated the branch to use that
<_mup_> ensemble/fix-shell-bad-vars-test r315 committed by jim.baker@canonical.com
<_mup_> Merged trunk
<niemeyer> Lights went off.. BOOM outside.. lights went on.. lights went off again.. BOOM outside again.. lights do not go on anymore.
<niemeyer> I suspect this might take a while
<_mup_> ensemble/trunk r315 committed by jim.baker@canonical.com
<_mup_> merged fix-shell-bad-vars-test [r=niemeyer][f=823586]
<_mup_> [trivial] Remove testing of ordering of flushes to different streams
<_mup_> in intermittent failing test, such ordering cannot be guaranteed when
<_mup_> collected for logging by HookProtocol.
<niemeyer> Okay.. fwereade's branches have at least one review each
<niemeyer> Will clean the rest of the queue tomorrow morning
<niemeyer> I'll step out for the moment..
#ubuntu-ensemble 2011-08-17
<_mup_> ensemble/verify-version r323 committed by jim.baker@canonical.com
<_mup_> Merged trunk
<_mup_> ensemble/verify-version r324 committed by jim.baker@canonical.com
<_mup_> Change protocol version to 0; explicity verify version key in test of InternalTopology.reset
<adam_g> SpamapS: no dice unless im pushing to the wrong place. http://paste.ubuntu.com/667795/
<_mup_> ensemble/expose-cleanup r312 committed by jim.baker@canonical.com
<_mup_> More initial code
<hazmat> niemeyer, do we have a eureka kanban?
<niemeyer> hazmat: Not yet, but if the milestone is in place I can set it up right away
<hazmat> niemeyer, the milestone is in place and most of the bugs are moved over
<niemeyer> hazmat: Woot
<niemeyer> hazmat: let me handle that
<hazmat> niemeyer, much more fun programming lp than doing it in the browser
<hazmat> niemeyer, i ended up writing a mongodb sync script last night after abandoning  cli integration
<niemeyer> hazmat: Wow
<niemeyer> hazmat: That sounds awesome
<hazmat> niemeyer, and used that just now as a basis for doing all the mods (queries).. its gratitious i could have done it through the lp api, but the turnaround was much faster this way
<hazmat> niemeyer, fwiw the mongodb sync code.. very much still in progress, i haven't gotten around to syncing people or teams yet, but its been useful by itself.. i need to parallelize the lp connections to take real advantage of the gevent concurrency options http://pastie.org/private/lnrfrdufyffnpbi7m69q
<hazmat> right now the gevent integration is gratitutous
<hazmat> interesting http://aws.amazon.com/govcloud-us/
* niemeyer changed the topic of #ubuntu-ensemble to: http://j.mp/ensemble-eureka | http://ensemble.ubuntu.com/docs
<_mup_> ensemble/expose-cleanup r313 committed by jim.baker@canonical.com
<_mup_> Implemented reasonably robust spike
<jimbaker> mocking tomorrow...
<_mup_> ensemble/trunk r318 committed by gustavo@niemeyer.net
<_mup_> Merge missed revision from kirkland's byobu-tmux branch, left
<_mup_> out due to an error on my part when merging. [trivial]
<fwereade> niemeyer: morning! missed you before :)
<niemeyer> fwereade: Hey!
<fwereade> niemeyer: thanks for all the reviews
<niemeyer> fwereade: No problem!
<fwereade> niemeyer: I think I'll be doing significant post-review changes in new stacked branches in future, to avoid monstrous diffs like the hide-instances one
<fwereade> niemeyer: sensible?
<niemeyer> fwereade: That sounds very sensible
<fwereade> niemeyer: in that case: https://code.launchpad.net/~fwereade/ensemble/provider-base-launch-machine/+merge/71850 :)
<fwereade> niemeyer: really just to check that that was what you wanted me to do
<niemeyer> fwereade: Looking at it right now, coincidently
<fwereade> niemeyer: it could be taken much further but I'm not yet convinced that's a good idea
<fwereade> niemeyer: cool :)
<niemeyer> fwereade: The bootstrap parameter doesn't look bad per se
<niemeyer> fwereade: But there's something weird in it as a whole
<fwereade> niemeyer: it's not *bad* but it feels a bit ...off-kilter
<niemeyer> fwereade: Why do we have bootstrap() and start_machine(bootstrap=True)?
<niemeyer> (rhetorical)
<niemeyer> fwereade: It sounds like we're about to have an eureka moment :)
<fwereade> niemeyer: because (1) bootstrap, in the context of start_machine, is shorthand for saying "I'd like this machine to play another couple of roles"; and (2) because LaunchMachine has an extra resposibility, which it shouldn't, that also uses the bootstrap parameter to complete the bootstrap operation (ie saving provider-state)
<fwereade> niemeyer: I'd definitely like to fix it but it feels like a distinct bug/branch to me
<niemeyer> fwereade: We need to distinguish the two operations in this branch, even if it's a matter of naming
<fwereade> niemeyer: how about (1) moving the state-saving into bootstrap where it should be, and renaming the start_machine parameter "master"?
<fwereade> niemeyer: (there should have been a (2) somewhere in there)
<niemeyer> fwereade: Moving state-saving is unrelated to this, I think
<fwereade> niemeyer: well... it's a lot of the justification for the param name being bootsrtap
<fwereade> niemeyer: but that makes sense
<fwereade> niemeyer: rename to "master", open a bug to move the state-saving?
<niemeyer> fwereade: Alright, I think I know what to do..
<fwereade> niemeyer: cool :)
<niemeyer> Maybe
<niemeyer> fwereade: Renaming to master is hiding the intention rather than making it more obvious
<niemeyer> fwereade: Hmm, or maybe not
 * niemeyer thinks
<fwereade> niemeyer: well, one problem is that I think we lack a canonical term for that machine
<fwereade> niemeyer: sometimes we call it the bootstrap node, sometimes the zookeeper
<niemeyer> fwereade: True
<niemeyer> fwereade: But that's not entirely a coincidence
<niemeyer> fwereade: The machine itself is not special
<fwereade> niemeyer: and I guess that when we have multiple zookeepers the precise nature of the machine(s) will change again
<fwereade> niemeyer: quite so :)
<niemeyer> fwereade: It just runs a service that we care about early on
<fwereade> niemeyer: that was why I was thinking that it's actually part of machine_data (where machine_data refers to a putative sensible replacement for the data bag)
<fwereade> niemeyer: machine-id is always needed; a machine implicitly always runs a machine agent; it may also run a zookeeper, and/or a provisioning agent
<niemeyer> fwereade: That said, I retract my pedantism :)
<niemeyer> fwereade: master=True sounds like a straightforward solution right now
<fwereade> niemeyer: however I'm loath to dirty up machine_data further right now
<fwereade> niemeyer: cool, cheers :)
<niemeyer> fwereade: We can improve the situation once we do more about turning zookeeper and the provisioning agents into actual formulas
<fwereade> niemeyer: ...I'm not sure we could deploy them as formulas without already having them in place, could we?
<fwereade> niemeyer: oh, additional ones
<fwereade> niemeyer: (right?)
<niemeyer> fwereade: There's a chicken & egg problem we need to solve, but I think it's doable
<fwereade> niemeyer: I can't quite see it yet, but it will be very nice if we can figure it out
<fwereade> niemeyer: anyway, can I go ahead and merge that one back into provider-base with just your review?
<fwereade> niemeyer: or should we wait for someone else to review both separately?
<niemeyer> fwereade: E.g. 1) deploy zk alone; 2) Run machine agent against zk; 3) Deploy a unit within machine 0 with a second zk; 4) Sync second zk with first one; 5) Kill first zk..
<niemeyer> fwereade: I think it's find to merge this as a review point for the first one
<fwereade> niemeyer: hm, maybe :)
<fwereade> niemeyer: ok, cool
<fwereade> niemeyer: cheers
<niemeyer> fwereade: Thanks for separating this out, btw.. it made a whole lot easier to review it
<niemeyer> fwereade: We should talk to others in our meeting about this workflow
<fwereade> niemeyer: cheers :)
<fwereade> niemeyer: I'll try to remember
<niemeyer> kim0: ping
<kim0> niemeyer: Hey
<niemeyer> kim0: Hey man
<niemeyer> kim0: You probably haven't had time yet, but when you have a moment, can you please fix the "Ensemble security and firewall enhancements" post?
<kim0> niemeyer: done .. thanks for clearing that up
<_mup_> Bug #827994 was filed: MachineProvider interface inconsistent (list/*args) <Ensemble:New> < https://launchpad.net/bugs/827994 >
<niemeyer> kim0: np!
<fwereade> niemeyer: couple of clarifications on https://code.launchpad.net/~fwereade/ensemble/cobbler-zk-connect/+merge/71734 ...
<fwereade> niemeyer: error style as follows?
<fwereade> Some general thing that went wrong: Some specific explanatory message: Some message from the underlying exception
<fwereade> eg
<fwereade> ...er, sorry don't have the exact example I'm looking for
<fwereade> niemeyer: anyway, the other question was: remove launch_time from ProviderMachine as well? I don't think anything else uses it...
<fwereade> niemeyer: anyway, error message example
<fwereade> Ensemble environment is not accessible: Machine i-foobar may not be ready: Connection timed out
<niemeyer> fwereade: Sorry, was grabbing some coffee
<niemeyer> fwereade: re. the message,
<fwereade> niemeyer: np
<niemeyer> fwereade: More than two levels feels a bit weird to me
<niemeyer> fwereade: But you get the idea
<fwereade> niemeyer: likewise, because that was my intent with the ones that aren't capitalised
<niemeyer> fwereade: +1 on dropping launch_time if we have no users (woohay!)
<fwereade> niemeyer: although I guess it's maybe not ideal having a common prefix on EnvironmentPending ("Ensemble environment is not available:")
<niemeyer> fwereade: I think I provided an example of this message in the review, btw
<niemeyer> <fwereade> Ensemble environment is not accessible: Machine i-foobar may not be ready: Connection timed out
<niemeyer> This message, I mean
<fwereade> niemeyer: yep, seems reasonable, I think I misread exactly what you were after
<fwereade> niemeyer: and, yay, code to delete :D
<niemeyer> "Can't connect to machine %s (perhaps still initializing): %s"
<niemeyer> fwereade: This will work regardless of context
<niemeyer> fwereade: The message in your quote mentions the environment not being accessible, which has assumes things 
<niemeyer> s/has//
<fwereade> niemeyer: I should point out that ProviderInteractionError uses That: Nested: Style
<niemeyer> fwereade: Hmm.. what's the outside message prefix?
<niemeyer> fwereade: (wondering if we can drop it)
<fwereade> niemeyer:         return "ProviderError: Interaction with machine provider failed: %r" % self.error
<fwereade> niemeyer: ...and we expect PIE to wrap a range of other possible errors, I think, which may themselves Do: That
<niemeyer> fwereade: That's not very nice
<niemeyer> fwereade: That was kind of ok in the original context
<fwereade> niemeyer: I think we can certainly lose the ProviderError: prefix
<niemeyer> fwereade: ProviderInteractionError was used for wrapping
<niemeyer> fwereade: Agreed
<niemeyer> fwereade: Ugh.. and it uses repr
<niemeyer> fwereade: This feels like an overlook
<fwereade> niemeyer: OK, I'll see what I can do with that
<niemeyer> fwereade: I think we can remove this entirely
<niemeyer> fwereade: The wrapping, that is
<niemeyer> fwereade: We shouldn't be wrapping our own error messages
<niemeyer> fwereade: Unless we actually have something interesting to say, of course
<niemeyer> fwereade: Which is not the case.
<fwereade> niemeyer: I think PIEs tend to have been tested with assertIn("blah", str(error))
<fwereade> niemeyer: I'll look for assertIns and make them assertEqualss, which should then make the nasty messages stick out
<niemeyer> fwereade: If we wrap a message from e.g. S3 into a PIE, we should add a prefix at the wrapping spot, if at all
<niemeyer> fwereade: This may be more work than you're looking for
<niemeyer> fwereade: I'd just fix the messages themselves and fix whatever breaks
<fwereade> niemeyer: I'll start a stacked branch and at least do a search, see how heavyweight it looks
<niemeyer> fwereade: Cool, thanks
<fwereade> niemeyer: killing launch_machine is pretty low-risk/low-noise, though, I'll do that inline first
<fwereade> niemeyer: er, launch_time
<niemeyer> fwereade: Superb
<pindonga> hi, just wanted a quick heads up
<pindonga> is ensemble still working only with ec2?
<pindonga> or is it possible to use it with vms, like Vagrant , for example
<niemeyer> pindonga: Hey
<niemeyer> There's good work in progress to make it work with physical machines
<niemeyer> pindonga: and local deployments too
<jimbaker> fwereade, just a quick note - take a look at bzr log. our standard is to indicate in the log text which branch was merged in, along with reviewers and the bug fixed
<jimbaker> when merging branches into trunk
<fwereade> jimbaker: whoops, sorry
<fwereade> jimbaker: I'll try to remember
<jimbaker> fwereade, no worries
<fwereade> niemeyer, everyone: I assert that (1) MachinesNotFound is a ProviderInteractionError, not just a ProviderError; and (2) anywhere we "except ProviderInteractionError" is still wrong, because many provider interactions raise ProviderError (for bad args, for example)
<fwereade> opinions?
<niemeyer> fwereade: 2 sounds sensible.. 1 feels strange
<niemeyer> fwereade: Provider*Interaction*Error was, again, supposed to wrap interaction errors with the provider that we didn't expect 
<fwereade> niemeyer: if I fix 2, I'm happy
<niemeyer> fwereade: MachinesNotFound may not be an error in the provider, but on our own data about it
<fwereade> niemeyer: yep, makes sense
<niemeyer> fwereade: It's a ProviderError, but a well understood one
<niemeyer> fwereade: I hope we kill ProviderInteractionError entirely at some point
<fwereade> niemeyer: cool
<fwereade> niemeyer: I'll be making sure the existing except PIEs also catch PEs in the errors branch I think
<fwereade> niemeyer: feels like a really bad idea to let that linger
<niemeyer> fwereade: Sounds good
<fwereade> niemeyer: cheers, I'll just let it all bed in for a few minutes
<_mup_> ensemble/expose-cleanup r314 committed by jim.baker@canonical.com
<_mup_> Cleanup
<niemeyer> Folks, have been working since early morning and stayed around until late yesterday.. I'm taking a longer mid-day break today for resting a bit.
<_mup_> ensemble/formula-state-with-url r314 committed by kapil.thangavelu@canonical.com
<_mup_> lazy init of storage url in formula serialization, default none url on state, restore passthrough error, and update failure/mock test that verifies for additional url param
<hazmat> niemeyer, cheers
<pindonga> niemeyer, anything I could try out (re: local deployments)
<hazmat> pindonga, not at the moment, its actively being developed for our next internal release milestone mid september.
<pindonga> hazmat, ok, I was asking as I'm in need of automating our infrastructure, so I wanted to follow the path of least effort :)
<hazmat> it will be in the ppa ensemble roughly around then
<pindonga> I want to be able to use ensemble in the future
<pindonga> but since I need this now, I guess puppet+something else will have to do
<hazmat> pindonga, atm ensemble really only works against ec2
<pindonga> yeah
<pindonga> thx anyway
<hazmat> pindonga, we're striking out in a couple of different directions to address this, orchestra/cobbler integration for physical machines, lxc/local development, and out of the box openstack support. i'd say its probably about a month till we get these landed.
<pindonga> cool
<pindonga> I'll take a look again in a month then :)_
<RoAkSoAx> fwereade: ping
<fwereade> RoAkSoAx: pong
<fwereade> RoAkSoAx: how's it going?
<RoAkSoAx> fwereade: pretty goo, and you? heard you got stuck in chicago on the weekend?
<fwereade> RoAkSoAx: yeah, it was a bit of a hassle :)
<fwereade> RoAkSoAx: all good now though
<fwereade> RoAkSoAx: are you in a position to do some reverification?
<fwereade> RoAkSoAx: we're only 3 merges away from trunk now :)
<RoAkSoAx> fwereade: yay!! and will do later today
<RoAkSoAx> fwereade: i'm finishing some stuff up with cobbler/arm and once that's done i'm free
<fwereade> RoAkSoAx: awesome!
<fwereade> RoAkSoAx: how's that going?
<RoAkSoAx> fwereade: pretty good, just patching cobbler, and have to write another small fix and it iwll be good to go
<RoAkSoAx> fwereade: btw.. what branch of yours should I be using?
<fwereade> RoAkSoAx: I'll send you a new branch, I honestly can't remember the state of the old ones and I don't fancy merging everything through
<fwereade> RoAkSoAx: can you wait an hour or so? I'll make sure I've done it before I stop for the day :)
<RoAkSoAx> fwereade: sure, take your time
<hazmat> fwereade, so the base of provider-base-machine is provider-base-launch-machine?
<fwereade> hazmat: I branched PBLM from PB, so niemeyer could review that separately and check it matched his thinking from the PBLM review; then I merged it back into PB
<hazmat> ic, so the base is still trunk
<fwereade> hazmat: gaah: s/from the PBLM review/from the PB review/
<fwereade> hazmat: yep
<fwereade> hey, is it team meeting time?
<hazmat> fwereade, i though that was tomorrow
<hazmat> s/thought
<fwereade> hazmat: that would be quite convenient as it happens :)
<fwereade> hazmat: thanks for the review btw 
<hazmat> fwereade, np.. great stuff
<fwereade> hazmat: I try :)
<hazmat> niemeyer, bcsaller, jimbaker, fwereade btw. you guys probably saw the bug spam, but i closed out the dublin milestone, any dublin bugs which didn't have an assignee, got unassigned, and the rest moved on to the eureka milestone.
<fwereade> hazmat: thanks 
<hazmat> the eureka milestone has hard deadline, as its release time, so we're trying to only keep things on the milestone which we expect/need to get done for the close of the oneiric cycle
<fwereade> jimbaker: curses, I included the bug and reviewers, and forgot the branch :/
<hazmat> fwereade, np.. was fun to script it with the launchpad api
<_mup_> Bug #828147 was filed: Ensemble branch option needs to allow for distro pkg, ppa, and source branch install <Ensemble:New> < https://launchpad.net/bugs/828147 >
<_mup_> Bug #828152 was filed: default formula config values not available to hooks <Ensemble:New> < https://launchpad.net/bugs/828152 >
<fwereade> later all
<hazmat> fwereade, cheers
<hazmat> jimbaker, do you mind if i put bug 828147  on your plate for the eureka milestone?
<_mup_> Bug #828147: Ensemble branch option needs to allow for distro pkg, ppa, and source branch install <Ensemble:New> < https://launchpad.net/bugs/828147 >
<hazmat> its something unrelated to feature dev, that we need for the release
<jimbaker> hazmat, sure, looks like i would learn something fun
<jimbaker> in terms of working with launchpad
<hazmat> jimbaker, sadly the lp integration there is pretty minimal
<hazmat> jimbaker, are you going to be enabling a runner that will do functional tests?
<hazmat> jimbaker, i'm looking at some of the ftest outstanding bugs, and wondering if they should be on the eureka milestone
<_mup_> ensemble/machine-agent-uses-formula-url r313 committed by kapil.thangavelu@canonical.com
<_mup_> merge predecessor
<_mup_> Bug #828189 was filed: machine agent should use formula url for unit deployment <Ensemble:In Progress by hazmat> < https://launchpad.net/bugs/828189 >
<SpamapS> hazmat: dublin is over, right?
<jimbaker> hazmat, ftest should be on the eureka milestone
<jimbaker> as well as part of some sort of CI, presumably using jenkins
<jimbaker> buildbot would also be ok, but jenkins has much better reporting
<hazmat> SpamapS, it is
<hazmat> jimbaker, jenkins is also a bit easier to setup imo
<SpamapS> If only canonistack's S3 worked, it would already be running there. ;)
<SpamapS> Actually I'm not sure canonistack is going to work.. 1.5GB isn't really enough for all those logs and graphs. ;)
<jimbaker> hazmat, sounds good - i haven't done either setup, but i was very impressed w/ working w/ jenkins/hudson in the past
 * SpamapS is thinking maybe that should be his Ensemble Audition candidate. :)
<jimbaker> SpamapS, sounds like a good use of ensemble
<m_3> SpamapS: can canonistack point to real S3?
<jimbaker> to self monitor
<m_3> at least temporarily
<_mup_> ensemble/verify-version r325 committed by jim.baker@canonical.com
<_mup_> Merged trunk & resolved conflict
<m_3> or just external swift
<hazmat> jcastro is your branch for bug 806638 ready for merging/review?
<_mup_> Bug #806638: Docs need updating to mention what it expects as a value for instance type <Ensemble:In Progress by jorge> < https://launchpad.net/bugs/806638 >
<hazmat> jcastro, actually that looks more like a trivial
<SpamapS> m_3: no because the auth details would be different
<hazmat> jcastro, i'll cowboy that in
<hazmat> jimbaker, bcsaller1, niemeyer  can i get +1 on this trivial, http://bazaar.launchpad.net/~jorge/ensemble/docfix-instance-type/revision/266
<m_3> gotcha... wasn't sure if we could configure separate S3_URL from ec2 endpoint for ensemble
<SpamapS> m_3: been there, tried that. ;)
<m_3> it's really about the image store though
<jimbaker> hazmat,taking a look
<jimbaker> hazmat, +1 on the trivial
<bcsaller> hazmat: lgtm too
<jcastro> hazmat: ah yeah I had forgotten about that, I was learning the docs and took an opportunity to do a quick fix.
<_mup_> ensemble/trunk r320 committed by jim.baker@canonical.com
<_mup_> merged verify-version [r=niemeyer,hazmat][f=825398]
<_mup_> Store ensemble protocol version in /topology znode to ensure all
<_mup_> topology using ops failfast with IncompatibleVersion if mismatch.
<_mup_> ensemble/trunk r321 committed by kapil.thangavelu@canonical.com
<_mup_> [trivial] clarify valid values ec2 provider option default-instance-type [r=bcsaller,jimbaker][a=jcastro]
<_mup_> ensemble/trunk-merge r276 committed by kapil.thangavelu@canonical.com
<_mup_> merge trunk
<_mup_> ensemble/security-policy-with-topology r326 committed by kapil.thangavelu@canonical.com
<_mup_> merge trunk and resolve conflict
<_mup_> ensemble/security-agents-with-identity r314 committed by kapil.thangavelu@canonical.com
<_mup_> resolve conflict and merge
<_mup_> ensemble/states-with-principals r326 committed by kapil.thangavelu@canonical.com
<_mup_> remove merge conflict files that got added.
<_mup_> ensemble/expose-cleanup r315 committed by jim.baker@canonical.com
<_mup_> Merged trunk & resolved conflict
<_mup_> ensemble/expose-cleanup r316 committed by jim.baker@canonical.com
<_mup_> Fixes due to provider refactoring
 * niemeyer waves
<niemeyer> hazmat: I started going in the direction we discussed for gozk last night, and suddenly a feeling of approach-rightness stroke me..
<niemeyer> hazmat: I have to say it's really weird the way zk works by default
<niemeyer> Can't imagine people writing reliable software would want this
<niemeyer> In the specific sense of watch vs. CONNECTING/ASSOCIATING events
<niemeyer> It's a bit like TCP popping up a notice to the stream saying "Hey, an ip packet got duplicated, but I dropped it, alright?"
<hazmat> niemeyer, indeed its noise to most apps
<hazmat> niemeyer, my short term memory must be really bad, i don't recall discussing gozk last night
<hazmat> niemeyer, bcsaller, jimbaker also reminder we've got our weekly team meeting tomorrow
<niemeyer> hazmat: We didn't.. I just felt in the mood to fix it after the previous talk we had a while ago
<hazmat> ah.. right, previous discussions about gozk and session events, gotcha
<_mup_> Bug #828326 was filed: need to be able to retrieve a service config or schema from the cli <Ensemble:New> < https://launchpad.net/bugs/828326 >
<_mup_> ensemble/expose-cleanup r317 committed by jim.baker@canonical.com
<_mup_> Partial fix of mock expectations in shutdown tests
<_mup_> ensemble/expose-cleanup r318 committed by jim.baker@canonical.com
<_mup_> Mocking on describe_instances going through state transitions
<_mup_> Bug #828378 was filed: handle ec2 instance quotas <Ensemble:New> < https://launchpad.net/bugs/828378 >
<hazmat> bcsaller, niemeyer so does an lxc sf mini-sprint still sound good?
<bcsaller> hazmat: yes, did something change?
<hazmat> bcsaller, just wanted to confirm
<niemeyer> hazmat: Absolutely
<niemeyer> hazmat: Makes a lot of sense
<hazmat> niemeyer, great
<_mup_> Bug #828411 was filed: relation status shows "up" before relation hooks complete execution <Ensemble:New> < https://launchpad.net/bugs/828411 >
<jimbaker> m_3, bug 828411 is interesting - certainly not at a granularity that we currently track as you note
<_mup_> Bug #828411: relation status shows "up" before relation hooks complete execution <Ensemble:New> < https://launchpad.net/bugs/828411 >
<m_3> jimbaker: yeah, main use-case is testing at the moment
<jimbaker> m_3, the only thing that really gives us a fairly comprehensive trace of the system state are the logs
<m_3> but this should be simple and it's definitely something you'd expect the framework to tell you
<jimbaker> maybe we can push more info into ZK, not certain
<m_3> especially when it comes to more automation
<jimbaker> m_3, the info being reported by status is actually is what is driving hook execution
<m_3> might surface more with user-defined hooks or something
<m_3> right now it's just a nice-to-have
<jimbaker> m_3, sure, definitely will think about it. i just wonder if we can extract more value from logs
<m_3> but seems like it would be important for state consistency going forward
<m_3> logically it's two different states
<m_3> (haven't taken the time to figure out a good solution... just starting the conversation with the bug)
<jimbaker> m_3, it's possible that the shared scheduler mentioned in UnitRelationLifecycle would benefit from this
<jimbaker> in terms of sharing more state through ZK,  which status could pick up
<jimbaker> the shared scheduler being something not implemented
<jimbaker> to my knowledge, the unit lifecycle is not recorded in ZK, but only in memory in the unit agent
<m_3> hmmm... I'll have to look
<jimbaker> i don't know dev plans for this. maybe to be addressed in the go rewrite, maybe earlier
<m_3> kinda almost sits at the same level of user-defined hooks... user-defined events... user-defined states
<m_3> that's probably what it should get lumped with
<m_3> not sure
<jimbaker> m_3, currently they are separate things, including what i would understand user-defined things to be
<jimbaker> m_3, but where they could intersect is on the scheduling
<jimbaker> also it's possible that user-defined could be meaningful on a relation, so that it would expand beyond settings
#ubuntu-ensemble 2011-08-18
<niemeyer> hazmat: http://paste.ubuntu.com/668723/
<jimbaker> m_3, but my initial feeling is  that the current model has an elegant simplicity to it
<niemeyer> Dinner
<jimbaker> niemeyer, enjoy what must be a late dinner!
<m_3> jimbaker: understand... it might be a design decision to keep the states/events at a certain level
<niemeyer> hazmat: ping
<hazmat> niemeyer, pong
<niemeyer> hazmat: Hey man
<niemeyer> hazmat: Just finishing some docs for gozk
<hazmat> niemeyer, hola.. long dinner with an out of town friend
<niemeyer> hazmat: It's in a pretty good shape I think
<niemeyer> hazmat: Nic
<niemeyer> e
<hazmat> niemeyer, awesome.. 
 * hazmat checks out the paste
<niemeyer> hazmat: I have a better test, actually..
 * niemeyer pastes
<niemeyer> hazmat: http://paste.ubuntu.com/668835/
<hazmat> niemeyer, interesting
<hazmat> looks so you did the diversion of session events off watches, nice
<niemeyer> hazmat: Yeah, and there's a non-obvious handy aspect there.. I'm just finishing the docs and will paste it
<hazmat> niemeyer, one caveat, perhaps obvious, if the reinit is used to restablish sessions the extant watches against the session are somewhat hosed because the zk c bindings are tracking watches against the handle, and on reinit the handle is hosed though the session restored
<hazmat> s/handle is hosed/handle is new
<niemeyer> hazmat: Not sure I get how that's an issue
<jimbaker> nice simple test
<hazmat> niemeyer, its not an issue, just something for a document
<hazmat> niemeyer, you where saying there's a handy non-obvious property there?
<niemeyer> hazmat: But why does it even have to be in the doc?  It feels like an internal implementation detail
<niemeyer> hazmat: Yeah.. events stringify properly through String()
<niemeyer> hazmat: and os.Error is actually a interface { String() string }
<niemeyer> hazmat: Which means it becomes comfortable to handle problematic situations
<niemeyer> hazmat: if !event.Ok { return event }
<hazmat> niemeyer, well reconnect/reinit a session, with watches associated to a session, and old extant watches effectively being dead is always obvious.
<niemeyer> hazmat: The event itself is the error
<hazmat> isn't that is
<niemeyer> hazmat: That's exactly what the test I pasted does.. old watches do not die unless the session expires
<niemeyer> hazmat: If people get an expired session, the watch will fire
<hazmat> niemeyer, in that cases its the same session and same handle
<hazmat> if you ReInit with an open watch.. things are different, even though its the same session
<niemeyer> hazmat: Sure.. if people obtain a new zk value, it's a new zk value..
<hazmat> niemeyer, very cool though.. good to have this 
<hazmat> niemeyer, its a c library implementation detail/deficiency, just the association of callbacks to the handle, the watch callbacks should be session associated across reinit boundaries imo
<niemeyer> hazmat: I don't feel too strong about this..
<niemeyer> hazmat: ReInit is a bizarre detail
<niemeyer> hazmat: In practice, zk does the heavy lifting internally
<hazmat> niemeyer, this looks pretty good
<hazmat> niemeyer, on a separate, unrelated note, i found a gevent / twisted integration, basically a new reactor for twisted that uses gevent and automatically associates twisted protocols to i/o greenlets.. allows intermixing the gevent and twisted style code.. http://wiki.inportb.com/wiki/Projects:Python:Geventreactor
<niemeyer> hazmat: Thanks, really glad you like it.. I hope we can sort out some of our issues
<niemeyer> hazmat: http://paste.ubuntu.com/668840/
<niemeyer> hazmat: This is part of the docs
<niemeyer> hazmat: Holy crap
<niemeyer> hazmat: This is sick :)
<hazmat> super awesome :-)
<hazmat> niemeyer, the implementation is pretty clean, i haven't run it against anything yet, i'm curious how well it will handle inlinecallbacks.. but definitely interesting
<hazmat> niemeyer, so re docs.. look good.. one point of concern, is that there are numerous session channel events which are effectively transient which mean nothing to the app, it might be worth noting that, there really there to allow apps to respond to connectivity changes for self-throttling.. but on a size limited buffer their effectively no-ops to most apps.
<niemeyer> hazmat: This takes two second class citizens in the Python world and put them together.. if you want to create a bomb, this is pretty effective.
<hazmat> i guess its captured with the panic if not read from session channel event, but it doesn't really distinguish that most of these events are frivolous to most apps
<hazmat> niemeyer, lol
<hazmat> niemeyer, if we pool our super powers.. we get cows :-)
<niemeyer> hazmat: That milk radioactive acid
<hazmat> niemeyer, i guess it does, its just implicit in that non important things aren't delivered to watchers, which implies they are delivered to the session channel
<niemeyer> hazmat: Good point re. docs, will add
<_mup_> ensemble/expose-cleanup r319 committed by jim.baker@canonical.com
<_mup_> Mock completely current tests against shutdown
<niemeyer> hazmat: Indeed
<hazmat> otoh.. twisted has pluggable reactors.. this is just another pluggable reactor..  a libev one.. the addition of greenlets to it.. is definitely radioactive, but intriguing
<niemeyer> hazmat: It's intriguing for sure, but I'm not interested on it in the way you are.. We really can't even think of using something like this for Ensemble
<jimbaker> why would inlineCallbacks be impacted by the choice of reactor implementation? other than a slow reactor is going to trampoline slowly?
<niemeyer> hazmat: We have to walk towards a platform that enables us to produce very a solid framework
<hazmat> jimbaker, it shouldn't be.. it should just work
<niemeyer> hazmat: This is going into the exact opposite direction
<jimbaker> hazmat, exactly. there's no magic there
<niemeyer> hazmat: Made docs more clear
<hazmat> niemeyer, cool
<hazmat> i'm going to head to bed... have a good night folks
<niemeyer> hazmat: Night man.. will just finish this and will head to bed too
<TREllis> anyone tried playing with openstack & ensemble yet?
<TREllis> just getting a "Connection was refused by other side: 111: Connection refused" upon bootstrapping atm
<hazmat> morning
<hazmat> TREllis, we found a few issues with it in one of our dependencies, txaws, its definitely a priority to have it working smoothly b4 oneiric
<TREllis> hazmat: ah, I saw one Clint put a patch for txaws and applied it manually to check, but I still get problems, perhaps my config is dodgy? lemmie pastebin it
<TREllis> hazmat: http://paste.ubuntu.com/669198/ my environments.yaml
<TREllis> hazmat: since I replaced the IP with a hostname and the txaws patch I get: http://paste.ubuntu.com/669200/
<TREllis> actually, hostname or IP, doesn't make a difference
<hazmat> TREllis, yeah.. i think that's about where we got on it
<hazmat> TREllis, we're going to need to do some additional debugging 
<TREllis> hazmat: ok thanks, added as a comment to the bug
<niemeyer> Good morning software lovers
<m_3> morning!
<heckj> morning west coaster!
<kim0> heh any idea how long does our cloudfoundry formula actually take to deploy
<niemeyer> kim0: No..?
<hazmat> kim0, there's an open bug about it.. there's some env var that the ruby needs that is unknown
<hazmat> kim0, it will hang as i recall else.. i thought it was just rabbitmq, but i think that's been resolved
<hazmat> i haven't seen any confirmation that its working though
<hazmat> but the sympton was it just hanged on install
<kim0> botchagalupe is saying it's 25mins min! wanted to know if we beat that
<kim0> I know it's mostly packaging, not really an ensemble thing though
<kim0> 25mins still sounds too much to me
<m_3> kim0: agree, that sounds kind of strange... even if they have to entirely bootstrap a gem env
<niemeyer> kim0,m_3: It'd be nice to have more details about what's going on there
<niemeyer> It's certainly quite unexpected.. there's no reason for it to take 25 mins
<m_3> kim0: you have links to the latest or should we wait for negronjl 
<kim0> m_3: haven't looked, but we could dig it up from negronjl's 
<kim0> m_3: it seems to have a bug now though
<m_3> kim0: ok, I'll schedule some time to branch and play with it this afternoon
<kim0> m_3: cool!
<kim0> I'll love to show it to the world .. finishing is 5mins or so hopefully :)
<m_3> kim0: it's gotten a lot of press lately
<kim0> yeah definitely hot
<niemeyer> kim0: Exactly.. it'd be nice to expose that
<kim0> if the bug is fixed today sometime .. I'll screencast it first thing tomorrow morning 
<niemeyer> kim0: That's awesome, thanks a lot
<_mup_> ensemble/expose-cleanup r320 committed by jim.baker@canonical.com
<_mup_> Mock out sleep
<hazmat> fwereade, niemeyer, bcsaller, jimbaker.. g+ meeting?
<bcsaller> looking for invite
<fwereade> hazmat: invite away
<niemeyer> Yep!
<jimbaker> meeting sounds good
<hazmat> invites out
<hazmat> jimbaker, you see invite
<hazmat> ?
<jimbaker> hazmat, i don't
<niemeyer> That's not working..
<niemeyer> I'll restart..
<jimbaker> i refreshed g+ a few times (which i don't expect to do), nothing yet
<_mup_> Bug #828885 was filed: 'relation-broken' hook not firing when relation is set to 'error' state <Ensemble:New> < https://launchpad.net/bugs/828885 >
<niemeyer> if !event.Ok { return event }
<niemeyer> c.Assert(event, Matches, "ZooKeeper connected; path created: /path")
<negronjl> jcastro: ping
<jcastro> negronjl: howdy
<negronjl> jcastro:  When you get a minute, I'd like to catch up with you re: NoSQL
<negronjl> is kim0 also attending ?
<jcastro> I have time now
<negronjl> G+ ?
<jcastro> sure
<negronjl> give me a sec
<negronjl> jcastro: invite sent
<niemeyer> bcsaller, hazmat: I'm available for our call
<bcsaller> niemeyer, hazmat: ready when you are
<hazmat> niemeyer, bcsaller invitations out
<jcastro> lynxman: heya, can you update the ensemble in macports to be a more recent snapshot?
<jcastro> we're doing a talk at scale out camp next wednesday and it'd be nice to have something more up to date
<_mup_> ensemble/expose-cleanup r321 committed by jim.baker@canonical.com
<_mup_> Refactoring of complex mocking
<_mup_> ensemble/expose-cleanup r322 committed by jim.baker@canonical.com
<_mup_> Finished previous refactoring
<_mup_> ensemble/expose-cleanup r323 committed by jim.baker@canonical.com
<_mup_> Verify deletion of security group in mocking
 * niemeyer breaks for coffee
<_mup_> ensemble/expose-cleanup r324 committed by jim.baker@canonical.com
<_mup_> Explicit MATCH for mock params, remove debugging
<_mup_> ensemble/expose-cleanup r325 committed by jim.baker@canonical.com
<_mup_> PEP8 & PyFlakes
<_mup_> ensemble/expose-cleanup r326 committed by jim.baker@canonical.com
<_mup_> docstrings, comments
<_mup_> ensemble/machine-agent-uses-formula-url r314 committed by kapil.thangavelu@canonical.com
<_mup_> additional tests for local file urls if the file is not present.
<_mup_> ensemble/machine-agent-uses-formula-url r315 committed by kapil.thangavelu@canonical.com
<_mup_> replace mocker any with value match functions
<fwereade> niemeyer, hazmat: EC2LaunchMachine *theoretically* handles stuff in machine_data like image_release_name, and various other params that get passed through to get_current_ami
<fwereade> niemeyer, hazmat: ...but I don't see any mechanism by which they could actually be injected at the moment
<fwereade> niemeyer, hazmat: (ok, yes, they get put in machine_data... but nothing *does* put them in machine_data AFAICT, so it all seems to be dead code)
<fwereade> niemeyer, hazmat: off the top of your heads, am I missing something?
<niemeyer> fwereade: They're probably not being used
<niemeyer> fwereade: get_current_ami was built as an experiment, outside of ensemble
<fwereade> niemeyer: then I can kill them? :)
<fwereade> niemeyer: get_current_ami itself is actually fine, it has (I think) sensible defaults for when you don't specify a default image id in the config
<niemeyer> fwereade: If they're working correctly, doesn't feel like a good idea..  we have missing functionality in that area, and this is a utility we can make good use of
<fwereade> niemeyer: get_current_ami is used elsewhere, and is useful so far as I know
<fwereade> niemeyer: tbh it's not really a hassle to keep them in, it'll just be an unused parameter somewhere
<fwereade> niemeyer: but on the other hand it wouldn't be a hassle to add them when we needed it
<niemeyer> fwereade: That's fine.. as long as they're not causing problems and not getting on your way to implement something, I suggest keeping things as they are
<fwereade> niemeyer: we'd need code changes to actually make use of them now
<niemeyer> fwereade: We need to support firing different releases, which touches that area
<fwereade> niemeyer: ok, I'll add an unused .image_options or something to the machine_data replacement, and make sure it works for when we get around to using it
<niemeyer> fwereade: Ah, I see.. you _are_ in fact touching on that area, and it's getting on your way
<niemeyer> fwereade: Sounds fine to remove it then
<fwereade> niemeyer: cool :)
<fwereade> niemeyer: I promise it won't be any harder to add in the future than it would be right now ;)
<hazmat> fwereade, the only injection mechanism atm is via config, we don't have pass through parameters to get machine options from start machine
<hazmat> hmm.. or do we
<fwereade> hazmat: we do, if we stick the right magic in machine_data
<niemeyer> hazmat: Injecting machine details through config is really a hack, though
<fwereade> hazmat: but we don't have any mechanism for actually adding them to machine_data, so it's kinda moot
<niemeyer> hazmat: Sounds fine to kill it for the moment
<hazmat> niemeyer, agreed most of the time folks are asking for it exposed on a per command basis, not an environment default
<niemeyer> Right
<hazmat> niemeyer, actually things like region are useful for determining ami and are environment settings properly
<niemeyer> hazmat: Sure.. that should continue to exist
<niemeyer> fwereade: ^
<fwereade> niemeyer, hazmat: region remains necessary, and default-image-id remains usable
<niemeyer> fwereade: default-image-id is also a hack
<fwereade> all we lose is the ability to stick things like image_release_name into machine_data, which we don't do anyway, and should do in a different way when we do come to do it
<fwereade> niemeyer: agreed, but it's a hack that people may well be currently using
<niemeyer> fwereade: We can support it for now, but it's dying soon
<fwereade> niemeyer: sounds good
<niemeyer> fwereade: They may be using, but there are features that we're adding very soon that render it unusable
<hazmat> niemeyer, to be replaced with?
<fwereade> niemeyer: but I think it deserves a targeted death, not just a drive-by
<niemeyer> hazmat: To be replaced with release-level selection
<niemeyer> hazmat: The way to customize a base image with Ensemble is through formulas, not through custom machine ids
<fwereade> niemeyer: I'd been assuming something along those lines
<niemeyer> hazmat: People should definitely not be publishingn formulas which require others to tweak their setup to use custom images
<fwereade> niemeyer: so, say, mongodb will want a 64-bit arch
<hazmat> yeah.. 64-bit image selection ended up being the primary use for default-image-id.. but it should be automatic just based on size
<fwereade> niemeyer: and we'll need to have an ensemble format to define these properties anyway, because we'll need to specify them differently on ec2 and on orchestra
<hazmat> the only ec2 oddball is tiny with 64bit
<fwereade> niemeyer: so the time to thread them through is when we've actually defined what can be specified
<niemeyer> fwereade: We'll probably start introducing these bits pretty soon
<SpamapS> hrm.. seeing something very strange when trying to bootstrap on canonistack..
<niemeyer> fwereade: This is part of the store work
<fwereade> niemeyer: sounds good to me
<SpamapS> when doing a bootstrap w/ EC2.. it never seems to check for the existence of the s3 bucket..
<SpamapS> but with canonistack, it tries to find it, gets a 404, and fails
<fwereade> niemeyer: in that case, maybe it would be more sensible to retain the capability, just change how it's done
<fwereade> niemeyer: it's just one extra field/parameter after all
<niemeyer> SpamapS: I have vague memories of S3 failing in non-obvious ways in some case, with 403.. there's a chance they're behaving differently there
<hazmat> SpamapS, i suspect its the error parsing against the file not found
<niemeyer> fwereade: Sounds good as well
<hazmat> SpamapS, like we expect to find a certain value in the error denoting a file not found
<SpamapS> niemeyer: whats confusing is that no such request is even made to ec2
<hazmat> and having looked at s3store.py i doubt its compatible on error message content
<SpamapS> actually I can't say that for sure..
<SpamapS> I just realized the S3 URL is HTTP, not HTTPS
<niemeyer> SpamapS: That'd be a serious bug.. it should certainly check
<niemeyer> SpamapS: I mean, EC2 not checking the bucket
<niemeyer> Ensemble not checking the bucket on EC2, that is
<hazmat> SpamapS, specifically line 73 of providers/ec2/files.py i'd stick a pdb in there to verify
<hazmat> we check the bucket access on s3  prior to starting any machines on ec2
<hazmat> its the error that comes back if the file doesn't exist that seems to be getting not trapped corectly into something ensemble recognizes i suspect
<SpamapS> ahh so its looking for 'NoSuchBucket' when a 404 should be sufficient
<niemeyer> SpamapS: Man..
<niemeyer> SpamapS: This is starting to feel like a can of worms
<niemeyer> It's unfortunate that these interactions are entirely different :-(
<SpamapS> well there is no published standard for AWS
<SpamapS> so it takes somebody reporting that something is different for an implementation to get fixed
<SpamapS> its worth noting that nova-objectstore is *not* recommended except for testing/single node setups.
<hazmat> SpamapS, i doubt swift does much better  here
<niemeyer> SpamapS: When one mimics an implementation, there's only one way to make it look the same..
<niemeyer> SpamapS: http://paste.ubuntu.com/669591/
<niemeyer> SpamapS: I'd expect people to do that kind of experimentation when implementing it
<hazmat> SpamapS, looking at the swift s3 it definitely doesn't..
<hazmat> SpamapS, both implementations do however maintain the same http error codes as s3
<hazmat> just not the same error content responses
<SpamapS> which makes sense
<SpamapS> honestly, code > content ;)
<hazmat> SpamapS, yeah.. i think switching this to just verifying error codes should be sufficient
<SpamapS> it may be worth reporting to swift and nova-objectstore that they should conform to AWS
<SpamapS> but we probably should work w/ them before they fix that
<SpamapS> big question, why am I not getting a real traceback?
<SpamapS> hazmat: that failure is not coming from line 73
<niemeyer> SpamapS: Agreed, we should try to make it work regardless
<SpamapS> or anywhere in providers.ec2.files :-P
<niemeyer> SpamapS: I'm just bitching
<SpamapS> niemeyer: yeah, not much we can do unless we want to try and champion AWS compatibility as a standard ;)
<SpamapS> anyway, have to run.. ttyl guys
<hazmat> SpamapS, re tracebacks.. twisted tends to make that hard.. what traceback do you get?
<niemeyer> SpamapS: Cheers
<hazmat> SpamapS, ttyl
<lynxman> jcastro: sure, it always take a bit to update though
<lynxman> jcastro: as a matter of fact, the port I submitted for Ensemble is still not in the public repo :(
<lynxman> jcastro: http://www.macports.org/ports.php?by=name&substr=ensemble
<niemeyer> hazmat: I'm attempting to force a session expiration without much success..
<niemeyer> hazmat: Have you succeeded in testing this on txzk?>
<hazmat> niemeyer, hmm.. expiration specifically..
<hazmat> niemeyer, i could do closes easily, by connecting another client with the same session, and closing it one, so it closes the other
 * hazmat checks for expiration
<hazmat> niemeyer, yeah.. that generates an expiration
<jcastro> lynxman: oh man that sucks. :-/
<hazmat> niemeyer, http://bazaar.launchpad.net/~ensemble/txzookeeper/trunk/view/head:/txzookeeper/tests/test_session.py#L151
<niemeyer> hazmat: Yeah, I recall you mentioned the trick
<niemeyer> hazmat: I was trying to check a real scenario
<lynxman> jcastro: yeah, they're up their elbows on work with the release of Lin :/
<niemeyer> hazmat: Reduced the timeout, killed the server
<lynxman> jcastro: Lion I mean :)
<niemeyer> hazmat: Restarted the server.. session seems reestablished fine
<niemeyer> hazmat: It's working too well.. ;)
<jcastro> lynxman: ok I'll say "in progress" or something
<hazmat> niemeyer, yeah.. so timeout specified on init is not the actual session timeout
<hazmat> niemeyer, the timeout is negotiated between client and server
<niemeyer> hazmat: Oh, crap.. you're right.. it's even called recvTimeout
<niemeyer> Oh well.. I'll see if the trick works
<lynxman> jcastro: sounds cool :)
<_mup_> ensemble/expose-cleanup r327 committed by jim.baker@canonical.com
<_mup_> Add mocking around shutdown taking too long
<hazmat> niemeyer, can you confirm that i didn't miss anything from our security discussion in austin.. https://pastebin.canonical.com/51455/
<adam_g> anyone bootstrapped an oneiric AMI lately? or at least since monday?
<adam_g>   * softwareproperties/ppa.py:
<adam_g>     - show PPA description and confirm before adding
<adam_g>       (security-o-catch-all spec)
 * hazmat tries it out
<hazmat> hmm
#ubuntu-ensemble 2011-08-19
<hazmat> hmm.. it looks like trunk is broken
<hazmat> oh.. i had ensemble set to use a different branch
<niemeyer> hazmat: Sounds right re. the catch up
<adam_g> actually pythoon-software-properties changed
<hazmat> jimbaker, is unexpose not implemented?
<adam_g> and needs to be fixed, and so will cloud-init
<niemeyer> adam_g: What happened?
<adam_g> niemeyer: add-apt-repository now requires user confirmation when adding a PPA
<adam_g> just submitted a patch to software-properties to have a '-y' option similar to apt-get
<niemeyer> adam_g: Oh no
<adam_g> either way it breaks lots of formulas. :\
<niemeyer> Man..
<niemeyer> hazmat: I understood it was ready as well
<niemeyer> adam_g: Sent a note about the conceptual problem there
<niemeyer> adam_g: We can fix with -y.. but we need to increase awareness about the importance of not introducing interactivity randomly
<adam_g> niemeyer: agreed
<jimbaker> hazmat, unexpose is implemented
<jimbaker> hazmat, i would be very curious if you have found any issues of course
<niemeyer> hazmat: The full diff on the gozk update: http://paste.ubuntu.com/669664/
<hazmat> jimbaker, i unexposed a service (current trunk) and was still able to access it
<jimbaker> hazmat,did you delete the environment security group first?
<hazmat> jimbaker, ah
<hazmat> jimbaker, probably not
<jimbaker> sorry, in the release notes :)
<jimbaker> my expose-cleanup branch will take care of that cleanup
<jimbaker> but i considered it a helpful feature in the transition ;)
<hazmat> jimbaker, cool, thanks for clarifying
<jimbaker> hazmat, no worries
<SpamapS> hazmat: on plane.. inflight wifi is .. amazing. :)
<hazmat> SpamapS, nice
<SpamapS> hazmat: anyway re the traceback... I am trying to find where the 404 is coming from
<jimbaker> SpamapS, i remember flying back from europe with internet service a few years back. too bad that program was scrapped because of expense
<SpamapS> jimbaker: lufthansa is rolling it out on 100% of flights now
<jimbaker> but ground stations are cheaper than satellites, so domestic only for the immediate future
<hazmat> SpamapS, where you heading?
<SpamapS> Seattle.. just showing my boy something different than the SW US
<jimbaker> SpamapS, really, there's another satellite player now? iirc, it was boeing that scrapped it. maybe someone did pick it up
<SpamapS> He's only ever been to CA, NM, and TX ... WA is *quite* a different place. :)
<jimbaker> and mid aug is the best time of year to visit WA
<SpamapS> jimbaker: Not sure, they've been squawking about it that they're rolling it out.
<SpamapS> jimbaker: yeah, nice and green.. mid 70's :)
<jimbaker> sounds like our mountains
<jimbaker> really, we should not have any more sprints in central tx in august
<SpamapS> haha
<SpamapS> I thought it was nice.. for keeping us in our rooms working
<jimbaker> SpamapS, that's one theory, butts in seats and all. i find my brain works better if the body has moved however
<SpamapS> oh I see the problem with the tracebacks. :-/
<hazmat> SpamapS, i'm curious.. sometimes we get tracebacks in twisted and sometimes not.. 
<hazmat> mostly depends on if the code has yielded to the reactor but its not always clear
<SpamapS> This seems to be because the error isn't raised until inside the select loop
<hazmat> yeah
<SpamapS> I'm guessing its coming from txaws.. I can't seem to find where its raised in the provider
<SpamapS> Its really not clear at all where the 404 from Amazon is made "ok" .. 
<SpamapS> Yeah this twisted error handling is *maddening*
<SpamapS> the error that is happening is not an S3Error, but a twisted.web.error.Error ...
<hazmat> SpamapS, it should be that line  i pointed out earlier
<SpamapS> hazmat: that line doesn't seem to be reached
<hazmat> hmm
<hazmat> i guess i should byte the bullet and do a nova install
<SpamapS> just get on canonistack
<SpamapS> its company wide
<SpamapS> and dead dumb ass simple
<hazmat> requires something i don't have... though i should get one
<hazmat> actually swift is much easier to setup
<SpamapS> err.. what? a SSO login?
<SpamapS> thats all you need
<hazmat> SpamapS, i need to setup a shell account i thought?
<SpamapS> nope
<SpamapS> maybe if you don't want to use public IPs to ssh into the instances
<SpamapS> hazmat: good luck.. time to shut down electronics
<hazmat> SpamapS, thanks.. i'll see if i can get this running
<hazmat> getting set up with canonistack was really easy
<hazmat> hmm.. so after fixing up the error trapping.. it still looks like a 404 on creating a bucket
<hazmat> got it
<hazmat> SpamapS, so getting past the error trapping code, it looks like the problem is a missing trailing slash
<hazmat> on bucket names in txaws
<hazmat> at least that's the only delta between boto and txaws when i try to do bucket ops with them
<hazmat> woot! 
<hazmat> hmm.. not quite
<hazmat> well at least past all the s3 errors
<hazmat> now onto the ec2 errors
<SpamapS> hazmat: :)
<SpamapS> hazmat: so txaws needs some testing against openstack. :)
<hazmat> SpamapS, definitely, the delta was tiny.. finding the delta ;-)
<hazmat> but clearly some additional testing.. 
<hazmat> its rather funny.. boto has such minimal testing.. but lots of prod use.. txaws lots of testing.. little prod use..
<hazmat> hmm.. that's a little questionable
<hazmat> i didn't have my access key/secret key setup correctly, but i could still create buckets..
<hazmat> i don't think there's any validation in nova-objectstore
<hazmat> woot! bootstrap sucess
<hazmat> ugh.. 12 roundtrips for bootstrap...
<hazmat> SpamapS, so i can't actually test this, since i need shell access for the ssh tunnel to zk
<hazmat> but it bootstraps at least
<hazmat> and shutsdown ;-)
 * hazmat heads to bed
<_mup_> ensemble/stack-crack r322 committed by kapil.thangavelu@canonical.com
<_mup_> openstack compatibility fixes for ec2 provider.
<SpamapS> hazmat: you shouldn't need shell access to talk to nova
<SpamapS> hazmat: and you can allocate/attach public ips to ssh in
 * SpamapS tries out branch
<SpamapS> hazmat: argh! where is your branch?
<kim0> huh .. ensemble upgraded causes syntax errors ? http://paste.ubuntu.com/669870/
<TeTeT> kim0: agreed, see the same
<kim0> Any idea if the CF formula has been made to work
<_mup_> Bug #829397 was filed: Link a service to a type of hardware <Ensemble:New> < https://launchpad.net/bugs/829397 >
<_mup_> Bug #829402 was filed: Deploy 2 services on the same hardware <Ensemble:New> < https://launchpad.net/bugs/829402 >
<_mup_> Bug #829412 was filed: Deploy a service on a service <Ensemble:New> < https://launchpad.net/bugs/829412 >
<_mup_> Bug #829414 was filed: Fail over services <Ensemble:New> < https://launchpad.net/bugs/829414 >
<_mup_> Bug #829420 was filed: Declare and consume external services <Ensemble:New> < https://launchpad.net/bugs/829420 >
<m_3> kim0: CF unknown still... looking at it now
<kim0> m_3: thanks :)
<kim0> I'm doing hpcc instead 
<kim0> horribly complex language
<m_3> kim0: ok, I'll go back to adding/testing nfs mounts into our standard formulas... fun fun :)
<kim0> yeah all fun :)
<botchagalupe> Newbee questionsâ¦ Can formulas be written in Ruby?
<hazmat> botchagalupe, definitely
<botchagalupe> very coolâ¦ 
<hazmat> botchagalupe, formulas can be written in any language, from c, shell, haskell, ruby, etc.
<botchagalupe> So far it looks pretty coolâ¦ Weird coming from a chef background though.  Just looked at it over the last hour.  Need to learn moreâ¦ 
<hazmat> botchagalupe, ensemble will call the hooks at the right time.. which are just executables to ensemble, and the hooks can interact with ensemble via some command line tools (relation-set, open-port, etc) that are provided.
<botchagalupe> Are there any good examples of running it outside of EC2?  e.g., openstackâ¦. 
<hazmat> botchagalupe, not at the moment, we're still working on openstack compatiblity ( i was just working on it late last night.), and cobbler/orchestra (physical machine) integration, but that likely won't be finished till the end of the month.
<hazmat> well.. sooner.. but at as far as having blog posts, and docs.
<botchagalupe> I look forward to it :) 
<hazmat> kim0, that upgrade error looks like some python 2.7isms that fwereade introduced while refactoring the provider storage... previously it was 2.6 compatible... its probably worth a bug report.
<kim0> hazmat: ok filing it
<_mup_> Bug #829531 was filed: broken python 2.6 compatbility <Ensemble:Confirmed> < https://launchpad.net/bugs/829531 >
<hazmat> SpamapS, open stack compatible branches..  lp:~hazmat/ensemble/stack-crack  and lp:~hazmat/txaws/fix-s3-port-and-bucket-op
<niemeyer> Hey all!
<niemeyer> botchagalupe: Hey!  Good to have you here..
<niemeyer> hazmat: Thanks for pushing that man
<highvoltage> hello niemeyer 
<niemeyer> hazmat: Have you checked if openstack returned anything at all in that put?
<niemeyer> hazmat: Was mostly curious if it was a bit off, or entirely off
<niemeyer> highvoltage: Hey!
<hazmat> niemeyer, well bootstrap works, i need to see if assigning the elastic ip address will change the address as reported by describe instances, if so than i should be able to actually use ensemble against the instance, else it will need an ssh account into this particular private openstack installation
<hazmat> niemeyer, it was just a few bits off.. the error capture in ensemble needed to be more generic, and the bucket operations needed a trailing slash
<niemeyer> hazmat: Hmm.. how's EIP involved there?
<hazmat> niemeyer, the actuall diff was only like 10 lines
<hazmat> niemeyer, only against this private installation of openstack
<niemeyer> hazmat: I meant openstack itself.. was it returning anything at all, or just 404ing
<niemeyer> hazmat: Yeah, but I don't get how's it involved even then
<hazmat> niemeyer, so two different topics.. openstack was returning 404s without ec2 error information, which means the error transformation in txaws  wasn't working, and the error capture in ensemble wasn't working either, updating the error capture in ensemble to catch twisted.web.error.Error and checking status against 404 solved that.. there was an additional compatibility issue which required bucket operations to have a trailing slash
<niemeyer> hazmat: I got that yesterday.. the question is:
<niemeyer> hazmat: OpenStack is obviously not returning the same message than AWS.. what is it returning instead?
<hazmat> niemeyer, empty contents on a 404
<niemeyer> hazmat: Ok.. :(
<hazmat> Nicke, on the the EIP topic.. the problem is that this particular openstack installation is private, so we launch a bootstrap instance of ensemble, and then we can't actually use the ensemble commands against that, because we use an ssh tunnel to the public ip addrs of the node.. which isn't routable
<hazmat> niemeyer, the openstack implementation, both switch and nova-objectstore are very simple if we want to send patches upstream re this
<niemeyer> hazmat: Sure, I totally get that we can fix it ourselves.. ;-)
<hazmat> finite time sucks :-)
<botchagalupe> niemeyer: Good looking toolâ¦ Gonna give it some kicks this weekend to podcast about Mondayâ¦ 
<niemeyer> botchagalupe: Neat!
<niemeyer> botchagalupe: We have a lot happening right now.. if you want to include details about what else is going on, just let us know
<niemeyer> hazmat: Re. the EIP.. I see.. so our setup does not actually expose the machines unless an EIP is assigned
<hazmat> niemeyer, exactly
<niemeyer> hazmat: Can we proxy the client itself through SSH?
<hazmat> niemeyer, yes. that requires a shell account that i don't have.. i'm curious if openstack maintains compatibility to the point of readjusting describe instance output when an eip is assigned to an instance, that will obviate the need for shell credentials to this private openstack instance.
<niemeyer> hazmat: Are you sure?  Have you tried to route through people or chinstrap, for instance?
<hazmat> niemeyer, i haven't setup that shell account
<hazmat> niemeyer, just finding some new errors as well with the ec2 group stuff on subsequent bootstraps
<niemeyer> hazmat: Hmm, good stuff
 * hazmat grabs some caffiene.. bbiam
<botchagalupe> niemeyer Please send me what you have  john at dtosolutions com 
<niemeyer> botchagalupe: I don't have anything readily packed to mail you..
<niemeyer> botchagalupe: Right _now_ we're working on the formula store, physical deployments, and local development.. have just deployed EC2 firewall management for formulas and dynamic service configuration.
<jimbaker> fwereade, i took more of a look at the cobbler-zk-connect branch
<fwereade> jimbaker: heyhey
<jimbaker> 1. test_wait_for_initialize lacks an inlineCallbacks decorator, so it's not testing what it say it's testing :)
<niemeyer> Nice catch
<jimbaker> 2. the poke i mentioned is TestCase.poke_zk
<jimbaker> note that to use it, you need to follow the convention of setting self.client
<jimbaker> fwereade, in general, you don't want to be using sleeps in tests. they have a nasty habit of eventually failing
<fwereade> jimbaker: cool, tyvm for the pointers :)
<jimbaker> fwereade, in this particular case, poke_zk will definitely work, and make the test deterministic. which is what we want
<fwereade> jimbaker: I'll look up poke_zk
<fwereade> jimbaker: sweet
 * niemeyer => lunch
<hazmat> hmm.. it looks like  openstack has a compatibility issue here regarding describe group for the environment group
<hazmat> niemeyer, i'm starting to realize compatibility for openstack might be a larger task, but its also unclear as their are lots of bugs fixed that are marked fix committed, but not released.
<adam_g> hazmat: i believe those bugs aren't marked as fixed released until the next milestone is released, but packages in ppa:nova-core/trunk are built on every commit
<_mup_> Bug #829609 was filed: EC2 compatibility describe security group returns erroneous value for group ip permissions <Ensemble:New> <OpenStack Compute (nova):New> < https://launchpad.net/bugs/829609 >
<hazmat> adam_g, understood, its just not clear what version canonistack is running
<kim0> woohoo just pushed a lexisnexis vid â http://cloud.ubuntu.com/2011/08/crunching-bigdata-with-hpcc-and-ensemble/
 * kim0 prepares to start the weekend 
<hazmat> smoser, just to verify.. cloud-init is running on the canonistack images?
 * hazmat lunches
<niemeyer> hazmat: What was the other issue you found? (sry, just back from lunch now)
<hazmat> niemeyer, see the bug report
<hazmat> niemeyer, i committed a work around to the txzookeeper branch
<hazmat> er. txaws that is
<hazmat> now i have a new error to decipher.. it doesn't appear my keys on the machine from cloud init for ssh access
<hazmat> http://pastebin.ubuntu.com/670216/
<hazmat> looking at the console-output it looks like cloud-init runs though .. http://pastebin.ubuntu.com/670217/
<niemeyer> hazmat: Hmm.. the metadata service must be hosed
<hazmat> niemeyer, true, that likely is hosed.. i think that's only recently been done, and perhaps differently, that was an issue i had early looking at rackspace support
<hazmat> there wasn't anyway to identify the machine api identifier from within the machine
<fwereade> niemeyer: quick confirm
<niemeyer> fwereade: Sure
<fwereade> niemeyer: if a foolish developer had ended up with a 3000-line diff, it would be appreciated to reconstruct the end result as a pipeline, even if the indivdual steps themselves each ended up seeming a bit forced/redundant
<fwereade> niemeyer: a lot of the problem is decent-sized chunks of code moving from one file to another, but sadly the structure isn't really apparent from the diff
<fwereade> niemeyer: however, I think it could be clearer if it were broken up into steps like
<fwereade> 1) add new way to do X
<fwereade> 2) remove old way to do X, use new way
<fwereade> etc...
<_mup_> Bug #829642 was filed: expose relation lifecycle state to 'ensemble status' <Ensemble:New> < https://launchpad.net/bugs/829642 >
<niemeyer> fwereade: Yeah, that sounds a lot more reasonable, and less painful for both sides
<fwereade> niemeyer: cool, they'll land sometime one monday then
<niemeyer> fwereade: Sounds good.. even though there's some work involved, I'm willing to bet that the overall time for the changes to land will be reduced
<fwereade> niemeyer: I swear it was a 1kline diff, and then I went and made everything neat and consistent :/
<fwereade> niemeyer: anyway, thanks :)
<fwereade> happy weekends everyone :)
<niemeyer> fwereade: I can believe that
<niemeyer> fwereade: Have a great one!
<fwereade> niemeyer: and you :)
<niemeyer> fwereade: Thanks
<hazmat> fwereade, cheers
<hazmat> hmm.. afaics its working the console output diff between ec2 and openstack looks sensible, except cloud-init waiting on the metadata service
<hazmat> machine and provisioning agents running normally
<hazmat> woot it works
<hazmat> doh
<niemeyer> hazmat: Woah!
 * niemeyer dances around the chair
<hazmat> niemeyer, well i got status output and agents are running
<hazmat> niemeyer, new error on deploy, but getting closer i think
 * niemeyer sits down
 * hazmat grabs some more caffiene
<RoAkSoAx> fwereade: \o/
<RoAkSoAx> fwereade: any updates on the merges in trunk?
<_mup_> ensemble/fix-pyflakes r322 committed by jim.baker@canonical.com
<_mup_> Remove dict comprehension, pyflakes doesn't understand it yet
<_mup_> ensemble/fix-pyflakes r323 committed by jim.baker@canonical.com
<_mup_> Remove remaining dict comprehension
<hazmat> jimbaker, later version of pyflakes seems to understand it for me.. if your pyflakes is using a 2.6 python.. then it could be an issue
<hazmat> jimbaker, definitely valid to do.. but the fix is really python 2.6 compatiblity 
<hazmat> jimbaker, nevermind.. i hadn't realized but the latest pyflakes package seems to be broken
<niemeyer> jimbaker: ping
<niemeyer> bcsaller: ping
<bcsaller> niemeyer: whats up?
<niemeyer> bcsaller: You, looking for a vict^Wcandidate for a review
<niemeyer> bcsaller: s/You/Yo/
<bcsaller> sure
<niemeyer> bcsaller: https://code.launchpad.net/~hazmat/ensemble/formula-state-with-url/+merge/71291
<bcsaller> on it
<niemeyer> bcsaller: Cheers!
<niemeyer> bcsaller: https://code.launchpad.net/~hazmat/ensemble/machine-agent-uses-formula-url/+merge/71923
<niemeyer> bcsaller: Oh, sorry, nm
<niemeyer> bcsaller: William has already looked at that latter one
<hazmat> niemeyer, so with dynamic port opening, one question i had is how do we go about solving placement onto multiple machines when reusing machines
<hazmat> we need static analysis to determine port conflicts for placement afaics
<hazmat> s/analysis/metadata
<hazmat> something along the lines of describing a security group, port-ranges, protocols, etc
<hazmat> directly in a formula
<heckj> I have a sort of random ensemble question: when you create a relation, is that a bidirectional concept, or unidirectional? i.e. do both pieces know about each other when you make the relationship for purposes of setting up configs, etc?
<hazmat> heckj, bi-directional
<niemeyer> hazmat: ROTFL
<hazmat> heckj, each side is informed when units of the other side join, depart or change their relation settings
<niemeyer> hazmat: Didn't we cover that issue at least 3 times? :-)
<heckj> hazmat: thanks!
<hazmat> niemeyer, yeah.. we probably did, but i'm looking at doing a more flexible placement alg to respect max, min machines.. and i don't recall what solution we came up with
<hazmat> actually i know we did several times
<niemeyer> hazmat: I don't understand how that changes the outcome we reached in Austin
<hazmat> niemeyer, i don't recall we discussed this in austin, we discussed network setups in austin
<hazmat> for lxc bridging
<niemeyer> hazmat: We certainly discussed port conflicts and how we'd deal with them in the short term and in the long term
<hazmat> niemeyer, in the short term we  said we wouldn't, and the long term?
<niemeyer> hazmat: We have all the data we need to do anything we please..
<hazmat> niemeyer, okay.. so i'm deploying a new formula, i can inspect which ports are open/used on a machine, but i can't tell which ones the new formula needs.. so i lack knowledge of what its going to be using in advance of deploying it.
<hazmat> if i knew advance i could select a machine with non-conflicting port usage
<niemeyer> hazmat: open-port communicates to Ensemble what port that is.. we don't need to tell in advnace
<niemeyer> hazmat: Ensemble will happily take the open port and move on with it
<niemeyer> Woohay
<hazmat> niemeyer, and in the case of a port usage conflict between two formulas?
<hazmat> s/formulas/service units
<niemeyer> hazmat: Yes, as we debated the same port can't be put in the same address.. it's a limit of IP
<niemeyer> hazmat: If people try to force two services on the same machine to be *exposed*, it wil fail
<niemeyer> hazmat: If they have the same port..
<niemeyer> hazmat: If they're not exposed, that's fine..
<hazmat> niemeyer, yes.. but if i knew in advance i could avoid conflicts when doing machine placement, with dynamic in the short term  we just allow conflicts in the short term.. but what's the long term solution here?
<hazmat> niemeyer, doesn't matter if their exposed or not
<niemeyer> hazmat: Of course it matters
<bcsaller1> hazmat: you mean they bind it even if its not addressable outside the firewall, right?
<hazmat> niemeyer, if i have two unexposed services trying to use port 80.. its a conflict regardless of the expose
<niemeyer> hazmat: It's not.. each service has its own network spae
<niemeyer> space
<hazmat> bcsaller1 exactly.. i have my web apps behind a load balancer for example
<hazmat> niemeyer, ah assuming lxc  and bridges
<niemeyer> hazmat: Yes, assuming the feature we've been talking about :-)
<hazmat> ah.. right so this is where we get to ipv6, ic
<hazmat> each service gets it own ipv6 address,  we route ip4v6 internally, and expose still can't deal with port conflicts, which we can't detect/avoid
<bcsaller> hazmat: prove it ;)
<RoAkSoAx> nijaba: ping?
<niemeyer> hazmat: Yes..
<hazmat> bcsaller, its runtime dynamic, placement is prior to instantiation.. what's to prove
<bcsaller> I just haven't seen the ipv6-> ipv4 routing work this way yet
<niemeyer> hazmat: In practice, that's a lot of ifs..
<bcsaller> not saying it can't, just haven't seen how it plays out yet
<hazmat> bcsaller, yeah.. there's a pile of magic dust somewhere
<bcsaller> and I think of IBM all of a sudden
<niemeyer> bcsaller: Why?
<hazmat> i think of nasa.. millions for a space pen that works.. russians use a pencil 
<bcsaller> niemeyer: they did commercials with self healing servers and magic pixie dust you sprinkle around the machine room
<niemeyer> Nice :)
<niemeyer> hazmat: Exactly.. let's design a pencil
<hazmat> niemeyer, a pencil is static metadata imo
<bcsaller> niemeyer: http://www.youtube.com/watch?v=3nbEeU2dRBg
<niemeyer> hazmat: A pencil to me is something that is already working fine today
<niemeyer> hazmat: Rather than going after a different fancy pen
<hazmat> niemeyer, we can rip out significant parts of code base and simplify them. its development either way.. the point is a pencil is simple
<niemeyer> hazmat: You're trying to design the pen that works without gravity..
<niemeyer> hazmat: Very easy to write once you have it
<niemeyer> hazmat: The pencil is ready
<hazmat> niemeyer, so i think we've got the analogies as far as they go.. the question is what's the problem with static metadata? besides the fact we've already implemented something with known problems
<niemeyer> hazmat: I thought the analogy was clear.. static metadata doesn't exist
<niemeyer> hazmat: How do you allow a service to offer another port to a different service?
<niemeyer> hazmat: How many ports do we put in the static metadata?
<niemeyer> hazmat: What if another port is to be opened?
<hazmat> niemeyer, the formula declares what it enables via metadata.. allowing for port ranges etc, perhaps associated to a name
<niemeyer> hazmat: Yeah.. what if the range is too small for the number of services someone wants to connect to?
<niemeyer> hazmat: What if the service could actually work dynamically?
<niemeyer> hazmat: And pick a port that is actually open in the current machine rather than forcing a given one?
<hazmat> niemeyer, the metadata is only for listen ports a formula offers
<niemeyer> hazmat: Since it doesn't really care
<niemeyer> hazmat: That's what I'm talking about too
<hazmat> it can reserve a range, if it wants.. like less than 1% of services are truly dyanmic that way
<niemeyer> hazmat: All services are dynamic that way.. a single formula can manage multiple services for multiple clients
<hazmat> i'd rather design for the rule than the exception, if i get a pencil ;-)
<niemeyer> hazmat: Multiple processes
<niemeyer> hazmat: We have the pencil.. services are dynamic by nature.. open-port is dynamic by nature
<niemeyer> hazmat: it works, today..
<hazmat> niemeyer, right.. i can have a formula managing wsgi-app servers, but i can also pick a range of 100, and reserve that block for the processes i'll create
<niemeyer> hazmat: Until botchagalupe1 wants to use it for 101 services in his data center
<niemeyer> hazmat: Then, even the static allocation doesn't solve the problem you mentioned..
<niemeyer> hazmat: Which is interesting
<niemeyer> hazmat: Scenario:
<hazmat> niemeyer, so your saying if its a service has a port per relation
<niemeyer> hazmat: 1) User deploys frontend nginx and backend app server in the same machine
<niemeyer> hazmat: 2) Both use port 80
<niemeyer> hazmat: 3) nginx is the only one exposed..
<niemeyer> That's a prefectly valid scenario
<niemeyer> hazmat: 4) User decides to expose the app server for part of the traffic
<niemeyer> hazmat: Boom..
<niemeyer> hazmat: Static allocation didn't help
<hazmat> in one case static metadata, we prevent the units from co-existing on the same machine
<niemeyer> hazmat: Why?
<hazmat> when placing them.. to avoid conflicts
<niemeyer> hazmat: The scenario above works..
<niemeyer> hazmat: 1-3 is perfectly fine
<hazmat> say i end up with varnish or haproxy on the same instance for a different service and i want to expose it.. 
<hazmat> same problem
<niemeyer> hazmat: Yep.. that's my point.. it's an inherent problem.. it exists with open-port or with dynamic allocation
<hazmat> in the static scenario we prevent by not placing it on a machine with conflicting port metadata
<niemeyer> hazmat: We need to solve it in a different way
<niemeyer> hazmat: Again, 1-3 is perfectly fine
<hazmat> 1) is not the case, they won't be deployed on the same machine with static metadata
<niemeyer> hazmat: There's no reason to prevent people from doing it
<hazmat> hmm
<hazmat> it is rather limiting to get true density
<hazmat> with static metadata
<niemeyer> hazmat: My suggestion is that we address this problem within the realm of placement semantics
<niemeyer> hazmat: In more realistic stacks (!) admins will be fine-tunning aggregation 
<hazmat> niemeyer, that's the problem/pov that i'm looking at this from.. placement has no data about the thing its about to deploy.. just about the current ports of each machine.
<hazmat> niemeyer, you mean moving units?
<niemeyer> hazmat: No, I mean more fine-tunned aggregation
<hazmat> or just doing manual machine selection placement
<niemeyer> hazmat: Not manual machine selection per se
<niemeyer> hazmat: Machines have no names.. don't develop love for them.. ;)
<hazmat> niemeyer, absolutely.. their so unreliable ;-)
<niemeyer> LOL
<_mup_> Bug #829734 was filed: PyFlakes cannot check Ensemble source <Ensemble:New> < https://launchpad.net/bugs/829734 >
<_mup_> ensemble/fix-pyflakes r322 committed by jim.baker@canonical.com
<_mup_> Remove dict comprehension usage to support PyFlakes
<jimbaker`> bcsaller, hazmat - i have a trivial in lp:~jimbaker/ensemble/fix-pyflakes that allows pyflakes to work again for the entire source tree
<hazmat> jimbaker`, awesome, there's a bug for py 2.6 compatibility that it can link to as well
<hazmat> afaics
<jimbaker`> hazmat, yeah, that's probably the source of the 2.6 bug
<hazmat> dict comprehensions where the only 2.7 feature we where using
<hazmat> s/where/were
<jimbaker`> hazmat, they're nice, but just not yet unfortunately
<jimbaker`> i'll mention this to fwreade so we can avoid it for the time being
<_mup_> ensemble/stack-crack r323 committed by kapil.thangavelu@canonical.com
<_mup_> allow config of an ec2 keypair used for launching machines
<jimbaker`> hazmat, so if that trivial looks good, i will commit and mark those bugs as fix released
<jimbaker`> (to trunk)
<jcastro_> negronjl: about how long does it take the mongo formula to deploy?
<jcastro_> like, if I ssh in and I type mongo and it doesn't find it, then I've obviously ssh'ed in too early? :)
<jcastro_> also, rs.status() returns "{ "errmsg" : "not running with --replSet", "ok" : 0 }"
<hazmat> jcastro_, if ensemble status says running it should be running
<hazmat> er. started
<jcastro_> aha, it takes about a minute
<negronjl> jcastro:  you got it .... about a minute or so
<jcastro_> negronjl: ok, the second db.ubuntu.find() shows the same results as the first one, how do I know that's on other nodes?
<jcastro_> or do you just know because that's what rs.status() already showed?
<negronjl> jcastro:  you don't really know ( without a bunch of digging ) what's on which node 
<jcastro_> right, I see, that's the point. :)
<hazmat> jimbaker`, also this has a fix for the cli help.. ignoring the plugin implementation http://pastebin.ubuntu.com/670338/
<hazmat> jimbaker`, sans it, the default help is bloated out by config-set
<hazmat> on ./bin/ensemble -h
<jimbaker`> hazmat, you mean lines 46-50 of the paste?
<jimbaker`> sure, we should pull that in
<jimbaker`> can also use the docstring cleanup too
<hazmat> jimbaker`, well pretty much all the changes to commands in that diff are docstring cleanup
<hazmat> the stuff in __init__ and tests can be ignored
<hazmat> jimbaker`, fix-pyflakes looks good +1
<jimbaker`> hazmat, thanks!
<hazmat> hmm.. looks like the nova objectstore namespace is flat
<hazmat> odd the code looks like it should work, its doing storing against the hexdigest of the name
<_mup_> ensemble/trunk r322 committed by jim.baker@canonical.com
<_mup_> merge fix-pyflakes [r=hazmat][f=829531,829734]
<_mup_> [trivial] Remove use of dict comprehensions to preserve Python 2.6
<_mup_> compatibility and enable PyFlakes to work with Ensemble source.
<niemeyer> hazmat: Any chance of a second review here: https://code.launchpad.net/~fwereade/ensemble/cobbler-shutdown/+merge/71391
<niemeyer> With that one handled, we'll have a clean Friday! :-)
<niemeyer> sidnei: People are begging for your Ensemble talk at Python Brasil :)
<_mup_> Bug #828885 was filed: 'relation-departed' hook not firing when relation is set to 'error' state <Ensemble:New> < https://launchpad.net/bugs/828885 >
<hazmat> niemeyer, sure
<hazmat> ugh.. its big
<niemeyer> hazmat: Why was this reopened ^?
<hazmat> niemeyer, just had a talk with mark about it .. its not really about relation-broken being invoked, its more than if a service unit is an error state should the other side know about it
<hazmat> take a broken service out of a rotation
<hazmat> i guess we try not to associate relation state to overall service status
<niemeyer> hazmat: That's what I understood from the original description
<niemeyer> hazmat: As I've mentioned in the bug, I don't think killing a service like this is the right thing to do
<niemeyer> hazmat: A _hook_ has failed, not a connection
<niemeyer> hazmat: In other words, we take a slightly bad situation, and make the worst out of it by actually killing the service
<hazmat> niemeyer, yeah.. fair enough, i forget if resolved handles that
<hazmat> niemeyer, its not about killing the service though
<hazmat> niemeyer, its about informing the other end of the relation that something is wrong
<hazmat> other relations of the service continue to operate normally
<niemeyer> hazmat: It definitely is.. that's what relation-departed does
<niemeyer> hazmat: The relation wasn't departed
<niemeyer> hazmat: There's an erroneous situation due to a human bug
<hazmat> niemeyer, relation-depart is just saying a unit has been removed..
<hazmat> it can re-appear latter with a join
<niemeyer> hazmat: Exactly, and it has not
<niemeyer> hazmat: Imagine the situation.. blog up.. small bug in relation-changed
<niemeyer> hazmat: "Oh, hey! There's a typo in your script! BOOM! Kill database connection."
<hazmat> niemeyer, but conversly do we allow for the other scenario to be true.. a web app server and proxy, the web app server is dead, its rel hook errors, and the proxy continues to serve traffic to it
<niemeyer> hazmat: Yes, that sounds like the most likely way to have things working
<hazmat> m_3, anything to add?
<niemeyer> hazmat: We can't assume it's fine to take services down at will without user consent
<niemeyer> hazmat: The user desire was to have that relation up..
<m_3> the web app <-> proxy relationship you described is a good example
<niemeyer> hazmat: There was an error because of an improper handling of state that can't be implied as "impossible to serve"
<m_3> the one I was seeing was at spinup
<hazmat> niemeyer, indeed, i remember know why it was done this way
<niemeyer> m_3, hazmat: Note that this is different from a machine going off
<niemeyer> m_3, hazmat: Or network connectivity being disrupted, etc
<m_3> spin up 20 units of a related service
<m_3> third of them failed, but the primary service still had configured state for the failed units
<m_3> that cleanup is what I'm targeting
<niemeyer> m_3: Define "failed"
<m_3> test case was a relation-changed hook that just "exit 1"
<m_3> the one where a third were failing was NFS clients trying to mount
<niemeyer> m_3: We can try to be smart about this in the future, and take down relations if there is more than one unit in it, for instance
<niemeyer> m_3: That situation is not a good default, though
<niemeyer> m_3: Note how your exit 1 does not imply in any way that the software running the service was broken
<m_3> understand... we can choose to not implement... just wanted to surface the issue
<m_3> so bringing clients up slowly works fine
<niemeyer> m_3: It implies relation-changed was unable to run correctly for whatever reason
<m_3> rewriting clients to retry a couple of time works
<niemeyer> m_3: Right, but do you understand where I'm coming from?
<m_3> yes, totally
<m_3> turning a machine off in a physical infrastructure is a good example
<m_3> haproxy and varnish are written to be tolerant against this eventuality
<m_3> would be nice if we could provide this though
<niemeyer> m_3: Hmm.. it sounds like we're still talking about different things
<niemeyer> m_3: Ensemble _will_ handle disconnections, and _will_ take the relation down
<m_3> sorry if I'm not explaining this well
<niemeyer> m_3: you're explaining it well, but I feel like we're making disjoint points
<m_3> it leaves the relation in an "error" state for the units where relation-changed hook exited poorly
<m_3> that's not taking the relation down
<m_3> there's no way for the "server" to know that anything wrong has happened
<niemeyer> m_3: This is not a disconnection.. an error in a relation-changed script doesn't imply in any way that the service is down
<m_3> it could do a relation-list and check on things... if something got fired
<m_3> hmmm... yes, I've been focusing on relation-changed during startup
<niemeyer> m_3: But if you turn the network down on the service, or if say, the kernel wedges.. Ensemble will take the relation down.
<m_3> for services that often don't start until relation-changed (not in start)
<niemeyer> m_3: Even in those cases, we can't tell whether the service is necessarily down or not
<_mup_> ensemble/stack-crack r324 committed by kapil.thangavelu@canonical.com
<_mup_> don't use namespaced storage keys, use a flat namespace
<niemeyer> m_3: Since we don't know what happened
<niemeyer> m_3: In a situation where that was a critical service, the most likely scenario to have it working is to allow the relation to stay up while the admin sorts it out
 * m_3 wheels turning
<m_3> how does ensemble respond to a kernel wedge (your example above)
<niemeyer> m_3: That situation puts the machine agent and the unit agent unresponsive, which will eventually cause a timeout that will force all of its relations down
<hazmat> m_3, it will get disconnected from zookeeper and then the opposite end of the relation will see a 'relation-depart' hook exec
<m_3> right... so "framework" or "infrastructure"-wise... that change is registere
<m_3> d
<hazmat> m_3, more than framework.. the opposite relation endpoints see the disconnection
<m_3> but it tries to stay ignorant of service semantics
<niemeyer> m_3: For now..
<m_3> right, I can clean up when that happens
<niemeyer> m_3: We want to go there, eventually
<m_3> ok, this really goes to all of the bugs about relation-status
<m_3> thanks for the discussion guys!
<m_3> s/bugs/feature requests/
<niemeyer> m_3: np!
<niemeyer> m_3: I think there's more we need to talk about in this area
<_mup_> ensemble/stack-crack r325 committed by kapil.thangavelu@canonical.com
<_mup_> allow txaws branch usage from an ensemble env
<niemeyer> m_3, hazmat: I'm personally concerned about even that scenario, for instance, when the unit agent goes off
<m_3> niemeyer: I'll write up my use cases that need relation state info
<hazmat> niemeyer, how so?
<niemeyer> hazmat: We need to find a way to restart the unit agent without killing relations
<hazmat> niemeyer, we can do that now, we just need to reconnect to the same session
<niemeyer> hazmat: In the next incarnation, the logic that puts the ephemeral nodes in place must take into account they might already be there
<niemeyer> hazmat: Kind of
<niemeyer> hazmat: We don't expect to find previous state, I believe
<hazmat> niemeyer, let's be clear its  not killing a relation, its a transient depart and join for the same unit
<hazmat> niemeyer, we do find the same relation state
<hazmat> the unit's relation state is the same across a depart/join... even if the client is disconnected, the relation settings are persistent
<hazmat> there's a separate ephmeral node for active presence
<m_3> 1.) formula tests need to know when hooks execute, 2.) relations that depend on another relation's state, and 3.) varous kinds of relation fails
<niemeyer> hazmat: That's how a relation is killed!
<niemeyer> hazmat: Formulas take state down on depart
<hazmat> niemeyer, that's how a service unit's participation in a relation is killed and resurrected
<niemeyer> hazmat: Yes.. and correct formulas will clean state/block firewall/etc on depart!
<hazmat> the relation itself is a semantic notion between services, its only killed when the user removes the relation
<hazmat> niemeyer, and they will open it back up when it comes back
<niemeyer> hazmat: The way that the formula knows a relation has been removed is through the relation-joined/departed! :-)
<niemeyer> hazmat: A bit shocked to be stating this :)
<hazmat> :-)
<hazmat> niemeyer, a relation has been removed to a formula upon execution of relation-broken
<hazmat> and created upon first execution of any join
<niemeyer> hazmat: No, relation-broken means it has been taken down by itself
<niemeyer> hazmat: relation-departed means "The remote end left.. clean up after yourself."
<hazmat> right, but if i have 5 other units in a relation, and one goes away, i don't say the relation is removed
<niemeyer> hazmat: The relation between the two units has been _dropped_...
<m_3> I'm confused about difference between relation taken down and related unit taken down
<niemeyer> hazmat: State may be removed.. etc
<hazmat> niemeyer, the state is service level typically, unit level state about remote ends is access, and that can be granted/restored
<niemeyer> m_3: A relation is established between services.. that's the ideal model the admin has stated he wanted
<hazmat> in general though it should be possible that a unit transiently departs a relation and comes back to find things working with the same access and state
<niemeyer> m_3: Service units join and depart the relation based on realistic behavior
<m_3> right, but all of my examples above retain the relation and just drop units
<niemeyer> hazmat: Agreed on the first point, disagreed strongly on the second one.
<hazmat> niemeyer, for example consider a network split.. its a transient disconnect and reconnect.. the relation isn't dead, that's between the services, the disconnect unit's participation in the relation is temporarily removed
<niemeyer> hazmat: """
<niemeyer> <relation name>-relation-departed - Runs upon each time a remote service unit leaves a relation. This could happen because the service unit has been removed, its service has been destroyed, or the relation between this service and the remote service has been removed.
<niemeyer> An example usage is that HAProxy needs to be aware of web servers when they are no longer available. It can remove each web server its configuration as the corresponding service unit departs the relation.
<niemeyer> """
<niemeyer> hazmat: This is our documentation.
<niemeyer> hazmat: It's been designed that way.. relation-departed runs, connection should be _down_..
<hazmat> hmm.. that's unfortunate, if a service has been destroyed should be under relation-broken
<niemeyer> hazmat: Nope
<niemeyer> hazmat: """
<niemeyer> <relation name>-relation-broken - Runs when a relation which had at least one other relation hook run for it (successfully or not) is now unavailable. The service unit can then clean up any established state.
<niemeyer> An example might be cleaning up the configuration changes which were performed when HAProxy was asked to load-balance for another service unit.
<niemeyer> """
<niemeyer> hazmat: That's how it's been designed
<niemeyer> Which is why I bring my original point back: we need to ensure that restarts keep the relation up
<hazmat> well i have some doubts that its implemented that way ... broken is always the final step of cleanup when destroying a relation
<niemeyer> hazmat: If it's not that way, it's a serious bug we should fix.. I certainly reviewed it against that assumption
<niemeyer> hazmat: We wrote that document jointly as well
<hazmat> niemeyer, i think that doc needs changing... depart is called when a unit is removed
<hazmat> niemeyer, i think  some editing and updating got done on it post implementation
<niemeyer> hazmat: "This could happen because the service unit has been removed"
<niemeyer> hazmat: ?
<hazmat> it can happen for any number of reasons
<niemeyer> hazmat: Yes, they seem listed there.. what's wrong specifically?
<hazmat> network split, explict removal of unit, etc.. the only significance is that the remote end isn't there
<hazmat> one of them that is
<hazmat> relation level cleanup.. removing a database, etc. should happen in relation-broken
<hazmat> only unit level cleanup should happen in depart
<niemeyer> hazmat: We'll have to talk on monday about this..
<m_3> is there any difference between the events fired for timeouts -vs- those fired for remove-relation calls?
<niemeyer> hazmat: That's not how it's been designed, and is certainly not what we talked about when we planned it
<hazmat> niemeyer, if i do a remove-unit, the remote end will get a depart
<hazmat> that doesn't mean blow up the database
<niemeyer> hazmat: It means remove the access from the other end
<niemeyer> hazmat: Nothing should mean "blow up the database", ever
<hazmat> niemeyer, write but not the five other units that are still in the relation
<hazmat> s/write/right
<niemeyer> hazmat: Yes.. remove the access from the unit that has departed
<hazmat> but if i see broken, the relation is finished.. it won't ever come back
<niemeyer> hazmat: Not at all
<hazmat> and i can do service level relation cleanup
<hazmat> niemeyer, it will be a new relation if it does
<niemeyer> hazmat: If network connectivity terminates, it should get relation-broken
<hazmat> niemeyer, who get its and why?
<niemeyer> hazmat: Again, the docs explain
<hazmat> niemeyer, if they see a network split from a single related unit, they get a depart
 * hazmat goes to read
<hazmat> niemeyer, don't see it
<hazmat> a relation is never broken till the user severs it
<niemeyer> hazmat: Who gets it:
<niemeyer> """
<niemeyer> Runs when a relation which had at least one other relation hook run for it (successfully or not) is now unavailable. The service unit can then clean up any established state.
<niemeyer> """
<niemeyer> and why too, in fact..
<hazmat> like i said the docs need cleanup.. we can discuss design considerations on monday if need be.. but afaics the semantics are correct
<hazmat> relation-broken is effectively a relation-destroyed hook
<hazmat> m_3, no there isn't
<niemeyer> hazmat: Regardless, the original point remains..
<niemeyer> hazmat: relation-joined should be sustained across restarts
<hazmat> niemeyer, you mean it shouldnt' be executed across an agent restart?
<niemeyer> hazmat: Right.. the relation should remain up
<hazmat> niemeyer, like i said originally if we can reattach the session that's trivial as is
<niemeyer> hazmat: I didn't say otherwise.. I pointed out the behavior of relation-joined, pointed out it doesn't work, and pointed out we should watch out next
<niemeyer> hazmat: You seem to agree now, so that's a good base to move on
<hazmat> niemeyer, indeed we do need to check for the ephmeral nodes before blindly recreating them
<hazmat> which would fail currently
<niemeyer> hazmat: Phew.. woohay agreement
<hazmat> niemeyer, i never disagreed with that, the conversation went sideways to something different
<niemeyer> Exactly
<niemeyer> hazmat: You disagreed with the behavior of joined, but it doesn't really matter now.
<niemeyer> hazmat: re. broken.. reading the code.. it sounds like the behavior you described is actually more useful indeed
<hazmat> niemeyer, agreed
<niemeyer> Double agreement! Score! :-)
<hazmat> :-) the docs need updating
<hazmat> just in time for the weekend, i should head out on that note ;-)
<m_3> later man... thanks for the help
<hazmat> more openstack to do.. needed to adjust to deploy a txaws branch for esnemble
<hazmat> m_3, cheers
 * hazmat grabs some caffiene
<niemeyer> hazmat: Not entirely surprised about that debate on broken
<niemeyer> hazmat: Looking through my mail, we've had very little debate on it
<hazmat> niemeyer, i think we discussed it in brazil sprint and voice meetings
<niemeyer> hazmat: Hmm
<niemeyer> hazmat: I'm still not sure about it
<hazmat> niemeyer, looks like we had  a long discussion on list oct 2010 re
<niemeyer> hazmat: relation-broken seems to be called on stop()
<hazmat> hmm
<niemeyer> hazmat: Which would put its behavior closer to the documented
<hazmat> niemeyer, where do you see that?
<hazmat> i'm looking at unit/lifecycle
<niemeyer> Me too
<niemeyer>                 yield workflow.transition_state("down")
<hazmat> on stop we do a rel down transition
<hazmat> niemeyer, right that doesn't execute broken
<hazmat> niemeyer, it actually doesn't execute anything on a relation
<niemeyer> hazmat: Ah, there's down_departed
<hazmat> ah. your looking at the workflow
<hazmat> niemeyer, those are for when the relation is broken while the relation was down
<hazmat> we still execute the relation-broken hook to give a final chance of cleanup 
<m_3> sorry... relation broken while down?
<hazmat> m_3, if the relation is an down/error state, we still execute the relation-broken hook on a unit if the relation between the services is removed
<m_3> ah, gotcha
<niemeyer> hazmat: There's some name clashing in the code.. we call depart when we mean break in a few cases
<hazmat> niemeyer, depart is always break
<niemeyer> hazmat: Except when it's not.. :-)
<niemeyer> hazmat: relation-departed
<hazmat> niemeyer, ah.. right.. yeah. there's a name indirection there
<hazmat> niemeyer, yeah.. ic what you mean
<niemeyer> hazmat: It's all good, though.. you are right, we need to fix docs for broken
<niemeyer> hazmat: I wonder if we can simplify the logic around that workflow significantly in the future, with a more direct state machine
<niemeyer> hazmat: self._current_state.. self.relation_joined().. self.relation_changed().. etc
<hazmat> niemeyer, you mean fold the lifecycle and workflows together?
<niemeyer> hazmat: Yeah
<hazmat> yeah.. possibly it was useful for some contexts like resolved where having the separate decisions points was very useful
<hazmat> to distinguish things like with hooks retry vs. not but that could be encapsulated differently
<niemeyer> hazmat: Still.. we could probably come up with a way to encode the changes into functions themselves
<hazmat> or when we decided to execute change after join always
<nijaba> RoAkSoAx: pong (late)
<niemeyer> hazmat: Anyway.. random wish to make it simpler really.. maybe not possible, don't know..
<hazmat> niemeyer, yeah.. it does feel like a redundant layer through most of the workflow
<hazmat> workflow.py that is
<niemeyer> Right
<hazmat> niemeyer, yeah.. i thought about just having functions attached as transition actions directly on the state machine
<hazmat> that was actually one of the original designs, but per discussion we wanted to keep things to as pure of a state machine as possible
<hazmat> i just went with something as static and simple as possible in the workflow def 
<hazmat> but the extra layer there hasn't really proved useful.. 
<hazmat> its always effectively a one liner to the lifecycle method from the workflow
<niemeyer> hazmat: Yeah.. I mean really having two layers.. e.g.
<niemeyer> def relation_joined():
<niemeyer>     ... do stuff
<niemeyer> def start():
<niemeyer>     ... call start hook ...
<niemeyer>     self._state = "started"
<niemeyer> etc
<hazmat> there's global state to manage on some of these though
<niemeyer> Then, another class
<niemeyer> err = hooks.install()
<niemeyer> if err == nil:
<niemeyer>     hooks.start()
<niemeyer> etc
<niemeyer> This feels easier to grasp/manipulate somehow
<hazmat> the lifecycle methods should correspond directly to those hooks.*
<hazmat> we could hook them up directly to the workflow def
<niemeyer> hazmat: Yeah, I know it's not too far.. we just have a few "padding layers" there
<niemeyer> hazmat: But I think we also need some separation in a few cases.. we don't have that external driver that says what to do
<hazmat> yeah.. it should be easy to drop all the action methods on workflow, and have the transition action directly invoke the lifecycle method
<niemeyer> hazmat: Feels a bit like inverting responsibility
<niemeyer> hazmat: Right, that's what I'm trying to get to if I see what you mean
<hazmat> anyways.. i should get back to openstack.. i need to sign off soon
<hazmat> niemeyer, i do
<niemeyer> hazmat: Awesome, have a good weekend.. I should be off in a bit too
<hazmat> niemeyer, have good weekend
<hazmat> ^a
<niemeyer> Cheers!
<m_3> great weekend guys... thanks
<hazmat> nice.. txaws running from branch.. 
 * hazmat crosses fingers on openstack deploy
<hazmat> sweet, deploy working!
 * hazmat does a dance
<niemeyer> hazmat: WOOT!
<niemeyer> hazmat: Man.. that requires beer
<niemeyer> I'll step out immediately to get some :-)
<niemeyer> A good weekend to all!
<_mup_> Bug #829829 was filed: test_service_unit_removed eventually fails <Ensemble:New> < https://launchpad.net/bugs/829829 >
#ubuntu-ensemble 2011-08-20
<sidnei> niemeyer, uhm. looks like i digged my own hole there. now i must type really fast. ;)
<sidnei> im sure if i just go back and read all the channel logs i'll learn EVERYTHING. or maybe read the source. :)
<_mup_> Bug #829880 was filed: object store doesn't like key with '/'  <Ensemble:Triaged by hazmat> <OpenStack Compute (nova):New> < https://launchpad.net/bugs/829880 >
<hazmat> sidnei, if you script the download of irclogs, i'd be interested in the script ;-)
<niemeyer> Hey there
<sidnei> hey niemeyer
<niemeyer> sidnei: Yo!
<niemeyer> sidnei: Ready for the talk? :-)
<sidnei> niemeyer, no ;)
<niemeyer> sidnei: Hehe :)
<niemeyer> sidnei: It's no big deal.. just play with it and you'll easily get the idea
<niemeyer> sidnei: and we'll be here :)
<sidnei> niemeyer, sure. i'm thinking of starting with a background of 'why ensemble', which should fill at least half the time. then get to the 'how', and hopefully i won't run out of time. :)
<sidnei> niemeyer, because most people won't have heard of it anyway
<niemeyer> sidnei: Sounds good.. I have the slides from FISL too, in case you want to get a head start
<sidnei> niemeyer, definitely helpful
<niemeyer> sidnei: http://labix.org/presentations/ensemble-fisl12/
<niemeyer> sidnei: The demo there is a short (3m) video.. I can send you at some point too if you want
<sidnei> niemeyer, that's fine thinking of saving the demo for a lightning talk.
<niemeyer> sidnei: In my experience the demo has been game-changing in presentations
<niemeyer> sidnei: I could babble for hours and it wouldn't be as effective as showing a couple of in-practice minutes
<sidnei> niemeyer, right. i'm thinking of redoing the demo so it's more readable on a projector, at fisl i could barely read what was on screen.
<niemeyer> sidnei: Ohh, sweet!
<niemeyer> sidnei: We'd love to have that as well! 8)
<sidnei> i'm sure you do :)
<sidnei> niemeyer, so changing subject but not quite, how do you see a django formula working? i mean, it's different than most services in the sense that installing a package (python-django) does not bring up a service, but instead you need a django app which is generally a vcs checkout.
<niemeyer> sidnei: That's not a big deal per se, but let me try to understand what you're really referring to
<niemeyer> sidnei: Do you mean a generic all-encompassing django formula that can be used for any django app at all, or
<niemeyer> sidnei: .. bundling of a specific django app from a vcs?
<sidnei> niemeyer, i think there's room for both. where the former probably means making the vcs location configurable in the formula.
<niemeyer> sidnei: Agreed, there's room for both, but they're different paths
<niemeyer> sidnei: I'd actually like to experiment the first option at some point
<niemeyer> sidnei: The second one is easy
<niemeyer> sidnei: For the first one, I'd like to try building something a bit smarter, that would enable communicating to the framework that e.g. a given database was made avaialble
<niemeyer> sidnei: Effectively working more like a PaaS
<sidnei> right
<niemeyer> sidnei: Imagine building a thin layer that enabled the Python app to query "Hey, what apps are up?", etc
<niemeyer> sidnei: We could then offer a few different relations.. e.g.
<niemeyer> sidnei: mongo-relation-joined, postgres-relation-joined, etc
<niemeyer> sidnei: These would hook directly into Django and configure the respective database
<sidnei> niemeyer, so hooking into django.settings, ok.
<niemeyer> sidnei: Right.. PaaS FTW
<niemeyer> sidnei: The vcs then is a config option..
<niemeyer> sidnei: Which means exactly the same formula can be used to deploy pretty much any django app straight from vcs
<sidnei> niemeyer, yup, i can see that working. would be simple to export DJANGO_SETTINGS_MODULE to a module generated by things provided by ensemble, which would override the django app defaults.
<niemeyer> sidnei: Right!
<sidnei> niemeyer, ok. i think i might give that a try. what about the second option, with bundling the app.
<niemeyer> sidnei: That'd be awesome.. please send some feedback to the list later if possible.  This feels like a big area we should start exploring for several platforms
<niemeyer> sidnei: The second option is trivial.. you bundle the app itself without a formula and hardcode things up
<niemeyer> sidnei: Erm
<niemeyer> sidnei: s/without/within/
<sidnei> niemeyer, so the formula would be pushed to ensemble with a tarball included
<niemeyer> sidnei: Not necessarily.. you can still reference the vcs, for instance
<niemeyer> sidnei: Except it would be hardcoded inside it
<niemeyer> sidnei: But the tarball would be doable too
<sidnei> niemeyer, ok. and how scalable is the deployment of formulas? if i had 1000s of new instances firing up simultaneously, would ensemble fall over from too many concurrent requests?
<sidnei> niemeyer, i was looking at murder https://github.com/lg/murder and it seems simpler than i thought it was, maybe a murder formula would be interesting too.
<niemeyer> sidnei: Nope
<niemeyer> sidnei: I mean, no, it won't fall
<niemeyer> sidnei: We never tested such a workload, but the communication is relatively simple
<sidnei> niemeyer, awesome. 
<sidnei> niemeyer, i think a murder formula might be interesting anyway, even if just for the sake of showing how to use murder without capistrano. 
<niemeyer> sidnei: I don't know what's murder.. lookig
<niemeyer> looking
<niemeyer> sidnei: Ah, is it the twitter thing?
<niemeyer> Yeah..
<niemeyer> sidnei: Yeah, certainly interesting..
<sidnei> niemeyer, the way its deployed seems fairly trivial, it just calls python scripts to fire up a torrent server and multiple peers
<sidnei> where it == capistrano
<sidnei> https://github.com/lg/murder/blob/master/lib/murder/murder.rb
<sidnei> niemeyer, i wonder if it would make more sense as a central service provided by ensemble though
<niemeyer> sidnei: Probably not
<niemeyer> sidnei: We we're going to the opposite direction, actually
<niemeyer> sidnei: Ensemble services themselves will eventually be formulas
<hazmat> murder is a deploy system, afaicr
<hazmat> i was thinking about doing a wsgi formula
 * hazmat tries some jfdi
<niemeyer> hazmat: Hey
<hazmat> niemeyer, hola.. spent the morning trying to track down a touchpad
<niemeyer> hazmat: I understand it as file-copying
<hazmat> i saw the annoucement last night before i went to bed
<hazmat> rushed to stores this morning, just missed them
<niemeyer> Aw
<niemeyer> LOL
<hazmat> 100 for a tablet computer is pretty nice
<niemeyer> I just googled for "murder"
<hazmat> doh ;-)
<hazmat> niemeyer, it uses zk
<hazmat> its twitter's deploy system afaicr
<hazmat> in python
<niemeyer> hazmat: Yeah, I've listened to part of the talk.. it sounded like a generic file-copying thing, but maybe I forget the details
<niemeyer> hazmat: I imagined that it might be put in a relation, etc
<hazmat> niemeyer, basically bittorrent + zk 
<niemeyer> hazmat: Ensemble won't get in the way of the copying/deploying bit
<hazmat> niemeyer, yeah.. its not really germane to a service its pretty custom imo
<niemeyer> hazmat: E.g.: """
<niemeyer> Murder (which by the way is the name for a flock of crows) is a combination of scripts written in Python and Ruby to easily deploy large binaries throughout your companyâs datacenter(s).
<niemeyer> """
<hazmat> a formula interestested in doing app deploy, just does a set on the version in the config file
<hazmat> for upgrade
<niemeyer> Yeah
<niemeyer> hazmat: Still, how would one put the first version in place
<hazmat> niemeyer, deploy --with-config
<hazmat> niemeyer, just finished the config.yaml ;-)
<niemeyer> hazmat: I imagine, e.g., a few formulas like murder-server, murder-service, etc
<hazmat> niemeyer, i just don't think its that useful
<niemeyer> Agreed, it's less exciting
<smoser> hazmat, cloud-init runs fine to my knowledge on canonistack
<hazmat> smoser, yeah.. it does.. i was having some issues logging in with keys that where put in place my cloud-init.. but cloud-init works fine
<hazmat> smoser, there's a few minor things i noticed.. the metadata server/hostname set seems to hang
<smoser> "metadata server/hostname set seems to hang"
<smoser> ?
<hazmat> smoser, the login issue disappeared once i started using a keypair when launching instances, haven't bothered to track it down.. it was probably a red -herring
<smoser> hostname is https://bugs.launchpad.net/nova/+bug/820962
<_mup_> Bug #820962: Generating hostname from display name incorrect <OpenStack Compute (nova):Triaged> < https://launchpad.net/bugs/820962 >
<hazmat> smoser, ah.. yeah that's it
<hazmat> smoser, do you know if current openstack has a metadata server of some type?
<smoser> yes, it does.
<smoser> its not perfect (bug 823520, bug 827569) but it is present and generally working
<_mup_> Bug #823520: EC2 instance-type metadata returns SQLAlchemy object string <OpenStack Compute (nova):Fix Committed by dan-prince> < https://launchpad.net/bugs/823520 >
<_mup_> Bug #827569: ec2metadata service does not include 2011-01-01 <OpenStack Compute (nova):Confirmed> < https://launchpad.net/bugs/827569 >
<hazmat> thanks, i'll look into that then, there's a little bit of metadata that we'd like for ensemble from the instances
<smoser> you're probably running natty ?
<hazmat> smoser, yeah.. mostly natty for ec2.. oneiric for lxc 
<smoser> so user data issue , natty?
<hazmat> smoser, honestly atm it all appears to just magically work
<hazmat> i'm able to deploy services on canonistack 
<hazmat> i'm just not sure how in all cases ;-)
<hazmat> off to enjoy a beautiful day.. have a good weekend folks
<smoser> verified that onieric image on canonicalstack works with no --key (launched and accessed via ssh_import_id in cloud-config)
<smoser> and verified natty does too
<smoser> but... there is a bug in the loader kernels it would appear that sometimes they dont' boot all the way
<smoser> it is the loader kernel that is at fault likely
<smoser> a reboot will usually fix it
<sidnei> niemeyer, yes, it's pretty generic file copying, peer-to-peer due to bittorrent.
#ubuntu-ensemble 2011-08-21
<akshay> are there any free cloud services with which I can test ubuntu ensemble?
