#ubuntu-ensemble 2011-05-30
<kim0> Morning everyone
<SpamapS> hrm.. laying around w/ ensemble.. it seems like hooks on two machines in a service are being triggered for eachothers' events
<SpamapS> s/laying/playing/ :p
<niemeyer> Good morning!
<SpamapS> niemeyer: hello there, I have a very interesting situation that I'm trying to debug..
<niemeyer> SpamapS: Cool, let's see
<SpamapS> which may be interrupted by my 20 month old wakign up :-P
<SpamapS> niemeyer: basically, I am seeing the data from demo-wiki/2's relationship when I run 'relation-get' on demo-wiki/1
<niemeyer> SpamapS: Hmm
<SpamapS> niemeyer: its as if the modified state is getting propagated to all service units
<SpamapS> niemeyer: I think I have a reproducible test case
<niemeyer> SpamapS: Ah, that's awesome
<SpamapS> btw, debug-hooks is *amazing*
<niemeyer> SpamapS: Is there a chance I could get access to the environment's zk machine?
<niemeyer> SpamapS: Just to have a look at the state
<niemeyer> SpamapS: It's nice isn't it
<niemeyer> SpamapS: Feels like gdb for formulas :)
<SpamapS> totally
<SpamapS> niemeyer: I think this is easily reproducible w/ the example formulas
<SpamapS> trying that right now
<niemeyer> Cool
<niemeyer> Let me know the steps and I'll run through it
<SpamapS> ahh ok so its not exposed by the example formulas because the example mysql formula sends the same user/pass to all service units in the relation
<SpamapS> ok so here's where I may just not understand relation-set and relation-get ...
<SpamapS> I thought that relation-set and relation-get were each reading/writing into an area unique to the two service units...
<SpamapS> so if   mysql/0 relation-set's with ENSEMBLE_REMOTE_UNIT=wordpress/1 .. thats a set of data unique to wordpress/1 and mysql/0's relationship
<SpamapS> niemeyer: ^^ is my assumption false?
<SpamapS> its been a while since I read the spec
<niemeyer> SpamapS: It's no
<niemeyer> t
<SpamapS> so that confuses me then as to how the example mysql formula can work.. as it exits if the database exists
<SpamapS> without ever having done a relation-set
<niemeyer> SpamapS: If you don't provide additional arguments, and do a relation-set with a=b, that's specific to the involved units only
<SpamapS> niemeyer: ok, so the steps to reproduce what I'm seeing as wrong behavior are as follows:
<SpamapS> bzr branch lp:principia-tools && cd principia-tools && scripts/getall && tests/mediawiki.sh
<SpamapS> niemeyer: this assumes a bootstrapped environment
<SpamapS> niemeyer: the failure is that both demo-wiki's will receive eachothers' usernames in db-relation-changed at some point
<SpamapS> causing them to fail
<SpamapS> note that this wil spawn about 7 machines. ;)
<SpamapS> you can probably reduce it down by removing the memcached stuff
<niemeyer> SpamapS: No worries.. I'll go through it in a moment
<niemeyer> SpamapS: Nah, it'll be nice to see it churning ;)
<SpamapS> niemeyer: much appreciated. I'll be in and out today, as its a U.S. holiday, but this one has been *killing me* this weekend. ;)
<SpamapS> so I'll check in as much as possible
<niemeyer> SpamapS: Ouch, sorry to hear it.. we'll get whatever is happening fixed
<SpamapS> I'm kind of hoping whats broken is my assumptions.
<SpamapS> I'm afraid I've churned a bit on it so the formulas are kind of ugly at the moment. :-P
<niemeyer> SpamapS: Your basic assumption is sane/right, at least
<SpamapS> on a side note... I wonder if we should make an environment config option to keep from putting all the aws machines' host keys in .ssh/known_hosts ... 
<SpamapS> I've spun up and down enough nodes, I get conflicts about 1 in 20
<kim0> Any way to remove "Front Page" from first line of https://ensemble.ubuntu.com/FrontPage :)
<koolhead17> replace it with startpage
<koolhead17> :)
<koolhead17> hey hazmat niemeyer obino
<niemeyer> SpamapS: Yeah, feels like we should do something on that area indeed 
<SpamapS> niemeyer: so if I were following the example formulas that exist now, I would believe that one relation-set feeds all units in a service.
<niemeyer> koolhead17: Hey there!
<niemeyer> SpamapS: Why?
<SpamapS> niemeyer: because db-relation-changed in the mysql formula only ever runs relation-set once per service
<SpamapS> niemeyer: once the database has been created and relation-set has been run, it just silently exits
<SpamapS> er.. db-relation-joined actually
<SpamapS> niemeyer: and in fact, this works beautifully, properly feeding that data to all units of wordpress when related
<niemeyer> SpamapS: I'm not sure I understand what you're saying
<niemeyer> SpamapS: Looking at db-relation-joined, why would it run only once per service?
<niemeyer> SpamapS: Oh, hmm, I think I see what you mean
<niemeyer> SpamapS: Feels like there's something catchy there indeed
<niemeyer> SpamapS: The database doesn't have to be created, but the relation settings should be piped through
<SpamapS> niemeyer: its as if relation-set is a broadcast channel
<niemeyer> SpamapS: Yeah, this is certainly bogus
<SpamapS> it didn't break until I added ip restrictions to the grants in the principia mysql formula
<niemeyer> SpamapS: It must be something minor, though.. I'm sure the concept was well understood the whole time
<niemeyer> SpamapS: We even have a parameter on relation-set when one wants to change info on a separate relation unit, for instance
<SpamapS> right
<SpamapS> in fact, I wonder if I explicitly state the remote unit if this goes away
<niemeyer> SpamapS: No, the default remote unit is the one you expect
<niemeyer> SpamapS: E.g. db-relation-joined will set the default remote unit to the joining unit
<SpamapS> actually, relation-set doesn't take a unit
<SpamapS> not from the cli options at least
<kim0> hmm so after the principia changes, what should I branch to work on a new formula
<niemeyer> SpamapS: Aw.. man, I think I'm doing a big confusion..
<niemeyer> SpamapS: I'm mixing relation-get and relation-set
<niemeyer> SpamapS: relation-get is the one that will look at the remote unit's data
<niemeyer> SpamapS: relation-set is always local
<niemeyer> SpamapS: So it changes the settings of "self", if you see what I mean
<niemeyer> SpamapS: and the other units will be notified of the change
<niemeyer> SpamapS: So it's working as intended, but not as I pointed out earlier
<niemeyer> SpamapS: You can think of it the following way: every unit in a relation has bucket with their own settings.
<niemeyer> SpamapS: relation-set always changes the local bucket
<niemeyer> SpamapS: relation-get can retrieve settings from the bucket of any other unit within this relation, and defaults to the remote unit the event is running for
<niemeyer> SpamapS: The documentation and examples may be helpful: https://ensemble.ubuntu.com/docs/formula.html#hook-tools
<SpamapS> so I can't give each remote unit its own unique configuration? :(
<SpamapS> bummer I was kind of excited to be able to restrict each username/password to each unit
<SpamapS> but it makes perfect sense, and is easy to correct
<SpamapS> the key "aha" btw, is "relation-set is always local"
<SpamapS> niemeyer: well it takes away a tiny thing I was trying to do, but it simpliefies the formulas, so my :( is turned around to :)
<niemeyer> SpamapS: Knowing that design, you actually can if you really want to
<SpamapS> The idea was simply that the units could be isolated from one another for greater security
<niemeyer> SpamapS: Ah, I see
<SpamapS> but, there are other ways to achieve that
<niemeyer> SpamapS: Right.. we have to consider further the security details intra-relations
<SpamapS> it was, I was thinking, a target of opportunity.. not something I see as key to formulas working
<SpamapS> I think given that the service units will be largely identical, its ok to treat them all as equals and just isolate the service from other services, not the units from eachother
<niemeyer> SpamapS: Right, this is likely the initial direction we'll go into
<niemeyer> SpamapS: In some cases it may actually be important to know details from other relations as well
<niemeyer> Sorry
<niemeyer> SpamapS: I mean from other units
<niemeyer> SpamapS: But, if we really want to, the current design doesn't make it impossible to do the isolation you mention
<niemeyer> SpamapS: Internally, that is
<SpamapS> sure, right now though, being able to assume that there is only one bucket per relation-set, allows me to delete a lot of code :)
<niemeyer> SpamapS: The conventions we build on it might make it hard, though.  IOW, people may well start depending on the fact they can see other unit's settings in useful cases.
<niemeyer> SpamapS: Sweet, that sounds like a good feature then ;)
 * SpamapS <heart> /^-/ in diffs
<niemeyer> +1
 * niemeyer lunch!
<SpamapS> ahh yes, deleting code to make things work is always refreshing
<SpamapS> kim0: nice job on the ensemble text. :)
<kim0> SpamapS: oh cool ! 
<kim0> SpamapS: thanks .. I could really use some tweak ups from a native speaker
<kim0> SpamapS: is there a short couple of commands to branch principia and work on a new formula
<kim0> if it's longish .. It's not urgent
 * kim0 is taking a first shot at writing a new formula
<SpamapS> kim0: I haven't gone through it with a fine toothed comb, but the bullt points portray what I think we all want to portray
 * kim0 nods .. cool
<kim0> n00b question, when I do ensemble add-relation mywiki mymemcached, assuming both mywiki and mymemcached have TWO service units deployed .. does Ensemble hook them up one memcache per mywiki, like I think it should ?
<SpamapS> err
<SpamapS> kim0: mywiki's units will all be related to mymemcached's units
 * niemeyer waves
<niemeyer> kim0: The documentation feels like going in a good direction, but I'm not sure the front page should hold that description
<kim0> o/
<kim0> shoot
<kim0> what's your thinking
<niemeyer> kim0: The feeling I have when opening the page is that I have to read a lot to grasp what Ensemble is
<niemeyer> kim0: In a sense, it feels like the first two or three questions in the FAQ would be a good introduction: https://ensemble.ubuntu.com/docs/faq.html
<kim0> there's a short version and a long one especially for that 
<kim0> hmm
<kim0> perhaps there should be some graphic separating the two
<niemeyer> Or, actually, the whole FAQ is probably the front page :-)
<niemeyer> Or rather a seed fo rit
<niemeyer> for it
<kim0> niemeyer: I think the text looks a bit too much, just because it's a lot of text
<SpamapS> niemeyer: ok the test I gave you before passes now.. :) thanks for clarifying, time to BBQ!
 * SpamapS disappears
<kim0> probably adding a nice graphic make it easier on the eye 
<niemeyer> SpamapS: Sweet, have fun there
<kim0> but I don't think we should remove much of it .. 
<niemeyer> kim0: Well, it feels like a lot of text because it is a lot of text.. :-)
<kim0> I could just leave titles for bullets, and link the denser text in another page of course
<niemeyer> kim0: We can certainly keep some of it
<kim0> like the 5 bullets, would just be 5 lines
<kim0> with the longer version linked to
<niemeyer> kim0: I actually like the points below
<kim0> niemeyer: so what part should not be there ?
<niemeyer> kim0: Perhaps the at-a-glance idea is that needs some love
<niemeyer> kim0: Reading this, for instance:
<niemeyer> """
<niemeyer> Ensemble is a novel cloud orchestration framework. It lets you deploy, manage and scale software services on the cloud and soon physical servers. Ensemble uses "formulas" to capture the intelligence of managing deployments. If you want a piece of software on the cloud, "there's a formula for that!" The Ensemble community is working towards that goal!
<niemeyer> """
<niemeyer> kim0: It feels very hyped, but without any hints at all regarding what it *actually* does
<niemeyer> kim0: I can replace Ensemble by any other configuration management product name, and it remains true
<kim0> but that's what marketing is all about eh :)
<niemeyer> kim0: I'm not a marketing guy
<niemeyer> :)
<kim0> mmm
<niemeyer> kim0: I think we should take pieces of the FAQ and insert above as an introduction
<kim0> niemeyer: well .. that paragraph will be replaced with whatever Gerry and everyone agrees on
<kim0> to be the standard intro paragraph to Ensemble
<kim0> this is just a placeholder
<niemeyer> kim0: Absolutely, but it's live right now, and I'd like to have something nice there
<niemeyer> kim0: What do you think of having our FAQ entries as an introduction, and maintaining the large paragraph there as a more in-depth look?
<niemeyer> s/large paragraph/large section/
<kim0> generall it's absolutely fine :)
<kim0> ok maybe it's indeed too hype packed
<kim0> k, I'll edit it
<niemeyer> kim0: Thanks a lot
<kim0> rock n roll
 * niemeyer plays with packages
<niemeyer> ... and everything seems to work.. sweet!
<niemeyer> kim0: When are you planning to run the refactoring?
<niemeyer> kim0: In the front page, that is
<niemeyer> kim0: I want to put some basic information up regarding the packages
<kim0> I'm held up in real life for a couple of hours
<kim0> if you'd like to change something go ahead ..
<kim0> I may not be able to change it before the morning
<kim0> niemeyer: is that ok ?
<niemeyer> kim0: Cool, no worries
<kim0> okie great
<niemeyer> kim0: Tweaked
<niemeyer> kim0: Please let me know what you think
<kim0> great .. thank you
#ubuntu-ensemble 2011-05-31
<poolie> anyone here?
<TeTeT> no developers I fear
<kim0> Morning everyone
<hazmat> good morning
<kim0> hazmat: morning :)
<niemeyer> hazmat, kim0: Yos!
<niemeyer> hazmat: Good extended weekend?
<hazmat> niemeyer, indeed.. lots of play time
<kim0> hehe
<kim0> Love the Ensemble audition program
<kim0> wonder if we could have a miniature public version of it (like for a week, to write a formula)
<kim0> The aws trial thing was a big success, I suppose we could do something similar
<niemeyer> kim0: Maybe.. let's see how the program goes
<SpamapS> goood morning ensembladeros
<kim0> SpamapS: morning o/
<kim0> ensembladeros .. mm can we pick something easier to say :)
<SpamapS> Ensemblanators?
<SpamapS> Ensemblasters
<jimbaker> visions of the toy story ride at disney in my head right now
<SpamapS> Ensemble Space Rangers
<robbiew> bcsaller: thanks for stepping up for the demo at structure...I wasn't thrilled about flying to SF for a day :D
<bcsaller>  robbiew: no problem
<bcsaller> I like talking with people about the project
<robbiew> :D
<robbiew> bcsaller: how'd the cloudcamp go last week?
<robbiew> (or whatever it was called, heh)
<bcsaller> at first I was worried that it was too high level with lots of exec types running around, but by the end I think I made some good contacts and generated some interest 
<bcsaller> no one there had a good answer for orchestration, some didn't even understand they'd hit this problem but it became very clear in talking with people that a sysadmin named Tim sitting in the back room wasn't a viable orchestration solution
<kim0> SpamapS: 
<kim0> SpamapS: Ensemblasters actually sounds great :)
<robbiew> bcsaller: :)...sweet
<SpamapS> bcsaller: dude, so harsh.. Tim's just tryin to feed his family. ;)
<robbiew> lol
<bcsaller> tim shouldn't have to work so hard, we offer better tools
<SpamapS> True, though without work as a distraction Tim might head back to the bottle.. ;)
<bcsaller> Tim should go back to enchanting 
<SpamapS> bcsaller: I'm going to a similar event down in SD .. how would you suggest approaching it?
<jimbaker> at pycon, one reaction i had is that "tim" might want to ssh into an indiv machine and fix it manually. wrong, wrong :)
<jimbaker> still an attitude that needs to be addressed
<koolhead17> beep beep :)
<kim0> koolhead17: hey
<SpamapS> Actually to that point jim, the current way to do it "the right way" is to modify the formula and upgrade the service..
<SpamapS> jimbaker: I think there's a need for a parallel ad-hoc command interface
<koolhead17> kim0: am heading for niemeyer docs as you suggested.
<kim0> koolhead17: cool!
<bcsaller> SpamapS: I really try to vary to to the audience. I went in, listened to the talks and the questions people had about other things and adjusted what I had to say. This time I was able to focus a lot of talk around the idea "Elasticity is one of the key properties of cloud. As you become elastic the way you need to think about and model services changes..." and so on. It was buzzword compliant enough that I think even the non-t
<SpamapS> jimbaker: also sometimes it is as simple as un-sticking a stuck service with a gentle kill -9 ;)
 * niemeyer waves
<kim0> koolhead17: It's Ensemble docs :) you can write there too
<koolhead17> hello niemeyer :)
<SpamapS> bcsaller: what was the format like? "un-conference" where people suggest what they'd like to discuss?
<jimbaker> SpamapS, agreed for development all of these things are good (so ensemble ssh and other tools we will write). clearly we would hope that ad hoc needs are not necessary outside of dev
<koolhead17> kim0: are we waiting for oneiric then we will put a community documentation page for ensemble?
<kim0> koolhead17: not really .. docs are here https://ensemble.ubuntu.com/docs/
<bcsaller> SpamapS: in this case yes, for example audience might collect a list of questions and then people that think they can answer one or more of those questions might join a panel to field questions from the audience 
<kim0> koolhead17: I'm just used to the older place 
<koolhead17> http://people.canonical.com/~niemeyer/ensemble/user-tutorial.html :P
<jimbaker> SpamapS, i think the real issue was that "tim" might need to not only do some transient changes, but there would be a need to capture from production some permanent fixes too
<koolhead17> haha
<SpamapS> jimbaker: for instance, sometimes you want to restart all of your app servers because you changed a DNS setting and they're using a long TTL..
<kim0> koolhead17: both are almost identical
 * koolhead17 waves to jimbaker 
<koolhead17> kim0: ok
<bcsaller> I was a little nervous getting up in front of a room full of people when the people on either side tended to be a little more business focused, but it worked out fine
<SpamapS> jimbaker: yeah, in that instance "Tim" needs to learn how to use his tools better. :)
 * kim0 hugs Tim
<kim0> hehe
<jimbaker> koolhead17, hi
<SpamapS> bcsaller: alright that sounds great. :)
<bcsaller> SpamapS: the whole thing worked pretty well
<SpamapS> bcsaller: interesting.. the SD one is co-located with an event called "The Business of Cloud Computing" that is free for end users, but costs $1895.00 for "service providers"
<bcsaller> This was co-located with another conference as well. I think it's the pattern for cloud camps
<bcsaller> different messaging between the two as well. MS, IBM and Salesforce are really pushing PaaS at the CTO level 
<bcsaller> but that wasn'
<bcsaller> wasn't so much what people were there to talk about 
<robbiew> bcsaller:  another one in Mountain View -> http://www.devopsdays.org/events/2011-mountainview/proposals/
<SpamapS> The attendees list of this business of cloud is quite aheavy w/ C level
<robbiew> if you're interested ;)
<SpamapS> CTO, CIO, etc. etc.
<SpamapS> Devops days is *awesome*
<robbiew> June 17-18th
<robbiew> call for proposals deadline is tomorrow
<SpamapS> the one last year only had ignite talks and panels
<bcsaller> robbiew: I'll put a proposal in and see what happens
<robbiew> sweet
<robbiew> spreading the gospel!!!
<bcsaller> yes, but I really need to start scheduling time to follow up with people after these things. 
<SpamapS> Definitely should bring them in here too. :)
<robbiew> SpamapS: +1
<SpamapS> ooops.. left my 8 node mediawiki running for 24 hours
<bcsaller> some of the people I talked with might be too high level for this type of chat though
<bcsaller> SpamapS: Ensemble audition ftw :)
<robbiew> anyone in the Boston area?...interested in showing off ensemble thursday at a CloudCamp? http://www.cloudcamp.org/boston/2011-06-02
<SpamapS> any idea when 'ensemble set' will land?
 * robbiew quits soliciting for talks....for now
<robbiew> uuuahahahahaaa
<bcsaller> SpamapS: re: set, its working, should be merging one 1/2 of it today and the rest will be in review, its almost all though the process 
<bcsaller> SpamapS: we were joking that it was had to come up with a simple user visible example. I started with trying to change the blog title in wordpress and ouch, that was a can of worms, it really involves a mysql update and it spiraled out of control from there 
<SpamapS> bcsaller: I'd think with a mysql update it should be *dead* simple
<SpamapS> bcsaller: mediawiki has me writing PHP files all over the place. :-P
<bcsaller> SpamapS: set wordress blog-title="foo". Right, so now wordpress has to identify its relation to mysql but config option hooks are not built around relation context, I think if you follow that thread you'll see it gets silly fast
<bcsaller> writing a PHP file would be a single change with immediate results that don't cross relationship boundaries. might make more sense for this 
 * SpamapS exports the entire "OpenBSD" category from wikipedia to import into his mediawiki
<SpamapS> bcsaller: right, this is where I think one thing missing from  my current understanding of the model is ordering
<SpamapS> bcsaller: I think we may need to be able to be able to guarantee ordering of relations
<SpamapS> there's no sense relating website to a loadbalancer until all the required things are related
<bcsaller> SpamapS: that's related but not quite where I was heading. 
<SpamapS> and since add-relation doesn't block until the relation exists.. this can be a problem I think.
<niemeyer> bcsaller: Glad to hear your talk went well
<SpamapS> bcsaller: Right, I guess what I'm saying is, that setting requires mysql... so settings should error or block until relations are setup. At that point you'd at least know wordpress knows how to talk to mysql.
<niemeyer> SpamapS: Ordering isn't entirely straightforward, as we discussed back in Cape TOwn
<bcsaller> SpamapS: relation settings are not even readily available in the config-changed hook as that's not a relation hook
<SpamapS> niemeyer: indeed.. the stacks may be the place to address that
<SpamapS> I'm still trying to see if something solves my problem of not being able to relate to my slave database until after it has been related to the master
<SpamapS> whoa.. I just discovered the resolved command
<SpamapS> sweeeet
<SpamapS> heh.. now that I'm importing a massive amount of data into my mediawiki.. I find myself wishing I had a graphing system setup. :)
 * SpamapS looks into reconnoiter formula
<niemeyer> Woot :)
<SpamapS> niemeyer: what would be cool would be if the EAP program shared a single ensemble environment... so we could all relate all of our services to one another. ;)
<SpamapS> have one EAP large instance for mysql for everybody's mysql needs. ;)
<niemeyer> SpamapS: I'm not entirely sure other people would agree with that :-)
<SpamapS> So here's an interesting conundrum. I want all the machines that I spawn to direct their syslog service to one syslog machine... essentially a system issue, not a service issue directly...
<SpamapS> i had one thought which is that formulas can have a "management" relation defined, which will run traditional config management type stuff like this.. and then a management formula that does all these tasks
<SpamapS> the other option is to just make it easy to drop in puppet/chef/cfengine to do these types of things
<bcsaller> its not yet clear to me how that would fit into the lifecycle or why the current lifecycle can't perform those tasks
<niemeyer> SpamapS: I'm not sure I get how that's any different from the other relations?
<niemeyer> SpamapS: Why not a "syslog" relation?
<SpamapS> niemeyer: because I'd have to define it for *every* formula in principia, and it would be the exact same program
<SpamapS> ergo, it is at a different level
<SpamapS> Its basically a machine policy, not a service policy
<SpamapS> so another option is to provide a machine analog for formulas
<niemeyer> SpamapS: We have to think about it a bit more.. I think it is a service policy because syslog is a service, but I see your point regarding handling that comfortably.
<SpamapS> yeah, the receiver is a service...
<bcsaller> SpamapS: we did talk about something like that, and something else that was provider specific, but nothing is spec'd yet
<niemeyer> SpamapS: There are class of things which are related to the machine that need further thinking
<SpamapS> I think making it as narrow and flexible as possible would be ideal. The thing I reclal talking about was "ensemble deploy ... --policy-formula=X" which would deploy another formula on the machine as X
<SpamapS> But I recall that the implementation of that would be fairly disruptive
<niemeyer> SpamapS: Yeah, that feels pretty close to ringing a bell
<SpamapS> Another one that might be a simpler stop-gap would be to allow specifying cloud-config data to add to the initial cloud-config
<niemeyer> SpamapS: That sounds like something hard to get out off down the road
<SpamapS> It would at least be easier to migrate away from than AMI's, which is the other way I could see people solving it
<niemeyer> SpamapS: We have to spend some time thinking about these issues, collecting a few use cases, and then sit down to design it properly and start implementing the barebones functionality
<niemeyer> bcsaller, jimbaker, hazmat: Standup?
<bcsaller> yeah
<hazmat> sounds good
<jimbaker> niemeyer, sure
<hazmat> hmm.. skype still segfaulting on me
<hazmat> hanging out in #ensemble on mumble
<niemeyer> I'm happy to go with Mumble
<_mup_> ensemble/expose-provision-service-hierarchy r252 committed by jim.baker@canonical.com
<_mup_> Test corner case that service has been removed between watch and the watch function execution
<niemeyer> Not sure for how long we'll be able to count with Skype either way
 * SpamapS deploys his munin formula..
<niemeyer> Woah
<niemeyer> SpamapS: How did it go?
<SpamapS> niemeyer: working out kinks
<SpamapS> niemeyer: its doing everything I told it to.. but munin isn't picking up my file in /etc/munin/munin-conf.d .. have to figure that out.
<SpamapS> ah! the dreaded type-o in the bash script problem :)
<niemeyer> :-)
<niemeyer> SpamapS: We need to compile bash!
 * SpamapS imagines niemeyer writing a Go parser for bash
<niemeyer> SpamapS: Oh man.. I don't want to get anywhere near that
<robbiew> lol
<_mup_> ensemble/trunk r239 committed by gustavo@niemeyer.net
<_mup_> Include examples as documentation.
<niemeyer> hazmat: Packages churning
<hazmat> niemeyer, awesome, thanks
<niemeyer> hazmat: np
<SpamapS> http://ec2-50-17-114-201.compute-1.amazonaws.com/munin/
<niemeyer> \o/
<SpamapS> It should gain mysql stats too.. qps, cache hits, etc.
<niemeyer> hazmat: pkgs are up for all Ubuntu releases
<SpamapS> hrm I think I hit a weird bug in the agent
<SpamapS> http://paste.ubuntu.com/615517/
<SpamapS> load got crazy high on the box, I think that actually may be have what caused the issue
<SpamapS> it was streaming those stack traces
<_mup_> ensemble/expose-provision-service-hierarchy r253 committed by jim.baker@canonical.com
<_mup_> Don't ignore watch_exposed_flag problem, fix it
<niemeyer> SpamapS: Yeah, that's a weird traceback
<niemeyer> SpamapS: Do you have the top of it?
<_mup_> ensemble/expose-provision-service-hierarchy r254 committed by jim.baker@canonical.com
<_mup_> Merged trunk
<SpamapS> niemeyer: should be in the agent log right?
<niemeyer> SpamapS: Yeah
<niemeyer> SpamapS: These two tracebacks are just a side effect of something else that happened earlier
<SpamapS> niemeyer: I think the agent may have crashed
<niemeyer> SpamapS: Yeah, the traceback certainly looks bad
<SpamapS> the new log is very small and from about 2 minutes before the traceback I pasted
<niemeyer> SpamapS: In a weird way.. that's twisted complaining about something strange
<SpamapS> ahh I was running debug-log when it started...
<niemeyer> SpamapS: Hmm
<SpamapS> http://paste.ubuntu.com/615520/
<SpamapS> seems like it missed the top
<niemeyer> SpamapS: There's actually some interesting info there that I missed earlier 
<niemeyer> hazmat: Would you mind to have a look at this when you have a moment?
<niemeyer> hazmat: The first traceback paste is within exists_and_watch, which is breaking due to the deferred being called twice
<hazmat> niemeyer, looking
<niemeyer> hazmat: I suspect it may have something to do with the recent refactorings 
<hazmat> interesting
<SpamapS> if you guys want to login to the box or anything let me know
<SpamapS> relations don't seem to be working to the service anymore
<hazmat> SpamapS, thanks
<niemeyer> SpamapS: That's understandable, thanks
<niemeyer> SpamapS: The watching within the agent is borked, so it'll not behave properly
<SpamapS> can it be restarted or anything?
<hazmat> niemeyer, not related to the refactoring just a new event type i think that wasnt in the txzookeeper event mapping for pretty names, which we  got exercised by a log statement
<niemeyer> SpamapS: Which agent was that
<niemeyer> SpamapS: You can restart it either way, but may need some env setup
<hazmat> SpamapS, it can be restarted by hand, but its tedious you have to setup the /proc/pid/env and launch with the same cmdline
<SpamapS> ah
<SpamapS> reboot?
<niemeyer> We need to work on this
<niemeyer> hazmat: new event type?
<hazmat> niemeyer, its not a new type.. just one that wasn't previously mapped
<niemeyer> hazmat: A deferred was called twice.. that sounds like wrong wired of deferred chaining
<niemeyer> s/wired/wiring
<hazmat> niemeyer, its caused by a SESSION_EVENT
<hazmat> niemeyer, indeed that too
<niemeyer> hazmat: How do you mean?
<niemeyer> hazmat:   File "txzookeeper/client.py", line 393, in callback
<hazmat> ah.. actually no the logging is async
<hazmat> File "txzookeeper/client.py", line 79, in type_name
<hazmat>     return self.type_name_map[self.type]
<hazmat> exceptions.KeyError: -1
<niemeyer> hazmat: This is within exists_and_watch
<SpamapS> Ok well ec2-50-17-114-201.compute-1.amazonaws.com is the hostname, I added keys for hazmat and niemeyer (from launchpad)
<hazmat> ah.. there are two tracebacks posted
<SpamapS> Still haven't grabbed that lunch.. ;)
 * SpamapS runs out for it
<niemeyer> SpamapS: Enjoy, and thanks!
<hazmat> niemeyer, i was referring to the second traceback
<hazmat> just looking at the first
<hazmat> hmmm.. the tailspin might have been a recursive error logging loop
<niemeyer> hazmat: Thanks
<niemeyer> hazmat: I'll leave that with you.. feeling very sleepy right now
<hazmat> niemeyer, get some sleep.. and thanks for setting up the ppa, much nicer to demo now
<niemeyer> hazmat: Will do, and np
<niemeyer> Laters
<_mup_> txzookeeper/fix-event-type-name-mapping r36 committed by kapil.foss@gmail.com
<_mup_> add some missing events to the event type name mapping
<hazmat> hmm.. this is a bug in the ensemble watch usage.. http://zookeeper-user.578899.n2.nabble.com/watcher-semantics-for-session-events-in-the-C-client-td6206081.html
<hazmat> not problematic for this particular case, which is fixed by the above commit
<hazmat> hmm. i guess its its not really an issue since all of our watches refetch current state
<hazmat> but it is an additional event firing which they most don't account for as spurious to their watch intent
<hazmat> niemeyer, jimbaker, bcsaller ^
<hazmat> so given our usage its fine, given that we check state
<hazmat> hmm.. actually we don't do it correctly, since in this case we also reset the watch
<hazmat> hmmm
<hazmat> it looks like we ran into this already.. http://comments.gmane.org/gmane.comp.java.hadoop.zookeeper.user/1951
<hazmat> but we never addressed it in our usage afaics
<hazmat> hmmm.. this feels like something we should set on the client
<hazmat> bcsaller, jimbaker could i get a +1 on this trivial.. http://paste.ubuntu.com/615542/
<bcsaller> hazmat: that's just mapping constants?
<hazmat> bcsaller, yup
<bcsaller> +1
<_mup_> txzookeeper/trunk r38 committed by kapil.foss@gmail.com
<_mup_> Add some missing constants to the client event type mapping. [trivial][r=bcsaller]
<SpamapS> hazmat: any progress on that problem? I am going to tear down the box if you don't need it
<hazmat> SpamapS, i committed a fix for the immediate cause (that change to txzookeeper), i'm still trying to figure out if we're handling the event that was received properly or not (the error was from printing the event), it looks like that machine got disconnected from zk for a little bit
<hazmat> SpamapS, feel free to tear down the machine
<SpamapS> hazmat: thanks :)
#ubuntu-ensemble 2011-06-01
<SpamapS> 2011-05-31 16:17:46,019 ERROR Formula %r is the latest revision known
<SpamapS> DOH
<SpamapS> Heh.. munin really is too much for a t1.micro. :-/
<SpamapS> http://ec2-50-16-148-122.compute-1.amazonaws.com/munin/
<SpamapS> mmmmmm... graphs
<SpamapS> hazmat: hit the problem again. :(
<bcsaller> SpamapS: nice graphs :)
<SpamapS> bcsaller: I'm besieging the cluster righ tnow.... about to get more interesting! :)
<bcsaller> press record :)
<SpamapS> top - 00:00:59 up  6:56,  1 user,  load average: 12.00, 4.46, 1.65
<SpamapS> 2011-05-31 16:59:30,485 provision:ec2: ensemble.agents.provision INFO: Starting machine id:11 ...
<SpamapS> bcsaller: I'm thinking of writing a "test monster" formula that will crawl and try to destroy a website.. :)
<bcsaller> terrifying 
<SpamapS> blammo.. load drops to 4 new box up
<SpamapS> I also wonder if my t1.micro's are getting throttled
<SpamapS> might be interesting to reboot a few as c1.medium 
<SpamapS> hmm. I really should be hitting the 10.x address so I can just leave this running all night. ;)
<hazmat> SpamapS, do you have the relation hook installing munin-node?
<SpamapS> hazmat: yes
<SpamapS> hazmat: will have to discuss later, super late heading out the door.. :-P
<SpamapS> hazmat: any tips though, let me know I'll give it a shot
<hazmat> SpamapS, me too.. cheers
<hazmat> SpamapS, i have to rebuild the txzk packages to get that fix pushed out.. i'll have a look at it in the morning
<_mup_> ensemble/expose-provision-service-hierarchy r255 committed by jim.baker@canonical.com
<_mup_> Updated watch_exposed_flag (about to be moved into separate branch)
<_mup_> ensemble/expose-watch-exposed-flag r240 committed by jim.baker@canonical.com
<_mup_> Changes to watch_exposed_flag moved from expose-provision-service-hierarchy
<_mup_> ensemble/expose-provision-service-hierarchy r256 committed by jim.baker@canonical.com
<_mup_> Removed watch_exposed_flag changes in this branch
<_mup_> ensemble/expose-provision-service-hierarchy r257 committed by jim.baker@canonical.com
<_mup_> Merged in expose-watch-exposed-flag branch
<_mup_> ensemble/expose-provision-service-hierarchy r258 committed by jim.baker@canonical.com
<_mup_> Nonexistent service terminates corresponding watch on exposed flag
<_mup_> ensemble/expose-provision-service-hierarchy r259 committed by jim.baker@canonical.com
<_mup_> PEP8, removed unnecessary debug statements
<_mup_> ensemble/expose-provision-service-hierarchy r260 committed by jim.baker@canonical.com
<_mup_> Better docstrings + cleanup
<_mup_> Bug #791035 was filed: removed formula components are not cleaned up when upgrading <Ensemble:New> < https://launchpad.net/bugs/791035 >
<_mup_> Bug #791042 was filed: *-relation-broken has no way to identify which remote service is being broken <Ensemble:New> < https://launchpad.net/bugs/791042 >
<kim0> ERROR Formula %r is the latest revision known
<kim0> Is %r supposed to be substitued for by something 
<kim0> anyone around today
<niemeyer> Good morning all!
<kim0> Morning :)
<niemeyer> kim0: Hey!
 * kim0 trying to write a new formula
<kim0> hmm .. in debug-hooks, # relation-list --format json
<kim0> why am I getting   No ENSEMBLE_AGENT_SOCKET/-s option found
<hazmat> kim0, are you in the window that popped up for the hook or just the default terminal?
<hazmat> re debug
<kim0> the one that poped
<hazmat> kim0, if you do ENV | grep ENSEMBLE what do you see?
<hazmat> er. env
<kim0> sorry, I had closed the window .. perhaps I shouldn't have
<hazmat> kim0, no worries, if it happens again let me know, i'd be happy to take a look
<kim0> hazmat: does debug-hooks execute the hook + give me shell, or just shell
<hazmat> kim0, hook + shell
<hazmat> the interactive shell replaces the hook execution
<hazmat> the exit of the shell is the considered the exit of the hook
<kim0> so the original hook script is NOT executed
<hazmat> kim0, correct
<niemeyer> hazmat: Good morning
 * hazmat is on his 3rd unity reboot of the last 24hrs.
<hazmat> niemeyer, g'morning
<niemeyer> hazmat: From the logs, it looks like SpamapS got the same issue again last night
<hazmat> niemeyer, he did, i did a fix for the top level symptom which is committed on txzookeeper trunk
<hazmat> niemeyer, i'm still tracking down in the zookeeper client where/why the event occurs, its a session event
<hazmat> so it hits all the watchers extant
<niemeyer> hazmat: I'm not sure that really fixes it.. right.. it's a session timeout, so it'll continue breaking
<hazmat> niemeyer, indeed
<niemeyer> hazmat: Btw, the quotes were inconsistent, but that's minor
<hazmat> niemeyer, i'm wondering if we need some lower level bookeeping in txzk client to be able to attach a handler for session events, what's unclear to me atm is if its a session expired scenario or not ( ie. recreate all watches and ephemerals)
<niemeyer> hazmat: Is the session issue happening before or after the already called error?
<hazmat> before
<niemeyer> hazmat: SpamapS said something about a micro machine being too small for whatever he was doing.. it may just be locked up completely for so long that it times out
<hazmat> niemeyer, yeah.. i think we need to up our session timeout values on the zk server, there's a faq on this for ec2
<hazmat> and the latencies inherent in that environment
<niemeyer> hazmat: Sounds good, if he's able to reproduce it easily, it would be good to try again under more controlled conditions
<hazmat> but that doesn't excuse that we need some handling for this behavior
<niemeyer> hazmat: Indeed
<niemeyer> hazmat: In gozk, I redirect session events to a specific channel
<hazmat> niemeyer, yeah.. i'm going to try and reproduce via manipulation of the ec2 firewall on the bootstrap node
<niemeyer> hazmat: and as an experiment, I'm panicing the application if the session event isn't handled in a given amount of time
<hazmat> niemeyer, yeah.. that's basically along the lines of what i was thinking for txzk
<niemeyer> hazmat: IOW, if the app isn't acknowledging the fact a session event happened
<hazmat> use a dedicated session event handler, and route those events there.
<hazmat> we'd also need to track the outstanding watches if we want to reduce the redunancy from the zk c client
<niemeyer> hazmat: Either way, we'll likely have to kill all the running callbacks
<hazmat> which notifies on all watchers
<kim0> hmm, why does relation-list only result in  ["mysql/0"]  (no database, user, password ) ..etc This is in db-relation-changed second invocation
<niemeyer> kim0: That's the relation name, not really what you want
<niemeyer> kim0: Check relation-get
<hazmat> niemeyer, it depends.. does it mean the session is expired.. or not
<niemeyer> hazmat: I think the only thing coming through those channels is up/down notes
<niemeyer> hazmat: But it'd be worth double checking
<kim0> relation-get - --format json  results in {}
<niemeyer> kim0: That means the other side hasn't relation-set anything
<kim0> facing this consistently for the past hour .. I used remove-relation and add-relation many times
<kim0> the other side is mysql from examples
<niemeyer> kim0: Hmm
<kim0> should I login and check if a DB has been created for my instance (drupal) ?
<hazmat> kim0, if your debugging both sides interactively
<kim0> just one side
<hazmat> the values aren't flushed for the remote to see till the hook/debug is ended
<hazmat> er. the debug window for the hook is ended
<kim0> hazmat: I'm not debugging mysql side, so that shouldn't be a problem corrrect
<niemeyer> kim0: Kind of..
<niemeyer> kim0: Try logging out and in again, just to try it out
<niemeyer> kim0: From the debug-hook session
<hazmat> kim0, are you in the relation join event or the changed event?
<kim0> hazmat: in "changed" in its 2nd invocation
<hazmat> kim0, hmm. that sounds like the right place then
<kim0> yeah and I've waited for mysql to create its stuff
<niemeyer> kim0: Try logging out and in
<niemeyer> kim0: From the debug session
<kim0> ok
<niemeyer> kim0: and then do the relation-get
<niemeyer> Oh, I think there's a bug, now that you mention it..
<kim0> niemeyer: so I closed the screen window
<kim0> opening it again
<niemeyer> kim0: Yeah, I suspect it won't work
<niemeyer> kim0: Our example is doing some magic it shouldn't
<kim0> niemeyer: so I should kill mysql and start a new one ?
<niemeyer> kim0: Follow with me so that you can actually fix it and try again
<kim0> ok
<niemeyer> kim0: Go to the mysql formula, and look at the joined hook
<kim0> opened
<niemeyer> kim0: You see that check under the comment # Determine if ...
<kim0> yeah
<niemeyer> kim0: You see, it's bumping out of the hook if the _service_ already has a database created
<kim0> which makes sense to me ?!
<kim0> what's the problem
<kim0> ah
<niemeyer> kim0: The problem is that it bumps out without ever letting the _relation_ to know of the settings
<kim0> so it's not relation-set'ing
<niemeyer> kim0: yep
<kim0> got it
<kim0> doh
<kim0> but it must have relation-set'ed the first time
<kim0> is that lost ?
<niemeyer> kim0: If you 
<niemeyer> kim0: Lost?
<kim0> why is the relation-set from first time not available ? 
<niemeyer> kim0: Oh, yeah, it seems to be
<kim0> ok
<hazmat> if the service name is reused
<kim0> I've just been remove-relation and add-relation
<hazmat> ie relate worpdress mysql.. destroy wordpress, recreate wordpress and relate
<niemeyer> hazmat: Ugh.. that's bad
<hazmat> yeah.. i'm not sure this was a problem before with the mysql formula
<niemeyer> hazmat: The service name is not being "reused".. the service is still around
<hazmat> kim0, are you using the example or principia?
<kim0> hazmat: example
<niemeyer> hazmat: It's just another relation
<hazmat> niemeyer, yes.. its another relation, but the association to the same related service name in the db is still present
<niemeyer> hazmat: Yes, but we shouldn't destroy the database like that no matter what
<hazmat> i think the principia (based on the original python) doesn't have this issue
<hazmat> yeah.. it doesn't
<hazmat> this is a bug in the mysql formula
<hazmat> in the examples directory
<hazmat> the original python one would just skip the db creation, but still set the credentials.. the shell script one exits the hook if the db is already present without setting credentials
<hazmat> iotw. the principia mysql formula should work fine here
<kim0> ok I'll destroy everything and start again
<niemeyer> kim0: Thanks for brining this up, we'll have to fix the example somehow
<niemeyer> hazmat: This is another use case for having a name for the actual relation
<niemeyer> hazmat: Or, an identifier
<hazmat> niemeyer, you mean a generated name?
<niemeyer> hazmat: The identifier we talked about before
<niemeyer> hazmat: relation-N
<hazmat> right.. to prevent anon relations
<hazmat> and make them addressable
<niemeyer> hazmat: Yeah.. this would be the right thing to use in this case
<niemeyer> rather than the service name
<niemeyer> then the logic would be right
<niemeyer> "If a database name with the relation name already exists, stop, otherwise create it and set the info"
<niemeyer> But, one thing at a time
<niemeyer> hazmat: What's your plan for the session issue?
<hazmat> debatable, we've allowed service names to be reusable (as opposed to unit names), so the question is are there scenarios where a service would want to pickup a previously set up database (a/b upgrades perhaps).. fwiw. the logic was right before it was rewritten.. given that we do need relation names to make them addressable in other context that seems like a good choice
<hazmat> niemeyer, setup a test environment and manipulate the firewall and zk server to reproduce
<hazmat> might need to send an email to the list as well, i've seen some contradictory information in regards to this
<hazmat> niemeyer, have a look at http://zookeeper-user.578899.n2.nabble.com/watcher-semantics-for-session-events-in-the-C-client-td6206081.html
<niemeyer> hazmat: The right thing wouldn't be to give the same database for all relations
<niemeyer> hazmat: So the original one wasn't proper either
<hazmat> the watches aren't freed after the session event by the client
<hazmat> niemeyer, working vs. proper
<niemeyer> hazmat: ?
<hazmat> niemeyer, it worked in this exact scenario
<hazmat> you can add/remove relations all day with the original, and the remote service will get the correct credentials and events
<niemeyer> hazmat: Ok, there are several things we might that that would work
<niemeyer> hazmat: I'm trying to figure how we should actually solve this issue
<hazmat> niemeyer, which issue session or relation?
<niemeyer> hazmat: Both :)
<hazmat> well that clarifies ;-)
<niemeyer> hazmat: So.. for relation I'll file a bug
<niemeyer> hazmat: Let's put it aside for now
<hazmat> niemeyer, sounds good
<niemeyer> hazmat: Interesting indeed
<niemeyer> hazmat: (the link)
<hazmat> niemeyer, so the ppa will auto update with the new package for txzk since there was a merge to trunk?
<hazmat> oh.. i need to increment the version i think
<niemeyer> hazmat: It will, but it will be daily
<niemeyer> hazmat: No
<hazmat> ah.. its got the revno on it
<niemeyer> hazmat: It will use the revision from the repo
<hazmat> niemeyer, can those builds be triggered by hand?
<hazmat> yup.. i see it now.. okay requested a rebuild of txzk
<niemeyer> hazmat: When you want to bump up due to a change you've done recently, go to the recipe page, and click on the button
<hazmat> wow.. the build wait time is much better now
<hazmat> 4-8m 
<hazmat> is much better than 12-36hrs
<niemeyer> hazmat: Btw, it's good to check if your revision isn't already built
<niemeyer> hazmat: It was, in this case
<niemeyer> txzookeeper - 0.2.1-0ensemble38~oneiric1
<niemeyer> hazmat: That 38 is the revno, which is the tip atm
<hazmat> niemeyer, ah good point
<niemeyer> hazmat: Don't use "Request build", in general
<niemeyer> hazmat: When there is a revno unprocessed, there will be a button at the top, right below "Build schedule"
<hazmat> ic
<niemeyer> hazmat: https://bugs.launchpad.net/ensemble/+bug/791370
<_mup_> Bug #791370: We need a relation identifier for hooks <Ensemble:New> < https://launchpad.net/bugs/791370 >
<_mup_> Bug #791370 was filed: We need a relation identifier for hooks <Ensemble:New> < https://launchpad.net/bugs/791370 >
<niemeyer> We should address this sooner rather than later
<hazmat> niemeyer, more immediately the formula should probably be fixed as well
<niemeyer> hazmat: It depends a bit on how long we take to get to it
<niemeyer> hazmat: E.g. if Ben is available today, it might be a good brain-break task
<niemeyer> I've assigned the bug to him
<niemeyer> Hah, just in time for bcsaller to be unable to say no!
<bcsaller> oh no
<hazmat> niemeyer, ;-) here's the diff https://pastebin.canonical.com/48037/
<bcsaller> what did I walk into?
<hazmat> bcsaller,  a bug in the example mysql formula
<niemeyer> bcsaller: https://launchpad.net/bugs/791370
<_mup_> Bug #791370: We need a relation identifier for hooks <Ensemble:Confirmed for bcsaller> < https://launchpad.net/bugs/791370 >
<niemeyer> hazmat: How does that solve anything?
<hazmat> bcsaller, and more generically identifiers for relations
<hazmat> niemeyer, it allows the formula to work in if add/remove relation is used or the service is destroyed and recreated
<niemeyer> hazmat: How so?
<hazmat> oh.. i need to remove the exit 0 as well
<hazmat> niemeyer, it will setup credentials for the service and set them on the relation always this way
<niemeyer> hazmat: Ahhhh, yeah, if you fix it, maybe ;-)
<hazmat> niemeyer, can i get a +1  on the updated trivial https://pastebin.canonical.com/48038/
<niemeyer> hazmat: So it'll use a different user per relation?
<hazmat> niemeyer, yes
<niemeyer> hazmat: Cool
<hazmat> hmmm.. i wonder what the original did here
<niemeyer> hazmat: +1 if you test it :-)
<hazmat> this doesn't seem right either
<hazmat> it shouldn't be resetting the identity every time on join
<hazmat> it should be checking for the value on the relation
<niemeyer> Yeah, indeed
<hazmat> niemeyer, oddly though the original (in principia) did this.. and it worked
<hazmat> for scaling mediawiki
<hazmat> but i take that  as some mysql aberation
<niemeyer> hazmat: Apparently we didn't test non-trivial scenarios, though
<niemeyer> hazmat: Well, or maybe you did with the old one
<hazmat> i did with the old one.. but i still wonder about this 
<_mup_> Bug #791382 was filed: Session events are not being handled properly <Ensemble:Confirmed for hazmat> < https://launchpad.net/bugs/791382 >
<niemeyer> Lunch time, biab.
<SpamapS> Hmm..
<SpamapS> so scrolling back a bit..
<SpamapS> the principia formula right now uses a flag file to know whether or not the relationship has been broken..
<SpamapS> If its been broken, then it creates a new user
<SpamapS> lp:~ensemble-composers/principia/oneiric/mysql/trunk btw
<SpamapS> hazmat: re the munin server, I had it happen yet again after a full shutdown/bootstrap.. and I tracked it down to heavy network + CPU. When munin was doing its updates and I would add or remove relations, the problem would happen.
<SpamapS> hazmat: so I think your hunch is correct that we probably need to think about longer timeouts for ZK
<_mup_> ensemble/expose-provision-service-hierarchy r261 committed by jim.baker@canonical.com
<_mup_> Tests verifying all service units are 'checked' upon destroying a service or unexposing it
<_mup_> ensemble/expose-provision-service-hierarchy r262 committed by jim.baker@canonical.com
<_mup_> Doc strings
<_mup_> ensemble/expose-provision-service-hierarchy r263 committed by jim.baker@canonical.com
<_mup_> Removed debug logging output
<niemeyer> Yo
<SpamapS> hrm.. augeas lenses have murky copyrights
 * SpamapS is starting to develop copyright policies for formulas
<kim0> yuck
<SpamapS> Yeah
<niemeyer> SpamapS: What, seriously?
<niemeyer> SpamapS: What's the copyright like?
<SpamapS> I'm surprised that augeas was allowed into Debian w/ such an inaccurate copyright file
<SpamapS> the copyright file claims that it is Copyright 2007, 2008 Red Hat Inc.
<SpamapS> but that copyright is only in the man pages and tests
<SpamapS> most of it is clearly Copyright 2007-2010 David Lutterkort
<SpamapS> And the lenses have almost no copyrights specified, and there is no blanket copyright specification
<SpamapS> the *license* is very clearly LGPL-2.1
<SpamapS> but most people seem to get Copyright wrong.
<kim0> Does the released column on https://ensemble.ubuntu.com/kanban/dublin.html refer to released in past week ?
<hazmat> SpamapS, it doesn't need to check a flag file, the relation is different if its broken, just checking for the value should do the trick
<hazmat> kim0, roughly.. everything since the last milestone, its cumulative. the milestone started roughly right after uds
<SpamapS> hazmat: well my point was simply that the database name should remain the same, as we want to preserve the data most of the time.
<hazmat> SpamapS, ah.. indeed
 * kim0 rings a little shiny bell
<SpamapS> I actually think even better would simply be to check for the existence of the user.. since I'm also revoking the user perms on broken
<kim0> Let's have our irc meeting in 1:15 hours
<kim0> Things to talk about 
<kim0> - Creating the principia distribution and principia-tools project
<kim0> - The Ensemble ppa being up
<kim0> - Plus any recent development done
<SpamapS> kim0: I'm going to miss it unfortunately.. have to take child to doctor. But next time!
<kim0> oh it would have been nice to have you on this one .. 
<kim0> no problem though
<SpamapS> Yes.. definitely.  :(
<SpamapS> just found out about it about 15 min ago
<kim0> SpamapS: all the best to our little sick friend
<SpamapS> He's fine.. its just not possible to convince his mother of that without a PhD. ;)
<kim0> HAHA
<hazmat> SpamapS, :-)
<kim0> it's amazing women are women everywhere I suppose
<hazmat> perhaps more specifically.. the nature of a mom to be a mom
<hazmat> i wish it were true
<kim0> yeah, they're hard wired to care for children more than others can imagine .. 
<niemeyer> SpamapS: Ah, that should be fine then
<niemeyer> SpamapS: We care mostly about license details, rather than copyright
<niemeyer> SpamapS: For using Augeas, that is
<niemeyer> SpamapS: +1 on having that cleared up for formulas
<niemeyer> bcsaller: Have an interview now, will push your review to completion after that
<niemeyer> sidnei_!
<sidnei_> yo
<sidnei_> so anyone wrote an ensemble formula for plone yet? :)
<hazmat> sidnei_, not yet ;-)
<hazmat> sidnei_, i was thinking it would be a zodb server formula, plone app server formula, varnish formula
<sidnei> hazmat, yeah, that's what i thought too
<sidnei> just wondering if i can get off the business of building the installer for windows and provide ensemble formulas instead :)
<niemeyer> That'd be *awesome*
<hazmat> the varnish is a little wierd, in that its probably passing VCL directly  via the service relation
<hazmat> for more advanced config.. and namespacing across multiple domains and different apps might be a little strange with varnish and service relations.. as its config is effectively code
<hazmat> but simple stuff should just work ootb
<hazmat> sidnei, did you see enfold's new  plone hosting system?
<sidnei> hazmat, yup, got early access and all :)
<hazmat> its insanely fast to setup a new plone with their stuff, i still haven't figure out how they did that
<hazmat> its way faster than the process start time
<sidnei> hazmat, i don't think they're setting up a new process, but i could be wrong. possibly shared instance but separate zodb?
<hazmat> sidnei, that makes sense, although there's some cache thrashing/mem ballooning possible with something like that
<sidnei> hazmat, indeed.
<sidnei> could ask alan but he's not online right now
<hazmat> sidnei, he told me secret sauce last i asked ;-)
<sidnei> haha
 * hazmat switches into concurrent mode
<_mup_> ensemble/expose-hook-commands r240 committed by jim.baker@canonical.com
<_mup_> Hook skeleton for open-port, close-port commands
<kim0> koolhead17: hey
<kim0> there ?
<koolhead17> kim0: hello
<kim0> hey :)
<kim0> koolhead17: can u join the online chat in 2 mins
<kim0> and tell us what you've been up to
<koolhead17> yes sure. am very much here.
<kim0> cool
<kim0> niemeyer: hazmat sidnei jimbaker bcsaller ready for some irc talk
<hazmat> niemeyer, that problem that SpamapS had seems pretty easy to reproduce.. start client with watch against local zk, restart zk, boom.
<kim0> let's hit #ubuntu-cloud
<niemeyer> hazmat: Sweet!
<hazmat> kim0, rock on
<niemeyer> hazmat: Well, "sweet!" :)
<koolhead17> cool
<niemeyer> COMMUNITY WEEKLY MEETING ROLLING ON #ubuntu-cloud
<niemeyer> ^^^
<niemeyer> :-))
<niemeyer> Ok, standup?
<niemeyer> Let's try to do a "real" one.. I have to stop talking and review some code :)
<koolhead17> haha
<jimbaker> niemeyer, sounds good
<niemeyer> hazmat, bcsaller: standup?
<koolhead17> obino: jimbaker hello guys
<bcsaller> niemeyer: mumble
<hazmat> niemeyer, we're all hanging in mumble
<jimbaker> koolhead17, hi
<_mup_> Bug #791501 was filed: Ensemble images should use the ppa <Ensemble:New> < https://launchpad.net/bugs/791501 >
<hazmat> niemeyer, on the call you where saying we shouldn't divert all session events to a separate channel if the session is still alive?
<niemeyer> hazmat: The opposite
<hazmat> okay.. cool, that's what i thought
<niemeyer> hazmat: We shouldn't divert the event if the session is really dead
<hazmat> or.. i should say.. yeah.. that makes more sense
<niemeyer> hazmat: Since we have to crash whoever is waiting on the watch
<hazmat> niemeyer, yeah.. in the case of disconnect we can still divert if the handler is a kill switch
<niemeyer> hazmat: Btw, it's good to double check what I said in the meeting re. the behavior of sessions.  That link you posted has someone saying the opposite, but I don't recall this being the case.
<hazmat> as we want to crash fast in that case, not go through every watcher callback failing
<hazmat> niemeyer, yeah.. i'm running some tests now
<niemeyer> hazmat: I recall a specific feature announcement saying that "now reestablishment of connections will preserve watches"
<hazmat> if connecting to a different server?
<niemeyer> hazmat: But this guy's email is from March.. so it's strange
<niemeyer> hazmat: Yeah, given the semantics of zk, switching server is really not a big deal
<niemeyer> hazmat:
<niemeyer> "If the client connects to a different ZooKeeper server, it will send the session id as a part of the connection handshake.
<niemeyer> "
<niemeyer> http://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html
<hazmat> i wonder how vmware with activestate stealing the thunder of cloudfoundry
<hazmat> ^feels
<niemeyer> "Another parameter to the ZooKeeper session establishment call is the default watcher. Watchers are notified when any state change occurs in the client. For example if the client loses connectivity to the server the client will be notified, or if the client's session expires, etc..."
<niemeyer> hazmat: The distinction between precisely those two examples is what we were talking about
<hazmat> niemeyer, from the same page "If you are watching for a znode to come into existance, you will miss the event if the znode is created and deleted while you are disconnected."
<hazmat> i guess in that particular case its moot
<niemeyer> hazmat: Yeah, sure, but we don't care .. righ
<niemeyer> t
<hazmat> hmm
<niemeyer> hazmat: re. cloudfoundry, I don't see it as stealing.  I think vmware is probably delighted.
<hazmat> its a separate private cloud foundry installation hawking closed source features 
<niemeyer> hazmat: Yeah, but it's still announced as Cloud Foundry..  much better for them than anything else ActiveState pushed.
<bcsaller> niemeyer: I think we are missing a high level architecture doc. I got asked for something and realized we don't really have anything we can point people at that want to eval the overall system design. Unless you know of something I'm not seeing?
<hazmat> niemeyer, another useful doc http://outerthought.org/blog/435-ot.html
<niemeyer> bcsaller: Indeed
<niemeyer> bcsaller: I started such a doc in the old specifications section in the wiki
<niemeyer> bcsaller: But it's poor and unfinished
<niemeyer> bcsaller: Funny enough, I think the best we'd have today is our glossary
<bcsaller> maybe I'll take a stab at it a little later today
<niemeyer> bcsaller: Sweet!
<niemeyer> bcsaller: https://wiki.canonical.com/Ensemble/Specifications/0010
<bcsaller> thanks
<niemeyer> bcsaller: Not sure there's anything useful there
<niemeyer> bcsaller: It hasn't been reviewed this century
<bcsaller> understood :)
<_mup_> ensemble/expose-hook-commands r241 committed by jim.baker@canonical.com
<_mup_> Completed wiring up hook commands for open-port, close-port
<hazmat> looking at some of the callback oriented test code of txzookeeper is like reassembling a puzzle.
<jimbaker> hazmat, indeed, inlineCallbacks do make for a better experience. in talking to guido van rossum at the amazon python day event, he clarified his position on async code - it was the callback oriented code that made his head hurt, not async code per se
<bcsaller> niemeyer: thanks for the review
<niemeyer> bcsaller: No problem, excited to see this close to completion
#ubuntu-ensemble 2011-06-02
<niemeyer> Night all
<_mup_> txzookeeper/session-event-handling r40 committed by kapil.foss@gmail.com
<_mup_> allow connection using existing session, test session expiration, additional symbol name translation for exceptions.
<_mup_> txzookeeper/session-event-handling r41 committed by kapil.foss@gmail.com
<_mup_> pep8isms
<kim0> do I really need to type "yes" to the ssh authenticity question
<niemeyer> kim0: How do you mean?
<kim0> ensemble status
<kim0> I get the ssh yes/no prompt
<kim0> same for debug-hooks
<kim0> niemeyer: that is normal right ?
<niemeyer> kim0: You mean the prompt asking you if the fingerprint for the server is valid?
<kim0> yes
<niemeyer> kim0: Yeah, that's usual when connecting to a new server
<kim0> Would be great if Ensemble would get the machine log .. and verify it for me 
 * kim0 grabs his wish list bag
<kim0> Also if ensemble had a presistent connection to bootstrap node :) and perhaps run locally under a screen session with "watch status" and debug-log ..etc all running
<kim0> niemeyer: I think I am see'ing strange behaviour which I hope someoen can help me with since I really want to write this "write a formula doc". I just launched a mysql SU, and a drupal SU (based on an almost empty new formula)
<kim0> fired debug-hooks drupal/0 .. works
<kim0> add-relation drupal mysql
<kim0> I am not getting any new windows in the debug-hooks screen
<niemeyer> Hmm
<niemeyer> kim0: Thinking
<kim0> sure
<kim0> I think I saw this yesterday
<kim0> when I closed the debug-hooks screen ..
<kim0> hooks suddenly started firing
<kim0> it's like it was stuck 
<niemeyer> kim0: That's normal
<kim0> but I always blame myself :)
<niemeyer> kim0: Hooks are serially executed
<kim0> well there were not opening new windows in screen session
<kim0> they should right ?
<niemeyer> kim0: You won't get another hook window until you stop the existing one
<kim0> there was no existing one .. I was waiting for it
<kim0> just like now .. there's only window 0 in screen
<niemeyer> kim0: and what's 0?
<hazmat> kim0, it was disabled (the ssh fingerprint confirm prompt)
<kim0> niemeyer: just a shell
<hazmat> but it leaves things open to man in the middle
<hazmat> we should pull it down though automatically 
<niemeyer> hazmat: Do you have any ideas of what might be going on for kim0?
<kim0> hazmat: yeah .. check my wish list :) we could ec2-get-console-output and verify it :)
<niemeyer> kim0: WE can do better than that
<niemeyer> kim0: We should inject the host key
<niemeyer> kim0: That's in our wishlist already :)
<hazmat> niemeyer, man in the middle was the primary reason fingerprint checking was renabled yes?
 * hazmat reads through log
<niemeyer> hazmat: That's right
<kim0> niemeyer: cool !
<kim0> niemeyer: cloud-init can inject host key already indeed .. that's even better
<hazmat> indeed, we should probably make use of that, but we need to store in zk for multi-client access
<hazmat> kim0, okay.. so you've got a debug hook session on drupal or mysql?
<hazmat> when doing the add relation
<kim0> drupal
<kim0> hazmat: debug-hooks drupal/0
<kim0> hazmat: add-relation drupal mysql
<kim0> that's it .. no new window in screen
<hazmat> kim0, okay.. so you do debugs for  install & start? 
<hazmat> or are you debugging after start?
<kim0> the sequence was
<kim0> deploy mysql
<kim0> deploy drupal
<kim0> debug-hooks drupal/0
<kim0> debug-log
<kim0> add-relation mysql drupal
<hazmat> kim0, could you paste your debug-log
<kim0> I got a "install" or "start" hook here can't remember .. which I closed
<kim0> I expected to get the db-relation-changed one after it .. but didn't
<kim0> sure
<hazmat> kim0, by closing are you exiting the shell or just closing the window?
<hazmat> hmm
<hazmat> i don't think i've tested closing the window instead of exiting the shell
<kim0> hazmat: ctrl + d
<kim0> exit shell
<kim0> hazmat: log http://paste.ubuntu.com/616692/
<hazmat> hmm that should be fine
<niemeyer> hazmat: Should be equivalent
<kim0> hazmat: status â http://paste.ubuntu.com/616694/
<hazmat> niemeyer, yeah.. but on the close window case, there is still a callback to screen to close the window after the process exit
<hazmat> but ctrl +d vs. exit is equiv
<niemeyer> hazmat: I don't understand that distinction
<hazmat> kim0, odd it seems like the unit hasn't picked up the relation
<kim0> :s
<kim0> can u connect to the env ?
<niemeyer> hazmat: The callback will execute after the shell process exits, right?  Either option would kill it
<niemeyer> hazmat: IOW, closing the window also terminates the shell
<niemeyer> hazmat: This would happen if the hook wasn't executed
<hazmat> niemeyer, right, but we have  another process checking on the shell and then instructing screen to kill the window, which is probably just a noop at that point
<hazmat> niemeyer, its unrelated to what kim0 is seeing
<niemeyer> hazmat: I mean, the relation not showing up
<niemeyer> kim0: Can you please paste ps auxw for that machine?
<niemeyer> kim0: The drupal one
<kim0> niemeyer: from the debug-hooks screen is ok right ?
<niemeyer> hazmat: Hmm.. unless we're running the shell script with -e, and screen exits with 1 because the window wasn't there?
 * niemeyer doing guess work
<niemeyer> kim0: Yeah
<kim0> http://paste.ubuntu.com/616696/
<niemeyer> hazmat: "install".. there's an old hook running still
<kim0> I hope I didn't do something stupid at the end :)
<niemeyer> kim0: I suspect your window 0 has the install hook running
<niemeyer> kim0: Can you please paste "env" from that window
<kim0> http://paste.ubuntu.com/616698/
<hazmat> niemeyer, window 0 is never used for hooks its .. its always a shell
<kim0> window 0 is always there 
<kim0> yeah 
<niemeyer> hazmat: It's trivial to shift windows around
<hazmat> niemeyer, but the names are distinct on the windows
<kim0> http://paste.ubuntu.com/616699/ is the install hook itself
<niemeyer> Ok, but that's not the case either way
<niemeyer> Still, we have a hook running
<niemeyer> hazmat: Ok
<hazmat> the debug stuff names the windows by hook , except window 0 which is named 'shell' afaicr
 * kim0 nods
<niemeyer> kim0: What's in /tmp/tmpLjxVDG-install
<kim0> niemeyer: http://paste.ubuntu.com/616700/
<kim0> scary script 
<hazmat> so it seems somehow the debug window was ended but the underlying debug process is still alive.
<niemeyer> hazmat: Yeah, it's still in the sleep loop
<niemeyer> hazmat: Which confirms your initial theory
<kim0> I probably closed the window too fast, if you think it needs time to do anything
<hazmat> it might be a different signal gets sent besides HUP that needs to be caught here
<hazmat> kim0, it shouldn't matter
<hazmat> we should never rely on user timing
<kim0> yeah I know 
<niemeyer> hazmat: TERM, KILL
<niemeyer> hazmat: Wait.. the HUP is catching the outside signal
<niemeyer> hazmat: That's not the problem.. that script is still running
<hazmat> yeah.. its not in the screen process
<niemeyer> kim0: One more: /proc/1585/environ
<kim0> http://paste.ubuntu.com/616708/
<kim0> niemeyer: not sure why it has no newlines
<kim0> doh
<niemeyer> hazmat: We should monitor it from outside instead of expecting it to do stuff before it dies
<kim0> sorry .. pastebinit error
<niemeyer> kim0: That's the file format indeed
<kim0> niemeyer: http://paste.ubuntu.com/616709/
<kim0> this is complete
<niemeyer> hazmat: e.g. writing to hook.pid when the process starts
 * kim0 probably just uncovered a pastebinit bug
<hazmat> niemeyer, yeah.. and then just doing something like kill -0 `cat hook.pid`  for the sleep condition
<niemeyer> hazmat: RIght
 * hazmat files a bug
<niemeyer> hazmat: Another handy issue for a brain breaker.. will paste that conversation in a bug.
<niemeyer> hazmat: Oh, ok :)
<niemeyer> hazmat: Please paste the log for context
<niemeyer> hazmat: Thanks
<niemeyer> kim0: Alright.. we know what's wrong
<kim0> great :)
<niemeyer> kim0: For fixing your problem right now,
<niemeyer> kim0: kill 1585
<kim0> got it 
<kim0> thanks
<niemeyer> kim0: np
<kim0> wonder why no one else is hitting this
<niemeyer> kim0: Thanks a lot for your help uncovering the bug
<niemeyer> kim0: It's the way the debug-hook window was closed
<kim0> ah so you close it clike ctrl-a c
<kim0> ok
<kim0> not c .. whatever closes windows :)
<kim0> ew
<kim0> ok probably hitting a new one
<kim0> I killed the process .. got the window for db-relation-changed
<kim0> relation-get inside it says â  No ENSEMBLE_AGENT_SOCKET/-s option found
<kim0> env dump http://paste.ubuntu.com/616713/
<kim0> hazmat: could you please as well
 * hazmat looks
<hazmat> kim0, that's the env from window 0 ?
<hazmat> or the debug window?
<kim0> hazmat: no win 1
<kim0> the db-relation-changed window
<kim0> db-relation-joined actually
<hazmat> it doesn't look like it has the debug environment variables sourced
<niemeyer> That's the shell isn't it?
<niemeyer> SUDO_COMMAND=/usr/bin/byobu -xRS drupal-0-hook-debug -t shell
<niemeyer> kim0: I think the paste is bogus
<niemeyer> kim0: Or maybe I just misunderstand what the variables mean
<kim0> I can repaste manually
<kim0> anything to look for ?
<niemeyer> kim0: Just thinking how to get the pid for the parent shell
<kim0> ps -elf | grep $$ ?
<kim0> it'd be listed in ppid field
<niemeyer> kim0: echo $BASHPID
<kim0> 9229
<niemeyer> kim0: echo $PPID
<kim0> 935
 * kim0 feels like a shell
<niemeyer> Ok, cool
<niemeyer> kim0: Hehe :-)
<kim0> :)
<niemeyer> kim0: Yeah, looks like a failure in source it indeed
<niemeyer> sourcing
<kim0> we don't log those steps somewhere ?
<niemeyer> Can't imagine how that could happen, though, even if we killed the process
<niemeyer> kim0: Nope, this is the bootstrapping of debugging itself.. we might indeed have to log it in the future
<niemeyer> kim0: Can you please paste the new process list so we can reach the new hook
<kim0> http://paste.ubuntu.com/616722/
<kim0> guess no one uses debug-hooks really :)
 * kim0 afk for 5 mins
<niemeyer> kim0: We do, but we don't generally kill processes in the middle
<hazmat> kim0, just me ;-)
<hazmat> need to run a quick errand, back in a bit
<niemeyer> kim0: Let me know when you're back.. we can follow on a bit if you're interested
<kim0> niemeyer: back
<niemeyer> kim0: Ok, let's see /tmp/tmpR1UhMY-db-relation-joined then
<kim0> niemeyer: http://paste.ubuntu.com/616737/
<niemeyer> kim0: Ok, please figure ENSEMBLE_DEBUG from /proc/9187/environ, and list $ENSEMBLE_DEBUG/env.sh
<kim0> niemeyer: can't see ENSEMBLE_DEBUG .. paste http://paste.ubuntu.com/616742/
<niemeyer> kim0: Hmm.. I guess it wasn't exported
 * niemeyer htinks
<niemeyer> thinks
<niemeyer> kim0: Ok, let's try to find by force: cd /tmp && find -name env.sh
<niemeyer> kim0: Will probably see more than one
<kim0> niemeyer: 3 of em
<niemeyer> kim0: Ok, let's fine the one with db-relation-joined
<kim0> http://paste.ubuntu.com/616746/
<kim0> http://paste.ubuntu.com/616747/
<kim0> http://paste.ubuntu.com/616748/
<kim0> niemeyer: I tried grep'ing .. doesn't have joined in them
<niemeyer> kim0: Well, that's likely the issue then.. let me check
<niemeyer> kim0: That's weird..
<niemeyer> kim0: All of them have ENSEMBLE_AGENT_SOCKET
<kim0> maybe none of them is sourced
<niemeyer> kim0: This is the right one for db-relation-joined: http://paste.ubuntu.com/616747/
<niemeyer> kim0: Can you please paste the hook.sh file living in the same directory?
<niemeyer> kim0: Well, that's the thing
<niemeyer> kim0: There's no easy way for bash to be executed without this being sourced
<kim0> niemeyer: http://paste.ubuntu.com/616754/
<niemeyer> kim0: As you can see..
<niemeyer> kim0: That prior paste is from /tmp/tmp.Sj2ilkd53B/env.sh, right?
<kim0> double checking 
<kim0> should be yes
<niemeyer> kim0: Ok, so there's really no way for bash to be executed without it being sourced, which is awkward..
<niemeyer> kim0: You got a bash, without the env variables, but the only way for that bash to have come up, was through the sourcing line
<niemeyer> Hmmm
<kim0> thinking as well
<kim0> niemeyer: parent process for the Window1 shell, is byobu, not the hook.sh ?
<niemeyer> kim0: Yeah, that's strange
<kim0> hook.sh should still be running if it fired us right
<niemeyer> kim0: Indeed
<niemeyer> kim0: This would also justify the previous issue as well, interestingly
<niemeyer> kim0: Hmm
<niemeyer> kim0: Let me do a local test, hold on
<kim0> niemeyer: pstree -p .. if that's helpful http://paste.ubuntu.com/616757/
<hazmat>   nice.. half way to my walking desk finished
<kim0> hazmat: walking desk ?
<hazmat> kim0, treadmill with keyboard tray and monitor stand
<kim0> oh that's new to me ... sounds cool indeed :)
<hazmat> kim0,  http://opinionator.blogs.nytimes.com/2010/02/23/stand-up-while-you-read-this/ ... http://www.nytimes.com/2008/09/18/health/nutrition/18fitness.html
<hazmat> kim0, lots of a good evidence for the benefits vs sitting in a chair all day
<kim0> yeah that's intuitive
<hazmat> the only problem is that the treadmill weighs 250 pounds.. just carried it up the stairs.. so most of the way done on the setup
 * hazmat catches up the irc log to get up to speed on debug-hooks
<kim0> hazmat: congrats :) send pics to warthogs :)
<kim0> niemeyer: I'm going for a late lunch .. I have inserted your ssh key into ec2-67-202-22-46.compute-1.amazonaws.com (drupal/0) should you want to login to it
<niemeyer> kim0: Cheers
<niemeyer> kim0: Will check it out
<kim0> cool
<niemeyer> hazmat: I suspect both issues likely boil down to the way the shell is being executed
<niemeyer> hazmat: It's doing a two-step execution, and it's not entirely clear why
<hazmat> niemeyer, what's strange is that it works sometimes
<hazmat> writing up a reply to tom for his questions on list
<niemeyer> hazmat: It first creates a window, which spawns an outside shell by screen itself
<niemeyer> hazmat: then overwrites a shell onto it
<niemeyer> hazmat: I suspect we may be hitting some race within screen itself
<niemeyer> hazmat: Is there a reason why you coded it like that, or is it safe to change?
<hazmat> niemeyer, its safe to change, i thought that creation was per your suggestion
<niemeyer> hazmat: It's unrelated to my suggestion
<hazmat> the openstack nova screen setup does a similiar setup
<niemeyer> hazmat: It's executing two shells for no reason
<niemeyer> hazmat: Rather than only hook.sh
<hazmat> right
<hazmat> so instead of creating the window it should just exec in the named window?
<niemeyer> hazmat: The "screen" command of screen takes an executable as an argument
<niemeyer> hazmat: -X screen -t .. hook.sh
<niemeyer> hazmat: The shell I'm seeing in kim0's is the shell from the first screen command, not the one from the exec
<hazmat> niemeyer, that sounds good to me 
<kim0> hope that bug got caught
<niemeyer> kim0: Sounds like so..
<niemeyer> kim0: Will try something this afternoon
<kim0> awesome
<niemeyer> kim0: Thanks for all your help
<kim0> All thanks to you :)
 * hazmat lunches
 * niemeyer too
<kim0> hmm
<kim0>         state: install_error
<kim0> if service is having install_error .. any facility to figure out what went wrong
<kim0> ok I could figure it out
<niemeyer> kim0: I was kind of expecting that..
<niemeyer> kim0: Is that the service we were debugging?
<kim0> niemeyer: I shutdown the env and started a fresh
<niemeyer> kim0: Oh, ok
<niemeyer> kim0: ensemble log, ensemble debug-hook, etc
<kim0> niemeyer: does the log not provide hooks stdout any more ?
<kim0> is it supressed by default
<niemeyer> kim0: It does, but you have to turn it on earlier
<niemeyer> We should really have a feature where it logs by default
<niemeyer> and rotates them out after a while
<niemeyer> kim0: Otherwise, the best bet is logging in the machine and checking logs
<niemeyer> kim0: You should be able to retry, though
<niemeyer> kim0: Run debug-hook
<niemeyer> kim0: and then run ensemble resolved with the --retry argument
<kim0> niemeyer: hmm .. service unit is stuck somehow
<kim0> here is status http://paste.ubuntu.com/616891/
<kim0> I hope you don't mind all the questions 
<niemeyer> kim0: Not at all.. I'm actually going to fix some of the issues you found today
<kim0> yeah .. the basic workflow should be smoother ..
<kim0> so, I had an error in a hook, now I have no idea how to nudge things and get them back
<niemeyer> kim0: resolved, as I mentioned above
<kim0> bin/ensemble resolved --retry drupal/1
<kim0> tried this
<niemeyer> kim0: Ok, what happened next?
<kim0> debug-log only shows mysql related messages
<kim0> and status is the same
<niemeyer> kim0: Have you actually fixed the original reason why your hook failed?
<kim0> niemeyer: yes I did
<niemeyer> kim0: Why was it failing before?
<kim0> niemeyer: ssh'ed into the machine .. and ran it 
<kim0> niemeyer: some cd to a non existent directoyy
<kim0> niemeyer: I ran the script inside the instance .. it is fine now
<niemeyer> kim0: Is it returning successfully now? (exit status 0)
<kim0> checking again
<kim0> can't really check again accurately
<kim0> ensemble-log giving errors because it's running outside the environment
<kim0> apt-get saying packages already installed ..etc
<kim0> but yeah it seems correct
<niemeyer> kim0: So how did you run it befor?
<niemeyer> e
<kim0> niemeyer: in a debug-hooks session
<kim0> /var/lib/ensemble/units/drupal-1/formula/hooks/install
<niemeyer> kim0: If apt-get install is failing, it won't work as a hook either
<kim0> it's just saying .. the package is already installed
<kim0> the script is fine trust me :)
<niemeyer> kim0: :)
<niemeyer> kim0: If you have already executed it by hand, you can just say "resolved"
<niemeyer> kim0: Without --retry
<niemeyer> kim0: But I suspect that fuzzing may have triggered something else.. that "state: null" isn't really great 
<kim0> yeah
<niemeyer> kim0: Try the resolved trick
<kim0> did that
<niemeyer> kim0: This just states to Ensemble "I have resolved the problem"
<kim0> still null
<niemeyer> kim0: Ok, try redeploying the fixed formula then
<niemeyer> kim0: We'll have to investigate a bit that scenario
<niemeyer> kim0: (--retry with a broken script, etc)
<kim0> how do I redeploy
<niemeyer> kim0: But first, I'll fix the debug-hook stuff we debugged this morning
<kim0> ok np ..
<niemeyer> kim0: Same thing you did earlier?
<kim0> I'll pick this up later
<kim0> fresh environment .. ok
<kim0> In the tutorial .. I'm assuming the formula is going to have errors
<niemeyer> kim0: Nope
<niemeyer> kim0: Just remove the unit
<niemeyer> kim0: and add it again
<kim0> ok
<niemeyer> kim0: Yeah, that's a good thing
<niemeyer> kim0: You can also upgrade the formula in general
<kim0> says, error state cannot be upgraded
<niemeyer> kim0: This is what we're preparing Ensemble to be able to do
<kim0> or so
<niemeyer> kim0: Error?  Change, upgrade..
<kim0> says like, formula is in error state .. so it cannot be upgraded
<kim0> I lost the exact message though
<niemeyer> kim0: Yeah, you have to resolve it first..
<kim0> Ok .. I'll need to try this again (recovering from a formula with errors )
<kim0> and will discuss again with you
<niemeyer> kim0: Because otherwise we can't assume a known state
<niemeyer> kim0: Imagine an install hook failed in the middle
<kim0> that's what it was :D
<kim0> so I just need to know the recommended recovery steps
<niemeyer> kim0: Right.. simply upgrading won't really yield a working system necessarily 
<niemeyer> kim0: Because half of it executed
<kim0> how to know what went wrong .. release a fix .. start recovering
<niemeyer> kim0: What we want is this:
<niemeyer> kim0: install failed: check logs, fix it or retry, upgrade if wanted
<niemeyer> kim0: With debug-hook if desired, to understand what's going on
<kim0> then use "resolved" right ?
<niemeyer> kim0: Right, after the "fix it or retry"
<niemeyer> kim0: Or during it actually.. retry is done with resolved
<kim0> is there a way to upload the fixed new hook ?
<niemeyer> kim0: upgrade-formula
<kim0> which refuses to work in error state ?
<niemeyer> kim0: Yes, which is the right thing to do
<kim0> ok .. so I'd still need to upload my fixed hook
<niemeyer> kim0: The error state must be acknowledged by the administrator
<niemeyer> kim0: and if an install hook blows up in the middle, upgrading a new install hook with the error fixed won't necessarily make it work
<niemeyer> kim0: mkdir foo, run twice, breaks
<kim0> so our recommended approach is kill the instance, and start a fresh machine ?
<niemeyer> <niemeyer> kim0: What we want is this:
<niemeyer> <niemeyer> kim0: install failed: check logs, fix it or retry, upgrade if wanted
<niemeyer> <niemeyer> kim0: With debug-hook if desired, to understand what's going on
<kim0> the "fixing and trying" cycle is what I'm trying to grasp
<niemeyer> kim0: Fixing it means fixing the actual problem within the formula..
<kim0> and what about the trying
<niemeyer> kim0: Sorry, within the service unit
<kim0> on a new instance ?
<niemeyer> kim0: If there's nothing to do.. you just run "resolved"
<kim0> ah
<kim0> so I fix the problem manually .. then run resolved
<niemeyer> kim0: Yes, that's one way to do it
<niemeyer> kim0: The other way to do it is to code an idempotent hook
<niemeyer> kim0: This enables you to run resolved and upgrade a new formula
<kim0> and use, resolved --retry
<kim0> right ?
<niemeyer> kim0: If that's what you want to do
<niemeyer> kim0: The problem is really quite simple
<niemeyer> kim0: When a hook fails, Ensemble will stop running hooks until the admin acknowledges it
<niemeyer> kim0: If you run ensemble resolved, it forgets about the old hook, and continues execution
<niemeyer> kim0: If you want to run the old hook again before continuing, you run resolved --retry
<niemeyer> kim0: and that's it
<niemeyer> kim0: Whether you upgrade the formula, change the hook in place to try things out, run debug-hook, run ensemble log, etc, is really up to you
<niemeyer> kim0: We're coding tools to give you everything you need to understand how things are behaving
<niemeyer> kim0: and fixing them
<kim0> I guess the workflow I had in mind is .. install hook blows up .. I ssh into machine .. figure out why it blew up .. then I fix the hook *locally* .. then somehow ensemble would upload the new version and run that
<kim0> but the workflow you explained is perfectly fine
<kim0> thanks
<niemeyer> kim0: Well, maybe there are further options we can develop around this
<kim0> Yeah, recovering from broken formulas should be smooth .. since people are going to make all sorts of mistakes :D
<kim0> niemeyer: thanks for all the explanation and patience :)
<niemeyer> kim0: No worries
<niemeyer> kim0: Good to talk about that stuff.. it's important to learn how other people feel about the system too
 * kim0 nods
 * SpamapS just discovered resolved yesterday btw
<SpamapS> would have saved me quite a few remove-relation/add-relation cycles ;)
<niemeyer> SpamapS: Sorry about that :-)
<SpamapS> Yes shame on you guys for making the thing work welle nough to survive hundreds and hundreds of remove/adds.
<_mup_> ensemble/close-zk-port r240 committed by gustavo@niemeyer.net
<_mup_> Do not open zk port on AWS firewall.
<niemeyer> SpamapS: Yeah, good thing we are in a polishing cycle.. :)
<_mup_> Bug #791973 was filed: Ensemble shouldn't open the EC2 firewall for zk access <Ensemble:Confirmed> < https://launchpad.net/bugs/791973 >
<niemeyer> hazmat:   sudo -u ubuntu screen -dmS $SESSION_NAME
<niemeyer> hazmat: -u ubuntu?  Shouldn't this be root?
<niemeyer> Also, this command seems to create a new session, irrespective of whether there's an existing one with the same name
<hazmat> its connecting to the ubuntu user's screen session from the login shell
<niemeyer> hazmat: Yes, but isn't the ubuntu user connecting to root's screen using sudo?
<hazmat> niemeyer, no its not that requires a setuid binary screen program
<niemeyer> hazmat: Using sudo?
<hazmat> niemeyer, that would be fine
<hazmat> i thought you where referring to multi-user screen
<niemeyer> hazmat: ssh -t ubuntu@%s sudo byobu -xRS %s-hook-debug -t shell
<hazmat> sounds good
<niemeyer> hazmat: The debug-hook command connects a session from root
<niemeyer> hazmat: Which means doing -u ubuntu would put the session in another place 
<niemeyer> Ok, I'll fix that as well
<niemeyer> So there's apparently no way to start a screen session unless it doesn't yet exist.. :(
<niemeyer> Ok, I'll give tmux a try..
<niemeyer> smoser!
<niemeyer> smoser: Was just reminding of you while hacking a shell script
<niemeyer> smoser: exec &> foo.. that's an awesome trick I learned with you recently :)
<smoser> that is bash only
<smoser> exec > foo 2>&1
<smoser> would be the posix shell equivalent
<niemeyer> smoser: Nice
<niemeyer> smoser: Will use that
<niemeyer> hazmat: If  a sigspec is EXIT (0) the command arg is executed on exit from the shell.
<niemeyer> hazmat: Looks handy
<niemeyer> hazmat: Sorry, that was a quote
<niemeyer> Wonder if that will *always* execute
 * niemeyer tests
<niemeyer> Yeah, works
<_mup_> ensemble/expose-hook-commands r242 committed by jim.baker@canonical.com
<_mup_> Implemented open-port, close-port hook commands
<jimbaker>  biab
<_mup_> ensemble/debug-hook-fixes r240 committed by gustavo@niemeyer.net
<_mup_> Fixed several issues in the debug-hook shell payload, and replaced
<_mup_> screen with tmux to handle concurrent session creation without races.
<_mup_> Let's see if this works in real test cases now.
<niemeyer> What's the proper format to put ensemble-branch in the environment's file again?  I recall we had a weird issue with one of the url forms, and I killed my commented option by mistake.
<niemeyer> jimbaker, hazmat, bcsaller: ?
<bcsaller> niemeyer: ensemble-branch: lp:~bcsaller/ensemble/config-set-lifecycle
<niemeyer> bcsaller: Does that work?  Nice, I think that was one of the formats which was not working
<bcsaller> its been working for me
<niemeyer> bcsaller: Super, thanks
<niemeyer> bcsaller: It was a crazy issue we had in the sprint..  Launchpad was just barfing on https or http or lp, can't recall which one..
<SpamapS> niemeyer: I'm thinking of uploading principia-tools to the ensemble ppa... thoughts before I do that?
<niemeyer> SpamapS: No, it sounds good
<niemeyer> 2011-06-02 16:50:32,617 ERROR ProviderError: Interaction with machine provider failed: ConnectionTimeoutException('could not connect before timeout after 1 retries',)
<niemeyer> Feels like a regression on the waiting behavior
<niemeyer> kim0: You were right.. debug-hook needed some good debugging by itself
<_mup_> ensemble/debug-hook-fixes r241 committed by gustavo@niemeyer.net
<_mup_> - Add additional hook names to valid list on debug-hook.
<_mup_> - Fix debug-hook shell template.
<_mup_> Bug #792071 was filed: relation-get blowing up badly during install hook <Ensemble:New> < https://launchpad.net/bugs/792071 >
<niemeyer> Observing debug-hooks actually working is beautiful!
<niemeyer> Do changes on one side, exit.. boom! The other side pops up!
 * niemeyer ponders about how to execute a script by piping it on stdin
<niemeyer> Hah, /bin/bash -
<kim0> niemeyer: great news!
<kim0> so it's working as it should now .. woohoo
<kim0> niemeyer: Would you think it'd be better for debug-hooks to open the hook code in vim in the new screen window, instead of dropping me in a blank shell ?
<niemeyer> kim0: Hmm
<niemeyer> kim0: No, probably not.. this would likely pass the wrong idea about what you can do within the debug hooks session
<niemeyer> kim0: It's useful to look at the script, but you're free to do pretty much anything
<kim0> hmm
<kim0> to debug the hook .. I had to figure out where it was
<kim0> and for that I used "find /"
<kim0> which sux ofc
<niemeyer> kim0: Yeah, I noticed this as well
<bcsaller> it should cd you into the formula directory I think
<niemeyer> kim0: That's a failure in the hook execution logic
<bcsaller> I think we have a bug for that 
<niemeyer> kim0: Hooks should always be executed within the formula directory
<niemeyer> kim0: When debugging or not
<niemeyer> bcsaller: Hey!
<bcsaller> :)
<kim0> bcsaller: the cd into hooks dir, sounds like a good compromise .. is it planned
<bcsaller> kapil and I agreed it was a good idea when we talked about it 
<niemeyer> Agreed, that's important even for plain hooks
<niemeyer> I mean, when executing the real ones rather than debugging
<_mup_> ensemble/expose-hook-commands r243 committed by jim.baker@canonical.com
<_mup_> Testing on args and logging for port commands
<niemeyer> Woohay
<kim0> expose merged ?
<niemeyer> kim0: Not yet
<niemeyer> jimbaker: Is hard at work on it
<_mup_> ensemble/debug-hook-fixes r242 committed by gustavo@niemeyer.net
<_mup_> - Use the real unit name as the session, since tmux is happy with that.
<_mup_> - Send an initialization script with a simple tmux.conf when firing
<_mup_>   ssh through the debug-hooks command.  Use screen shortcuts since
<_mup_>   people will be happier with that.
<_mup_> [WIP]
<niemeyer> Okay, time to step out and do something else..
<niemeyer> See y'all tomorrow!
<jimbaker> kim0, the expose work is getting close - hook commands are almost ready for review, i have most of the remaining provisioning work already done from a spike branch, and the ec2 group authorization model maps readily against what is necessary for a provider
<kim0> jimbaker: sounds like great news .. rock on :)
<SpamapS> Hmm.. I've been thinking about proposing a specification for 'machine-info-get' .. Teyo from Puppet suggested that they'd be interested in collaborating on a library to collect info about the current machine.. and it would be really useful to have this...
<SpamapS> So I'm thinking the machine agent should have some of this information available.. some from the machine provider, some from this library
<SpamapS> That would solve the 'Whats my private IP? Whats my public IP?' case.
<SpamapS> How would I propose such a spec?
#ubuntu-ensemble 2011-06-03
<SpamapS> just fire up a merge proposal for the drafts directory?
<jimbaker> SpamapS, that would be a normal process. and this info would be really cool to have, and a nice way to integrate w/ puppet too
<hazmat> SpamapS, write a rest formatted specification, put in docs/specifications of a branch, and propose on list
<hazmat> SpamapS, alternatively just start the discussion on list
<SpamapS> I may try my hand at implementing it too... it should be pretty easy, as there's no data storage requirement.. just a proxy to the real info really
<hazmat> SpamapS, to save time/effort, i'd try'd to start the discussion b4 impl
<SpamapS> hazmat: but I wanna play with my six-guns!
<hazmat> SpamapS, word.. i've been thinking about plugin apis
<hazmat> probably wouldn't help here.. but it would be nice to have
<hazmat> i'd like to keep playing with the auto-resolve  i put together... which might work as a plugin
<SpamapS> yeah actually thats probably the right way to go since there are some murky aspects of what we actually want auto-resolution to do
<hazmat> SpamapS, course there's always room for a BFG  ;-)
<hazmat> SpamapS, minus the remote repo stuff, the one committed, is pretty solid.. takes into account env, pull extra deps and adds any needed relations (including against existing services).. but that's pretty trivial.. multiple providers in the environment and the formula are pretty much random selection.. remote should be straightforward.. still needs some tweaks on the repository lookup order (using the repository found for a dep as the primary lookup f
<hazmat> or the next dep for any formulas from that repo)
<hazmat> with the remote repo the sharing store should be pretty nice
<SpamapS> hazmat: I think we need a new way to express "recommends" and "suggests" type relations in formulas. I don't want to force people to use munin, just because mysql supports it.
<hazmat> SpamapS, i don't think that should be in the mysql formula
<hazmat> its not core to its responsibilties
<SpamapS> agreed, its a policy thing :)
<SpamapS> but it needs to be *doable* before I can take it out
<SpamapS> because I want to graph stuff, damnit. ;)
<SpamapS> also mediawiki works great w/o memcache
<hazmat> either inheritance, possibly containment based (pull additional stuff from env) or policy formulas.. policy formulas are a little better in avoiding possible tight association to a machine
<hazmat> yeah.. graphing stuff is required..
<hazmat> first rule of devops.. measure
<SpamapS> I was thinking maybe the answer to the policy thing is to have a way to model your desired machines much the same way you model your services
<SpamapS> which is sort of like inheritance
<SpamapS> like, inherit this class "ubuntu::mymachine" and morph it into ubuntu::mymachine::mysql
<hazmat> SpamapS, tight machine coupling to units is best avoided
<hazmat> it makes unit migration much harder
<hazmat> either for scaling up compute capacity available to a unit.... or resurrection in the case of a dead machine
<SpamapS> hrm, I guess what I mean is, I want to be able to say two things.. 1) I want my machines to look like X, and 2) I want my service to be deployed on my machines.
<SpamapS> Right now we're saying "I want my machine to look like a bare bones Ubuntu machine"
<hazmat> SpamapS, true.. the question of what to do with 1) is basically how does ensemble handle machine configuration management
<hazmat> 2) is a little dangerous.. because if the service depends on the machine configuration than we loose reuse
<hazmat> re 1) we either adopt an existing configuration management system, adopt policy formulas, or grow our own declarative config
<hazmat> pretty much every corp env.. i've seen has a bunch of policy associated to a machine.. nfs mounts to sans, auth mech integration, monitoring/alerting.
<SpamapS> hazmat: I'd say that we should make sure we have a hand in the machine config, as that way we can at least have a chance at early warning that a formula and a machine policy conflict
<SpamapS> hazmat: most of what is "machine policy" in big corp environments tho, is to get around how hard it is to deploy things. :)
<kim0> niemeyer: morning :)
<niemeyer> Good morning!
<kim0> niemeyer: I noticed your debug-hooks branch .. is it in a useable state ?
<niemeyer> kim0: It actually is!
<niemeyer> kim0: I'll be finishing it within the next hour or so, so any input you have will be welcome
<kim0> niemeyer: well I tried using it 
<kim0> but it blocks on "connecting to remote machine ... "
<kim0> I killed it multiple times .. always the same
<niemeyer> kim0: Have you tweaked your configuration to actually use the branch in the server side as well?
<kim0> ah :)
<kim0> nope
<niemeyer> kim0: Yeah, that won't work
<kim0> ok I'll do it
<kim0> thanks
<niemeyer> kim0: ensemble-branch: lp:~niemeyer/ensemble/debug-hook-fixes
<niemeyer> kim0: next to type: ec2
<kim0> Yeah thanks .. I did that one before
<kim0> Q: does relation-get foo, have a non-zero exit status if foo has not been set yet?
<niemeyer> kim0: IIRC, no.. for relation-get, not setting is equivalent to the empty value
<niemeyer> kim0: So much so that relation-set foo= will erase the value
<_mup_> ensemble/debug-hook-fixes r243 committed by gustavo@niemeyer.net
<_mup_> - Don't overwrite config if it exists.
<_mup_> - Tests work again, even though a bit difficult to test the shell
<_mup_>   scripts themselves without a real run.
<_mup_> Now, for some real world testing.
<niemeyer> kim0: Just pushed the RC (Review Candidate ;-), if you want to test it
<niemeyer> I'm doing the same here
<kim0> great .. pulling
<_mup_> Bug #792406 was filed: "Will wait" never returns <Ensemble:Confirmed> < https://launchpad.net/bugs/792406 >
<niemeyer> kim0: There's a bug on it, btw
<kim0> niemeyer: it's not finish initialization ?
<kim0> finishing*
<niemeyer> Yep
<niemeyer> kim0: Am on it
<_mup_> ensemble/debug-hook-fixes r244 committed by gustavo@niemeyer.net
<_mup_> Forgot to rename the control import after the rename of
<_mup_> the debug_hook file to reflect the actual debug-hooks
<_mup_> command name.
<niemeyer> kim0: ^
<niemeyer> Pushing
<kim0> niemeyer: is there some way for drupal/0 to talk to drupal/1 ? 
<kim0> as in tell it, "I already did the DB setup .. just use it"
<niemeyer> kim0: That's not a task for them
<niemeyer> kim0: The mysql units are the ones doing the database setup
<kim0> niemeyer: I kinda meant .. I populated the DB tables
<hallyn> hey - i'm trying to set up relations between a master and new slave node in jenkins.  Is there a definitive order in which the master and slave's -joined hooks fire?
<niemeyer> hallyn: Hey Serge!  Very happy to see you're experimenting with it
<hallyn> actually, maybe i'm being silly.  i guess one of the steps i wanted to sync doesn't need to wait
<hallyn> niemeyer: i'm really enjoying it!
<niemeyer> hallyn: Woot!
<niemeyer> hallyn: There isn't an order
<niemeyer> hallyn: On purpose, mostly
<hallyn> ok
<kim0> niemeyer: currently this is done by drupal/0 .. I'd like to tell drupal/1 to skip creating the tables and part and just connect (I could ofc check if the tables exist, just wondering if there's a better way)
<niemeyer> hallyn: But you can create the ordering by setting/getting relation settings as you see fit
<hallyn> what do you mean?  in particular - i think i don't need this after all, but am wondering - waht if i need to do a 5-way protocol between master/server?
<niemeyer> hallyn: We should document better that feature, actually.
<hallyn> master/slave, that is
<niemeyer> hallyn: That's fine
<hallyn> oh, i can use relation-{set,get} as sync points?
<hallyn> suddenly i'm using Ada!
<niemeyer> hallyn: Yeah.. you can do something like this:
<niemeyer> hallyn: if [ -z "relation-get someinfo" ]; then exit 0; fi # Data isn't there yet
<niemeyer> hallyn: Ensemble guarantees that when anything on the relation is changed, the hook is called again
<hallyn> aaah
<hallyn> makes sense
<hallyn> meanwhile, -joined gets called for both ends, right?  
<niemeyer> Sorry, that should be backticks, but you get the idea
<hallyn> yup :)
<niemeyer> hallyn: Yeah, joined is a single shot
<hallyn> cool.  thanks.  i'm off to try and finish this then
<niemeyer> hallyn: That's awesome, please let us know how it goes
<niemeyer> kim0: There is something for exactly this case, but we haven't put much work on it yet
<niemeyer> kim0: Besides client/server relations, we have something called peer relations
<niemeyer> kim0: Their use is precisely for that kind of thing
<kim0> a ha 
<niemeyer> kim0: You can talk to the other units of the node
<kim0> ok np
<niemeyer> kim0: Be aware though, we need to test it :-)
<kim0> I'll just interrogate the DB
<niemeyer> kim0: That's a good way for the moment
<niemeyer> kim0: peer relations will be perfect for that in the near future
 * kim0 proceeding to torture the debug-hooks branch
<niemeyer> kim0: Sweet :-)
<kim0> hehe
<niemeyer> jimbaker: ping
<kim0> can you guys silence the txaws warning :)
<niemeyer> kim0: Which warning?
<kim0> /usr/lib/pymodules/python2.7/txaws/ec2/client.py:223: FutureWarning: The behavior of this method will change in future versions.  Use specific 'len(elem)' or 'elem is not None' test instead.
<niemeyer> kim0: Weird.. I'm not seeing that
<kim0> I'm probably using an old version
<kim0> niemeyer: I though tmux was using screen cli shortcuts ?
<kim0> thought*
<kim0> niemeyer: also, is it normal that I don't see ensemble-log messages from install hooks
<niemeyer> kim0: Which shortcuts are you missing?
<kim0> Ctrl-A " ?
<niemeyer> kim0: re. logs, what's the context?
<niemeyer> kim0: What does that do?
<kim0> niemeyer: shows a list of windows .. allowing visual selection
<kim0> it was just the first thing I tried .. I see now it's not that bad :)
<jimbaker> niemeyer, hi
<niemeyer> kim0: CTRL-A 1?
<niemeyer> jimbaker: Hey jim
<kim0> niemeyer: using deploy drupal .. the install hook fires, calls ensemble-log .. but that doesn't show in my debug-log session
<kim0> jimbaker: morning o/
<niemeyer> jimbaker: Ensemble needs some more love on the connection establishment front
<kim0> any default filters exist ?
<jimbaker> kim0, beautiful morning here
<kim0> :) sweet
<niemeyer> jimbaker: Are you sure you want to continue connecting (yes/no)? yes
<niemeyer> ProviderError: Interaction with machine provider failed: ConnectionTimeoutException('could not connect before timeout after 1 retries',)
<niemeyer> 2011-06-03 12:02:44,706 ERROR ProviderError: Interaction with machine provider failed: ConnectionTimeoutException('could not connect before timeout after 1 retries',)
<niemeyer> etc
<niemeyer> jimbaker: It fails in different ways in different cases, doesn't wait properly, etc
<kim0> Would be awesome if we could reuse connections as well :)
<niemeyer> jimbaker: After you're done with exposure, it'd be good to spend some time actually trying to use it for real
<niemeyer> kim0: Just got my time machine.. ensemble open-tunnel
<kim0> have no idea what does that do ?
<niemeyer> kim0: What you asked for :)
<kim0> oh!
<jimbaker> niemeyer, ok. from what you posted, seems like a matter of tuning, but i'll look at it in depth as you suggest
<niemeyer> jimbaker: Yeah, tunning/bug fixing/polishing.. whatever we want to call it.. we just need a smoother experience on connection establishment
<niemeyer> jimbaker: How's that exposure stuff going?
<jimbaker> niemeyer, agreed. probably bumping the retries would suffice
<jimbaker> niemeyer, bumping up that is
<niemeyer> jimbaker: No, it likely wouldn't..
<niemeyer> jimbaker: I'm pretty sure *1* retry isn't the number we have today
<bcsaller> bug #792406 sounds like the halting problem ;)
<_mup_> Bug #792406: "Will wait" never returns <Ensemble:Confirmed> < https://launchpad.net/bugs/792406 >
<kim0> bin/ensemble debug-log -i '*'   â  still doesn't show ensemble-log messages fired at install/start stage ?!
<niemeyer> bcsaller: Hey!
<jimbaker> niemeyer, exposure is going fine - i have hook commands working, just need some more testing; and i'm waiting on the review of the expose-provision-service-hierarchy review
<bcsaller> hey
<kim0> bcsaller: hi there
<niemeyer> jimbaker: You have an open review on the pre-requisite branch
<niemeyer> jimbaker: I'm waiting for you to address this one before looking at the next one
<niemeyer> jimbaker: The branch is pretty simple, so I was expecting you'd get through it quickly
<jimbaker> niemeyer, ahh, ok, i will work on that now then
<niemeyer> jimbaker: Are you doing debug-log + debug-hooks?
<niemeyer> Sorry
<niemeyer> kim0: Are you doing debug-log + debug-hooks?
<hallyn> all right, if i have formulas 'x' and 'x-slave', with x providing master: x and x-slave providing slave:x (and each requiring the other), should the forumales be called 'x-relation-joined' or 'master-relation-joined'?
<kim0> niemeyer: yeah
<kim0> is there some conflict
<jimbaker> niemeyer, yes, it is really a matter of determining some code patterns you were asking about to see what's going on - most of that came from the waiting for godot redux work
<niemeyer> kim0: Yeah, the output from the hook on debug-hooks is your terminal, so you won't see it in the logs
<jimbaker> as in, presumably already vetted code, i wanted to see if i made some mistake in applying the pattern, and if not, how else is it used
<niemeyer> kim0: I'm not touching that area on this branch, and I'm not aware of any issues on debug-log, so if you have any known issues on debug-log, please bring them up
<kim0> niemeyer: sorry, in the install hook stage .. I hadn't opened debug-hooks yet 
<kim0> niemeyer: Also .. once a window opens in tmux .. it cds to /usr/lib/ensemble/txzookeeper .. not where the hooks are
<niemeyer> kim0: Do note, however, that today debug log only starts logging after you execute the command
<niemeyer> kim0: Yes, we talked about that yesterday
<kim0> should I open a bug on that
<niemeyer> kim0: As I pointed out yesterday, this is a more general issue.  All hooks are executed in another directory. We have to fix that generically, and put the hooks to execute with curdir set to the formula dir.
<niemeyer> kim0: I _think_ we already have a bug open for that
<kim0> ok got you
<niemeyer> kim0: That said, please do check it out and file if you can't find it
<niemeyer> kim0: This feels like something I can address right after this is finished
<niemeyer> kim0: This first one, on debug-hooks, that is
 * kim0 nods
<kim0> cd to hooks dir .. ok
<kim0> getting strange debug-log output â 2011-06-03 17:22:51,493 unit:drupal/0: hook.output ERROR: /tmp/tmpe7x6qc-db-relation-joined: line 37: kill: (2378) - No such process
<jimbaker> niemeyer, SSHClient.connect controls this with a timeout parameter. maybe this should be exposed to our commands as a  standard param or env var. right now it defaults to 30s.
<jimbaker> niemeyer, there is some testing to verify that the timeout actually is effective (and so can have more than one retry), but it is mock testing.
<niemeyer> jimbaker: Please do finish the expose stuff before working on this
<kim0> niemeyer: please check that kill error ^ looks weird
<jimbaker> niemeyer, sounds good
<niemeyer> kim0: Indeed, that sounds quite interesting
<niemeyer> kim0: Investigatingt
<niemeyer> Actually, I'll just merge a branch before that
<hallyn> ambiguous endpoints
<kim0> Writing formulas is much fun :)
<_mup_> ensemble/trunk r240 committed by gustavo@niemeyer.net
<_mup_> Merge close-zk-port branch [r=jimbaker,hazmat]
<_mup_> This prevents a very silly security issue which allows external people
<_mup_> to dial into the zookeeper port.
<_mup_> The expose+open-port work in progress at the moment will address this
<_mup_> in a better way, and the follow up work on ZK ACLs will close the door
<_mup_> more properly.
<kim0> I am genuinly happy btw :) no sarcasm
<SpamapS> kim0: +1 .. :)
<kim0> SpamapS: morning o/
<niemeyer> hallyn: When establishing a relation?
<niemeyer> SpamapS: Yo!  Awesome news re. the talk approval
<niemeyer> kim0: Aha, thanks for the warning re. kill
<niemeyer> kim0: It wasn't a bug, but rather a message we shouldn't print
<SpamapS> niemeyer: just a bof, but definitely should be cool. I actually hope the puppet and chef guys will be there (they're all speaking at oscon).
<niemeyer> kim0: kill -0 was expected to fail at some point there
<SpamapS> no surprise that puppet labs would be there.. given its in Portland where their home office is.
<kim0> niemeyer: cool .. another suspicious message would be â 2011-06-03 17:22:51,565 unit:drupal/0: hook.output ERROR: duplicate session: drupal/0
<niemeyer> kim0: Hah, cool, same case
<niemeyer> kim0: That's actually one of the bugs being fixed
<niemeyer> kim0: Before it would simply create the two sessions
 * kim0 nods
<niemeyer> kim0: Ok, both fixed locally.. will do a last run after lunch, and then submit for review
<SpamapS> niemeyer: how do I take advantage of the open-tunnel command btw?
<niemeyer> kim0: Thanks a lot for your help
<kim0> niemeyer: aye aye 
<niemeyer> kim0: Please ping me if you find anything else you feel would need attention
<niemeyer> SpamapS: Just open it in a different terminal
<kim0> niemeyer: I'm running it in bg & 
<niemeyer> SpamapS: While the command is open/executing, every other command will make use of the tunnel if possible
<kim0> niemeyer: doesn't make status faster ?
<SpamapS> niemeyer: I was thinking it would give me some sort of indication its working. I tried reading the code and I have *no* idea how it can possibly work. ;)
<niemeyer> SpamapS: It's actually fairly simple.  We make heavy use of a feature of SSH which is not very explored, which is why you don't find much logic in there.
<niemeyer> SpamapS: The best indication is that your commands will be blazing fast :)
<SpamapS> niemeyer: is it using the Master/Slave stuff?
<hallyn> niemeyer: oh, yes.  when establishing relation.  probably my extraneous statements inthe metadata.yaml.  trying again
<niemeyer> SpamapS: Yeah
<niemeyer> SpamapS: Check prepare_ssh_sharing, under ensemble/state/sshforward.py
<niemeyer> hallyn: Ah, that's fine
<niemeyer> hallyn: In general it just means there are multiple possible ways to make the relation
<niemeyer> hallyn: You can disambiguate by providing the relation name in addition to the service name
<niemeyer> hallyn: e.g. add-relation myservice1:therelationname service2
<niemeyer> hallyn: For instance
<hallyn> oh, will try that if it fails again, thaks.  but i was specifying 'provides: slave: x', which probably is confusing
<SpamapS> the mediawiki formula has two mysql interface requires relations, so one always has to specify whether they want the 'db' relation or the 'slave' relation.
<niemeyer> Ok, lunch!
<niemeyer> biab
<kim0> niemeyer: is there a good way to "debug-hooks" before "install" hook fires .. other than launching debug-hooks in the nick of time
 * SpamapS notes that ensemble should Recommend: python-pydot
<_mup_> txzookeeper/session-event-handling r42 committed by kapil.foss@gmail.com
<_mup_> watch wrapper that diverts session events to a session event callback set on a connection.
<kim0> SpamapS: what does that do
<SpamapS> kim0: bzr status --format dot
<SpamapS> err
<SpamapS> haha
<SpamapS> ensemble status --format dot
<SpamapS> I keep mixing those two up. :-P
<kim0> installing
<SpamapS> hmm doesn't seem to work ...
<SpamapS> bcsaller: having trouble using ensembl status --format dot ..
<SpamapS> http://paste.ubuntu.com/617639/
<bcsaller> ensemble status --format svg --output /tmp/status.svg ; eog /tmp/status.svg
<SpamapS> Oh!
<bcsaller> though that error is confusing, it looks like there is some other issue
<SpamapS> same error
<kim0> bcsaller: wow that rox :)
<bcsaller> SpamapS: can you email me the dot file, it seems like there is some version error happening here, I want to see if I can find it 
<SpamapS> bcsaller: where would the dot file be?
<bcsaller> hmm. nowhere if you can't even generate it I guess :-/
<SpamapS> bcsaller: Here's my status output... http://paste.ubuntu.com/617644/
<SpamapS> You can reproduce very easily with the mediawiki test from principia-tools
<bcsaller> the first error listed a tmp file, is that left around or cleaned up?
<SpamapS> bcsaller: if you have a quick answer that would be awesome.. trying to put together a "WTF is ensemble" blog post
<SpamapS> cleaned up
<_mup_> Bug #792448 was filed: status dot ouput fails with errors <Ensemble:New> < https://launchpad.net/bugs/792448 >
<kim0> Is it ok to put template files besides hooks .. and use them from the hooks ?
<SpamapS> yep
<SpamapS> kim0: the whole formula dir is zipped up and sent to the machine
<kim0> cool
<kim0> trying to embed php code in bash scripts is no fun :)
<SpamapS> Yeah I just went ahead and wrote a few hooks in PHP ;)
<SpamapS> so I could extract stuff from includes/settings files.
<SpamapS> PHP actually has a quite rich set of capabilities for file manipulation
<SpamapS> Hmm ok so if I wanted to spawn m1.small's instead of t1.micro's ... anybody have a clue how I might do that?
<SpamapS> I thought changing environments.yaml to add 'default-instance-type: m1.small' would do it, but still seem to be spawning t1.micro's
<hazmat> SpamapS, hmm
<hazmat> default-instance-type should do it
 * hazmat digs
<SpamapS> environments:
<SpamapS>   sample:
<SpamapS>     type: ec2
<hazmat> SpamapS, default-instance-type: m1.small after that should do it
<hazmat> hmm.. oh against an existing environment
<SpamapS> right
<SpamapS> can I specify it to add-unit/deploy ?
<SpamapS> that would be t3h awesome
<SpamapS> also before I go and try it, would you expect an instance that reboots to re-join ensemble properly?
<SpamapS> because I can always convert t1.micro's to m1.small's
<hazmat> SpamapS, no re rejoin.. not until we do upstart integration
<SpamapS> so on boot up it won't get run?
<hazmat> SpamapS, yeah
<SpamapS> doh
<hazmat> SpamapS, the config setting change locally to environments.yaml is  propagated to the provisioning agent on deploy
<SpamapS> so .. that would be something that needs to be overridable at runtime with a cmdline switch
<hazmat> making a provider image identifier part of the machine state would allow for the cmd line switch, to be processed by the provisioning agent.
<SpamapS> no offense, but all I see there is "eep opp glorp cmd line switch ok gru blorg" ;)
<jimbaker> hazmat, in the watcher function in this paste (http://paste.ubuntu.com/617670/), is it possible for the connection to be lost between the test in line 13 and the watch in line 16?
<SpamapS> hah, just got my AWS bill for May.. $67 .. a new record!
<hazmat> SpamapS, i think i got to the 200s playing around with cloudformation.. that was painful
<hazmat> SpamapS, fair enough re klingon speak
<SpamapS> hazmat: just giving you a hard time because I am too lazy to read the specs. ;)
 * niemeyer waves
<niemeyer> kim0: We have plans for that
<niemeyer> kim0: We discussed a bit the idea of having something like deploy --debug-hooks
<SpamapS> You know.. t1.micro's are really unpredictable for performance at all. :-P
<niemeyer> kim0: To fire a new unit with debugging on
<niemeyer> SpamapS: Yeah, I think we should switch back to small by default
<SpamapS> I have 6 in a mediawiki right now.. sometimes its *FAST* .. then a few of them just *stall*
<niemeyer> SpamapS: Otherwise Ensemble will end up being blamed for this
<hazmat> jimbaker, its concievable, but if your getting that, there's background activity when the test closes.. which we're trying to avoid.. typically the teardown closes a client, to kill watches before clearing out the tree
<niemeyer> SpamapS: Yeah, that's exactly what they advertise them to be
<SpamapS> niemeyer: agreed, especially since people can't take advantage of the reboot-into-something-bigger game of t1.micro
<jimbaker> hazmat, this is in reference to https://code.launchpad.net/~jimbaker/ensemble/expose-watch-exposed-flag/+merge/63066, but the specific code is the watch_resolved function, which has the same guard
<niemeyer> SpamapS: It can spike in performance, at the price of stalling completely
<niemeyer> SpamapS: Cool, let's do this then
<SpamapS> I'm going to re-run my tests with m1.small and compare.
<SpamapS> They're, what, 2x the price?
<niemeyer> SpamapS: Sweet, let us know
<niemeyer> SpamapS: More
<hazmat> jimbaker, yeah.. its possibly not relevant anymore post godot-redux
<niemeyer> SpamapS: But it's worth it
<SpamapS> I was thinking it would be cool if we can also add in spot pricing capabilities...
<jimbaker> niemeyer, does that make sense to you?
<niemeyer> jimbaker: What's that?
<SpamapS> Since we can always add units.. spot priced units would be ideal for expanding beyond a certain level.. if they disappear.. oops, just add a full price unit.
<hazmat> jimbaker, i added that in when doing looped tests of some of the watch apis
<jimbaker> hazmat, correct, it's necessary to support such looped tests
<hazmat> jimbaker, i guess the question is are you seeing a problem?
<hazmat> and if so what is it
<jimbaker> niemeyer, this in reference to your review of the expose-watch-exposed-flag branch, and asking why this guard was necessary
<jimbaker> hazmat, i'm trying to investigate niemeyer'
<niemeyer> jimbaker: and what's the conclusion?
<jimbaker> s review point
<jimbaker> hazmat says it is necessary, and that there should not be a scenario where the connection is in fact lost between the guard and the establishment of the watch
<jimbaker> which was my concern
<jimbaker> it would be nice if we could have the need for this guard tested outside of using -u loops
<SpamapS> bcsaller: fyi, its something to do with the scale or make-up of the status.. if I do it with just the bootstrap node I don't get the dot error
<jimbaker> maybe a separate branch for that, so we can do it for all the watches using this guard?
<niemeyer> jimbaker, hazmat: I don't understand why this is necessary in this specific location, and not everywhere else in Ensemble
<bcsaller> SpamapS: interesting. The test case for it uses a med. size multi-node topology so this is a little unexpected. I'll try to get to it later today
<jimbaker> niemeyer, i understand the pragmatic reason for it. the tests otherwise fail. hazmat, anything else you would want to add?
<niemeyer> jimbaker: Why do they fail?
<jimbaker> niemeyer, for the same reason they always do - we are still doing something in the background when the tests are being torn down
<jimbaker> niemeyer, so there is a probably a good solution in this case
<jimbaker> write better tests :)
<jimbaker> in thinking this through, what's confusing about the guard is that it is making an assertion about being connected which as you mention applies to any code hitting zk
<niemeyer> jimbaker: Yes, please :)
<jimbaker> niemeyer, do you want me to fix the other watches with this problem?
<niemeyer> jimbaker: There shouldn't be background logic happening on tests in unexpected ways
<niemeyer> jimbaker: Not in this branch
<niemeyer> jimbaker: Just don't introduce more of it
<jimbaker> niemeyer, ok
<hazmat> jimbaker, if you remove the check, can you run the tests looped?
<jimbaker> niemeyer, actually most of the work i spent on the corresponding branch (expose-provision-service-hierarchy) was cleaning up this problem. so certainly one that has a familiar quality
<jimbaker> hazmat, no, they will always fail
<jimbaker> hazmat, but the check should not be there at all, per what niemeyer has said
<SpamapS> oh yeah, m1.small already just *feels* faster to navigate
<jimbaker> since it is compensating for bad tests
<niemeyer> Right
<jimbaker> that don't properly clean up
<hazmat> jimbaker, unless the watch is stopped there is always a chance for background activity on a watch
<jimbaker> hazmat, what can we do to fix that then?
<jimbaker> seems like a fundamental thing to get right
<niemeyer> jimbaker: Don't allow a watch to stay alive after the test has finished, ever
<hazmat> the state request protocol have lifecycle methods to control stopping them
<hazmat> jimbaker, in the interim.. raising a stopwatcher from the callback is an alternative
<niemeyer> hazmat: What about closing the connection?
<jimbaker> hazmat, right, that's what i do to control such watches for the provision agent tests
<jimbaker> niemeyer, that sounds like a good way to test this type of solution :)
<niemeyer> jimbaker: We do this in several places already
<jimbaker> niemeyer, sounds good, i will look for that code
<niemeyer> jimbaker: and it's always worked for me in the places I had such logic
<niemeyer> jimbaker: See tearDown of StateTestBase
<jimbaker> hazmat, it seems to me that we are moving to supporting StopWatcher in all the watches. certainly the one that is to be fixed for now that's the case, in fact that's part of the changes in this branch
<jimbaker> niemeyer, thanks
<hazmat> niemeyer, that's part of the problem, connection close while the watch is firing in the background
<niemeyer> hazmat: Well, that means a watch is firing as part of the test action itself
<niemeyer> hazmat: Which falls back to the conversation we had before
<niemeyer> hazmat: Currently we sleep in those cases, we need a way to sync up the zk connection
<niemeyer> jimbaker: StopWatcher is unrelated, IMO
<niemeyer> jimbaker: StopWatcher means the watch is being fired in the first place
<_mup_> ensemble/debug-hook-fixes r245 committed by gustavo@niemeyer.net
<_mup_> Send error output to /dev/null in cases we expect the
<_mup_> command to fail.
<niemeyer> Ok, I think I'm happy with this branch
<niemeyer> Will push it for review
<hallyn> jamespage: yay, we're in business.
<SpamapS> You guys ready to submit to principia? ;)
<hallyn> now my only problem is lp isn't letting me create lp:~serge-hallyn/principia/oneiric/jenkins-slave/trunk
<hallyn> it let me create jenkins/trunk...
<SpamapS> hallyn: you have to create a package called 'jenkins-slave' .. bug in LP
<SpamapS> hallyn: why isn't your formula just 'jenkins' though?
<hallyn> SpamapS: the one for jenkins is :)
<hallyn> i used a separate one for slaves
<SpamapS> I'm curious why
<SpamapS> I'd think thats just a difference of relations
<hallyn> uh
<hallyn> well i install different packages
<hallyn> at init
<hallyn> i probably could just install all everywhere...
<SpamapS> Right, or you could just install the packages you need at the time that the relationship is established.
<hallyn> meh
<SpamapS> I'm guessing a jenkins slave is way simpler than a jenkins master
<hallyn> so, wehreas i now have 'jenkins-relation-joined' in both jenkins and jenkins-slave,
<hallyn> would i call them master/slave?
<hallyn> ok first step
<SpamapS> thats what I did in the mysql formula
<hallyn> how do i 'create the package'?
<hallyn> ok.
<SpamapS> upload a dummy package to a ppa anywhere on lp
<hallyn> i'll try consolidating
<hallyn> got it
<SpamapS> the benefit of consolidating is that you have one place to gather all of the operational knowledge for jenkins
<hallyn> yeah
<kim0> niemeyer: relation-get in "upgrade-formula" seems to blow up spectacularly
<hallyn> will do and get bck to you
<SpamapS> the benefit of not consolidating is you have two simpler formulas instead of one possibly more complex formula
<SpamapS> kim0: probably same problem as the blow up in install
<niemeyer> kim0: I've reported this bug yesterday
 * kim0 nods
<SpamapS> hallyn: its been working great for mysql... I deploy a master-db, then slave-db .. then relate them
<niemeyer> kim0: upgrade-formula is not part of a relation, so it doesn't make sense to call relation-get
<hallyn> how do you deploy them?
<niemeyer> kim0: relation-get shouldn't break that way, though
<kim0> yeah exactly
<SpamapS> lp:principia-tools/tests/wiki-slave.sh is an example
<hallyn> i've not gotten around to using multiple instances of a formula, dunno how to name them :)
<SpamapS> ensemble deploy mysql master-db  names the service master-db
<hallyn> jamespage: so if a slave first does 'sudo apt-get install jenkins' that shouldn't be a big deal right?  do you think it's wroth doing apt-get remove jenkins before installing jenkins-slave when it gets related to a master?
<hazmat> kim0, the relation cli api only works currently in relation hooks
<hallyn> oh, why not
<kim0> hazmat: yeah got that .. still a more graceful error would be nicer :)
<hazmat> indeed
<SpamapS> hallyn: is jenkins-slave way way simpler than jenkins?
<_mup_> ensemble/debug-hook-fixes r246 committed by gustavo@niemeyer.net
<_mup_> Self review:
<_mup_> - Some additional comments.
<_mup_> - Add new valid hooks on debug-hooks to tests, even though those tests
<_mup_>   aren't really great. They will not break once we have more hooks.
<SpamapS> hallyn: I mean, if you have to jump through hoops to morph it into something completely different, then I may say that its better to split them.
<_mup_> ensemble/debug-hook-fixes r247 committed by gustavo@niemeyer.net
<_mup_> Minor comment fix.
<jamespage> hallyn: well you will end up with another running jenkins master node in that case
<SpamapS> hrm.. wordpress's "<code>" tag kind of sucks in my theme. :-/
<hallyn> jamespage: right so i'm just doing apt-get purge
<_mup_> ensemble/micro-is-slugish r241 committed by gustavo@niemeyer.net
<_mup_> Switch default instance type to t1.micro.  Does that work?
<niemeyer> Erm
<_mup_> ensemble/micro-is-slugish r241 committed by gustavo@niemeyer.net
<_mup_> Switch default instance type to m1.small.  Does that work?
<hallyn> SpamapS: gr.  well it shoudl be pretty simple, it just forces me to become more familiar with the terminology.  So, the consolidated version isn't working yet for me...  working on it
<SpamapS> hallyn: what I mean is, if jenkins-slave is an entirely different piece of software than jenkins, I may have been wrong that they should be consolidated.
<jamespage> SpamapS, hallyn
 * hallyn listens up
<jamespage> I was thinking that once we have this working well I might that a look at writing a Jenkins plugin to automagically add and remove slaves depending on demand
<jamespage> using ensemble
<SpamapS> NICE
<niemeyer> Sweet!
<kim0> niemeyer: btw the open-tunnel isn't really working for me
<niemeyer> jamespage: ensemble add-unit jenkins FTW!
<niemeyer> kim0: What happens?
<kim0> niemeyer: I think it's just equally slow
<niemeyer> kim0: When doing what?
<kim0> status
<kim0> niemeyer: status takes 5.4s 
<niemeyer> kim0: status should definitely improve with the tunnel open
<niemeyer> kim0: Let me try here
<kim0> niemeyer: could u launch as enseble open-tunnel &
 * niemeyer cranks another environment.. Amazon says Katching!
<kim0> hehe
<niemeyer> kim0: I'd recommned against it
<niemeyer> kim0: Keep it open in the forefront
<niemeyer> kim0: and put a warning in your wall saying "MY ENVIRONMENT IS EXPOSED"
<kim0> the 2 times I tried it .. I bg'ed it
<kim0> exposed ?!
<kim0> isn't it only to my user?
<niemeyer> kim0: You have a root tunnel to your environment's control machine
<niemeyer> kim0: That's why we don't just do that magically for you
<kim0> niemeyer: got it .. so no technical reason not to background, but a security one
<niemeyer> kim0: It's neat and all, but be aware :)
<kim0> hehe yeah :)
<kim0> ok I'd love to have this working
<niemeyer> kim0: It should be working.. trying right now
<niemeyer> With tunnel:
<niemeyer> bin/ensemble status  0.29s user 0.08s system 8% cpu 4.309 total
<niemeyer> Without tunnel:
<niemeyer> bin/ensemble status  0.34s user 0.05s system 5% cpu 7.033 total
<niemeyer> kim0: Seems to work
<niemeyer> kim0: If you want to be sure:
<kim0> 0.3s! something must be wrong on my box (it's 5s)
<niemeyer> kim0: 1) Open the tunnel
<kim0> it's open
<niemeyer> kim0: On another terminal, run this:
<niemeyer> kim0: ensemble ssh 0
<niemeyer> kim0: Then, close the tunnel with CTRL-C
<niemeyer> kim0: What happened to your ssh?
<niemeyer> kim0: You were looking at the wrong number.. it's 4.3s
<kim0> niemeyer: it was terminated
<niemeyer> kim0: Bingo
<niemeyer> kim0: It was going through the tunnel
<kim0> ah yeah .. You'd think the tunnel would make it even faster
<niemeyer> kim0: That's the penalty of living at thousands of km from the DC
<kim0> is it bandwidth limited
<niemeyer> kim0: Latency
<niemeyer> Okay!
<niemeyer> m1.small works..
<niemeyer> hazmat++
<_mup_> Bug #792491 was filed: t1.micro is too sluggish <Ensemble:In Progress by niemeyer> < https://launchpad.net/bugs/792491 >
<niemeyer> In review
<niemeyer> kim0: I'll look at the problem of the default directory now
<niemeyer> Actually.. standup..
<niemeyer> Who's up for it?
<niemeyer> hazmat, jimbaker, bcsaller: Yo?
<hazmat> i'm game if need be.. almost done with session work
<bcsaller> almost done addressing review
 * hazmat parks in mumble
<jimbaker> niemeyer, sounds good to me
<niemeyer> jimbaker: We're all there
<jimbaker> yeah, my laptop just needed to reboot... unity issues ;)
<niemeyer> jimbaker: Heh, sure, blame unity :)
<jimbaker> which features a need for daily reboots it seems, otherwise everything else is great
<jimbaker> fortunately i know about alt-f1
<jimbaker> so fewer reboots required
<hallyn> SpamapS: lp:~serge-hallyn/principia/oneiric/jenkins works for me
<hallyn> (jenkins/trunc, that is)
<SpamapS> right
<hallyn> still have some jenkins usage to learn to figure out how to use this to test kernel compiles
<jimbaker> well it looks no sound for me
<SpamapS> hallyn: for new formulas, I'm going to put together a "how to propose a new formula" document at some point. Right now I think you can just bug me.. eventually we'll need a "NEW" queue like thing.
<hallyn> if it was all one bzr tree we could do merge requests...
<hazmat> jimbaker, unity again
<jimbaker> just one more reboot, now i have sound!
<niemeyer> https://code.launchpad.net/~jimbaker/ensemble/expose-watch-exposed-flag/+merge/63066
<SpamapS> http://fewbar.com/2011/06/so-what-is-ensemble-anyway/
<hazmat> SpamapS, woot!
<SpamapS> graph porn at the bottom. ;)
<bcsaller> SpamapS: now I'm extra sorry status didn't draw properly for you
<niemeyer> SpamapS: WOAH
<SpamapS> bcsaller: I can always add it back in. ;)
<niemeyer> SpamapS: Have you tweeted it yet?
<SpamapS> niemeyer: my blog auto-tweets. :)
<niemeyer> SpamapS: What your username there?
<SpamapS> niemeyer: guess! ;) spamaps
<niemeyer> SpamapS: Cool :)
<koolhead17> SpamapS: is it your blogpage :D
 * koolhead17 just did niemeyer RT :P
<niemeyer> koolhead17: It is indeed
<niemeyer> An awesome intro
 * koolhead17 reading it
<_mup_> Bug #792540 was filed: example mysql formula should publish details of precreated databases <Ensemble:New for kim0> < https://launchpad.net/bugs/792540 >
<koolhead17> kim0: that website bug has been confirmed and assigned!! :D
<kim0> koolhead17: Yeah it's important ! thanks :)
<koolhead17> kim0: karma ++ :D
<kim0> haha :) enjot
<kim0> enjoy*
<koolhead17> http://www.youtube.com/canonicalmatters  it needs to be populated :F
<koolhead17> :D
<kim0> Just fixed the mysql example formula to make it publish DB details of precreated formulas .. yaay
<kim0> precreated DBs I mean .. yaay :)
<robbiew> SpamapS: so when can we get the narrated screencast version?
<robbiew> :P
<SpamapS> kim0: how did you manage that, given that the password is encrypted?
<kim0> SpamapS: caught it as it's being created and saved it in file like how the master password is?
<SpamapS> kim0: ah, I went through great pains to avoid that in principia. ;)
<SpamapS> kim0: but I'm not really sure why. ;)
<kim0> hehe :)
<kim0> I'm using those example formulas in tutorials .. so I value simplicity above all
<SpamapS> The reality is you shouldn't ever have to re-use those users. You just have to keep track of two things. 1) does the db exist yet or not, and 2) has the relationship been completely severed or not.
<SpamapS> as long as there is a relationship, the username/db doesn't have to change
<_mup_> ensemble/expose-watch-exposed-flag r241 committed by jim.baker@canonical.com
<_mup_> Removed unnecessary guard and separated out ports tests from ServiceStateManagerTest megatest
<_mup_> ensemble/expose-watch-exposed-flag r242 committed by jim.baker@canonical.com
<_mup_> PEP8
<_mup_> ensemble/expose-provision-service-hierarchy r264 committed by jim.baker@canonical.com
<_mup_> Merged upstread expose-watch-exposed-flag
<kim0> SpamapS: I've written a drupal formula at https://code.launchpad.net/~kim0/+junk/drupal .. Mind importing to principia ?
<SpamapS> kim0: reading now
<kim0> cool
<SpamapS> kim0: we had another guy submit one recently too.. ;)
<kim0> it's pretty basic :)
<kim0> oh np though :)
<SpamapS> kim0: I love how all formulas start with at least 5 - 10 revisions. :-p
<kim0> hahah :D
<kim0> even as basic as this one
<SpamapS> The other one had a good idea.. his was 201106011200
<kim0> a ha .. yeah like DNS zone versions
<SpamapS> Ok first off, descriptions shouldn't say what it "does", they should say what it "is".
<kim0> It's a blant copy from the wordpress example .. so we probably should change both of em
<SpamapS> Yeah
<SpamapS> Been thinking about writing up a basic set of guidelines and sending it out for comment
<SpamapS> drush?
<kim0> drupal shell 
<kim0> nice little tool that automates lots of things
<SpamapS> kim0: right.. I'm not sure we're going to accept stuff in Principia that pulls things from archives other than Ubuntu...
<kim0> it doesnt ?
<kim0> drush is packaged
<SpamapS> ensemble-log "Using drush to download latest Drupal"
<SpamapS> 10	
<SpamapS> cd /var/www && drush dl drupal --drupal-project-rename=ensemble
<kim0> ah ..
<kim0> hmm
<SpamapS> The reason being, then we lose the security updates of the Ubuntu release.
<SpamapS> Everything else looks fine for importing.
<kim0> althogh I'd argue a deployed ensemble, won't upgrade itself yet :)
<SpamapS> The drush stuff is *awesome* though.
<kim0> It can auto-update core drupal 
<kim0> and all modules/plugins/themes ..etc
<SpamapS> kim0: one can at least ssh to it and upgrade, but yeah, this goes back to being able to enforce policy across the machines.
<kim0> if ensemble would give me a hook to check/apply updates
<kim0> it would probably be better than apt :)
<SpamapS> we don't need a hook for that
<SpamapS> install a cron job
<SpamapS> But then you get into, why not just enable unattended upgrades
<kim0> ok I'll probably tweak it later for apt .. 
<SpamapS> We need to discuss this though
<kim0> Yeah .. 
<SpamapS> I don't feel adamantly opposed to having formulas that deploy software outside Ubuntu's archives in Principia.. but I do think we need to think about it.
<kim0> agreed 
<kim0> maybe start a thread
<kim0> SpamapS: if I were to deploy a drupal cluster, I'd need a shared filesystem across all nodes ... what's your take on implementing this in ensemble ? an NFS formula ?
<kim0> what about gluster on the same nodes
<SpamapS> I want to do it with several options
<SpamapS> gluster and NFS would be the simplest/safest
<SpamapS> NFS would need to be failover/drbd or something like that.
<SpamapS> gluster w/ a peer formula would be *pimp*
<kim0> I suppose that's good :)
<SpamapS> wordpress and mediawiki have the same needs
<SpamapS> Tho.. I'd bet all 3 have S3 plugins
<SpamapS> actually I *know* wordpress does because I use it
<kim0> yeah, having one formula per instance sounds a bit limiting ? wish I could group more than one formula on a single instance
<SpamapS> i totally agree
<SpamapS> like as long as neither of them provide the same interface, it should be no problem
<kim0> yeah
<kim0> and while we're at it .. I think we'd need a documented "interface" for a formula
<kim0> for example to write that drupal formula .. I had to read the mysql formula source, which doesn't sound scalable
<kim0> basically documenting the wire protocol .. nothing more
<SpamapS> Agreed, I think the providing formula should always document it. It would be great if ensemble even enforced/warned when something didn't follow the documented semantics.
 * kim0 nods
<SpamapS> http://pastebin.com/TqwWjgEB
<kim0> will we need "gets" as well
<SpamapS> http://pastebin.com/KaBHLnX9
<kim0> and will we need "when" .. which hook .. does that make a difference
<SpamapS> Yeah... s/expects/gets/
<SpamapS> Maybe
<kim0> good start though
<kim0> It will need a description .. like memcached gets/sets "ip" which is totally different afaict
<kim0> I was also thinking about the need to have "joined" when we have "changed" .. knowing that changed always fires after joined, and pretty much always has to detect whehter or not this is its first run!
<SpamapS> hmm
<SpamapS> I tend to find it easy.. if relation-get has values, configure things. If not, don't.
<kim0> Yeah
<SpamapS> then joined can just create data/etc. if need be
<SpamapS> and departed/broken can clean up
<SpamapS> though broken needs something to tell us which relation broke.
<kim0> I feel like everything is still possible without joined
<kim0> will think about it a bit more
<SpamapS> well the idea is that joined *only* gets called when a new member arrives
<kim0> changed, would still be called, so it'd need to do the right thing as well
<kim0> any way .. it's past midnight for me .. nightie everyone
<kim0> enjoy your weekend
<SpamapS> kim0: will do, thanks for the formula work/ideas/everything :)
<kim0> SpamapS: thanks man :)
#ubuntu-ensemble 2011-06-04
<_mup_> ensemble/expose-hook-commands r244 committed by jim.baker@canonical.com
<_mup_> Tests to verify communications between unit agent client and server for port hook commands
 * niemeyer off!  Have a good weekend, and see you monday the latest! :-)
<koolhead17> hi all
