#ubuntu-ensemble 2011-09-05
<_mup_> ensemble/machine-with-type r337 committed by kapil.thangavelu@canonical.com
<_mup_> machine state includes provider type
<kim0> Hey folks .. is LXC deployment actually working right now
<kim0> Another questions, is Ensemble known to run successfully client-side and cloud-side on Ubuntu 10.04+ ?
<hazmat> kim0, no to both
<hazmat> kim0, there used to be some compatible fixes for 10.04 cloud-init, but those got yanked, so it no longer works
<kim0> hazmat: so supported versions are 11.04+ ?
<hazmat> the lxc work is in dev, i'm hoping to get it to a testing state for those who want to try off a branch by the end of the week, but first plumbers
<kim0> client and cloud ?
<hazmat> kim0, for client 10.04 is fine
<hazmat> for the cloud portion we need 11.04 or newer
<kim0> got it
<kim0> thanks
<hazmat> np
<hazmat> kim0, are you going to be at plumbers?
<kim0> nope
<kim0> Dustin is however
<hazmat> kim0, oh.. i thought you where co-track lead?
<hazmat> k
<fwereade> hazmat: btw, I have done my best to review everything, but I ended up a bit tired by the time I got to bcsaller's LXC branch -- can I ask you to pick that one up please?
<hazmat> fwereade, absolutely
<hazmat> fwereade, also fwiw today's a national holiday in the us
<fwereade> hazmat, cheers :)
<fwereade> hazmat: ah, I missed that
<hazmat> so the review queue might be sitting through till tomorrow.. i'll try to dig into some of that today though
<fwereade> hazmat: no rush, I don't mean to demand you work on a day off ;)
<hazmat> i'm going to spend most of the day getting ready for a presentation on wednesday anyways
<hazmat> fwereade, no worries,  wasn't planning on taking the day off
 * niemeyer waves
<fwereade> niemeyer: heyhey
<fwereade> niemeyer: hope you're ok?
<niemeyer> fwereade: Yeah, mostly alright
<fwereade> glad to hear it :)
<niemeyer> fwereade: Thanks to modern medicine :-)
<fwereade> niemeyer: what happened?
<niemeyer> fwereade: Fuzzed around with a minor tooth issue a bit too much and made it into a bigger issue
<fwereade> niemeyer: ouch :(
<fwereade> niemeyer: I'd been vaguely wondering if you were, I dunno, into extreme mountain fighting or something ;)
<niemeyer> fwereade: Haha, no.. I might feel less silly if that was the case, though :-)
<niemeyer> fwereade: Thanks for the docstring changes
<niemeyer> fwereade: Very nice to have it being standardized
<niemeyer> fwereade: I'm slightly concerned it might be going overboard in terms of syntax in some cases.. not sure if the benefit of the linkage is worth the reduced readability in the code itself
<niemeyer> fwereade: Simple example:
<niemeyer> -        @param instance_id: instance_id of the `ProviderMachine` you want.
<niemeyer> vs.
<niemeyer> 316	+        :param str instance_id: :attr:`instance_id` of the
<niemeyer> 317	+            :class:`ensemble.machine.ProviderMachine` you want.
<hazmat> it reads better html
<niemeyer> hazmat: Yeah, I'm just not sure if the benefit of the linkage in HTML is worth the reduced readability in the code itself
<niemeyer> hazmat: I suspect we'll be reading the code more often than the HTML
<hazmat> niemeyer, yeah.. that particular example is a bit obtuse
<hazmat> and self referential
<fwereade> I did wonder about that, but I found it very hard to justify not having stuff linked if it were there
<fwereade> I felt on balance that the awesomeness of the linkage made up for the ugliness of the code, for the simple reason that when we're reading the code we're reading the *code* -- I trust docstrings about as much as I trust comments, which is to say not at all ;)
<fwereade> and so I end up viewing the docstrings as targeted more to a reader of the docs than a reader of the code
<hazmat> fwereade, instance_id = opaque string identifier specific to the provider
<hazmat> fwereade, where you able to build the html output?
<fwereade> I was, yes, are yu having problems?
<hazmat> fwereade, haven't tried.. jim mentioned some issue, wasn't clear if it was endemic or not
<hazmat> fwereade, was that with the code linked using autodoc?
<fwereade> hazmat: I didn't end up hitting jim's issue, so I'm not sure if it's waiting to trip me up as I rationalise the rest, or whether it was just an artifact of his approach
<niemeyer> Will get some "food".. biab
<hazmat> niemeyer, cheers
<fwereade> hazmat: instance_id -- yes, I know; but I don't follow your train of thought
<fwereade> hazmat: they're also how we identify machines at the provider level
<hazmat> fwereade, ah.. i think i missed the context of the diff
<hazmat> i thought that was on a provider machine, but ic now its actually   more likely get_machine
<fwereade> hazmat: I'd be sympathetic to an approach in which we used machine_id throughout, but it doesn't seem very convenient atm ;)
<hazmat> fwereade, but you are using autodoc when building the html output?
<fwereade> hazmat: yes, but I have a little script that does more than and less than that script jim linked the other day to build the structure we want
<fwereade> hazmat: I output a very few directives basically just saying "autodoc this module here"
<fwereade> hazmat: but the resulting structure is (IMO) nicer, and tests get excluded withot enormous hassle
<fwereade> hazmat: anyway, I must dash, ttyl :)
 * hazmat lunches
<hazmat> fwereade, cheers
<hazmat> niemeyer, the formulas/ -> formulas- change was to accomodate openstack's s3server which didn't like the "/" separator on a key name
<hazmat> i'm not entirely sure why as the trunk code of it, seems to do the right thing with the key (hash on key name b4 store in dir).. its possible its interacting poorly with the route dispatch logic
<hazmat> but it seemed more expedient to change our storage key then pushing the fixes upstream...
<hazmat> kirkland, ping
<niemeyer> hazmat: WTF, seriously?
<hazmat> niemeyer, indeed
<niemeyer> hazmat: Cool, thanks for the explanation.. +1
<hazmat> niemeyer, its the bug where upstream explicitly noted s3 compatibility isn't a goal.. https://bugs.launchpad.net/nova/+bug/829880
<_mup_> Bug #829880: object store doesn't like key with '/'  <Ensemble:Triaged by hazmat> <OpenStack Compute (nova):Confirmed> <ensemble (Ubuntu):New> < https://launchpad.net/bugs/829880 >
<niemeyer> hazmat: Yeah, it's just not clear anymore why we're even talking about S3 in this context now
<hazmat> niemeyer, well it is *just* enough s3 for  ec2 compatibility.. which is all ensemble needs as well atm.... 
<niemeyer> hazmat: Yeah, I commented on it even.. it just wasn't so obvious to me until now that it's another beast entirely
<hazmat> ah
 * hazmat gets back to work on his presentation
<hazmat> niemeyer, also fwiw, i'll be offline most of tomorrow morning, entransit to plumbers
<niemeyer> hazmat: Sounds great, thanks for your assistance today, btw
<thebishop> hello, I'm interested in the new "Orchestra" feature.  I know it's not the primary use, but can Orchestra be used as a Proxmox "private cloud" replacement as well?
<hazmat> thebishop, proxmox ve looks like a commericial ovirt?... orchestra+ensemble+openstack is effectively auto private cloud provisioning.. its not clear what exactly proxmox ve is doing outside of providing a web ui ontop of a virt mechanism (kvm or openvz)
<hazmat> ah.. ic http://pve.proxmox.com/wiki/Vision
<hazmat> so ensemble does real encapsulation at a service level instead of just virtual image appliance management
<hazmat> but it doesn't include backup/restore builtin or live migration atm
<hazmat> lxc support for the latter features is still in the works
<hazmat> ugh.. i really want to use the local dev for the talk
<_mup_> ensemble/machine-with-provider-type r359 committed by kapil.thangavelu@canonical.com
<_mup_> pull machine-with-type into local dev pipeline
<_mup_> ensemble/no-regex-option r338 committed by gustavo@niemeyer.net
<_mup_> Dropped unused class var, as pointed out by William.
<_mup_> ensemble/trunk r337 committed by gustavo@niemeyer.net
<_mup_> Merged no-regex-option branch [r=fwereade,hazmat]
<_mup_> This removes the regex configuration type, and renames 'str' to 'string'.
<_mup_> For the second change, it also introduces backwards compatibility logic
<_mup_> so that we can continue to work with 'str' for the moment while we warn
<_mup_> authors to move out of it.
<_mup_> ensemble/trunk r338 committed by gustavo@niemeyer.net
<_mup_> Merged simplify-iface-schema branch [r=hazmat,fwereade]
<_mup_> Minor simplification in the implementation of the interface schema.
<_mup_> ensemble/go r3 committed by gustavo@niemeyer.net
<_mup_> Merged go-iface-schemas branch [r=fwereade,hazmat]
<_mup_> This kicks off the formula schema support in the Go port.
<_mup_> ensemble/go r4 committed by gustavo@niemeyer.net
<_mup_> Merged go-initial-formula-meta branch [r=fwereade,hazmat]
<_mup_> This introduces metadata.yaml parsing in the Go port.
<_mup_> ensemble/go r5 committed by gustavo@niemeyer.net
<_mup_> Merge go-rename-short-types branch [r=hazmat,fwereade,jkakar] (!)
<_mup_> This makes the schema.M/L types more readable, as suggested by Kapil.
<_mup_> ensemble/go r6 committed by gustavo@niemeyer.net
<_mup_> Merged go-final-formula-meta branch [r=fwereade,hazmat]
<_mup_> This completes the parsing of metadata.yaml in the Go port.
<_mup_> ensemble/go-formulas r14 committed by gustavo@niemeyer.net
<_mup_> Added formula(_test).go files with the generic bits that were
<_mup_> all together in meta.go at first.
<_mup_> ensemble/go-formulas r15 committed by gustavo@niemeyer.net
<_mup_> Implemented initial config parsing. No variable validation using the
<_mup_> parsed schema yet.
<_mup_> Bug #842195 was filed: config.yaml handling is necessary in Go port <Ensemble:In Progress by niemeyer> < https://launchpad.net/bugs/842195 >
#ubuntu-ensemble 2011-09-06
 * hazmat yawns
<_mup_> Bug #842488 was filed: Enable cloud-init debug output to better support problem analysis <Ensemble:In Progress by smoser> < https://launchpad.net/bugs/842488 >
<_mup_> ensemble/stack-crack r332 committed by kapil.thangavelu@canonical.com
<_mup_> clean up bad merge from formula-provider-storage url integration
<hazmat> hmm.. one more openstack ec2 incompatiblity found.. only affects shutdown though
<_mup_> Bug #842497 was filed: Error on openstack env shutdown <Ensemble:New for hazmat> < https://launchpad.net/bugs/842497 >
<fwereade> hazmat: what's the distinction between a machine_id and a machine_state_id?
<fwereade> hazmat: because it seems to me that they're the same, except that the requirement that it be an int is never expressed with reference to machine_id
<smoser> SpamapS, i thiknm you're planning on getting a new ensemble into ubuntu, right? when you do i would really like to have bug 842488 in.
<_mup_> Bug #842488: Enable cloud-init debug output to better support problem analysis <Ensemble:In Progress by smoser> < https://launchpad.net/bugs/842488 >
<smoser> if people are going to use those packages, then this will really help debug.
<SpamapS> smoser: its trivial enough, I'll drop it in as a patch until it gets merged
<SpamapS> hazmat: since the txaws MP's were approved, but I'm not on that particular team, can you merge both branches into txaws?
<SpamapS> hazmat: I'll handle putting them in as a patch to the package
<jimbaker`> fwereade, re machine_state_id - i only see one reference to that in trunk, in ensemble.agents.provision
<jimbaker`> fwereade, in that case, it's certainly the same as machine_id
<fwereade> jimbaker`: heh, I didn't actually look for other references, I thought I'd already seen it elsewhere ;)
<fwereade> jimbaker`, cheers
<jimbaker`> fwereade, re being constrained to an int, the one place that appears to be the case is in terminate_machine, in the corresponding argparse. normally it's something like 0 or 42, but we often test it as a "machine-0"
<jimbaker`> we have tried not to overly constrain what these IDs are
<fwereade> jimbaker`: it seems to me that it's tighter than that, just a sec
<fwereade> jimbaker`, ensemble.state.machine, get_machine_state
<fwereade> jimbaker`, if it's not an int (or a str that can be inted) it will fail
<jimbaker`> fwereade, indeed that's the case
<fwereade> jimbaker`, so while many things can still work if it's not an int, it feels kinda wrong to throw random data around elsewhere
<fwereade> if, say, bootstrap were to create a machine called "bootstrap-0", it would indeed start a machine, but the rest of the system wouldn't be able to use it properly
<jimbaker`> fwereade, yeah, i'm not certain what distinction is intended here
<fwereade> jimbaker`, it's the sort of thing that makes me want to just tighten things up a little by asserting that the machine-id field in machine_data actually is an int when we call start_machine
<fwereade> the tests imply you can pass anything :)
<jimbaker`> fwereade, even if you were to remove this constraint? i have never asked hazmat about it, but the tests as we have discussed do use strings preferentially
<fwereade> jimbaker`, I'm not sure I want to remove any constraints, just make the ones we already have (but I only just discovered) explicit
<jimbaker`> fwereade, the external machine id generation is just a (convenient) artifact of the internal ZK representation
<fwereade> jimbaker`, your point being that the constraint is itself unnecessary?
<jimbaker`> fwereade, it's quite possible this constraint is rather old
<jimbaker`> and we have never exercised it
<jimbaker`> it's certainly the intent expressed in a lot of tests that the machine id could in fact be something besides an int
<fwereade> jimbaker`, hmm, makes sense
<jimbaker`> fwereade, fwiw, hazmat is credited with that code (or at least touching those lines) in bzr blame. but my bzr fu is currently too weak to know how to map it to the rev # on trunk
<fwereade> jimbaker`, I'm just going to leave an XXX there for now, I don't want to make any logic changes while I'm documenting
<jimbaker`> fwereade, sounds like a good plan
<jimbaker`> fwereade, btw, are you doing the doc changes to use autodoc in one big branch?
<fwereade> jimbaker`, I'm stacking them
<fwereade> jimbaker`, I'm finding the documenting quite hard going tbh
<fwereade> jimbaker`, but it's teaching me a lot about the bits ofthe code I don't know
<fwereade> jimbaker`, but if I tried to do everything at once I think I'd go insane
<jimbaker`> fwereade, good to split up. it would make reviewers insane too. just don't know if you should be the one doing all of these fixes as well
<jimbaker`> fwereade, also, is it necessary to fully specify the full path in the rst? my understanding is that the full path is only necessary for resolving ambiguities
<jimbaker`> eg :class:`ensemble.machine.ProviderMachine` vs :class:`ProviderMachine`. i guess it's worth experimenting with
<jimbaker`> fwereade, regrettably it doesn't seem able to do the xref unless the path is explicit
<fwereade> jimbaker`, indeed, I resisted the temptation to start trying to hack up sphinx to make it resolve non-ambiguous references ;)
<fwereade> jimbaker`, and, well, it's teaching me a lot, but to be fair I'm not that likely to finish everything
<jimbaker`> fwereade, well it does seem some sort of hacking up is required, as seen in generate_modules.py
<fwereade> jimbaker` that's distinct from hacking sphinx itself ;)
<jimbaker`> fwereade, indeed, there is that distinction
<fwereade> jimbaker`, anyway, I have something else to do that's oddly uninspiring, and docs feel like a good thing to work on while I try to figure out what's bothering me in the other stuff 
<fwereade> jimbaker`, I'm not committing to documenting everything, but I can make things a little better and teach myself stuff in the process :)
<jimbaker`> fwereade, indeed, it's a very good way to learn the codebase by having to do some sort of explicit/formal pass over it, vs just reading code
<niemeyer> Hey folks
<jimbaker`> niemeyer, hi
<niemeyer> I'm heading out to lunch.. will be biab
<jimbaker`> niemeyer, enjoy!
<jcastro> lynxman: any word on macports?
<lynxman> jcastro: nope :( https://www.macports.org/ports.php?by=name&substr=ensemble
<jcastro> so is there like someone we can ping?
<m_3> jcastro, lynxman: what abpout homebrew?
<lynxman> m_3: if you get the port it works fine :)
<lynxman> jcastro: I'll try to ping them tomorrow in the irc channel, I'm supposedly off today (why am I on irc? I don't know)
<jcastro> heh, ok
<m_3> jcastro: thanks for the syndication!
<m_3> we're missing some posts though
<jcastro> yeah the plugin got upgraded
<m_3> Juan's got a couple in that same timeperiod that've gone missing
<m_3> membase
<m_3> and hpcc
<jcastro> ok
<m_3> lemme know if we should do something different with the formatting to make it easier
<m_3> thanks man!
<jcastro> ok I think I see the problem
<jcastro> you need a "planet" tag AND the "featured" tag
<jcastro> so using both those tags should autosyndicate you guys
<jcastro> negronjl: ^
<m_3> ok, I'll add them on future posts
<m_3> jcastro: should I retroactively do it to old ones?
<jcastro> I've been adding them by hand in the syndication thing, I guess that would make more sense
<negronjl> jcastro:  I'll add them to all of mine ... let's see what happens then
<jcastro> k
<jcastro> hey so m_3 
<m_3> me2
<jcastro> about that node.js "easy" article I linked you to
<jcastro> the one on HN.
<m_3> I'm working on a post now
<jcastro> I was thinking a nice response on how we're making all that easier
<jcastro> and then when we pingback the guy I can send him a note
<jcastro> and then he can mention it on the original article, that would be win
<m_3> yup!
<kirkland> hazmat: pong
<m_3> one of the comments is by a friend
<m_3> small world
<jcastro> oh awesome
<negronjl> jcastro:  posts tagged
 * niemeyer waves
<niemeyer> fwereade: ping
<fwereade> niemeyer: pong
<niemeyer> fwereade: Heya!
<fwereade> niemeyer: hey :) how's it going?
<niemeyer> fwereade: Nicer today :)
<fwereade> niemeyer: jolly good
<fwereade> niemeyer: as you say, yay, modern medicine :)
<niemeyer> fwereade: How's the Orchestra stuff going?
<fwereade> niemeyer: I've been finding it oddly tricky to redo the put-auth stuff
<niemeyer> fwereade: Hmm.. how so?
<fwereade> niemeyer: I ended up doing docs and poking at go today, I'm hoping whatever's bothering me will sort itself out by tomorrow
<fwereade> niemeyer: whenever I start, it feels like I'm reinventing the wheel
<niemeyer> fwereade: Ok
<fwereade> niemeyer: HTTPClient doesn't seem to be quite right -- I need at least a couple of distinct classes that differ slightly from the ones they've provided 
<fwereade> niemeyer: really I just need to force myself to finish it off, and it'll probably turn out to be fine
<niemeyer> fwereade: Cool.. I'm hoping to have a hand from you on the store stuff
<fwereade> niemeyer: cool, that sounds like fun :)
<niemeyer> fwereade: But for that we need to have the Orchestra fully closed
<fwereade> niemeyer: well, I'll polish that auth stuff off as soon as I can and hassle RoAkSoAx for any more bugs he can think of ;)
<niemeyer> fwereade: Awesome
<niemeyer> fwereade: Once you've finished with that, please ping me so we can synchronize on the store
<fwereade> niemeyer: cool, will do :D
<fwereade> niemeyer: hopefully tomorrow :)
<niemeyer> Awesome!
<jimbaker`> fwereade, hazmat introduced that change in get_machine_state in jan 13 as part of the misc-demo-fixes branch. but it seems more to be able to parse internal ids
<hazmat> SpamapS, yes re merge, post plumbers
<hazmat> bus time.. bbiab
<hazmat> fwereade, jimbaker` the machine i corresponds to the zk sequence value of the machine state node
<hazmat> s/i/id
<jimbaker`> hazmat, correct
<hazmat> jimbaker`, trying to back track through the conversation, not sure what the question was
<jimbaker`> hazmat, fwereade was simply bringing up this restriction as part of his doc work
<hazmat> ah
<jimbaker`> hazmat, in many tests, machine id is often something like "machine-0"
<jimbaker`> but in a few places, like get_machine_state, we explicitly require it to be an integer, or something that has an integer portion
<jimbaker`> so "machine-0" would in fact qualify in that case ;)
<hazmat> yeah.. in retrospect, it would hav ebeen nice to have those namespaced to something more blantaly obvious.. ssh 0 .. vs ssh machine/0
<hazmat> hotspot on a bus .. rocks
<jimbaker`> hazmat, i always liked how aws does it, i-abcdef
<jimbaker`> and of course we do that in the topology tests, s-0 or m-0
<hazmat> jimbaker`, at this point are namespace precedent is "qualifier/id" for service units, which is pretty nice overall i think... definitely true that tests violate those blatantly where they can
<jimbaker`> hazmat, all vehicles should be a hotspot. i definitely enjoy my verizon router for that aspect
<hazmat> i was able to get the local dev provider bootstrapped on a plane
<jimbaker`> hazmat, and maybe the tests should do just that
<hazmat> SpamapS, is there a way to prime a debcache dir with additional packages.. i really want to have the ability to do completely offline demos, just trying to understand how to go about it.. afaics, i would also need to setup a file apt repo accessible the containers for formula pkg installs
<_mup_> ensemble/local-ubuntu-provider r348 committed by kapil.thangavelu@canonical.com
<_mup_> work around path issues when launching machine agent, better multi bootstrap/shutdown handling, incorporate unpack formula into start unit, fix up admin identity into hierarchy initialization, make unitdeploy.is-running return a deferred
<hazmat> bcsaller, so i got about as far as creating the lxc container in response to ensemble deploy with the latest on the above branch, still lots of work, but i'm optimistic.. 
<bcsaller> hazmat: I am as well
<bcsaller> hazmat: where are you now?
<hazmat> bcsaller, i'm wondering if we can do away with the ensemble-branch/origin stuff and try to always deploy into the container the version running the cli, via copying the source tree into the container.. maybe too aggressively implicit.
<hazmat> bcsaller, on a bus just left sfo airport about 10m ago.. heading north on the 1
<SpamapS> hazmat: if the right .deb is in /var/cache/apt/archives it will not be downloaded
<bcsaller> hazmat: for local dev that might work, but the origin stuff would still be needed on EC2 I'd think
<SpamapS> /var/cache/apt/archive actually
<bcsaller> bus w/wifi, nice 
<hazmat> bcsaller, personal t-mobile hotspot.. works just as well :-)
<SpamapS> hazmat: oh debootstrap.. hm
<hazmat> bcsaller, its already outside the container on the host... its just a bit of an ugly issue when using multiple branches against the same env.. different deploys with different unit code.. and bad from an upgrade perspective.. probably makes a debian maintainer cry somewhere ;-)
<hazmat> SpamapS, excellent thanks
<hazmat> SpamapS,  i think the /var/cache/apt/archives priming should be able to do it.. the debcache thing isn't going to help per se, if we can't pass the additional packages to be installed when doing the debootstrap
<SpamapS> hazmat: you can also just build your own mirror w/ reprepro
<hazmat> SpamapS, that looks great, i was figuring we could do something like a shared local package cache onto a block dev, we overlayfs into the containers
<SpamapS> hazmat: lxc-clone perhaps? ;)
<hazmat> SpamapS, well i don't really want to clone the entire container, i just want a primed package cache for any given container
<SpamapS> hazmat: the default container builder only builds the mini-buntu once.
<hazmat> SpamapS, i know.. but it would be nice to extend that to common packages we might install.. i mean if i have 20 units of wordpress... i really don't need to download apache/php/wordpress times
<SpamapS> IMO there's a missing hook command which is the "I'm ready to be turned into an image" which is executed whenever any common operations (like package installs) are done.
<hazmat> SpamapS, perhaps.. its a little tricky... an image implies a static base point, but from the moment of deploy a formula may be launched with different configs.. it might work within a service, for a quick new unit, on the same machine...
<SpamapS> hazmat: it should work within every service if the formula author identifies the point at which they have done no more unique configuration.
<SpamapS> so during 'install' you say 'ready-for-imaging' right before you start services or record hostnames or anything like that.
<hazmat> bcsaller, we totally have to hit avatar's again... life changing
<SpamapS> Then the provisioning agent can snapshot your EBS, make an image of it, and every add-unit to that service is super fast.
<hazmat> SpamapS, good point.. assuming volume management.. which we'll probably need to tackle next cycle
<SpamapS> if there's no volume management, then the 'ready-for-imaging' command could simply do the bundling itself.
<SpamapS> Thinking in terms of real use cases vs. playing w/ ensemble.. the faster add-unit is, the more useful it is to people... they won't mind the first deploy being 10 minutes if every add-unit after that is 40 seconds.
<hazmat> SpamapS, so there are faster tricks we can do to get there
<hazmat> er.. not faster in terms of clock time, but in terms of dev time
<hazmat> i figured out the easiest way to preallocate machines just using the existing infrastructure, is to have  our bootstrap setup the zk tree with however many nodes we want, and they'll be created as standby nodes for new deploys after the first deploy
<SpamapS> anyway, to your point about containers.. I think lxc-clone is what you want
<SpamapS> hazmat: yeah that would be huge.. to be able to decouple machine creation from deploy/add-unit
<hazmat> SpamapS, its basically in the vision of a min/max environment config
<hazmat> where you spec min/max machines you want to pay for
<hazmat> SpamapS, its starts to fall down because we're doing dynamic port management
<hazmat> so we can't ever do placement to avoid conflicts, since we have no static analysis capabilities upfront at formula deploy time
<hazmat> i don't really see the need for dynamic port management.. but otoh.. getting true density is going to need some network trickery anyways
<robbiew> okay folks....gotta fire in some nearby woods...railroad tracks are a good barrier between the woods and my neighborhood, but I need to go afk for a bit until I know it's safe
<hazmat> robbiew, good luck and be safe
<niemeyer> robbiew: Ugh :(
<SpamapS> Austin dealing with the same type of fires we had 3 years ago. Nasty stuff.
<hazmat> SpamapS, re containers, i'm pretty sure lxc-clone isn't what i want... i really just want a local deb repo... for offline ensemble env deploys, the reprepo looks spot on for it, thanks
<SpamapS> robbiew: be safe
<SpamapS> hazmat: ok, cool. :)
 * hazmat goes back to working on presentation
<niemeyer> It surprises me a bit that we have no required/optional flag in the config 
<SpamapS> agreed
<SpamapS> there will be plenty of services with no sane default config.. they should error at deploy.
<SpamapS> alright rev 336 uploaded to ubuntu
<niemeyer> SpamapS: Hmm
<niemeyer> SpamapS: That was probably the rationale for the choice
<robbiew> fire contained....all is good
<niemeyer> SpamapS: The service should always have a sane default
<robbiew> :)
<niemeyer> robbiew: Phew
<SpamapS> niemeyer: there's no sane default title for a blog. :)
<SpamapS> "Abandoned Server" maybe
<niemeyer> SpamapS: Indeed, but it's fine to define it afterwards
<SpamapS> "TELL MY OWNER TO CHANGE ME"
<niemeyer> SpamapS: Exactly! ;-)
<robbiew> niemeyer: indeed...was bit dangerous for awhile, but the fire chief lives in our neighborhood...so I'm thinking that helped get resources here quickly :D
<SpamapS> niemeyer: we run into this all the time in packaging.. some services need deep configuration before being started.
<niemeyer> robbiew: As some say, it's important to know everything, but just to have the phone number of someone who does ;)
<niemeyer> s/it's im/it's not im/
<SpamapS> mysql you can really shoot yourself in the foot if you don't tune it first
<SpamapS> postgres too
<SpamapS> because the initial database creation can change the way the thing works.
<niemeyer> SpamapS: Sure, and that's where Ensemble gets in :)
<SpamapS> I guess my point is, deploying it w/o configuration would result in something that can't be related to things
<niemeyer> SpamapS: Hmm.. I don't get it..
<niemeyer> SpamapS: Why would we build a formula that can't be related to things?
<SpamapS> There are defaults, but they get you in trouble.
<niemeyer> SpamapS: Sounds like we should change them?
<SpamapS> niemeyer: if you don't turn on binary logging in mysql.. you can't relate to slaves.
<SpamapS> but if you turn it on, you double your write load
<SpamapS> which you may not want to do if you have chosen drbd as your HA method
<niemeyer> SpamapS: Sounds like an easy choice to make.. enable by default.. we don't support drbd
<niemeyer> SpamapS: and offer a flag for people to tune
<niemeyer> SpamapS: Those who want something else, can have it
<SpamapS> What if you do your replication w/ rabbitmq
<SpamapS> meh
<SpamapS> Its not critical
<niemeyer> SpamapS: :-)
<SpamapS> I'm just saying that often times this is what annoys sysadmins about packaging.. that the packagers make too many decisions for the user.
<SpamapS> In fact
<SpamapS> what we need isn't optional/required
<SpamapS> but priority
<SpamapS> But thats later, when we have configurators and not just config.yaml .. n/m
<_mup_> ensemble/rename-shutdown-command r340 committed by jim.baker@canonical.com
<_mup_> Merged trunk
<hazmat> niemeyer, re config required/optional.. how about runtime relation declarations
<hazmat> robbiew, awesome, looks quite serious from a distance
<_mup_> ensemble/rename-shutdown-command r341 committed by jim.baker@canonical.com
<_mup_> Fixed review items
<hazmat> we just mod the formula state serialization directly to add the additonal rel endpoints
<niemeyer> hazmat: How's one related to the other?
<hazmat> good question :-)
<niemeyer> :)
<hazmat> there both relating to runtime configuration of a service
<niemeyer> hazmat: Good try ;-)
<hazmat> say i have a generic wsgi/rails/node deploy formula.. pulls from vc... etc. based on configuration.. figures out what the particular service dependencies are for this app  and then updates the relation declarations
<hazmat> atm, i have to declare the entire world of possibilties as optional relations
<niemeyer> hazmat: Yeah, but there are some important reasons for that
<niemeyer> hazmat: I know what you mean, but we'll have to dig deeper to see how to model that kind of relationship
<niemeyer> hazmat: It's not sensible to simply dynamically show up with new relationships
<niemeyer> hazmat: The admin would be surprised, the formula itself would become weird (how can a new relation be introduced if there are no files expecting for it), etc
<_mup_> ensemble/trunk r339 committed by jim.baker@canonical.com
<_mup_> Merged rename-shutdown-command branch [r=niemeyer,fwereade][f=838215]
<_mup_> Renames shutdown command to destroy-environment.
<hazmat> niemeyer, dynamic relations seem more 'sensible' to me then dynamic ports... everything is a tradeoff..  dynamic over static looses analysis,  i mean the admin is configuring  the formula to point to a particular platform app.. 
<niemeyer> hazmat: Awesome.. we've managed to talk about dynamic ports and required/optional config settings in the context of dynamic relations :-)
<hazmat> :-)
<hazmat> i should eat my lunch while its warm 
<niemeyer> hazmat: Enjoy!
<niemeyer> bcsaller: Is it the case that all of the inputs to config.validate are string=>string?
<niemeyer> bcsaller: If you remember that from the top of your head at all.. I can look otherwise
<bcsaller> niemeyer: I thought conversions came out of the validate process
<niemeyer> bcsaller: Yeah, that's what I think as well.. cool
<_mup_> ensemble/go-formulas r16 committed by gustavo@niemeyer.net
<_mup_> Added config validation support.
<_mup_> Bug #843299 was filed: Formula config in Go port needs validation support <Ensemble:In Progress by niemeyer> < https://launchpad.net/bugs/843299 >
#ubuntu-ensemble 2011-09-07
<_mup_> ensemble/go-formulas r17 committed by gustavo@niemeyer.net
<_mup_> Got started with formula.Dir implementation. Missing bundling
<_mup_> and expanding for completion.
<_mup_> Bug #843539 was filed: Statusd -- deployment side status computation <Ensemble:New> < https://launchpad.net/bugs/843539 >
<kim0> morning folks
<fwereade> heyhey
<_mup_> Bug #843667 was filed: Instance not provisioned in the specified region <Ensemble:New> < https://launchpad.net/bugs/843667 >
<niemeyer> Folks, it's a national holiday around here, and given the recent events, I'll probably be mostly offline today to rest a bit. 
<highvoltage> recent events?
<hazmat> g'morning
<kim0> yeah what events .. hope everything is fine
<hazmat> i think gustavo is recovering from some dentistry issues
<robbiew> nah, I know what he's talking about...nothing serious
<robbiew> lol
<jimbaker`> bcsaller, there's no bug linked to lp:~bcsaller/ensemble/statusd, so it's not showing up in the kanban
<bcsaller> jimbaker`: thanks, there was an error with lbox when I tried it, thought I fixed the issue
<jimbaker`> bcsaller, no worries
<bcsaller> jimbaker`: it created the bug, I'll link it manually
<SpamapS> wow so the shutdown command just *disappeared* ?!
<jimbaker`> SpamapS, yes, it's now destroy-environment 
<SpamapS> No "this is deprecated stop using it"? :-/
<SpamapS> Meh, it was an evil command anyway.
<jimbaker`> SpamapS, considering how devastating it can be, it seems reasonable to require this
<SpamapS> I really really don't like the introduction of incompatible changes this late tho.
<SpamapS> Even when they're quite beneficial.
<SpamapS> But, meh.
<jimbaker`> SpamapS, understood. i forgot to give you a shout when i pushed this merge in, since it was you that we were specifically trying to save ;)
<SpamapS> bug subscriptions are fine I saw it poking around
<SpamapS> I was just surprised to see it land so suddenly and break my test script
<SpamapS> but again, thats a good thing
<SpamapS> The really important one is for bootstrap to exit non-zero when an env is already bootstrapped
<jimbaker`> SpamapS, good point. re suddenly - it was marked as high and it was easy
<jimbaker`> SpamapS, i'll fix that as part of the more general bug 697093
<_mup_> Bug #697093: Ensemble command should return nonzero status code for errors <cli> <Ensemble:New for jimbaker> < https://launchpad.net/bugs/697093 >
<_mup_> Bug #844010 was filed: fix autocomplete for the rename of shutdown <Ensemble:New> < https://launchpad.net/bugs/844010 >
<jimbaker`> kim0, thanks for pointing that out, that's an unfortunate aspect of not having autocomplete going against source
<kim0> yeah :)
<kim0> wish it could
<jimbaker`> kim0, it's too bad that argparse in python leaves some obvious stuff unimplemented
 * kim0 nods
<fwereade> hey all
<jimbaker`> kim0, with bcsaller's statusd branch, we will be pretty close to having autocomplete on names
<fwereade> anyone got a few moments to talk about HTTP authentication?
<kim0> jimbaker`: that'd be awesome :)
<SpamapS> hazmat: I'm thinking I'll just patch in the openstack fixes into txaws for Oneiric.. its too critical to wait for a merge from "upstream" :p
<hazmat> SpamapS, i actually am upstream.. i can get to it tomorrow
<hazmat> SpamapS, i can cut a new release with the branch fixes as well
<hazmat> SpamapS, one additional issue is the shutdown error message, which isn't functionally bad.. and as a result i'm tempted to wave off on it for oneiric
<SpamapS> hazmat: I wasn't sure if you had to await process tho. :)
<SpamapS> hazmat: you mean the destroy-environment message :)
<hazmat> SpamapS, process is done, i got at least one other dev to sign off on it.. that's all i need
<hazmat> SpamapS, yeah that too :-)
<SpamapS> ah ok
 * niemeyer waves
* niemeyer changed the topic of #ubuntu-ensemble to: http://j.mp/ensemble-eureka | http://ensemble.ubuntu.com/docs | JUJU!
<hazmat> jimbaker`, do you know what this error might mean? /bin/ensemble deploy --repository=examples wordpress
<hazmat> No machines have addresses assigned yet
<hazmat> juju make me sick
 * hazmat looks for some mojo
<hazmat> oh.. i see its the bootstrap isn't complete yet
<hazmat> hmm.. this is odd.. error on destroy-environment.. http://paste.ubuntu.com/684699/
<robbiew> hazmat: heh
<niemeyer> hazmat: This error means the machines are still pending in EC2
<niemeyer> hazmat: I have a guess about what the underlying error means, but it looks like the exception is from a bug in txaws itself
<niemeyer> hazmat: I mentioned to jimbaker` that EC2 would raise an error if the machine disappeared while it was waiting, since ec2-describe-instances (the equivalent API) requires the id to exist
<niemeyer> hazmat: This may be the underlying cause.. but the txaws error will have to be fixed for us to be sure
<statik> lynxman: fyi I submitted a homebrew formula for ensemble. homebrew is a popular replacement for macports. https://github.com/mxcl/homebrew/pull/7488
<lynxman> statik: nice!
<lynxman> statik: hope they're faster ;)
<statik> yeah, previously when I've sent pull requests for things like upgrading to a newer version of bzr they have been merged in 45 minutes
<statik> macports has fallen out of favor with all the developers I know who use macs
<statik> i'll let you guys know if it gets merged. there are no "maintainers" so anyone can submit updates in the future.
<lynxman> statik: that's very cool, I've submitted things to macports in the past and they've been faster, I think they're trying to cope with the upgrade to Lion
<lynxman> statik: but definitely, as many flavours as possible of ensemble on the mac are a good thing
<jimbaker`> niemeyer, hazmat - i made that fix with respect to describe instances. obviously for now the immediate problem is in txaws, as mentioned
<smoser> SpamapS, can you re-review https://code.launchpad.net/~smoser/ensemble/cloud-init-output-log/+merge/73596
<smoser> should pass test now.
<SpamapS> smoser: ack
#ubuntu-ensemble 2011-09-08
<relateable> how do I configure which ssh key to use in my environments.yaml?
<niemeyer> Mornings!
 * niemeyer => lunch
<niemeyer> hazmat, jimbaker`, bcsaller: Meeting time
<niemeyer> ?
<jimbaker`> niemeyer, sounds good
<niemeyer> I've started the hangout, but if it's just the two of us it won't make much sense as a team meeting
<jimbaker`> niemeyer, maybe just wait until hazmat and bcsaller are available?
<hazmat> niemeyer, i'm in the middle of a keynote
<hazmat> niemeyer, ready in 3m
<niemeyer> hazmat: Ah, super, no worries.. we can also do it at another time
<niemeyer> hazmat: fwereade and bcsaller seem away too
<hazmat> ah
<niemeyer> hazmat: Is it going well there?
<hazmat> niemeyer, yeah.. talk went ok, sandwiched between some very technical talks (google on process scheduling efficiencies, and xen ha replication)
<niemeyer> hazmat: Wow.. pretty wide range :)
<hazmat> niemeyer, there are some interesting talks today on containers, but for the most part its not really the right audience
<hazmat> niemeyer, its all cloud ;-)
<niemeyer> True :)
<niemeyer> hazmat: I sorted out the GOPATH problem on gotest yesterday
<niemeyer> hazmat: Well.. mostly.. still need to polish it up a bit, and then battling reviews to get it merged
<hazmat> niemeyer, awesome.. incidentally google handed out gophers dolls as  conference swag
<niemeyer> Wow
<niemeyer> hazmat: That's surprising
<niemeyer> hazmat: Was there anything about Go there besides the doll? :)
<hazmat> niemeyer, nope.. but there was a talk about using python for unit testing bioses ;-)
<niemeyer> hazmat: Very weird.. and a bit pointless it seems.. did people even noticed the correlation?
<niemeyer> s/noticed/notice
<hazmat> niemeyer, not sure what you mean.. but the class of geeks here have very strong juju ;-)  getting a few interested seems like a cheap investment
<hazmat> niemeyer, can i start calling you medicine man? ;-)
<niemeyer> hazmat: I mean the correlation between the doll and the language
<hazmat> ah.. yeah.. its labeled pretty clearly if discretely
<niemeyer> I'm still waiting for Mark to join that thread :)
<niemeyer> hazmat: By the way.. today I passed by a juju label engraved in the sidewalk.. starting to get concerned about that stuff :)
<hazmat> niemeyer, we'll need to start sacrificing chickens at release time
<niemeyer> ROTFL
<jcastro> hey SpamapS and m_3 
<jcastro> we'll have 2 screens, and they can make wired ethernet available
<jcastro> but I still think we should shoot for no-network
<m_3> yes, no-net
<jcastro> SpamapS: and we can have the space anytime after 5:30 for rehearsals
<m_3> 2 screens'll be awesome
<robbiew> jcastro: m_3: so assume we have one screen for the demo, as the other will contain silbs' slides
<jcastro> ok
<jcastro> SpamapS: they need a more concrete power estimate for the server other than "4 2Uish units". Any ideas?
<m_3> gotcha
<m_3> negronjl, kirkland: is there some other ppa:canonical-sig/thirdparty with an oneiric index?
<kirkland> m_3: i think that's all legacy;  i don't think there's anything active/useful there, right negronjl ?
<m_3> we're pulling the hadoop packages from there in our formulas
<m_3> (or at least we were for natty images)
<negronjl> jcastro:  are you doing a demo on hadoop ?
<jcastro> yep
<negronjl> jcastro:  if so, it's all natty
<m_3> bummer
<negronjl> jcastro:  the current formulas will work just fine on natty.  I haven't done any testing on oneiric for hadoop yet.
<negronjl> jcastro:  It's in my TODO to work on this for oneiric
<jcastro> heh, I guess m_3 will find out how well they work
<negronjl> m_3:  ping iamfuzz on the hadoop packages for oneiric.  I believe he was working on something like that
<m_3> negronjl: ok, thanks!
<negronjl> m_3, jcastro, kirkland:  if iamfuzz is done with the oneiric pacakges, then, the only change will be the ppa change.
<SpamapS> jcastro: at least one 20AMP circuit .. maybe 2, or a single 30
<SpamapS> hazmat: talk to me about openstack support. txaws needs your branch and mine. What else needs to land?
<niemeyer> and, formula bundling works!
<niemeyer> On that note, I'll step out for some coffee
<niemeyer> SpamapS: Great question there. I'm interested on the answer too.
<niemeyer> bbiab
<niemeyer> Spoke too soon.. there was a bug in the bundling
<_mup_> ensemble/go r18 committed by gustavo@niemeyer.net
<_mup_> Implemented bundling of formulas!
<_mup_> ensemble/go r19 committed by gustavo@niemeyer.net
<_mup_> Unified formula.ParseMeta and ReadMeta.
<_mup_> Now only ReadMeta exists, and it accepts an io.Reader,
<_mup_> which is a common denominator and enables easily doing
<_mup_> what the other two could achieve.
<hazmat> bcsaller, have you run the tests on the lxc-lib branch recently?
<hazmat> bcsaller, their all failing for me atm
<bcsaller> hazmat: its fixed locally, I'll push
<bcsaller> hazmat: you should be able to pull it now
 * hazmat updates, and reruns the tests
<SpamapS> You guys want to see something cool?
<_mup_> ensemble/go r20 committed by gustavo@niemeyer.net
<_mup_> Unified formula.ReadConfig and ParseConfig.
<_mup_> Now ReadConfig accepts an io.Reader, equivalent to the
<_mup_> ReadMeta change.
<SpamapS> checkout lp:~clint-fwebar/ensemble/gource-output ... 
<SpamapS> run 'python -u misc/status2gource.py | gource --log-format custom --file-idle-time 1000000 --highlight-dirs -
#ubuntu-ensemble 2011-09-09
<SpamapS> Oh, and then of course, add/remove stuff from your running ensemble environment
<niemeyer> jimbaker`: ping
<SpamapS> still needs to have the command changing added.. and show the units / machines.. but.. kind of cool. :)
<SpamapS> (oh, and install gource)
<hazmat> bcsaller, interesting so it still doesn't work
<hazmat> part of the issue i think is that the container needs to do an apt-get update prior to the pkg install
<hazmat> else its referencing an old version of ensemble that's not upstream anymore
<bcsaller> hazmat: ahh, that is strange, I have a script that updates the cache locally, but its just a chroot, apt-get update/upgrade
<hazmat> bcsaller, it should be part of the ensemble-create script, else its going to break during dev cycles every time there's an upload or prods on srus
<hazmat> bcsaller, i'm also seeing this alot.. ensemble.lib.lxc.LXCError: lxc-stop: failed to stop 'lxc_test': Operation not permitted
<hazmat> which causes other tests to fail because the container exists
<hazmat> bcsaller, any ideas on automated alternatives to apt-get upgrade in ensemble-create?
<hazmat> it does take quite a while
<bcsaller> yeah, I've been thinking about that, its like there is another type of bootstrap before spawning a stack that you might want before spawning many nodes
<SpamapS> I think they call that a "release" ;)
<hazmat> bcsaller, well its not even that.. the environment might live for quite a while.. and this will still cause breakage when adding a new unit to an existing env 
<SpamapS> hazmat: not so on a regular release of Ubuntu
<SpamapS> the versions in the lists never "disappear"
<hazmat> SpamapS, all it takes is an sru?
<SpamapS> nope
<SpamapS> they stay there, all of them
<hazmat> SpamapS, ah cool.. so its only for dev versions that old versions get yanked?
<SpamapS> this business of purged versions is only an issue during development
<SpamapS> Yeah, for this exact reason.
<hazmat> bcsaller, so even updating the debbootstrap cache by hand, i'm still seeing a bunch of errors.. do the tests work for you on the lxc-lib branch?
<bcsaller> hazmat: yes
<bcsaller> hazmat: _cmd does return the output of the command if you want to make a change to look at it
<SpamapS> jimbaker`: what is "butler" ?
<SpamapS> sounds a lot like jenkins. :)
<hazmat> bcsaller, http://paste.ubuntu.com/685682/
<_mup_> ensemble/lib-lxc-merge r339 committed by kapil.thangavelu@canonical.com
<_mup_> merge latest lxc-lib
<bcsaller> hazmat: I'll see what I can do :-/ Just not sure there is a good place in the lifecycle for this. You think the tests timeout was what killed it and then it didn't clean up for the later tests?
<hazmat> bcsaller, hmm.. there's a couple of issues, i don't think the timeout is one of them
<hazmat> bcsaller, the container cleanup needs to wait for stopped state before proceeding to destroy
<hazmat> _cmd is spitting the output on error, without any context of what command it ran... although that's less functional
<hazmat> er. not a functional problem
<SpamapS> hazmat: are you running the containers as pure daemons or foreground children?
<hazmat> SpamapS, daemons
<SpamapS> hazmat: so you need to watch the cgroup then
<SpamapS> or poll proc
<hazmat> SpamapS, lxc-wait does the trick
<SpamapS> oh nice
<SpamapS> didn't know that existed
<bcsaller> SpamapS: took us a while to find it too
<hazmat> bcsaller, i don't see how the tests could be working
<hazmat> bcsaller, lxc-stop normally tosses an error
<hazmat> bcsaller, which will break per its integration with _cmd
<hazmat> and raise an exception
<hazmat> bcsaller, what version of lxc do you have?
<bcsaller> hazmat: 0.7.5-0ubuntu7
<hazmat> aha
<hazmat> i'm on the version in the ppa
<bcsaller> hazmat: let me know if that changes anything
<hazmat> SpamapS, any chance we can the oneiric lxc into the ppa for natty
<hazmat> bcsaller, update-manager -d is broken for me at the moment.. 
<hazmat> s/can/can get
<jimbaker`> SpamapS, historically a butler used to manage the buttery, which would in turn store the results of churning operations
<jimbaker`> SpamapS, that it is also similar to jenkins is not terribly coincidental either ;)
<SpamapS> hazmat: you should be able to just upload it.
<kim0> hmm does open-port support a port range
<niemeyer> Hallo Ensemblers
<kim0> Hello
<kim0> hmm .. Can I launch a long running program from an install hook
<niemeyer> kim0: Hey man
<niemeyer> kim0: Absolutely
<kim0> I was imagining I'd need tricks
<kim0> I'm doing an torrent download appliance on the cloud with Ensemble :)
<kim0> hope this will go popular with many users
<_mup_> Bug #845604 was filed: ensemble should show ports that need to be exposed <Ensemble:New> < https://launchpad.net/bugs/845604 >
<niemeyer> kim0: It does sound cooL!
<kim0> can open-port do a port-range
<niemeyer> kim0: No, but that's an interesting idea
<niemeyer> kim0: Can you please open a bug about this?
<kim0> sure thingie
<hazmat> kim0, so i imagine at some point we might try to put some sort of sensible time outs on  hooks
<hazmat> but if you fork something that should be fine
<_mup_> Bug #845616 was filed: open-port should support port ranges <Ensemble:New> < https://launchpad.net/bugs/845616 >
<kim0> hazmat: does that mean the install hook would not be considered "complete" 
<hazmat> kim0, if it hasn't exited ... yes
<kim0> hazmat: I suppose the better way is to double-fork my command 
<hazmat> definitely
<kim0> any advice on doing that, or should I google :)
<hazmat> kim0, probably google will be faster, what are you writing the program in?
<kim0> bash shell script
<kim0> it's a formula after all :)
<kim0> It might actually not be a bad idea for my use-case .. to start the command in screen and detach it
<kim0> I hope that would make the hooks happy
<niemeyer> kim0: start-stop-daemon may help you as well
<kim0> niemeyer: oh thanks looking at that
<kim0> woot, bash's version of double fork is: ( command & )
<kim0> of course!
<kim0> When ensemble launches an instance, can it get it's host SSH keys (ec2-get-console-output), such that I am not asked to confirm the machine's identity upon first login? worth a bug report?
<niemeyer> kim0: We already have one open for that 
<kim0> ah okie
<niemeyer> kim0: I *think*
<niemeyer> kim0: At least we're very aware of the issue
 * kim0 nods
<niemeyer> kim0: The proper way is to send the host key
<niemeyer> kim0: Rather than just ignoring
<niemeyer> kim0: Otherwise it's a security issue
<niemeyer> kim0: You'll likely  ignoring it anyway, but then you're the security issue! ;-D
<kim0> exactly :)
<niemeyer> s/ignoring/be ignoring
<niemeyer> kim0: We want to improve that, more seriously
<niemeyer> jimbaker`: Please ping me when you're around
<jimbaker`> niemeyer, hi
<niemeyer> jimbaker`: Yo
<niemeyer> jimbaker`: So, are going to have a working waterfall today?
<jimbaker`> niemeyer, i should have butler working yes that runs the churns and generates the waterfall. to do so, i simply need to add code to walk the updates in a bzr branch
<niemeyer> jimbaker`: Cool
<jimbaker`> this is pretty straightforward, compare bzr revno of a local branch with bzr revno lp:ensemble
<jimbaker`> (or whatever branch)
<niemeyer> jimbaker`: Hmm.. not really.. the other point of comparison is the waterfall itself
<jimbaker`> niemeyer, what do you mean? in terms of the build runs in the waterfall directory?
<niemeyer> jimbaker`: 1) update bzr to tip; 2) i := max revno in branch; 3) j := get max revno in waterfall; 3) for j < i: update bzr to j + run tests
<niemeyer> jimbaker`: 1) update bzr to tip; 2) i := max revno in branch; 3) j := get max revno in waterfall; 4) for j < i: update bzr to j + run tests
<jimbaker`> niemeyer, sounds good, thanks!
<kim0> hmm .. my ensemble deploy, results in "install_error", when I debaaaaapaaaa
<kim0> hung irc window .. continuing ..
<kim0> when I debug-hooks, and execute the hook manually .. it's exit code is 0
<kim0> any idea what could be going on
<niemeyer> kim0: No, but you can check logs locally
 * kim0 looks around
<niemeyer> kim0: In the machine itself, that is
<niemeyer> kim0: ensemble ssh <machine num>
<kim0> niemeyer: /var/log/ensemble/machine-agent.log ?
<niemeyer> kim0: yeah
<niemeyer> kim0: Wait, no
<m_3> kim0: /var/lib/ensemble/units/<unitname>/formula.log
<niemeyer> kim0: This is from the unit agent
<kim0> thanks :)
<niemeyer> kim0: What Mark says
<kim0> m_3: hmm I forgot what's the cwd for a hook?
<m_3> /var/lib/ensemble/units/<unitname>/formula/
<m_3> is that what you mean?
<kim0> yes, thanks!
<m_3> np
<SpamapS> jimbaker`: so you've never explained how your butler relates to jenkins..
<SpamapS> jimbaker`: is this just a "jenkins is too complex and I don't want to use it" or "jenkins is missing something fundamental" ?
<jimbaker`> SpamapS, this is a project specific tool
<SpamapS> Because its basically the industry standard.. and we already use it all over Ubuntu dev
<SpamapS> Still sounds a lot like it was a project specific invented tool to do what jenkins does. :-P
<SpamapS> I mean.. jenkins does code coverage analysis, distributed multi-platform testing, and a whole host of stuff I don't even understand yet.. so I'd like to understand why we're not just running bash scripts in jenkins.
<jimbaker`> SpamapS, these are all good points. the functional tests could be readily run by jenkins. however by being project specific, we can ensure that it can best meet our needs
<SpamapS> LOL, ok, thats true, integrating it with several releases is more my problem than "yours"
<SpamapS> jimbaker`: as long as I can also run it as part of our CI for upload to Ubuntu and maintenance of a "stable" PPA, I don't care what output it returns. :)
<jcastro> bike shed question: why does ensemble log in /var/lib/ensemble/whatever instead of just /var/log?
<jimbaker`> SpamapS, it will be very easy to take the output of churn and turn it into junitxml
<SpamapS> jcastro: it does use /var/log for the "machine" wide logs
<SpamapS> jcastro: the unit log ends up in /var/lib/ensemble because its eventually going to be in /var/lib/lxc/container/rootfs/....
<SpamapS> I think
<jcastro> yeah but I don't care about that, I care about the service being deployed and what not.
<SpamapS> jimbaker`: I don't even care about junitxml
<SpamapS> jimbaker`: just "pass/fail"
<SpamapS> jcastro: its a stop gap until the unit agent runs inside a container.
<jcastro> oh I see
<jimbaker`> SpamapS, sure you just need some way of summarizing the churn results
<SpamapS> jimbaker`: no, I need an exit code non 0
<SpamapS> jimbaker`: of course, I could just use run-parts on the same dir churn sees, why do I need churn? ;)
<jimbaker`> SpamapS, sounds like you don't need any part of the butler project to run the functional tests with jenkins. cool
<niemeyer> SpamapS: I don't want to buy into the whole Jenkins and all of the things it does that we don't know before we need to
<niemeyer> SpamapS: Right now our glorious functional test suite and Jenkins reinvention sums up to less than 100 lines
<SpamapS> Hey I'm not complaining.. I *do* need something jenkins has that you don't, which is running multiple tests on multiple platform slaves. :)
<niemeyer> SpamapS: Let's put that online ASAP and focus on the meat, which is the tests themselves and being able to see if trunk is working or not
<niemeyer> SpamapS: We certainly have it.. these scripts can run anywhere
<SpamapS> And one would need to coordinate the results of all of those tests.
<niemeyer> SpamapS: I know you're not complaining.. I'm just stating the reasoning we're doing this because I've heard the "Oh, but that's Jenkins" argument a few times, so wanted to explain
<niemeyer> SpamapS: Sure.. and nothing prevents us to use Jenkins when the threshold has been crossed
<SpamapS> The setup that we need is, run tests on [ all supported releases ] then copy the package into the "stable" PPA.
<niemeyer> s/to use/from using
<SpamapS> and by we, I mean those of us integrating ensemble into Ubuntu and supporting people who use it for demos. :)
<SpamapS> triggered by changes in bzr.. and showing those changes in all reports... 
<niemeyer> SpamapS: I bet I can do this with less than 100 lines of fabric logic or similar
<niemeyer> SpamapS: But before even worrying about this, we need the tests
<SpamapS> niemeyer: I'd think we'd want to rally around one tool.. like we have for everything else at Canonical. Jenkins has been in use for well over 8 months in the platform team for testing.
<niemeyer> SpamapS: and being able to run them at all
<niemeyer> SpamapS: That's great, and nothing we're doing prevents its use
<niemeyer> SpamapS: But I don't want to buy a big truck when I need to walk next door
<niemeyer> SpamapS: We should be able to run these tests in any machine, anywhere
<niemeyer> SpamapS: checkout branch; run..
<niemeyer> SpamapS: With that covered, Jenkins support is trivial
<SpamapS> indeed, jenkins tries very hard to be "any machine" :)
<SpamapS> so getting that story right is the right focus. I was surprised to see a bunch of HTML output created and stuff.
<niemeyer> SpamapS: It's less than 50 lines of code that converts a directory full of output files into HTML
<niemeyer> SpamapS: and it's completely independent from the runner
<niemeyer> SpamapS: Which is completely independent from the Bazaar updating logic
<niemeyer> SpamapS: Again, trivial to do any of these steps in any other way..
<niemeyer> I need to get some food now.. biab
<SpamapS> ciao!
<hazmat> jcastro, things which definitively live outside of a container do log to /var/log/ensemble .. the machine and provisioning agent atm
<SpamapS> hazmat: so , given the impending release and such, I'm going to import your merge proposal as a patch to the oneiric txaws package..
<SpamapS> hazmat: even if you do make a release, there are other things in there that I'd rather just leave out of my sphere of concern
<hazmat> SpamapS, fair enough.. the biggest thing that's holding me up is i've seen an occasional regression against ec2 that i'm trying to track down
<SpamapS> hazmat: *UGH*
<SpamapS> hazmat: could you note that in the MP? That would be the suck to ship.
<hazmat> SpamapS, ugh indeed.. i'm doing some tests right now, but i'm not seeing any problems atm
<hazmat> i've definitely seen issues b4, but it might they no longer exist
<SpamapS> do we do any extensive functional testing in txaws? Last I saw they weren't mocked up so they actually did hit Amazon
<hazmat> SpamapS, they are mocked up just differently
<hazmat> SpamapS, they have recorded responses from amazon that the parsing verifies against
<hazmat> and on the request side they verify the outbound request
<SpamapS> *ah*
<hazmat> but its not act
<SpamapS> biggest problem I keep running into is that canonistack's s3 is basically unusable 90% of the time
<niemeyer> hazmat: As ahasenack would say, problems that magically disappear, magically reappear :-)
<niemeyer> SpamapS: We should try to deploy Ceph there
<_mup_> ensemble/stack-crack r333 committed by kapil.thangavelu@canonical.com
<_mup_> merge trunk
<hazmat> SpamapS, its not that bad for me re canonistack
<hazmat> niemeyer, proper solution is to deploy swift
<hazmat> alternatively gluster
<hazmat> ceph + btrfs = chains of instability
<niemeyer> hazmat: "proper" depends a lot of context
<niemeyer> s/of/on
<hazmat> niemeyer, well we're talking about a machine provider storage that has an s3 front end and scales.. swift is that
<niemeyer> hazmat: Really? Who's been using it at scale?
<hazmat> niemeyer, it powers rackspace cloud files today
<hazmat> its production code
<niemeyer> hazmat: Interesting.. I'm curious about the stability of it
<niemeyer> hazmat: Either way, Ceph is going to production soon as well
<hazmat> when we want to talk about volume/storage management by ensemble itself.. then tools like ceph/lustre/gluster are more appropriate, assuming an absence of a requisite provider capabilities (like orchestra)
<hazmat> niemeyer, i'm not sure how.. i still see lots of btrfs fails
<niemeyer> hazmat: They've been in beta for quite a while
<niemeyer> hazmat: objects.dreamhost.com
<niemeyer> hazmat: This is the restricted beta site
<hazmat> niemeyer, internal server error ;-)
<niemeyer> hazmat: Yeah, unfortunate timing
<niemeyer> hazmat: It's down ATM
<hazmat> ceph has many more moving parts and code, and depends on other things that are not production ready (btrfs)
<hazmat> compared to swift for example, but swift isn't block storage
<hazmat> er. volume storage
<hazmat> its REST object storage
<adam_g> does ceph or gluster export block devices to clients?
<adam_g> to the user, lustre is just NFS on 'roids and a nightmare to the sys admins :P
<niemeyer> adam_g: Yeah, Ceph has a kernel driver in the mainline
<niemeyer> But that's a separate piece from the object storage and S3 interfces
<niemeyer> inerfaces
<adam_g> niemeyer: right, a file system or a block driver?
<niemeyer> adam_g: "Rados block device (RBD).  The RBD driver provides a shared network block device via a Linux kernel block device driver (2.6.37+) or a Qemu/KVM storage driver based on librados.  In contrast to alternatives like iSCSI or AoE, RBD images are striped and replicated across the Ceph object storage cluster, providing reliable, scalable, and thinly provisioned access to block storage.  RBD supports read-on
<niemeyer> ly snapshots with rollback."
<adam_g> oh, cool
 * adam_g knows little about ceph
<adam_g> https://lists.launchpad.net/openstack/msg00053.html <- interesting
<niemeyer> adam_g: I don't claim to know much either, but its features resemble science fiction
<niemeyer> adam_g: Except it's real software backed by a real company that is doing that for quite a while
<adam_g> swift is definitely production ready stuff
<niemeyer> adam_g: It's good to hear you guys feel confident on it
<adam_g> its the only openstack component thats seen production use. its too bad its lumped in and assumed to be as unstable as everything else under that umbrella
<niemeyer> adam_g: So, it's not clear to me.. how does Swift handle storge?
<niemeyer> storage
<adam_g> niemeyer: at what level?
<niemeyer> hazmat: Perhaps you can answer that as well.. have you been following it?
<niemeyer> adam_g: Replication, balancing, etc
<adam_g> niemeyer: http://swift.openstack.org/overview_architecture.html is a good overview
<niemeyer> adam_g: Neat, thanks
<SpamapS> hazmat: btw, re CEPH, its apparently ok to use it w/ ext3/4 now.. just not as performant.
<SpamapS> niemeyer: file level.
<SpamapS> niemeyer: swift is not a block store
<SpamapS> hazmat: maybe I'm doing something wrong w/ canonistack's s3.. it has been timing out with every request all day
<niemeyer> SpamapS: Yeah, was mostly wondering about the logic for replicating/load balancing the files
<SpamapS> Its pretty simplistic
<SpamapS> Thats a compliment to it btw. :)
<SpamapS> Its a bit more clever than MogileFS, which simply keeps track of all files in an underlying database.
<SpamapS> hazmat: heh, ignore my earlier comment about canonistack's s3 going slow.. I had left out my patch in the debian/patches/series file .. DOH!
<SpamapS> hazmat: so, do you have a workaround for the keys not being set?
<niemeyer> Stepping away.. have a good weekend folks
<SpamapS> you too niemeyer!
<hazmat> <hazmat> SpamapS, so i think the issue i'm able to trigger on ocassion also exists in txaws trunk
<hazmat> <hazmat> happens when the security group gets removed
<hazmat> <hazmat> some sort of error happens, that txaws doesn't parse properly and then it gets a traceback
<hazmat> SpamapS, as for key not set workaround not sure.. smoser has a branch for openstack and cloud-init
<hazmat> SpamapS, gustavo suggested working around by bypassing cloud-init key installation.. 
<smoser> cloud-init is uploaded
<hazmat> smoser, nice, thanks
<hazmat> but regarding lucid support we either fix in openstack, ensemble, or sru cloud-init
<_mup_> Bug #846055 was filed: Occasional error when shutting down a machine from security group removal <Ensemble:New> < https://launchpad.net/bugs/846055 >
<SpamapS> hazmat: heh, well there's no lucid series of principia.. so we don't have to worry about lucid.. right? ;-)
<SpamapS> I think fixing in nova is the right thing
<SpamapS> and it looks like the trivial MP has been approved, so just needs to land in OpenStack.
<hazmat> SpamapS, yeah.. that's ideal, i'd rather not hardcoding things to bypass tools we already depend on
<adam_g> is there any plans to make the ensemble agents upstarted services instead of being spawned by cloud-init?
<SpamapS> adam_g: yes, but there is some trouble to be tended to since the agents might miss changes in state if they're not running (something I think should be fine, but hazmat knows better than I do :)
<SpamapS> IMO the state is the state, and the agent's job is just to make that state a reality.. and formulas should be written that way as well.. not written in such a way where their ordering matters.
<hazmat> adam_g, there are for the local dev
<hazmat> adam_g, we could go there for the provisioning and machine agent as they have no transient state, there's an issue for the unit agent that needs to be resolved for them to safely moved over to it
<hazmat> adam_g, i'm using upstart for unit agents on local dev.. but its a little dicey.. there's an open ticket/bug for it
<adam_g> hazmat: im running into an issue where the agent looses connection to zookeeper due to something that the formula is doing, and needs to be restarted manually
<hazmat> adam_g, do you have any logs for the agent you can upload to a bug?
<adam_g> hazmat: yeah, let me get something and you can tell me if its relevant, or if perhaps the formula shouldn't be doing anything that would cause connectivity to drop
<hazmat> adam_g, is the formula manipulating the firewall?
<adam_g> hazmat: the firewall, no. but basically doing an ifdown -a ; ifup -a
<adam_g> http://paste.ubuntu.com/686183/
<hazmat> adam_g, hmm.. yeah.. we have some better reconnect capabilities in our zk api layer... but we haven't gone through and put the additional reconnect logic into the agents
<hazmat> adam_g, could you go ahead and file a bug for that... we should handle short disconnects a bit better
<adam_g> hazmat: ah.. i might end up not touching the network stack at all in these formulas, but thats not to say nothing else will. this problem didn't show up until deploying to a hardware on  a "real" network :)
<hazmat> long disconnects are little more problematic (effectively the same problem for the upstart, transient state needs persistence, and needs to delta to remote on connect)
<hazmat> adam_g, interesting.. its definitely on my todo list for next cycle re better disconnect handling in agents
<_mup_> Bug #846106 was filed: Interruption of network connectivity should be handled gracefully <Ensemble:New> < https://launchpad.net/bugs/846106 >
<SpamapS> hazmat: explain transient state? Why can't we just look at whats there, and make it true?
<SpamapS> hazmat: like, if I'm starting up, and I see that there's a relation.. I should just pretend its new and run the joined/changed hooks.
<SpamapS> hazmat: likewise for install
<SpamapS> all hooks must be idempotent
<_mup_> ensemble/stack-crack r334 committed by kapil.thangavelu@canonical.com
<_mup_> restore key name use temporarily
<hazmat> SpamapS, transient state like what have we informed the formula about regarding the upstream zk state..
<hazmat> SpamapS, i'm trying not to assume any hooks are idempotent outside of config
<hazmat> SpamapS, ideally they should be
#ubuntu-ensemble 2011-09-10
<SpamapS> hazmat: Err...
<SpamapS> I think you should just assume it
<hazmat> SpamapS, but we'll basically end up re-executing hooks is the only bad thing
<SpamapS> Yeah exactly
<SpamapS> all hooks should be able to be executed over and over and over
<hazmat> SpamapS, but we'll also miss changes, that the hook will need to detect when it gets the join event again
<SpamapS> join+changed always comes together tho
<hazmat> SpamapS, and we'll miss departures
<SpamapS> I've not found much use for those anyway ;)
<SpamapS> But I see the point
<SpamapS> seems to me that the model for those needs to be on join, each service unit gets its own node under the relation that means "I've been told about this" and then it must delete that node when it has been told about the departure
<_mup_> Bug #846129 was filed: After bootstrapping with a custom ensembe-branch, config-get no longer works. <Ensemble:New> < https://launchpad.net/bugs/846129 >
<_mup_> Bug #846197 was filed: Opinionated driver for functional tests <Ensemble:New> < https://launchpad.net/bugs/846197 >
<_mup_> Bug #846208 was filed: Provisioned nodes do not get a FQDN <Ensemble:New> <orchestra (Ubuntu):New> < https://launchpad.net/bugs/846208 >
<jcastro> negronjl: around?
<hazmat> can't seem to hit the archives 
<hazmat> odd
<hazmat> it works from host not container
<hazmat> ah.. its the race condition around resolv.conf
<_mup_> ensemble/local-ubuntu-provider r350 committed by kapil.thangavelu@canonical.com
<_mup_> redo setup of containers, and formula extraction, always log to /var/log/ensemble
<hazmat> interesting the lxc cache captures the old dns server into /etc/resolv.conf which if its stale is toast
<_mup_> ensemble/local-ubuntu-provider r351 committed by kapil.thangavelu@canonical.com
<_mup_> update resolv.conf before attempting package installation in the container in the post-create script
<hazmat> hmm.. dangling symlink
