=== freeflyi1g is now known as freeflying === jcastro_ is now known as jcastro [04:46] hazmat: I'm not certain that the error I was getting was wrong.. need to fiddle with it more. === kim0|vacation is now known as kim0 [07:38] Morning everyone o/ === daker_ is now known as daker [13:02] kim0, g'morning [13:19] hazmat: morning :) [13:23] Good morning! [13:25] niemeyer, g'morning [13:27] niemeyer: Morning [13:32] How're things going there? [13:33] there where :) I'm back home [13:34] hey all [13:35] any particular reason we're not using FileStorage in LoadState/SaveState for EC2? [13:39] fwereade: Hmm, not sure [13:40] still trying to come to grips getting setup my new laptop [13:40] it just seems a bit odd [13:40] fwereade, we're not? [13:40] fwereade, oh.. i guess it predated filestorage [13:40] LoadState and SaveState would be completely generic if they just used provider.get_file_strage for everything [13:41] any objections to making it so? [13:41] fwereade, sounds good [13:41] hazmat: cool, cheers [13:43] on a related note [13:43] actually forget I said anything, thoughts need marshalling a mo [13:44] ok [13:44] FileStorage interface asymetry [13:44] get returns a file handle [13:44] put takes a local file path [13:44] I favour making these consistent [13:45] is there some important feature of this interface that I'm missing? [13:46] (seems a good time to tidy that up, since a new FileStorage class is on its way) [13:47] fwereade, the get returns a file path because in some cases its backed by a temp file [13:47] fwereade, if we returned the temp path, the clean up become ambigious [13:48] for some remote file storages (ec2 is the only extant atm) we spool the file local to a temp file [13:51] hazmat: I tend to favour using file handles over paths anyway [13:52] hazmat: so, let me try again: is there any reason not to pass a file handle to put, rather than a local file path? [13:54] hazmat: if it's just an accident of convenience, I'd rather make it consistent now rather than entrenching the accident by writing a second confirming implementation ;) [13:56] er, conforming ^^^ [13:57] fwereade, re put taking a file handle, that sounds good [13:57] fwereade: There are reasons for the current interface, yes.. give me a moment and I'll be with you [13:58] fwereade, so remote_path, local_file ? [13:58] as args [13:59] hazmat: yep [13:59] hazmat: to go with local_file (*) remote_path [14:00] hazmat: as it were [14:00] niemeyer: cool [14:07] also as per mail on list, the kanban view is still borked, reviewers should look at https://code.launchpad.net/ensemble/+activereviews [14:08] fwereade: Ok, so.. [14:08] fwereade: It's not a big deal in the case of put, but note that the interface is symetric [14:09] fwereade: get() results in a path, put() provides a path [14:09] fwereade: Or, takes a path [14:09] fwereade: There's no greater reason for the latter, besides symmetry [14:10] fwereade: For the former, it's actually more convenient to have it this way, so that we don't have to worry about the file size nor deal with full buffers and whatnot [14:10] niemeyer: well, that's what's documented, but the actual code in ec2's FileStorage appears to return a file handle [14:11] niemeyer: http://paste.ubuntu.com/651774/ [14:11] fwereade: Indeed, and that's pretty weird [14:11] fwereade: and it's not just a file handle.. [14:12] niemeyer: if the docs are the SPOT, I'll fix that as I go [14:12] fwereade: It's a hack that works around the issue I just pointed out [14:12] niemeyer: I have a preference for things I can read/write in interfaces, as opposed to paths, but it's not a big deal [14:12] fwereade: fh is a file object [14:13] fwereade: Well, if you want to go over the trouble of converting it, it's not such a big deal to me, as long as you maintain the properties just mentioned [14:13] niemeyer: quite so, sorry poor terminology [14:13] niemeyer: I guess the other possibility is just to get/put the actual content [14:14] fwereade: Yes, that's one of the explicit things we want to _avoid_ there [14:14] niemeyer: but I guess at times that will be unhelpfully largem so scratch that [14:14] niemeyer: cool [14:14] niemeyer: I'll go with local paths throughout then [14:14] fwereade: If you want to pass a file object in and out, that's fine to me [14:15] niemeyer: it's a smaller change and it matches the docs ;) [14:16] fwereade: There's a disadvantage to it, though [14:16] fwereade: Which is likely why whoever wrote this logic ended up choosing this path [14:16] fwereade: closing the NamedTemporary will remove the file [14:16] niemeyer: ...heh [14:17] niemeyer: in that case, file objects both ways would seem to me to be the ideal [14:18] fwereade: Ok.. sounds good.. please push this change independently from the rest [14:18] niemeyer: yep, np [14:18] niemeyer: I thought of that myself this time :) [14:18] fwereade: Aha, conditioning! ;-D [14:18] niemeyer: don't stop reminding me though, I don't think it's ingrained yet ;) [14:38] hazmat: Do you know what's the reason why the kanban is breaking? [14:38] niemeyer, i was just playing around with that [14:39] niemeyer, i originally suspected it had something to do with the lp oops on some of my branches in review [14:39] niemeyer, but i'm not so sure anymore, i get unauthorized errors trying to use kanban atm against ensemble (fresh kanban checkout and oauth) [14:41] hazmat: Ok, if you're already investigating it, just let me know what conclusion you get to please [14:52] niemeyer, is there something more to setting up kanban than doing the lp login.. i get errors anytime i try to use it (http unauthorized, Unknown consumer in body) [14:52] hazmat: Not in general, IIRC [14:53] <_mup_> ensemble/robust-hook-exit r284 committed by jim.baker@canonical.com [14:53] <_mup_> Merged trunk [14:55] kim0, can you test this https://code.launchpad.net/~daker/ensemble/small-fix ? [14:55] https://code.launchpad.net/~daker/ensemble/small-fix/+merge/68038 [14:56] daker: I'm preparing my session in an hour .. I can test it afterwards [14:56] ty [15:07] niemeyer, it looks like kanban is using a deprecated api, i tried updating the api usage, but ended up just going with anonymous login which fixed the problem with the generation (reports work fine). its a one line patch to kanban [15:08] niemeyer, http://pastebin.ubuntu.com/651803/ [15:08] should probably get cleaned up if its going upstream [15:15] hazmat: Just generated a kanban locally.. apparently it worked [15:15] niemeyer, without the patch? [15:15] hazmat: Yeah [15:15] hazmat: Just run the code I had lying in my disk, without doing anything else [15:33] SpamapS: huh, today 'ensemble add-relation jenkins jenkins-slave' works. (on my ohter laptop) [15:58] Howdy folks, Ubuntu cloud days starting in #ubuntu-classroom on the hour .. see you there [16:08] hazmat: Kanban looks good apparently [16:20] * niemeyer => lunch [16:50] hallyn: weird [17:10] jimbaker, do you know what value for ZOOKEEPER_PATH is used for a deb installation of zookeeper? [17:11] kim0: Thanks a lot for the class there [17:11] hazmat: None, supposedly [17:11] niemeyer: oh cool .. glad it went well [17:12] niemeyer, cool, that works, thanks [17:12] hazmat: np [17:12] niemeyer: smoser + roaksoax are doing the one after the current, on orchestra+ensemble integration [17:12] should be cool :) [17:12] Wow, neat [17:12] fwereade: That may be nice to watch out too ^ [17:33] * SpamapS will tune in === daker is now known as daker_ [17:50] kim0: should be but afaik, the installer is broken or the kernel or something so we wont be able to demo it [17:50] :( [17:50] :/ [17:50] Should be ok .. really wanted to see it [17:50] but no worries [17:51] RoAkSoAx: an in-depth explanation should be great though :) [17:51] I want to understand all about it :) [17:51] thanks a lot for the sessions [17:52] hehe :) [17:52] soon will fully work [18:13] Hi people. I'm installing libapache2-mod-wsgi as part of an install script for a formula, but it keeps failing, and when I debug the hook, I see http://paste.ubuntu.com/651735/ [18:14] afaict, it's related to this package requiring both python 2.6 and 2.7, even though I don't need 2.6 (https://groups.google.com/forum/#!topic/modwsgi/M1AZ5HHb3rY) [18:15] I realise it's not strictly an ensemble question, but was wondering if anyone else has hit this when deploying? [18:37] noodles775: Hmm [18:37] noodles775: That's super strange indeed [18:38] noodles775: It's worse than a package requiring both of these versions.. apparently Python 2.6's UserDict is managing to import Python 2.7 [18:38] noodles775: I've never seen this happening.. must be a bad environment somehow [18:39] Hrm... it's reproducible (you can see the WIP recipe here: https://code.launchpad.net/~michael.nelson/open-goal-tracker/ensemble_deploy/+merge/69078 ). 'tis strange, though I was hoping if I could somehow disable python2.6 I could avoid it :/ [18:39] s/recipe/formula. [18:41] (without having to repackage libapache2-mod-wsgi into a ppa or similar - afaics, it's the only package requiring 2.6) [18:42] noodles775: Checking that out [18:42] * noodles775 tries cutting down the recipe to just install libapache2-mod-wsgi to verify the cause... [18:42] thanks niemeyer [18:43] Seems like that would be a bug in python2.6 or python 2.7 if they were interfering with one another [18:51] gtg for the day, *might* be able to pop on later [19:15] niemeyer: fwiw, I can reproduce it with a minimal install script (just installs apache2 libapache2-mod-wsgi): http://paste.ubuntu.com/651927/ [19:16] noodles775: Can you find abc.py under lib/python2.6? [19:18] gar, I've just apt-get upgraded (in case newer packages help), but will check when it finishes. [19:18] noodles775: Cool, I also suggest attempting to install python2.6 in isolation beforehand [19:18] noodles775: There may be some relations borked up [19:18] noodles775: Note this: [19:19] libpython2.6 depends on python2.6 (= 2.6.6-6ubuntu7); however: [19:19] Package python2.6 is not configured yet. [19:19] and dpkg: error processing python2.6 (--configure): [19:19] dependency problems - leaving unconfigured [19:21] After upgrading, my install hook runs without issues. [19:21] noodles775: I'm pretty sure these packages are in a bad state [19:22] Yeah, it sounds like it. [19:22] noodles775: Or were, anyway [19:22] * noodles775 updates formula to apt-get upgrade before install. [19:36] <_mup_> ensemble/security-acl r292 committed by kapil.thangavelu@canonical.com [19:36] <_mup_> node acl absttractions, reworked to utilize a retry_change like pattern for concurrent updates [19:41] niemeyer: oh well :/. It actually fails with the same error during apt-get upgrade :/. http://paste.ubuntu.com/651938/ I'll have to leave it there for now, thanks for your help. [19:41] noodles775: This is really messed up :( [19:42] noodles775: Do you have anything about PYTHONPATH in the environment? [19:42] noodles775: It shouldn't be the case, but I guess it's the only other reason besides a problem in the package itself [19:43] niemeyer: nope, that install hook is doing apt-get update followed by apt-get upgrade before anything else. [19:43] noodles775: Yeah, I'm just curious if Ensemble itself might be setting these vars [19:43] ah, checking. [19:44] hmm.. it might be [19:45] niemeyer: http://paste.ubuntu.com/651941/ [19:45] noodles785: Try unsetting PYTHONPATH [19:45] noodles785: At the top of the script [19:45] noodles785: I bet that's the error === noodles785 is now known as noodles775 [19:50] niemeyer: it certainly enables apt-get upgrade to complete without errors. Updating the formula to retry it from scratch. [19:50] <_mup_> ensemble/security-acl r293 committed by kapil.thangavelu@canonical.com [19:50] <_mup_> if principal not found in token db, raise a principalnotfound error instead of a keyerror. [19:55] hazmat: you mentioned you have a mongodb formula? [19:55] OMG, have three different events to claim expenses for [19:56] <_mup_> ensemble/security-acl r294 committed by kapil.thangavelu@canonical.com [19:56] <_mup_> remove ACL.update_grant, doesn't have a use case atm. [19:58] m_3, http://kapilt.com/files/mongodb-replicaset-formula.tgz [19:58] hazmat: awesome, thanks! [19:59] m_3, i haven't worked on it since the mongodb conference, i was structuring the sharding and router as separate formulas [19:59] niemeyer, speaking of which how was the mongodb conference in san paulo? [19:59] hazmat: It was fantastic [20:00] niemeyer, good attendance? any talk highlights? [20:00] hazmat: Great meeting some of the 10gen engineers, good talks overall [20:00] niemeyer, nice, yeah.. i think talking to the 10gen guys definitely made the dc one worthwhile for me. [20:01] hazmat: One talk on the internals was interesting for me in terms of having new information.. other talks were interesting in the sense of getting to know some of the people using it [20:02] hazmat: a pretty large local TV station picked it for high-performance tasks, for instance [20:02] hazmat: also got a nice quote from one of the engineers [20:02] hazmat: http://labix.org/mgo [20:03] <_mup_> Bug #816108 was filed: Ensemble needs a high level ACL api/abstraction < https://launchpad.net/bugs/816108 > [20:13] bcsaller: have you merged in the deploys tuff for configs yet? I want to make sure it gets into tomorrow's build so the weekly Oneiric upload has configs... definitely by Thursday since I'm including mention of the feature in my talk. [20:13] * SpamapS REALLY needs to finish these slides with more pictures. :-P [20:16] Is there a reason why some instances are incredibly slow? When deploying this last time, it took ~10mins for the state to update from null->started, and it's been a while since now... still no install hook :/ [20:17] noodles775: are you using m1.small or t1.micro? [20:18] noodles775: I've found that t1.micro are unbelievably unreliable w.r.t. speed. m1.small's also disappoint quite a bit, especially w/ anything CPU hungry at all. [20:20] SpamapS: just the default m1.small, but I've not changed it... I'm just comparing to previous runs, so yep, perhaps it's a reliability issue? [20:21] noodles775: I think its actually luck of the draw.. my theory is that sometimes you get machines without adequate memory bandwidth.. but its just a theory. :-P [20:27] SpamapS, its merged i believe re deploy wth config [20:30] oi [20:31] are there directions written down somewhere for getting an ensemble developer environment set up? I was thinking about having a stab at a bite size bug [20:33] statik, check out https://ensemble.ubuntu.com/docs/drafts/developer-install.html [20:33] jimbaker, thanks! [20:36] jimbaker, are the versions of txaws and txzookeeper in natty new enough, or is trunk definitely needed for those? [20:36] <_mup_> ensemble/robust-hook-exit r285 committed by jim.baker@canonical.com [20:36] <_mup_> Initial mock test for hanging process [20:36] <_mup_> ensemble/robust-hook-exit r286 committed by jim.baker@canonical.com [20:36] <_mup_> Initial mock test for hanging process [20:38] statik, i'm not certain about their update schedule... for those dependencies, i'm also running trunk [20:39] statik: ppa:ensemble/ppa ftw [20:39] you don't need trunk [20:39] SpamapS, perfect, I see that has the versions from oneiric backported [20:39] and natty txaws is fine as long as you don't want to use Eucalyptus/Openstack [20:39] txzookeeper has problems in natty [20:40] statik: thats a daily build PPA .. but trunk has been quite well cared for thus far. :) [20:40] cool [20:40] SpamapS, seems like a reasonable plan - to use the PPA for the dependencies [20:40] statik: oneiric has a weekly upload of ensemble .. in theory ;) [20:40] jimbaker: yeah, thats a pretty common scenario [20:41] jimbaker: drizzle developers even had a package.. drizzle-dev .. that just pulled in all the build deps [20:41] anyway, dentist time.. :-P [20:42] SpamapS, my favorite part of going to the dentist is that i always schedule it at the same time slot w/ my kids [20:42] my son enjoys looking in my mouth for some reason ;) [21:07] hmm.. it look like when the unit tests use a deb installed zk, they don't reset state directories properly [21:09] nevermind, looks like i had a background zk from the pkg install [21:09] killing that and it works properly [21:14] noodles775: Did it work? [21:14] hazmat: Ugh [21:15] niemeyer, yeah.. that's going to bite folks in the future [21:15] hazmat: We have to dump something to avoid having it on [21:30] I'm breaking off for a moment.. will be back later today to finish the expense reporting drama and unblocking for more useful stuff tomorrow [21:31] <_mup_> ensemble/security-otp-principal r291 committed by kapil.thangavelu@canonical.com [21:31] <_mup_> use a class method to specify test ace, allows for better test usage when OTPPrincipal used by other components. [21:37] <_mup_> ensemble/robust-hook-exit r287 committed by jim.baker@canonical.com [21:37] <_mup_> Robust test of reaping a hanging process [21:37] <_mup_> ensemble/security-acl r296 committed by kapil.thangavelu@canonical.com [21:37] <_mup_> remove some debugging statements, update tests to specialized exceptions [21:43] niemeyer, when you have time (perhaps tomorrow) i'd like to discuss some of the lxc work, i don't really see how we're getting around the network issues of multi-unit machines on ec2 [21:44] afaics doing the machine provider for lxc as local dev is the appropriate local dev solution, with specialization of serviceunit deployment by provider type [21:45] setting up a vpn for the container doesn't really address the useability issue [21:46] hazmat: Ok.. what's up [21:46] we can separately address using openswitch or other tunnels, but that's not core or at incompatible with delivering the local dev story via an lxc machine provider [21:47] hazmat: They're two different code paths, with two different deployment models, incompatible with each other [21:47] hazmat: What's the actual issue you're seeing with deploying locally as we discussed? [21:47] niemeyer, i'm trying to understand two things.. one) what are we trying to deliver for the next release.. per my understanding that's a local dev story [21:48] hazmat: Yes, we're trying to deliver local development [21:48] two) how are we overcoming the networking issues with ec2 and multi-unit container based machines.. and is it even worth doing in the ec2 provider [21:48] hazmat: We're not doing EC2 [21:48] SpamapS: yes, its merged, was at lunch, sorry [21:49] hazmat: we're addressing (one) for now [21:49] I was thinking about this and I think we can do it with the same code path. The provider should actually determine what the best way to setup a "container" is inside of one of its machines. In this manner, we can have the local provider say "use the LXC container for my units" and the ec2 one can say "use the noop container for my units" .. then when we figure out how best to do multiple units per machines on ec2, we can make LXC the default. [21:49] SpamapS: Exactly! [21:50] niemeyer, the base layer for both is an lxc abstraction which is common, the differentiator is whether we want a machine provider or is this an encapsulation with the machine agent as it deploys units [21:50] the latter allows for a common code path, but leaves the ec2 problem unresolved [21:50] indeed it does [21:50] hazmat: Indeed.. [21:50] but allows flexibility [21:51] without compromising simplicity [21:51] hazmat: Imagine we had IPv6 working.. it'd be trivial, right? [21:51] niemeyer, that's not entirely clear to me, we'll need 6to4 for common accessibility by remote consumers [21:52] hazmat: No.. imagine we had *IPv6* working.. [21:52] niemeyer, we'd still have to bridge to something that's providing the address [21:53] hazmat: Why? [21:53] we need routable addresses [21:53] or maybe i'm unclear when you say *ipv6* working, is this the future perfect world where external clients are consuming ipv6 services? [21:54] hazmat: Yes, and IPv6 offers that.. [21:54] If amazon allowed multiple IPv6 addresses per machine, you'd just forward the packets that weren't for you on to the container virtual network. Right? [21:54] hazmat: In an IPv6 world, each machine generally gets a block rather than an individual address [21:55] niemeyer, it stills an ipv6 nat afaics, if the provider is only giving out one address... is there some documentation for that? [21:56] my understanding is that you can trade in ipv4 addresses for an ipv6 address block but that's a policy allocation, not a device address notion [21:56] I'd imagine that, as niemeyer is suggesting, when Amazon rolls out IPv6, they'll do away with the internet/external distinction they use now, and just meter traffic as it traverses their borders. [21:57] Given that, they'll most likely also give the VM a block of 8 or 16 or maybe even 255 IPv6 addresses. [21:57] i guess i need to do some ipv6 research [21:58] hazmat: I don't think its all that different. Just more available addresses so NAT is no longer a good idea or needed. [21:58] hazmat: http://en.wikipedia.org/wiki/IPv6_address [21:59] The detail that is more pressing is simply how to make sure Ensemble doesn't need refactoring when that magic day arrives where all endpoints on the net are IPv6 capable. [22:00] hazmat: E.g. [22:00] hazmat: " [22:00] Each RIR can divide each of its multiple /23 blocks into 512 /32 blocks, typically one for each ISP; an ISP can divide its /32 block into 65536 /48 blocks, typically one for each customer;[17] customers can create 65536 /64 networks from their assigned /48 block, each having a number of addresses that is the square of the number of addresses of the entire IPv4 address space, which only supports 232 or about 4. [22:00] 3×109 addresses. [22:00] " [22:00] Who is to say how far out that day is? Exhaustion is estimated between 2 and 7 years away from the articles I've read.. depending on whose stats and projections you believe. [22:01] i thought we had already hit the tipping point on exhaustion [22:01] http://en.wikipedia.org/wiki/IPv4_address_exhaustion [22:01] SpamapS: Right, agreed.. [22:01] SpamapS: Yeah, it's over already. [22:01] The exhaustion that has happened is merely that IANA has assigned all ASN's to regional bodies. [22:01] SpamapS: Agreed on the fact we have to make sure Ensemble doesn't need significant refactoring by then [22:02] the refactoring we're talking about is minimal [22:02] There's still hundreds of millions of addresses, and many many thousands of blocks to dole out, before they start culling unused blocks and clamping down on ISPs that waste them. [22:02] hazmat: Yes, we can use the tunneling on EC2 for when we actually look at this feature, while keeping the real end goal in mind the whole time [22:02] its not even refactoring, its implementing the feature [22:03] hazmat: It's kind of dropping code [22:03] hazmat: If we make it work with tunneling, one day we can just drop the logic which sets the tunneling in place [22:03] The tunneling is somewhat generic, so it shouldn [22:03] 't even be too much work [22:04] niemeyer: Make no mistake though, if you make people use tunneling, they will *reject* it hard. If you offer it as an option.. that may be a different story. [22:05] SpamapS: It will be an option, and right now it will be non-existent, so don't worry yet. :) [22:05] I did yoga this morning.. my worry is at least stayed until tomorrow. :) [22:06] the emails that have been exchanged on this are larger than the work by an order of magnatitude imo [22:06] * SpamapS searches for funny pictures to depict "The Cathedral and The Bazaar" for his presentation.. [22:08] i think the contortions we're going to for policy assignments with a machine provider (local) with a single machine are worse than those that come out from specializing unit deployment by provider.. which we have to do anyways! [22:09] so why its not clear why we're investing time discussing a future solution in a future world, that doesn't address the current problem we have to solve, and whose implementaiton differential is minimal [22:10] we're not going to use ipv6 on a local host provider.. we just want to enable local development story.. so we're not tunneling or natting for local dev, so i'm missing the value for not just treating the containers like machines [22:12] this all sounds like the perfect is the enemy of the good [22:19] hazmat: We're equal grounds.. I also don't understand why you want to do work that is not necessary [22:19] hazmat: rather than doing work that is needed anyway [22:20] hazmat: Being able to choose between deploying in LXC or without LXC, as suggested by SpamapS, is needed for all the deployment methods [22:20] niemeyer, because what it is needed for isn't decided [22:20] hazmat: Except the local one [22:21] hazmat: I miss what you mean with that [22:21] we haven't resolved how this mechanism can be used in ec2 [22:22] hazmat: What do you want to know about this? [22:22] hazmat: I'd rather not have to debate about details we won't be implementing right now, but I see that for some reason this is very important [22:22] and it implies just shifting the maintainance burden onto another shared component, the unit placement algorithm, instead of localizing it to the machine provider [22:23] since now we'll have a provider with exactly one machine. [22:23] hazmat: A provider that has to know all the details of using LXC isn't used in any other deployment method besides the local one [22:23] hazmat: An agent that is able to deploy units within LXC is useful in all deployment methods [22:23] hazmat: That's exactly the debate that created the huge thread which you said wasn't necessary [22:24] niemeyer, i didn't say the thread wasn't nesc. just that the implementation that's being discussed could have fit in a small portion of it [22:25] its good to have the discussion, but i'm unclear that we have resolution, as i'm approaching implementin git [22:25] hazmat: well.. and the proportion increases as we speak :) [22:27] hazmat: It's pretty clear in my mind, at least. Naturally, nothing is settled on stone and it's just my opinion, but I don't see any arguments that contradict these. [22:31] I would say might be.. ;) [22:31] err [22:31] oops I scrolled back [22:34] hazmat: to get down to brass tacks, what I suggest is that each provider chooses how to contain service units and how many service units per machine are allowed. Its more invasive than the provider method.. [22:34] hazmat: but if we decide chroots are a better choice for EC2 .. it enables that quite nicely [22:35] That said, there is only one containment method known and desired at the moment.. and I'm wont to add abstraction layers for what-if's. [22:35] as i said we have two approaches, a local machine provider that uses containers as a machine, and the second one where the local provider has exactly one machine, and the machine agent deploys service unit as a containers. both solutions need a local machine provider, the second also needs service unit deployment specialization, and creates additional machine-unit placement constraints that seem rather arbitrary like a provider with a max of one mac [22:35] hine. The real question of how to realize the benefits of all seem to revolve around future notions of introducing additional network layers and complexity.. i'd argue that simplicity should win, and we should use the right tool for the job at hand, rather than trying to achieve some perfect symmetry that is illusory imo anyways [22:35] SpamapS: They are orthogonal.. it's not the provider that should decide, it's the user [22:36] SpamapS: I mean, the capacity of deploying in a chroot or in an LXC should be at the user's disposal [22:37] Sure, if we make it configurable thats fine, though I'd suggest we just be opinionated about what the default containment method and capacity for machine-splitting is for each provider. [22:37] If somebody comes up with a good one for EC2, then we can make it the default. [22:37] hazmat: Heh [22:38] And if somebody comes up with a narrow-use-case-satisfying version, we can make it usable in that case when the user decides. [22:38] BUT [22:38] hazmat: Complaining that what I say is "perfect symmetry that is illusory imo anyways" is non-constructive FUD [22:38] I'm suggesting that we are just yak shaving at this point. [22:38] hazmat: I'm providing details about why that is the case.. [22:39] I like flacoste's suggestion that we just do both, and learn from it. :-P [22:39] hazmat: Knowledge about LXC is involved.. all the details of how the machine communicate with it, start, stop, boot, kill, restart, etc.. must exist [22:39] hazmat: If you stuff that knowledge into a local provider that is only concerned about deploying in a local machine as if they were EC2 instances, that's not useful anywhere else [22:40] hazmat: If we teach the agent to deploy units within LXC and maintain them, that's useful across the board, *even* if there is additional logic that must be implemented for routing packets properly [22:40] niemeyer: Oh but it is. :) Its already split out into an 'LXCControl' class, which makes functional testing easier. [22:41] hazmat: If you disagree, please provide some details that my help me understand why that is so, rather than calling it illusory. [22:41] niemeyer: create a machine with this cloud init.. start it, stop it, destroy it .. list running machines .. thats useful in all contexts that you need to use LXC [22:41] SpamapS: Yes, it is.. the command line tools are also shared.. the question is how much is being shared. [22:42] Heh.. its just an adapter. Very little code. [22:43] SpamapS: We shouldn't go down a route because it's worse.. [22:43] I did sit down to do it at the machine agent/unit agent spawn level.. will be a lot more code than the provider was. [22:43] SpamapS: We'll have to implement either approach, one of them allows more sharing, more symmetry on deployments, etc. [22:44] SpamapS: If it doesn't work because it's too involved, or whatever, we can rethink. But we shouldn [22:44] Have to write new cloud-init stanzas that we never needed before. [22:44] 't go down that path because it's "only slightly worse".. [22:44] No it will work. [22:44] For the local case, its not that different in results. [22:45] For the EC2 case, we'll need to add NAT to make them addressable, or tunneling, or something. [22:47] Yes, we'll have to add something.. and reuse what will hopefully already work by then. [22:47] Rather than reimplementing it. [22:48] Of course, as hazmat is arguing.. you can reuse what already works now and tackle that use case when you get there, and as Francis suggested, you'll have more info, so you can embrace LXC more completely when the time comes. [22:49] SpamapS: There's no doubts about what we'll have to do by then.. we'll have to enable Ensemble to deploy unit agents in LXC. [22:49] SpamapS: I'm suggesting we do research now to know in detail what that means, and if we have to fall back we take an informed decision. [22:50] SpamapS: I don't think I'm asking too much, honestly. [22:50] Yeah, lets just implement this and be done with it. [22:51] SpamapS: We have to do A or B.. A may be easier, but B is needed anyway. If we have to do A, it must be for a good reason which needs to be understood, because we'll be wasting time doing B anyway. [22:51] SpamapS: I offered to do that work myself. [22:51] As long as EC2 uses the "exec" container method until we do that research, I really don't care. Just give me local dev, and give it to me soon. I spent $300 last month on EC2 because of forgotten instances and 15+ node demos. [22:52] Also launchpad is chomping at the bit.. lifeless is trying to put together launchpad formulas and unwilling to deploy LP into EC2 over and over again. [22:53] Another thing came up with containers btw.. what about services that are memory intensive? Can LXC actually limit memory usage and does it expose that through /proc/meminfo ? [22:54] memcached comes to mind.. it really should use almost all the RAM on the box... we could arguably have it default to do just that in the formula. [22:55] SpamapS: Not sure, but it's worth considering it indeed [23:14] my network disconnected, not sure what got lost.. [23:14] niemeyer, if you want to do that experimentation, i'm happy to not do the lxc stuff, and instead do some pending refactorings (the protocol stuff) and bugfixes in the codebase [23:14] but as far as implementing lxc and what we need today, i don't see the justification for doing things in such a way that we're going to ripple change costs throughout the system.. be it as simple as getting useful output of ensemble status, or dealing with unit machine placement, when we have unresolved questions of how its going to be used or implemented [23:14] hazmat: We'll have to debate this again.. it has changed several times, and I'm not sure that's feasible now. [23:15] hazmat: Someone will have to do the repository work.. do you want to do that instead? [23:16] hazmat: If you have unresolved questions, please ask them. [23:16] hazmat: I've been presenting why and how for some time now.. [23:18] hazmat: and in fact we had a call last week where we discussed exactly this [23:18] hazmat: I'm really expecting a little bit of more insightful feedback now. [23:18] niemeyer, indeed, i've been thinking about since [23:19] hazmat: What has to be changed then? Do you have that written down? [23:20] hazmat: This is not a fight, we just have to think through.. [23:20] hazmat: If we can't do it, let's not.. but we can't decide things by arguing over and over without some insight into the problem. [23:21] the fundamental of how we get around the network issue for ec2 providers, seems to be unresolved, the things proposed tunneling, vpn are about adding complexity vs. just utilizing the api that the provider gives us. be it ec2, where atm we only have one public address per machine, or where the openstack guys are digging into openswitch etc, ensemble isn't a provider, its not clear we should be setting up secondary networking infrastructure on virtua [23:21] lized systems with poor i/o throughput [23:21] hazmat: Forget the EC2 providers.. [23:21] hazmat: We're not implementing them now. [23:23] hazmat: We want to implement local support through a mechanism that will be useful in the future in EC2. [23:23] hazmat: That's not the same as implementing support in EC2 now,. [23:23] hazmat has the same concern that I do. That we will ever do it. [23:23] hazmat: We don't need to implement tunneling [23:23] SpamapS: That's fine.. maybe we'll never do.. but that's how developing large software works. [23:23] To alay that concern, I'd say if we don't do LXC .. we'd do something else like chroot with collision avoidance mechanisms. [23:24] I'm not sure large is desirable. [23:25] SpamapS: Heh.. any other bikesheds for us to get into? [23:25] I'm honestly trying to be positive and helpful in that debate [23:25] But we can't just argue across each other, or it'll be hard to be effective [23:26] niemeyer, the same underlying abstraction (lxccontrol) done today for an lxc machine provider, makes the lxc service unit deployment in the future fairly trivial compared to the costs of implementing these network solutions afaics [23:26] I think we only differ on the perspectives. In my perspective, something that is so far off shouldn't be worried about in coding now. From yours, it must be worried about in coding now. I get the position, but I stand in another one. [23:26] hazmat: What is the cost of implementing an agent able to deploy LXC as an option, _today_? [23:26] hazmat: Please answer that question so taht we can move forward. [23:27] niemeyer, we need an lxc control abstraction for start/stop/exec command.. but it also ripples into things like ensemble status and machine-unit mapping. [23:27] hazmat: Ok, please provide an actual written down version of this with details. [23:27] we'll also need service unit deployment specialization by provider type to avoid using this for certain providers [23:27] niemeyer: I made a meager attempt to ballpark that. You have to create cloud-init rules that will start the unit agent the same way the machine agent is started, and then factor in a way to have it work different for local vs. ec2 provider. [23:28] hazmat: Yeah, we'll have a base class which is pretty dumb [23:28] hazmat: Just opening the formula onto the directory [23:28] hazmat: We can, at some point, have other options such as plain chroots, etc [23:28] The former is fairly independent code and will be useful in any container/vm strategy used. The latter is invasive at the provider level. [23:29] hazmat: and maybe that's even a way for us to start [23:29] SpamapS: No, there's no point in using cloud-init.. that's why I didn't take the idea seroiusly [23:29] There's another option which is to let lxc just start the unit agent directly, but I don't like that one as it won't actually "boot" the container. [23:30] SpamapS: Agreed, ideally we'll boot it [23:30] So you'll end up with a machine that is not actually in the same state as a non container machine. Networking needs configuring, services starting, etc. [23:30] SpamapS, the unit agent could just be a upstart rule at that point [23:30] Yeah you can drop it on the disk too, thats fine. The point is that code is sort of easy and independent. [23:31] The refactoring of providers needs some thought, so that each provider knows what type of container it should provide. [23:32] Thats as far as I got. [23:33] negronjl: hey, didn't you create a mongodb formula? [23:34] hi SpamapS: I did. let me find out wher I put it. [23:34] negronjl: I want to mention it in my talk on Thursday.. just that it exists. [23:34] SpamapS: not in bzr yet. still refining it. [23:35] negronjl: ok, no worries I'll leave it off [23:35] i dropped a dump of mine over at http://kapilt.com/files/mongodb-replicaset-formula.tgz [23:35] hazmat: didn't you have a riak formula too? [23:36] SpamapS, nope. my riak one is a skeleton [23:41] ah ok [23:47] SpamapS: there is a rabbitmq-server formula incoming today or tomorrow [23:49] adam_g: w00t.. yeah I already threw their logo up. :) [23:50] SpamapS: is it okay for me to just push that directly to principia or file a merge bug? [23:51] adam_g: just push it in as lp:principia/rabbitmq-server .. :)