[00:17] andrewsmedina: Heya [00:17] andrewsmedina: Just delivered a review on your branch [00:18] andrewsmedina: Seems pretty close.. [00:18] andrewsmedina: just trivial stuff [00:20] Achievement unlocked! [00:20] Nothing to review on the Go front! [00:20] Phew [00:45] niemeyer: thanks for review! I will work to resolve the things that you said === davechen1y is now known as davecheney [04:16] anybody here? [05:06] * davecheney waves [05:28] andrewsmedina: are you doing a full port for the local provider to juju/go ? [05:28] that would be awesome [06:31] davecheney: hiya [06:35] aoy [06:56] davecheney: i'm a big fan of sharing-memory-by-communicating in general, but sometimes a mutex gets the job done more cleanly IMHO. http://paste.ubuntu.com/1000362/ [06:57] davecheney: (which also has the advantage that we don't serialise all requests) [07:00] rogpeppe: yup, that will work also [07:01] there is no reason why we couldn't also do [07:01] case cmd := <- p.cmd ; go cmd(p.env) [07:02] i'm not sure gustavo is sold on the idea of a proxy yet [07:02] but I do admit your idea is simpler [07:03] davecheney: invoking go doesn't quite have the same semantics (something may be operating on the environment while it changes) but perhaps that doesn't matter. [07:04] rogpeppe: it's fine to operate on a stale copy [07:04] because that is inevitable [07:04] davecheney: if i'm passing a function into a goroutine just to get mutual exclusion, then i start to think i'm doing something a bit wrong. [07:04] rogpeppe: i was following the way the topology worked [07:05] * rogpeppe obviously hasn't looked too closely at that code :-) [07:07] davecheney: if you don't care whether things can run on the same environment when it changes, then things can be even simpler: http://paste.ubuntu.com/1000370/ [07:08] nice [07:08] i'll take that, thank you [07:08] davecheney: cool, glad to be of help. [07:09] i had some reason why it needed to be the original way [07:09] but I can't remember them now [07:09] so until I do [07:09] lets go for simplicity [07:14] davecheney: one thing: do we really need to block in NewProxyEnviron? can't we just let the first operation block until there's a valid environ? [07:14] nope, no need to block [07:14] davecheney: that would simplify things even more: http://paste.ubuntu.com/1000379/ [07:14] that was a personal chioce [07:14] davecheney: down from 167 lines to 86 :-) [07:14] excellent [07:15] i won't mention that it won't compile anymore [07:15] davecheney: minor details [07:15] :-) [07:15] ahh, that is why we block til a valid environ [07:15] so we don't need to check for nil [07:16] davecheney: ah, that's easy to fix too. [07:16] when NewProxyEnviron returns, there is something that proports to be an envion there [07:16] so no worrying about NPE [07:17] davecheney: something like this, perhaps? http://paste.ubuntu.com/1000381/ [07:17] no thanks [07:17] locked isn't protected by a mutex, and the locks and unlocks are unbalanced [07:18] davecheney: locked doesn't need to be protected by a mutex (it's local). we ensure that the lock is held on entry to loop. but yeah, it's not so pretty. [07:19] davecheney: i had the locked logic wrong BTW: http://paste.ubuntu.com/1000384/ [07:23] davecheney: a slightly less unsavoury option: http://paste.ubuntu.com/1000388/ [07:24] davecheney: but maybe it's best just to do the wait explicitly in NewProxyEnviron as before. [07:24] lets wait for the request that it doesn't block during startup [07:25] i don't see it as a issue, as nothing can act on the Environ til it's valid anyway [07:25] thanks rogpeppe that made the code a lot simpler [07:26] tests still don't pass, but whatever [07:26] the main goal is to convince gustavo why a proxy is a good way of managing the unstable environment [07:26] davecheney: i think that the simpler it is the more likely we are of that [07:28] davecheney: BTW i'm not convinced this deserves its own package. it's quite specific to the provisioning agent, isn't it? [07:28] i hope that it makes it easier to test [07:28] ie, we test we can do all the dummy tests against this Environ [07:28] rather than trying to test it via the provisioning agent [07:29] davecheney: there's nothing stopping us having internal tests for the provisioning agent code [07:29] but your point is well made that in theory the provisinng agent is the only one talking to the Environ [07:29] davecheney: i've done that quite a bit in the ec2 provider [07:29] davecheney: also, as a general mechanism it's not quite there [07:30] davecheney: because if a client keeps a Storage around, that won't work well - you'd have to proxy that too. [07:31] rogpeppe: yup - i saw that, maybe that can be solved via convention [07:31] on error, get a new storage [07:31] davecheney: yeah, but will the provisioning agent actually be writing to storage ever? [07:31] davecheney: i.e. do we really care? [07:31] rogpeppe: that, i do not know [07:32] davecheney: we could have a Storage method that just paniced. [07:33] davecheney: (or returned a Storage that always returned an error, i suppose) [07:34] davecheney: oh, i'm talking bollocks, of course the provisioning agent needs to talk to the storage. [07:34] ok, i'll make a proxy storage [07:35] davecheney: yeah. it could always call the relevant storage method in proxy before invoking its method. [07:46] davecheney: it's slightly easier if you assume that the provisioning agent will never need to *write* to the storage (which i think is the case) http://paste.ubuntu.com/1000412/ [07:46] davecheney: and i'm thinking that it's probably best to have NewProxyEnviron block until the environment is valid, and let the caller deal with that. [07:46] davecheney: rather than putting an error test in every method. [07:49] davecheney: anyway, i should get back to my *own* damn code! [08:08] rogpeppe: thanks for your help [08:08] out [11:18] morning. [11:21] Aram: hiya [12:00] gotta love ec2 request reliability [12:15] rogpeppe: Why? [12:15] rogpeppe: Heya btw. [12:15] TheMue: (hiya too) i've run the environs/ec2 amazon tests four times in a row and they've failed each time for different reasons. [12:16] TheMue: we really need some retry logic at the goamz/ec2 level - that will help a lot. [12:17] TheMue: it's great when a test fails 3/4s of the way through, reading a file that it already read successfully, saying "file not found" [12:17] rogpeppe: Yes, transparent for the user. Only real hard errors we know should raise. [12:17] TheMue: yeah. you can't deal with the eventual consistency stuff automatically sadly. but the "premature EOF" and "handshake failure" errors are entirely possible. [12:18] rogpeppe: Is EC2 in general so unreliable? Or is it only a special part of their API? [12:19] TheMue: i have a feeling it's all that unreliable [12:19] TheMue: but i only have the one data point [12:19] rogpeppe: Uuh, not good. [12:19] TheMue: it applies to S3 too [12:19] TheMue: yay, put in a couple of retry loops and it all works now [12:21] rogpeppe: Are there any hints by Amazon on an optimal retry strategy (how often, which delay between requests, …)? [12:21] TheMue: no. they don't even talk about eventual consistency issues. it all looks hunky-dory in their docs. [12:22] rogpeppe: I've got to admit that I expected this answer. I only had a little hope left. [12:22] TheMue: building on shifting sands... [12:24] rogpeppe: So take a house boat, able to move but tied in place. [12:26] TheMue: right, time for some lunch [12:26] rogpeppe: Enjoy, I just had. [12:39] Hullah! [12:49] niemeyer: Moin. ;) [12:49] TheMue: Heya [12:58] niemeyer: I'm just thinking about the "endpointInfo" in topology.go. It's indeed no beatiful solution and tries to wrap what "get_relation_service" and "get_relations_for_service" return in the Python version. [12:59] niemeyer: Its really very similar to "RelationEndpoint2, only that there are names while "endpointInfo" deals with keys. [13:00] TheMue: Right [13:01] niemeyer: I could add the keys to "RelationEndpoint" as private fields, that would safe us one type [13:01] TheMue: Why do we need the keys there? [13:01] TheMue: See the Python code [13:02] niemeyer: Have to look, one moment. It's taken from the Python code. [13:02] TheMue: I don't think it is [13:02] TheMue: This branch diverges wildly from Python all around [13:05] niemeyer: The information that the two functions using the "endpointInfo" returns is the same like the two Python functions named above. [13:05] niemeyer: But there dynamically lists and dictionaries are taken. [13:05] TheMue: That's not what we were talking about.. [13:05] niemeyer: I could add the keys to "RelationEndpoint" as private fields, that would safe us one type [13:06] TheMue: Why do we need the keys there? [13:06] TheMue: See the Python code [13:09] gorgeous weather out there for a change. [13:10] niemeyer: mornin', BTW [13:10] niemeyer: "get_relations_for_service" stores at least the "relation_id", I don't yet know where it is used. Regarding the "ServiceKey" William and I discussed that it's better than the name, which can be retrieved via "ServiceName(key)". [13:11] niemeyer: thanks a lot for the review last night. you might've seen two tiny branches proposed this morning: https://codereview.appspot.com/6220065/ and https://codereview.appspot.com/6221058/ [13:13] rogpeppe: Morning [13:14] TheMue: Sorry, I don't understand what you mean.. the service key is a service key, the service name is a service name. Neither is better than the other without context. [13:14] rogpeppe: No, haven't run through reviews yet [13:14] niemeyer: that's fine. they're both independent anyway, [13:15] rogpeppe: That's great, thanks [13:17] niemeyer: If I have one of it (service name or key) I can retrieve the other (only that from key to name is a bit faster). For relations I need the key to retrieve the name. [13:18] TheMue: Yes. How does that change the picture in the scenario we were talking about earlier? [13:18] TheMue: You can use RelationEndpoint to work with relations in the same way that Python deos. [13:18] does [13:18] TheMue: You don't need to store any private keys in RelationEndpoint. [13:19] niemeyer: That's only a first observation for me to get a better understanding [13:19] TheMue: RelationEndpoint needs a RelationName [13:19] TheMue: All of that has to be fixed [13:19] TheMue: And all of it are divergences from what Python does [13:19] niemeyer: It has now a name. [13:20] TheMue: Simplifying is cool, but it needs a lot of attention to what is being done if we are to diverge from what is being done in the Python code. I get quite nervous when I see the implementation diverging wildly both in terms of implementation and in terms of *semantics*. [13:21] niemeyer: What I'm looking for is how to optimally return informations that get_relations_for_service and get_relation_service return. [13:21] TheMue: Ok, let me look at that [13:21] niemeyer: There they are e.g. dynamic lists with the interface as first field and service topology as second field. [13:22] TheMue: What is "service topology"? [13:22] niemeyer: For the rest I already adopted your suggestions. [13:24] TheMue: I suppose get_relations_for_service should return []relation? [13:24] niemeyer: Eh, wrong wording, see return (relation_data["interface"], relation_data["services"][service_id]) [13:25] niemeyer: Good idea, would only have to add the key here (currently it doesn't has). [13:26] TheMue: Yeah, that sounds fine.. RelationKey [13:26] Well, not really [13:26] TheMue: relation{ Key: ... } [13:26] niemeyer: It would be persisted. [13:26] TheMue: Not necessarily.. omitempty [13:26] niemeyer: Ah, yes, I remember. [13:27] TheMue: service=services[service_id])) [13:27] TheMue: What is that data in the service key? [13:29] {"role": r, "name": n}, I suppose..? [13:29] niemeyer: The function only looks for one service. So our Services map would contain the role and the key. [13:30] TheMue: Why does it do that? I'm not sure we should mimic it [13:30] niemeyer: Let me look for a consumer of this function, moment. [13:30] TheMue: If we get a relation back, it should be the actual relation, with its actual.. it's not clear to me why we should be stripping it off arbitrarily [13:31] s/its actual/its actual data/ [13:31] TheMue: I was looking at that too.. still not clear.. feels like a poor interface [13:31] niemeyer: So far I don't see any reason too. [13:32] TheMue: Ok.. let's just return a proper relation with its real data then [13:32] TheMue: If we find out why we can talk again [13:32] TheMue: But I suspect this should do quite nicely [13:32] niemeyer: Think so too. [13:32] niemeyer: At least it "feels" better than my current solution. [13:36] rogpeppe: You got a major LGTM, and a major OMG WHAT ARE YOU DOING? [13:36] niemeyer: interesting [13:37] niemeyer: the way to fix that OMG is to implement signed URLs in s3, which i haven't got around to yet. [13:37] niemeyer: if i put in a TODO, would that be ok? [13:38] rogpeppe: No.. we're not making anything public and swiping that under the carpet [13:38] s/anything/everything/ [13:38] rogpeppe: Been there, doesn't end up well [13:38] niemeyer: ok. i'll do the s3 thing first then. [13:39] niemeyer: only problem is, i looked for how to do the signed URL thing before and didn't come up with anything [13:39] rogpeppe: I can do that if necessary [13:39] rogpeppe: Actually, let me do that [13:39] rogpeppe: Are you happy with the former s3 branch? [13:39] niemeyer: just a pointer to some docs would be fine. or, if you wanna do that, that would be lovely too! [13:39] niemeyer: oh shit, forgot about that. one mo. [13:40] rogpeppe: Cheers [13:55] niemeyer: you've got a review [13:58] rogpeppe: "could we give a nicer error here when we panic?" [13:58] rogpeppe: When would it panic? [13:59] niemeyer: if the host in the headers didn't have two dots in it? [13:59] rogpeppe: This code can't execute if that's the case [14:00] niemeyer: ah yes. [14:00] rogpeppe: Will fix the rest [14:00] rogpeppe: Thanks for the review [14:00] niemeyer: doh, sorry for that [14:01] rogpeppe: np [14:01] rogpeppe: I'll add the "" on a line on its own.. it's not using field names purposefully in this case [14:02] niemeyer: that's fine [14:02] rogpeppe: It should fail if we add a field and forget to update one of the structs [14:02] niemeyer: sure. [14:06] rogpeppe: ${bucket} would probably be a more sensible replacement var [14:07] niemeyer: yeah, that would be fine. although if there's ambiguity between $name and the rest, there's gonna be ambiguity between the bucket name and the rest... [14:07] rogpeppe: Hm? [14:07] niemeyer: isn't that why you're preferring ${bucket} ? [14:08] niemeyer: because it's unambiguously replaceable? [14:08] i suppose the host name might have a $ in. [14:08] if that's allowed [14:08] rogpeppe: $namesis shouldn't get replaced [14:10] niemeyer: yeah, true. it's a URL really and URLs can contain $s [14:10] niemeyer: ${bucket} is just fine. [14:10] niemeyer: lots better than %s :-) [14:12] rogpeppe: Indeed [14:15] "@ylastic #AWS #EC2 API calls returning intermittent errors "Unavailable: Unable to process request, please retry shortly"" [14:15] Things are getting more and more "eventual" [14:18] rogpeppe: Branch re-proposed [14:19] niemeyer: it's wonderful. i tried the amazon test 4 times in a row this morning and it failed each time with a different error. we need to get that retry logic in! [14:22] niemeyer: LGTM [14:23] rogpeppe: Thanks [14:23] rogpeppe, niemeyer morning :D [14:23] andrewsmedina: hiya! [14:24] rogpeppe, niemeyer thanks for review [14:24] andrewsmedina: np. thanks for bearing with us! [14:24] rogpeppe: when you have a time, I wanted to talk with you about the reivew [14:25] andrewsmedina: i've got a meeting in 5 minutes, but in an hour or so, no problem. [14:32] rogpeppe: ok [14:50] andrewsmedina: ping [14:57] rogpeppe: pong [14:57] rogpeppe: about the %d [14:58] andrewsmedina: about the %d [14:58] rogpeppe: virsh use the %d to create a autoincrement number for the net/bridge name [14:59] rogpeppe: the juju python version uses it [14:59] andrewsmedina: is that needed, given that the name needs to unique anyway? [14:59] s/to /to be / [15:00] andrewsmedina: fair enough. *is* it actually documented anywhere, or is it just a bit of black magic known to those who've read the source? [15:01] rogpeppe: http://bazaar.launchpad.net/~juju/juju/trunk/view/head:/juju/providers/local/network.py#L13 [15:02] rogpeppe: I dont know if this is a oficial feature =/ [15:02] andrewsmedina: ok. i see that, thanks. but i'd like to know why we're doing that. or at any rate write a comment saying what it's there for. [15:03] niemeyer, hazmat: do you know, by any chance, why we need an autoincrement number in the lxc network bridge name? [15:03] rogpeppe: I will do a comment about it [15:03] andrewsmedina: thanks a lot [15:03] rogpeppe: about the split [15:03] rogpeppe, we do? we just use the default libvirt name [15:04] s/name/network [15:04] rogpeppe: virsh always returns the same string pattern [15:04] hazmat: see link above [15:04] I need to check anyway? [15:04] hazmat: http://bazaar.launchpad.net/~juju/juju/trunk/view/head:/juju/providers/local/network.py#L13 [15:05] andrewsmedina: yes, but virsh isn't part of the juju program. if someone changes the virsh output, we want a "unexpected virsh output" error or similar, rather than an array bounds panic. [15:05] andrewsmedina, we don't specify that value [15:05] andrewsmedina, its escaped because libvirt uses it [15:05] %%d [15:05] hazmat: why do we need it? [15:06] rogpeppe, we don't.. libvirt does [15:06] hazmat: isn't the bridge name unique enough, given that it includes the network name? [15:06] i'd suggest not being slavish about copying that aspect.. i've got a separate branch local-cloud-img that gets rid of libvirt entirely [15:07] launch time :( [15:07] sorry [15:07] rogpeppe, its not [15:07] hazmat: ok. [15:07] hazmat: when you say "getting rid of libvirt", you mean "not using virsh" ? [15:08] hazmat: i haven't looked too deeply into how all the lxc stuff is set up. i should! [15:08] rogpeppe, no i mean getting rid of libvirt [15:08] rogpeppe, the name comes out virbr [15:08] again this has nothing to do with juju [15:08] this is libvirt's configuration [15:09] and it needs that string insertion for its templates %d [15:10] hazmat: i didn't see any mention of the %d in the libvirt docs. but i'm probably looking in the wrong place. [15:10] hazmat: (i looked in [15:10] http://libvirt.org/formatnetwork.html and [15:10] http://libvirt.org/sources/virshcmdref/html/sect-net-define.html [15:10] ) [15:11] hazmat: so your branch eschews the portability layer and talks to lxc directly? [15:11] rogpeppe, what portability layer? [15:12] rogpeppe, i think your misunderstanding something fundamental here [15:12] hazmat: libvirt looks like a portability layer to me, but i may be misinterpreting [15:12] hazmat: almost certainly! [15:12] rogpeppe, we don't use libvirt outside of setting up the network [15:12] rogpeppe, its a rather broken portability layer when it comes to lxc [15:12] rogpeppe, it has many additional functionalities that have no meaning on the context of lxc [15:12] hazmat: sure. but that is the role it's trying to play, no? or i'm i fundamentally misunderstanding? [15:12] It's lunch time [15:12] rogpeppe, we're just piggybacking on its packinging of a default network [15:13] in the lxc in precise, this is also in the lxc package [15:13] moreover local provider doesn't use cloud-init or cloud images at the moment, the branch i mentioned local-cloud-img, rips out libvirt, uses cloud images for local provider, and uses cloud-init to configure [15:14] rogpeppe, yes libvirt is an abstraction layer over different virt providers [15:14] hazmat: ok, cool. i'm not too far off then! [15:14] rogpeppe, but its not particularly good for several of them [15:15] hazmat: that's in the nature of portability layers i guess. lowest common denominator. [15:15] rogpeppe, for example openstack uses libvirt to address several different tech, but there are also standalone providers for some of the same ones that libvirt provides an abstraction for [15:15] within openstack [15:15] rogpeppe, its actually quite rich as a common api, but in the context of lxc specifically its typically out-of-date, and overkill [15:16] rogpeppe, we had a discussion about trying to use libvirt directly a while ago instead of lxc directly [15:16] hazmat: given that we're directly reliant on lxc in other aspects, i'd be inclined to agree, although i haven't looked at how hard it is to configure networks in lxc directly. [15:16] hazmat: that's the other possibility of course [15:17] rogpeppe, its pretty trivial, just needs the packaging effort around creating the bridge, etc [15:17] trivial to configure that is [15:17] * hazmat preps for his next meeting [15:17] andrewsmedina: anyway, with a comment, i'm happy with that code. [15:49] TheMue: i see sporadic test failures testing state: http://paste.ubuntu.com/1001083/ [15:51] TheMue: I've done no changes at watcher for a longer time, but I've just merged the trunk into my branch and get a watcher failure too. [16:02] TheMue: i don't *think* it's my fault :-) [16:10] * niemeyer waves [16:10] TheMue: Perhaps a race? [16:17] niemeyer, TheMue: i suspect a race, or an inadequate timeout. [16:17] niemeyer: BTW if i take out that PublicRead change, does that "make amazon tests work" branch LGTY? [16:17] rogpeppe: I've stopped checking it out there [16:18] So, back, had to do an emergency repair here (jalousie broke down). [16:18] niemeyer: ok. 'cos i need that branch in an upcoming branch that already has a dependency... [16:27] rogpeppe: Ok, what's the plan? [16:28] niemeyer: i've pushed that branch without that change. the amazon tests no longer pass of course, so i'm changing it temporarily locally so i can test. but the other changes are less trivial and also necessary. [16:28] niemeyer: so when s3 implements signed URLs we'll use them and the amazon tests will pass [16:29] rogpeppe: Ok, I mean what's the plan as far as the branch goes [16:29] niemeyer: the "fix amazon tests" branch? [16:29] rogpeppe: Whatever branch you mean.. yous said you need that branch.. what's your plan [16:30] rogpeppe: I few moments ago I had your watcher error too, but now everything runs. Will investigate it later. [16:30] niemeyer: i'll hold off proposing again until the "fix amazon tests" branch is in. [16:31] rogpeppe: Ok.. so I'm' not sure about what you meant [16:31] niemeyer: BTW if i take out that PublicRead change, does that "make amazon tests work" branch LGTY? [16:31] niemeyer: ok. 'cos i need that branch in an upcoming branch that already has a dependency... [16:31] niemeyer: it would be nice to have a review of the "fix tests" branch so i can propose that upcoming branch [16:31] rogpeppe: You already have a review.. there's a change that can't go in [16:31] niemeyer: it no longer has that PublicRead change [16:32] niemeyer: i've re-proposed it [16:32] rogpeppe: Ok.. so you've reproposed and you want another review? [16:32] niemeyer: yes please [16:32] niemeyer: you should have got a mail saying "please take a look" :-) [16:32] rogpeppe: Heh [16:33] rogpeppe: That's a pretty difficult way to ask for a review [16:33] rogpeppe: Will have a look [16:33] niemeyer: isn't that implied? if not, what's the point of the mail? [16:34] rogpeppe: You see the history above? [16:34] rogpeppe: "Please have a look at http://... again" is fine [16:34] niemeyer: fair enough [16:35] niemeyer: (but i thought that was the point of the "Please take a look" email!) [16:35] rogpeppe: Nevermind! [16:35] yay! go tools now working on the server again... [16:41] niemeyer: BTW while (if?) you're on little reviews, this is trivial, assuming you agree with the premise. https://codereview.appspot.com/6188068/ [16:50] rogpeppe: What's the point of these loops? [16:50] for i := 0; i < 5; i++ { [16:50] ? [16:50] niemeyer: ping [16:50] andrewsmedina: hi [16:50] niemeyer: thanks for your review [16:50] andrewsmedina: np [16:50] niemeyer: I think it is okay [16:50] now [16:50] niemeyer: same as for other eventual consistency issues [16:51] niemeyer: i copied the retry code from elsewhere. we could use a different retry time or count [16:51] rogpeppe: Ok, can you please reduce the delay on each retry then, to maybe 1e8, increase the limit to 30 or so, and add a comment explaining? [16:51] niemeyer: ok, will do, and in the other place too. [16:51] rogpeppe: No need to be too extensive in the comment, just a reminder [16:52] niemeyer: I will work on local provider interface now: p [16:52] rogpeppe: Thanks [16:53] rogpeppe: "LGTM assuming reduced timeouts and increased N as discussed online." [16:53] niemeyer: thanks a lot [16:56] rogpeppe: No problem.. that was easy :) [16:56] rogpeppe: Hopefully we can get a fix to the signature issue in very soon [16:56] rogpeppe: I'll try to address that today still [16:57] *splash* [16:57] (the sound of a branch landing) [16:58] niemeyer: It needed something more to my proposal be accepted? [17:00] right, gotta go. have fun all. see you tomorrow. [17:13] rogpeppe: See ya [17:13] andrewsmedina: I haven't re-reviewed your branch yet [17:13] andrewsmedina: have you addressed the latest comments by rog/ [17:13] ? [17:19] aujuju: What is the best way to use the mysql charm in Juju with dynamic database credientials? [17:50] niemeyer: yes, about %d rog said it is ok [17:50] niemeyer: about split, virsh always returns the same pattern [17:50] andrewsmedina: Cool, need to step out for a doc appointment.. back soon, though [17:50] niemeyer: I'm using the same strategy used in juju python code [17:51] niemeyer: ok [17:55] Multiple IPs!! [17:56] * SpamapS does a dance [17:56] niemeyer: great news! [17:56] the multiple IPs, not the doc appointment ;) [18:08] SpamapS: :-p [19:25] SpamapS: Indeed!! [20:48] Ouch.. that's the sleepiest time of the day [20:54] Alright, not working.. I'll do it Windows style and take a short nap to reboot [20:54] sleep is when the garbage collector runs. [22:00] davecheney: Morning! [22:00] Aram: Indeed [22:02] mornign! [22:02] niemeyer: lets talk about the proxy environ [22:02] davecheney: Yeah, was planning on that too [22:03] davecheney: Good we're overlapping today [22:03] how can I convince you of it's utility [22:03] davecheney: Hehe :-) [22:03] niemeyer: yes, sorry about that, i had to go intot he city to finish some of the employment paper work that was dropped when veronika moved on [22:03] davecheney: Ah, no worries at all [22:04] davecheney: The concept of having such a wrapper feels pretty icky to me [22:04] niemeyer: ok, maybe the name is a problem [22:04] davecheney: It's a bit like replacing the car every time the battery goes bad [22:04] davecheney: Not really [22:05] davecheney: It's the concept itself [22:05] ok, let me put it like this [22:05] we dont' wrap one Environ in another [22:05] the proxy wraps a *ConfigNode in an Environ [22:05] davecheney: We wrap it so that it is magically replaced every time its configuration changes [22:05] yes [22:05] davecheney: That foundational idea is that seems misleading [22:06] ok [22:06] davecheney: If the configuration of an environment changes, we should replace its configuration at an appropriate well known time, not the entire thing behind the scenes at an arbitrary moment [22:07] niemeyer: hmm, this is a bit of a change of heart frmo UDS [22:07] where we talked about making the environment a *ConfigNide [22:07] Node [22:07] davecheney: We're talking about different things [22:07] then the provisioning agent would start to watch that node, until it got a valid configuration [22:07] ok [22:08] davecheney: I'm talking about the thing that offers an environs.Environ interface, not the content of the /environment node [22:08] ok [22:09] davecheney: Whenever the configuration of an environment changes, there's a window of time that the value in memory we have that represents the environment (implements environs.Environ) will have a wrong configuration. [22:09] yup [22:10] i see that as inevitable [22:10] davecheney: This is a problem inherent to the distributed context that we can't solve.. [22:11] davecheney: Understanding that gives us some freedom too, interestingly. It means we can *choose* the best moment to make the configuration active, since we have to deal with it being wrong no matter what. [22:11] i don't think that matters [22:11] the config changing is not the thing that broke the environment [22:11] take the case of AWS keys chanign [22:11] davecheney: It matters, in the sense that we don't have to rush to replace the configuration of an environment. [22:12] ok, so i see your concern as proxy.loop() replacing the configuration at wil [22:12] davecheney: Not the configuration, it is replacing *the environment* [22:12] davecheney: That environs.Environ, is not just a configuration [22:13] how are those not the same thing [22:13] davecheney: It's the environment state we have in memory. [22:13] the environ.Environ is formed from materialising the configuration stored in zk [22:13] davecheney: Is httpd and httpd.conf the same thing? [22:13] that is not a good example [22:13] :-) [22:13] but I take your point [22:14] to offer a counter example, port 80 and httpd.conf are the same thing [22:14] davecheney: Wait, waht? [22:14] davecheney: *port 80* and *httpd.conf* are the same thing!? [22:14] as a counter example [22:15] not being too literal [22:15] davecheney: I can't process or even argue about that.. it seems so wrong in so many levels [22:15] sure, lets get back to the point [22:16] working from the point of view of the PA [22:16] it needs to see changes to the topology relating to machines [22:16] davecheney: RIght [22:16] and try to make reality match that [22:17] so, it needs an Environ [22:17] this we know [22:17] davecheney: Right, I'm with you [22:17] however the PA doen't ahve environments.yaml, only the data stored in zk [22:17] again, no supprise [22:17] davecheney: Right, all good [22:18] so we've added support to state and environs to be able to construct an Envrion interface from zk data [22:18] Yep, sensible [22:18] (now comes the interesting leap) [22:18] so, what happens when the underlying data in zk changes [22:19] there are number of possible choices, and it looks liek I've chosen the wrong logic [22:20] * davecheney re-reads the preceeding discussion before continuing [22:20] ah, yes, you previously said, "If the configuration of an environment changes, we should replace its configuration at an appropriate well known time, [22:20] " [22:21] i offer that the proxy environ does that [22:21] without having to teach the real environs how to do that [22:21] and it is not papering over their limitations [22:21] but a good seperation of concerns [22:22] hey, what did you miss ? [22:22] LOL, perfect timing for the disconnection.. [22:22] davecheney: Right, all good [22:22] so we've added support to state and environs to be able to construct an Envrion interface from zk data [22:22] Yep, sensible [22:22] (now comes the interesting leap) [22:22] 08:18 < davecheney> so, what happens when the underlying data in zk changes [22:22] 08:19 < davecheney> there are number of possible choices, and it looks liek I've chosen the wrong logic [22:22] 08:20 * davecheney re-reads the preceeding discussion before continuing [22:23] 08:20 < davecheney> ah, yes, you previously said, "If the configuration of an environment changes, we should replace its configuration at an appropriate well known time, [22:23] 08:20 < davecheney> " [22:23] 08:21 < davecheney> i offer that the proxy environ does that [22:23] 08:21 < davecheney> without having to teach the real environs how to do that [22:23] 08:21 < davecheney> and it is not papering over their limitations [22:23] 08:21 < davecheney> but a good seperation of concerns [22:23] davecheney: and that was the interesting leap [22:23] davecheney: THe proxy doesn't replace the configuration.. it replaces the whole environment [22:23] i accept that [22:24] davecheney: and has to wrap its whole interface so that whoever has it can pick the latest replacement [22:24] it does so because there is no Environ.ReloadConfiguration() facility [22:24] davecheney: Exactly [22:24] davecheney: That's what we need.. well, something like it [22:24] how would such a faciliy with given that Environ knows nothing of state [22:24] davecheney: SetConfig, potentially [22:25] davecheney: func (e) SetConfig(config EnvironConfig) error, more precisely [22:25] ok [22:26] i think that pushes a lot of logic down to the environ [22:27] it may potentially hold locks while it's config is reloading [22:27] i think using a proxy acomplishes the same thing with less work [22:27] davecheney: The logic is in the environment anyway, if you see it from the perspective that the wrapper is actually an environment that deals just with that one factor [22:28] sounds like good separation of concerns [22:28] davecheney: Not really, from the perspective that two sequential calls to the same environment interface are actually dealing with different environments [22:28] davecheney: This is really not that great [22:29] ok, now I am understanding your concern [22:29] i see that as inevitable, and something we need to code for anyway [22:29] as the environ can't hold any state about it's remote provider [22:29] davecheney: Why is it inevitable? [22:30] davecheney: Why? [22:30] well it can hold state, but it must validate that state before using it [22:30] but that is a digression [22:30] davecheney: Yes [22:30] so, we're always talking about the case of AWS creds failing [22:30] s/failing/changing [22:31] davecheney: Yes.. or failing.. both have to be dealt with [22:31] so, take the case of calling .Instances() to get the instance data, then using StopInstances to stop some of those instances [22:31] during the two calls the environ is replaced [22:31] i dont' see the problem [22:32] the data from Instances() comes from the real AWS somewhere, and is passed back to the real AWS in the second call [22:32] ah, let me approach it another way [22:32] davecheney: I don't want to be using a value that is changing behind my feet.. I don't want to be programming with that mindset. The problems we handle are complex enough by themselves.. we don't have to spoil our own environments. [22:33] so you are happen for the configuration inside an Environ to change, but not the reference to the Environ itself ? [22:34] davecheney: Yes, I'm happy to see the configuration within the environment changing when the configuration of the environment changes [22:34] (!) [22:35] maybe i can approach this another way [22:35] with a different example [22:36] two machines are added to the topology, the PA starts the first, then is restarted, then starts the second [22:37] the reason i ask this is I am trying to understand what is so important inside the specific instance of the real Environ that you want to preserve (that isn't its configuration) [22:37] davecheney: func MyFunc(e environs.Environ) { ... ... }.. is it fine to replace what e *is* in the middle of MyFunc? [22:38] niemeyer: if e is an interface, yes [22:38] wait, let me reread that again [22:38] yes, if e is an interface, yes [22:38] davecheney: Oh, ok.. so it doesn't matter how e is implemented, or what does? It's always ok, necessarily? [22:39] in the situation of juju, we have to code to taht [22:39] for the specific case that e represents a set of calls to an external provider of resources [22:39] here is a better example [22:39] if e was an SQL connection [22:39] davecheney: Not at all.. we don't have to code with a wild environment where we have no idea about what our own code is doing [22:40] it is well established to give back a proxy e (sql connection) that goes an gets a real sql connection when invoked [22:40] davecheney: You've lost me [22:41] let me back up [22:41] davecheney: I have no idea about what you mean there [22:41] lets not talk about environs for a moment [22:41] you know how db connection pools work [22:41] davecheney: Yep [22:41] you get a proxy connection from the pool, then when you do work on it, it checks out a real connection [22:42] davecheney: No [22:42] ok, i'll cease that line of argument [22:42] ok, to summarise, i am aruging that using a proxy environ achieves the same as an (unwritten) Envrion.SetConfig() [22:43] but I can't offer any argument that says it is superior (apart from we don't ahve to write SetConfig) [22:43] so I will drop this branch and go an do SetConfig [22:43] davecheney: I understand your argument, and I've been pointing out that it doesn't.. an Environ is an interface.. the implementation of that interface can do anything it pleases, cache anything it wants, start any goroutines it wants.. [22:44] niemeyer: ok, yes, and that would be sensible [22:44] howeve the consumer of the interface can't assume any of that [22:44] davecheney: Replacing the credentials that an environment uses should not destroy all of that. [22:44] davecheney: Nor should it cause code that takes an Environ to blindly talk to two different values in an arbitrary moment. [22:45] ok, you've won me over [22:45] i would point out that the python version does rebuild a new environ when the config changes [22:45] but i won't take long to do SetConfig [22:46] davecheney: We're talking about *credentials*.. why are we killing the value representing the whole environment to have its credentials changed [22:46] i have a slight concern that at some point in the future configuration may include more than credentials [22:46] but for right now, and what we need for this cycle [22:46] i can't see that happened [22:46] davecheney: The Python provisioning agent isn't something I'd put in a frame, for a few different reasons.. I'm sure you can come up with something we'll be a lot more proud of [22:47] flattery will get you everywhere :) [22:47] davecheney: I'm being honest! [22:47] ok, i'll add SetConfig [22:48] but I would point out my concern that SetConfig will have to drop (potentially) any caches or other internal state [22:48] but having just writtne that [22:48] it's not going to be that hard to recognise those things [22:49] davecheney: Agreed, and it's also conscious.. the exact moment in time when the agent replaces the environment configuration is under our control. [22:50] ah that raises an intersting point [22:51] when you say under our control, i imagine you saying 'when nothing else is using it' [22:51] is that fair to say ? [22:51] davecheney: You know it's not :) [22:51] good [22:51] because that would make things harder [22:52] davecheney: But the provisioning agent itself should not be spinning and replacing the configuration in the background.. there's no reason to, I believe [22:52] so, currently in the proposed PA, the actual stop/start signal is delegated to a goroutine to handle any potential retry logic [22:53] the aim of that was, the signal of topology change is sent one time [22:53] on the change, you can't observe it again [22:53] so we have to store it until it's actioned [22:53] davecheney: We've debated at length the idea of how the agents should be organized, and our tentative design direction was to have a main loop that is responsible for dispatching tasks [22:54] davecheney: I imagine the configuration replacement being done as one of the potential tasks in that main loop [22:54] ok my recollection was the opposite, but i'm happy to be corrected [22:54] davecheney: That's all theoretical, but it gives you the feeling we have about the problem, at least [22:55] davecheney: I recall we talked about one goroutine per machine.. is that what you are alluding to? [22:56] yes [22:56] davecheney: Yeah, this is a recent idea [22:56] so, either the main loop servicing the watcher maintains a list of machines to be added and delted [22:56] davecheney: I'm not entirely sure about it yet.. [22:56] then services that list when there is nothing else to do [22:56] davecheney: I'm not in a good position to say either way since it depends a lot on how the details are done [22:57] or do something like this [22:57] http://codereview.appspot.com/6220063/ [22:57] davecheney: I suspect having goroutines per machine should make the problem simpler, at least in terms of moving things concurrently and focusing on a single task at once [22:58] i think so too [22:58] however it makes the problem of reloading the environment configuration a bit trickier [22:58] davecheney: Still, there's some maintenance task which is unrelated to an individual machine, such as configuration changing, which seems to make sense to be done by a main loop [22:58] as, from the point of view of those workers, it can happen at any time [22:58] davecheney: That orchestrates the whole thing, if you will [22:58] yup, so it's selecting on the watcher for machines and the watcher for confgiuration [22:58] machine changes get delegated [22:59] configuration change call environ.SetConfig [22:59] davecheney: Right, but even those changes need some taking care of [22:59] davecheney: E.g. a controlled reboot needs to take into account those tasks [23:00] right, you're talking about limiting work in progress and the number of concurrent connections in progress [23:00] davecheney: No, not just that [23:00] can I ask a semi related question ? [23:00] davecheney: I'm saying that it's not just a background task that we fire and forget [23:00] do machines, that is to say their topology name, ever get reused [23:00] davecheney: It needs to be managed [23:01] davecheney: If the provisioning agent has to reboot (e.g. upgrades) it should be done in a controlled fashion [23:01] can you expand a bit on that last sentance pelase [23:01] davecheney: Pretty much all background activity must be accounted for, and properly terminated synchronously when necessary [23:01] davecheney: Watchers, for example, have Stop methods that request the termination of the background task [23:02] davecheney: Those methods are synchronous.. they ask for the background activity to die, and wait until they do so [23:02] can you expand on this bit "If the provisioning agent has to reboot (e.g. upgrades) it should be done in a controlled fashion [23:02] davecheney: It's critical that things that start background activity, have hooks onto the termination of those activities, and that we can wait and verify that the given activity was ok [23:02] specifically about rebooting the PA [23:03] davecheney: We need to be able to upgrade, from day zero [23:03] davecheney: Upgrade == get new version, unpack, reboot [23:03] davecheney: s/reboot/restart/, maybe that's clear [23:03] ok, i'm not sure i understand the significance of that [23:04] davecheney: process stops and starts? [23:04] i'm assuming this code will run on EC2 therefore stands a strong chance [23:04] of being rebooted at any time [23:04] davecheney: Yes, why does that matter? [23:04] i'm writing everything to assume no local state [23:04] everything is sensed from the zk state [23:05] and acts uppon it anew every time [23:05] davecheney: Yep, that's nice.. but you have local state no matter how much you pray not to.. in memory [23:05] davecheney: Code running, variables flowing, error conditions [23:05] sure, but lets not get pedantic [23:05] davecheney: No no, I'm serious [23:05] davecheney: It's not ok to say 1) start machine; 2) reboot [23:06] the only assumption i'm building on is that the watcher will always tell me about anything I have missed on the first cycle [23:06] davecheney: Yes, that's fine [23:06] tf the PA can start at any point and learn about the complete topology to that point without having to maintain it's own journal [23:06] davecheney: I'm talking about background activity.. it's not ok to put a goroutine to do something and then ignore it [23:06] davecheney: Watchers have Stop methods [23:06] davecheney: We can stop them, and get the error condition [23:07] davecheney: That's not what I'm saying.. I'm saying that activities should be controllable.. it's not ok to run stuff in the background and be unable to ask them to stop or to tell if they worked or not [23:07] davecheney: That's hard to test, hard to debug, hard to make work correctly [23:07] ok, i understand your concerns [23:08] i will work towards addressing them today [23:08] davecheney: Python code base is largely unmanaged.. watchers are very hard to deal with [23:09] davecheney: We have code hacks in place for managing the watcher logic that was obtained from trial and error [23:09] davecheney: "Oh, hey, yeah.. maybe it's still running" [23:09] davecheney: That sucks.. we know better now.. let's keep track of stuff running so we have a handle on them [23:09] right, because watchers drive twisted callbacks, which might still be firing [23:10] davecheney: Exactly [23:10] so, to conclude, because this has taken a lot of your time [23:10] you are happy for me to continue to itterat on this [23:11] i do worry that the PA is becoming a blocker to having _something_ to get us a golang juju [23:11] davecheney: I am very happy with that, and I find that conversation very healthy [23:11] davecheney: It is a blocker indeed, but you had to learn about the problems we face [23:11] davecheney: and we all have to learn about how to design it properly [23:12] davecheney: I suggest small branches, that we can quickly merge/refactor/reject/whatever [23:12] Im back [23:12] davecheney: So that as a side effect of these conversations we move forward steadily, and get unblocked [23:12] * davecheney applauds [23:13] Alright.. I have to go help my wife for a moment.. back later [23:13] andrewsmedina: Heya [23:13] nw [23:13] thanks for the talk [23:13] andrewsmedina: Will still be back today, in case you wanna talk [23:15] niemeyer: ok [23:26] niemeyer: juju client on centos working fine :D [23:33] andrewsmedina: Woohay [23:35] andrewsmedina, is this the python version of the juju client? [23:35] I suppose.. the Go version isn't runnable yet [23:39] jimbaker`: in this python version [23:39] niemeyer: we will work now to do vms create by juju works fine with centos :D [23:40] andrewsmedina: Sweeet [23:41] andrewsmedina: How's the PaaS side of things going? [23:43] niemeyer: the PaaS have a cli and a webserver (rest api) both written in go [23:43] andrewsmedina: cool [23:43] and that runs ok on RHEL5 ? [23:44] i ask because technically 2.6.18 isn't a supported kernel [23:44] davecheney: it's running on ubuntu at the moment [23:44] ah, ok [23:44] davecheney: we will use centos 6. [23:44] 6.2 [23:44] no problems then [23:44] 2.6.32 is fine [23:44] go could be made to run on 2.6.18 easily [23:46] to create machines for projects (django, php, rails) and services (mysql, mongo, elastic search) the webserver calls juju in the backend [23:46] and with juju we can use openstack or ec2 for IaaS [23:46] we are using openstack essex [23:47] this project is opensource https://github.com/timeredbull/tsuru [23:50] niemeyer: are you still there [23:51] niemeyer: nm, will be back in 30 mins (breakfast) [23:51] davecheney: Here [23:52] was just looking at the latest changes to environs/interface.go [23:52] Trying to get that S3 signing stuff working [23:52] davecheney: Ok? [23:52] what is the signature that Environs.SetConfig() should use [23:52] should it take an EnvironConfig ? [23:52] or just a plain map[string]interface{} [23:53] davecheney: Yeah, EnvironConfig [23:53] davecheney: So that we force the raw data to go through NewConfig [23:54] davecheney: Which validates and processes it into something the environment is happy with [23:54] ok, might have to do some type checking on the way in [23:55] davecheney: I think it's fine to assume it's the correct one [23:55] davecheney: It will panic if it's not, and that seems fine given it would be a major coding mistake [23:55] yup [23:55] ok, afk for real this time [23:55] davecheney: See ya later