[00:04] fwereade: I'll look into the details, but am not able to get clarity on that right at this moment because people are not around [00:04] mramm, np, I'm heading off to sleep soon [00:04] fwereade: understood [00:04] take care of yourself -- it's late your time! [00:05] mramm, cheers :) [00:17] ah ffs [00:18] m_3: ping [00:19] davecheney: I have bootstrap problems with raring with the 1.9.14 package, and tip of trunk [00:19] davecheney: I'm in dive mode, but breaking for lunch now [00:20] as in diving in with logging to try and figure out wtf is going on === thumper is now known as thumper-afk [00:21] thumper-afk: ok, i'm doing the same [00:21] it could be that we have too many tools in the public bucket [00:21] in fact, that is probably it [00:22] release mode uses best fit, so it iterates over the tools [00:22] dev mode uses exact fit, so it's a striaght hit and run [00:26] thumper-afk, davecheney: sleep [00:26] (me sleep, not you) [00:28] go [01:33] davecheney: pong [01:39] m_3: wazzup ? [01:40] back home [01:40] right [01:40] i'm going to make one more attempt to figure out what is going on with hp cloud [01:40] i'm going to allocate 100 machines [01:40] manually remove their public addresses [01:40] another 100 [01:40] etc [01:40] etc [01:41] see if that gets us over the 2^8 hump [01:41] yeah, did you ever try it from _outside_? [01:41] that's what I was gonna do next [01:41] m_3: i was able to launch 100 extra isnances from inside via the nova command [01:41] basically spin up what we have from outside of hp [01:41] but I've seen this problem before, a bunch [01:42] intermittent inability to resolve the enpoint urls [01:42] maybe it will work from outside [01:42] it's like when we spin up two many machines inside an openstack tennant [01:42] this kills charmtests against hp sometimes (w/ juju-0.6) [01:42] it runs out of ip's to respond _to_ dns queries [01:42] I really don't understand it [01:42] i can sort of explain it [01:42] couldn't work with it yesterday cause of the talk [01:43] given the ways that each tennant probably has the _same_ 10/8 address space [01:43] dude, there were >300 people in our talk yesterday! [01:43] so there is probably shittones of nat going on [01:43] they mostly stayed [01:43] m_3: i saw the photo [01:43] that is fucking amaizing !?! [01:43] but /8's frickin huge [01:44] sure, but it's the same 10/8 for each customer [01:44] you're thinking that's natted out using a limited pool of outsides? [01:44] just like your router is using 192.168.0/16 [01:44] my suspicion is because we're asking for so many public addresses, we're sort of choking off our own air surply [01:45] anyway, going to explore that theory today [01:45] hmmm... it really looks like the same as wehn I hit it from ec2 [01:46] m_3: is this at all related to the stuff mgz was saying about security groups and screwing ourselves by making too many stupid requests ? [01:46] dude... dunno [01:47] I plan to sort of wrap my head around all of this again tomorrow [01:47] roger [01:47] you've got to uncompress from ODS [01:47] * davecheney waves his wand [01:47] I'll start from scratch and grok what you and mgz's done this week [01:47] "thou shall never have to say cloud again" [01:47] haha [01:47] yes [01:47] "i've got a cloud in my pants, and everyone is invited" [01:48] * m_3 groan [01:49] too soon ? [01:49] :) [01:49] m_3: fyi, just bringing the code on juju-hpgoctrl2-machine-0 [01:49] up to date with the overnight changes [01:50] davecheney: awesome... thanks man [01:50] m_3: no worries, we fixed some good issues this week [01:50] the deploy logs from the load test are much cleaner [01:50] they actually tell you what is going on [01:50] status is now usable while you are doing a big deploy [01:50] etc [01:51] also, when hp cloud wants too, it is nearly twice as fast to bring up an instance than ec2 [01:51] which doesnt' suck [02:00] 2013/04/19 01:59:49 ERROR worker/provisioner: cannot start instance for machine "16": cannot set up groups: failed to create a rule for the sec [02:00] urity group with id: [02:00] caused by: Maximum number of attempts (3) reached sending request to https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/os- [02:00] security-group-rules [02:39] davecheney: &http.Response{Status:"200 OK", StatusCode:200, Proto:"HTTP/1.1", ProtoMajor:1, ProtoMinor:1, Header:http.Header{"X-Amz-Request-Id":[]string{"90D56A3D6895C07E"}, "X-Amz-Id-2":[]string{"6lThSRAi5lMeq9oe8oSeibO7fjvZQLjgKGYG0Gs7vRMBZrQ6Z0xVlIfyILAoWO4A"}, "Date":[]string{"Fri, 19 Apr 2013 02:38:03 GMT"}, "Content-Type":[]string{"application/xml"}, "Server":[]string{"AmazonS3"}}, Body:(*http.bodyEOFSignal)(0xf84022ec60), [02:39] ContentLength:-1, TransferEncoding:[]string{"chunked"}, Close:false, Trailer:http.Header(nil), Request:(*http.Request)(0xf8403ac600)} [02:39] davecheney: this is the response from the http request inside goamz/s3 for the request to list the public bucket [02:39] seems like body is: Body:(*http.bodyEOFSignal)(0xf84022ec60) [02:40] ContentLength:-1, === thumper-afk is now known as thumper [02:41] hate it when I forget to reset the nick [02:42] that it because it is chunked [02:42] TransferEncoding:[]string{"chunked"}, [02:42] length is unknown from the server [02:42] ok, what does that mean? [02:42] rfc 2616 chunked transfer encoing [02:42] encoding [02:43] it's not a problen [02:43] ok [02:43] its a way of sending the http body without having to specify the length first [02:43] very common if you are streaming a response [02:47] davecheney: this is the line that is failing: err = xml.NewDecoder(hresp.Body).Decode(resp) [02:54] davecheney: I wonder if we have moved into a chunked response now, which the decoder can't handle due to number of tools... [02:58] * thumper writes up findings to the list [03:06] thumper: nah, it just gets a Reader [03:06] what implements the reader is not important [03:15] davecheney: so what would it be then? [03:16] * thumper has put the heating up [04:29] davecheney: ping [06:02] mornin' all [07:06] m_3: https://bugs.launchpad.net/juju-core/+bug/1170595 [07:06] bingo [07:07] this is why we're having problems in load test [07:07] 2013/04/19 07:07:20 INFO rpc: discarding obtainer method reflect.Method{Name:"Kill", PkgPath:"", Type:(*reflect.commonType)(0x7468a8), Func:reflect.Value{typ:(*reflect. [07:07] commonType)(0x7468a8), val:(unsafe.Pointer)(0x4d6359), flag:0x130}, Index:4} [07:07] 2013/04/19 07:07:20 INFO rpc: discarding obtainer method reflect.Method{Name:"requireAgent", PkgPath:"launchpad.net/juju-core/state/apiserver", Type:(*reflect.commonTyp [07:08] e)(0x767768), Func:reflect.Value{typ:(*reflect.commonType)(0x767768), val:(unsafe.Pointer)(0x4d63e7), flag:0x131}, Index:8} [07:08] 2013/04/19 07:07:20 INFO rpc: discarding obtainer method reflect.Method{Name:"requireClient", PkgPath:"launchpad.net/juju-core/state/apiserver", Type:(*reflect.commonTy [07:08] pe)(0x767768), Func:reflect.Value{typ:(*reflect.commonType)(0x767768), val:(unsafe.Pointer)(0x4d64ac), flag:0x131}, Index:9} [07:08] ^ rogpeppe1 is this a problem ? [07:08] davecheney: no [07:08] was spotted on pa restart [07:09] davecheney: that's expected behaviour [07:09] davecheney: the warnings are useful when developing [07:09] davecheney: i know they're annoying otherwise [07:09] ok, nm [07:10] davecheney: i guess we should probably move those methods off the rpc root object to stifle the warnings. [07:11] rogpeppe1: if they aren't bugs then I wouldn't worry about it for the moment [07:13] davecheney: do you know if anything's happened about 1.10 yet? [07:14] davecheney: 'cos i have a couple of minor bugs (i already have the fixes for them) that it would be great to sort out if there was a moment or two more. [07:14] davecheney: it seems nobody has ever used juju get. [07:14] rogpeppe1: yeah, i saw that bug [07:14] i think you are right [07:14] noone ever did use it [07:15] davecheney: i had a fun time yesterday starting up a juju env, making some weirdish relations, upgrading charms, resolving hooks, etc [07:15] davecheney: it actually seemed to work pretty well [07:16] rogpeppe1: i don't doubt that, we have excellent charm compatibility [07:18] davecheney: i've just had an idea for a way to make it easy to write little charms that exercise particular functionality; trying to knock something together today [07:18] jujud 8613 root 1w REG 253,1 71378 131869 /var/log/juju/machine-0.log [07:18] jujud 8613 root 2w REG 253,1 71378 131869 /var/log/juju/machine-0.log [07:18] jujud 8613 root 3r CHR 1,9 0t0 5786 /dev/urandom [07:18] jujud 8613 root 4w REG 253,1 71378 131869 /var/log/juju/machine-0.log [07:18] do I even want to ask why we have 3 fd's pointing to the same log file ... [07:19] rogpeppe1: sweet [07:19] davecheney: stdout and stderr are expected [07:19] davecheney: not sure about 4 [07:19] that isn't as important as# lsof -p $(pgrep jujud) | grep -c ESTABLISHED [07:19] 129 [07:19] https://bugs.launchpad.net/juju-core/+bug/1170595 [07:20] that is why we can't provision more than about 200 machines in a run [07:20] oops [07:20] have you found the source of the leak? [07:20] looking now [07:20] shouldn't take long [07:20] given the number of times this problem turns up [07:20] i'm smacking myself it wasn't the first thing I looked for [07:22] davecheney: this was the status from one of yesterday's environments http://paste.ubuntu.com/5719234/ [07:23] davecheney: note the interesting relationship between mongo and logging there [07:24] * davecheney isn't quite sure what is wrong there [07:24] are they circular ? [07:24] davecheney: nope [07:24] davecheney: there's nothing wrong [07:24] davecheney: it's just quite cool that you can do it [07:25] davecheney: basically, logging requires mongo to store its logs. but we also want to store the log files produced by mongo itself, so the logger is subordinate to mongo as well as being related to it. [07:26] ok [07:26] davecheney: i set it up deliberately like that, thinking it might not work [07:26] davecheney: but it seems to work fine (at least on the surface! i haven't *actually* looked at the logs in mongo) [08:28] anyone know if there's an easy way for a charm to find out its service name? [08:29] you'd think that would be straight forward [08:29] currently the only thing i can think of is `pwd | sed blahblah` [08:29] which is a hack [08:29] it isn't a config property ? [08:30] davecheney: no [08:30] * davecheney gives up [08:30] davecheney: i'm not sure it should be a config property [08:30] maybe i used the wrong word [08:30] setting might be appropriate [08:30] davecheney: it could easily be an env var though [08:31] davecheney: settings can change [08:31] davecheney: this is immutable [08:31] again i'm using the wrong word [08:31] surely we have a class of setting which are immutable [08:31] davecheney: ah, ok [08:31] davecheney: i don't *think* so [08:32] davecheney: there might be a special case for public-address i suppose [08:32] davecheney: ah, but that's relation setting anyway [08:32] davecheney: currently service settings map exactly to the config defined in the charm [08:32] davecheney: which seems good to me [08:33] davecheney: i think just an env var JUJU_SERVICE to go along with JUJU_ENV_UUID would be good [08:33] i agree [08:33] sounds like something very useful [08:33] davecheney: and i'd add JUJU_SERVERS too [08:34] davecheney: yeah, it's very useful because it's an easy and predictable disambiguation mechanism [08:35] davecheney: so i can create a directory that has a predictable name but is guaranteed not to clash with similar names chosen by other colocated charms [08:36] davecheney: JUJU_SERVICE is a one-line change [08:37] davecheney, fwereade, dimitern: do you know if trunk is still frozen? [08:38] rogpeppe1, davecheney, dimitern: I have heard nothing from mramm re the deadline ambiguity alluded to my mgz === rogpeppe1 is now known as rogpeppe [08:39] rogpeppe1, davecheney, dimitern: whoops, I did actually, had missed that mail [08:39] fwereade: to you only? i don't think i saw anything [08:39] rogpeppe, davecheney, dimitern: I think we should revert the 1.10 version for now [08:39] fwereade: ok - what's the situation? [08:39] rogpeppe, apparently the *real* deadline is EOD monday [08:40] fwereade: oh, that's great! i'll propose a couple of bug fixes then, if that's ok. [08:40] rogpeppe, so I think we should revert the version and keep going on low-risk/high-impact bugfixes for today at least [08:41] fwereade: 1130149 and 1170425 are both easy and worth doing [08:41] #1130149 [08:41] rogpeppe, although, tbh, today at *most* also applies ;p [08:41] lp#1130149 [08:41] fwereade: agreed entirely [08:42] fwereade: BTW what do you think about a $JUJU_SERVICE env var? [08:42] fwereade: so a charm can know what service it's running as [08:42] rogpeppe, use case? [08:42] fwereade: to go along with JUJU_ENV_UUID [08:43] fwereade: it gives an easy way for a charm to create a predictable directory that won't clash [08:43] fwereade: also it provides a reliable way for a knowledgable charm to find the unit config (although tbh i think we should provide JUJU_SERVERS or something like that instead of needing to do that) [08:44] rogpeppe, I think service name is too coarse, and you really want unit name [08:44] rogpeppe, sorry, what's the unit config? [08:44] fwereade: the uniter agent config [08:44] fwereade: yeah, unit name would be good [08:45] fwereade: currently you *can* find it out, but only by mangling pwd, which is dreary and nasty. [08:46] fwereade: mind you i'm not sure it's currently possible to have two units of the same service in the same container, is it? [08:46] rogpeppe, yeah, but hitting the agent conf at all is dreary and nasty -- we should be explicitly making the API server addresses available if hooks need them [08:47] rogpeppe, nothing stopping you doing that [08:47] fwereade: yeah, i think we should; but i think the unit name is useful info too. [08:47] rogpeppe, JUJU_UNIT_NAME is already there, isn't it? [08:48] fwereade: for my particular use case, i'm wanted to write a charm that made it easy to test pwd [08:48] fwereade: ah, i missed that [08:48] rogpeppe, sorry, I'm being slow, test what about pwd? [08:48] mistype! [08:48] haha [08:49] fwereade: for my particular use case, i'm wanting to write a charm that made it easy to test aspects of charm behaviour [08:49] fwereade: $JUJU_UNIT_NAME is great [08:49] rogpeppe, sweet [08:49] fwereade: although perhaps $JUJU_SERVICE might be useful too, i dunno [08:50] rogpeppe, I *am* wondering about the juju gui charm though [08:50] fwereade: yeah [08:50] fwereade: i really think we should provide server address info [08:50] rogpeppe, I'm not sure the juju gui should be bound to the juju that deployed it [08:50] fwereade: ah, that's an interesting point [08:50] rogpeppe, I suspect that API information should just be service config [08:51] rogpeppe, even if it's a little less convenient to set it up [08:51] fwereade: the problem with that is that in a HA world that info changes [08:52] fwereade: i could see that it might be good to allow both ways actualy [08:52] fwereade: use the local server unless a config option is set [08:53] rogpeppe, maybe, I need to think about this for a bit [08:53] fwereade: then we can potentially have something that watches some environment and makes config changes when the set of server addresses changes [08:53] rogpeppe, it kinda feels like the same old service-output problem [08:54] fwereade: anyway, i don't think there's a good reason to make it hard for a charm to access its own API server [08:54] fwereade: ? [08:54] fwereade: which problem was that? [08:55] rogpeppe, that we'd kinda like to be able to get information back out of services [08:55] fwereade: ah yes. we really really do [08:55] rogpeppe, it should ideally always be possible to deploy a service with default configuration and have it work nicely [08:56] fwereade: i think that's one of the most crucial missing juju features. that and allowing a charm to change things asynchrously. [08:56] fwereade: i agree. [08:56] rogpeppe, in the case of a password a default password is painfully insecure, and generating one on the fly should be perfectly possible, but there's no way to get it out [08:56] rogpeppe, for the async stuff you mean juju-run basically? [08:56] fwereade: yeah [08:57] rogpeppe, agreed on both points [08:57] rogpeppe, anyway, those bugs [08:57] rogpeppe, 1130149, +100 [08:57] fwereade: because currently there's no way for a unit to *say* anything other than in response to something else [08:57] rogpeppe, 1170425, I'll take quite a lot of convincing [08:58] rogpeppe, yep, definitely [08:58] fwereade: are you suggesting that juju get shouldn't work on a subordinate service? [08:58] rogpeppe, I'm suggesting that calling Constraints on a subordinate service is DIW [08:58] fwereade: did i suggest otherwise? [08:59] rogpeppe, last night, I think you did ;p [08:59] fwereade: ah, i didn't know you'd seen that :-) [08:59] fwereade: i knew you'd be -1 on that suggestion [09:00] rogpeppe, so long as it's done by skipping the Constraints call I'm fine, I guess, but I'm a bit surprised that the gui always wants to get constraints alongside config [09:00] fwereade: i can't really see a down side, but there y'go. [09:00] fwereade: it gets all the service info in one call [09:01] fwereade: the fix i made just tested IsPrincipal [09:01] rogpeppe, ok, that's fair enough in the current context [09:02] wow, hp cloud is so much faster than ec2 [09:03] davecheney: it wouldn't take much :-) [09:03] bootstraps take < 2 mins on hp cloud [09:03] fwereade: your password use-case is an interesting one. [09:05] fwereade: and highlights one particular issue with getting stuff out of a service - can the service somehow choose a "shared" value that all units agree on, or can you just see a set of values for each unit? i think probably just the latter actually. [09:05] rogpeppe, that's just a matter of exposing stuff we already have, so it would certainly be simpler [09:06] fwereade: in a way you could think of the service config the relation settings of the juju client [09:06] s/the relation/as the relation/ [09:07] fwereade: so a similar model could apply - a charm could run config-set to set its own config settings that could be seen by the client. [09:07] rogpeppe, that is a *very* nice way of looking at it [09:08] rogpeppe, but it does ring up interesting race possibilities, I think [09:08] fwereade: really? each unit would have its own set of config settings [09:08] rogpeppe, if I deploy 3 units of something, which one gets to pick the output password for the service administrator? [09:08] fwereade: they all pick their own passwords [09:09] rogpeppe, I don't think it necessarily makes sense at a unit level but go on [09:09] fwereade: as a client i have to choose which unit to get the password from [09:10] fwereade: that's why the relation analogy is nice - with relations, there's one group of settings for each unit, and each unit can set its *own* settings, but can only read the remote settings. [09:10] fwereade: if you have a service where all units must agree on a password, they can work it out together and present a unified front [09:11] fwereade: doing leader election perhaps through a peer relation [09:11] fwereade: shared read-write settings are a no-no i think [09:15] https://codereview.appspot.com/8668048 [09:15] ^ fixes openstack connection leak [09:15] davecheney: yay! [09:16] doing a 300 node test now [09:16] it's not leaking [09:16] so sayeth lsof [09:17] but i'll leave it running and get some dinner [09:17] davecheney: i'm not sure the fix is quite right [09:17] davecheney: it can still potentially leak, i think [09:17] rogpeppe: oh realy ? [09:17] davecheney: if retryAfter == 0 we leak [09:17] oh fuck, i didn't see all those stupid returns [09:17] right, will fix some more [09:18] davecheney: i'd be tempted to put it into its own function [09:18] rogpeppe: ohhh [09:18] i have many many refactors to this package [09:18] davecheney: with a deferred "if err != nil {resp.Close()} [09:19] rogpeppe: the body closing is all over the shop in that package [09:19] i have a branch for fixing that as well [09:19] * rogpeppe is not greatly surprised [09:20] PTAL https://codereview.appspot.com/8668048 [09:21] davecheney: i think that's wrong too, probably [09:21] well fuck [09:21] where do you think it goes ? [09:22] davecheney: does nothing read the resp body returned from sendRateLimitedRequest ? [09:22] oops sorry! [09:22] you're good, i think [09:22] cool [09:23] it is hard to understand when it is read and not read [09:23] and there are other potential places where the connection can leak [09:23] check out client.BinaryRequest [09:23] i've patched all those in my other branch, but they didn't appear to be the problem [09:23] davecheney: LGTM [09:25] no rush on the review, I have something similar bodged into the load testing machine [09:25] and it's doing the job [09:33] davecheney, LGTM also [09:34] fwereade: what is the story with patches to trunk ? [09:34] yes ? no ? please ? maybe ? [09:34] davecheney, I'm going to revert the version right now [09:34] ok [09:34] davecheney, low-risk/high-value changes to trunk are fine for today I think [09:35] AHHH SHIT [09:35] this is too goose [09:35] and jon's bot is fucked [09:38] davecheney, hell-damn -- I think the juju-core revert still stands [09:38] davecheney, rogpeppe: https://codereview.appspot.com/8855044 [09:38] davecheney, but jam's not around today is he? [09:38] dimitern, do you know how we can land goose fixes ATM? [09:38] * davecheney grumbles about things [09:38] fwereade: LGTM trivial [09:39] fwereade: I'm definitely not here and responding to davecheney's request [09:39] definitely not right now. [09:39] jam: good to know [09:39] or not [09:39] i think [09:39] jam, well, that is very lazy and irresponsible of you ;P [09:39] jamtyvm [09:39] fwereade: LGTM, just commit it [09:39] jam, sorry, I though this was your day off [09:39] davecheney, don;t worry already happening ; [09:40] :) [09:40] fwereade: it is [09:40] which is why I'm definitely *not* doing it exactly right now. [09:41] and it should be done in as long as it takes to confirm it doesn't break juju-core's test suite [09:41] jam: thanks for fixing gz's one as well [09:41] to opine for a second [09:41] the http package it trail by fire for everyone [09:41] surely there must be a better way to write a http client that doesn't mame anyone who touches it [09:42] davecheney: so why doesn't gc close the resp.Body stuff? Or it does, but may take a while. Or it doesn't because underlying it all is a shared http connection that keeps a reference? [09:42] jam: there is no finaliser on the response body [09:42] this is part of the connection reuse logic [09:42] a very questionable decision [09:43] eventually if every refreence to the response, and hence the net.Conn was freed [09:43] the finaliser on the fd would close it [09:43] but because of the way the connection reuse logic works, a response (and hence the body) is 'checked out' until you close it [09:43] fwereade: can I land the maas provider constraints stuff today? [09:44] rvba, ...honestly I can't think why not, if it works, let me go review that right away [09:44] rvba, I don;t think it's likely to be destabilizing [09:45] fwereade: it should be pretty safe [09:45] rvba, but I seem to be being dense, because I don't see a review [09:46] rvba, MP [09:47] fwereade: https://codereview.appspot.com/8842045/ [09:47] rogpeppe, btw, are you planning to look at both those bugs you linked before? [09:48] fwereade: it has been reviewed by dimitern already. [09:48] fwereade: yeah, i'm doing them [09:48] rogpeppe, <3 [09:49] rvba, we try to have 2 reviews (except for the truly trivial), may I take a quick look before I approve? [09:49] fwereade: sure, please do. [09:52] rvba, that's approved [09:52] rvba, tyvm [09:52] fwereade: ta [09:53] rvba, I am dense, I found it in LP and reviewed there [09:53] rvba, close enough [10:10] fwereade: BTW the old "// Breaks compatibility with py/juju" comment in statecmd/get.go - do you know anything more about that? i'm presuming that py juju printed the actual value and the compatibility breakage is just because we're returning null [10:13] rogpeppe: i think I wrote that [10:14] davecheney: do you remember what the issue was? [10:16] rogpeppe, I'm afraid I'm almost 100% ignorant of get, but, well, we should avoid compatibility breaks where possible [10:16] rogpeppe: this was probbly lisbon II [10:16] and gustavo said do it this way [10:16] fwereade: agreed totally. [10:16] i think it was something he felt was an improvement over python [10:16] davecheney: hmm, interestin [10:16] g [10:17] davecheney: surely the fact that it never prints default values isn't right though... [10:20] rogpeppe: it was certainly this issue surounding the difficulty in diferentiating between the default value [10:20] and a value which was set, but set to the default [10:20] http://paste.ubuntu.com/5721158/ [10:20] ^ i've broken HP Cloud, where is my medal [10:20] davecheney: yeah. maybe py juju didn't have the "default" bool [10:21] from emmory [10:21] fwereade: the change to the MAAS provider is merged now. Make sure you have the last version of the gomaasapi lib otherwise some tests in environs/maas will fail. [10:21] it was the issue of telling the default value, ie, nothing set, from the value which was set, but was set to the default [10:21] rvba, cool, thanks [10:22] davecheney, rogpeppe: isn't it mportant that we differentiate between those cases? [10:22]