[00:26] perhaps I should stop sparring the week before a gathering [00:26] I wonder if this facial bruising will be gone by Sunday [00:27] I should learn not to block punches with my face [00:50] bigjools: is it ok for me to use your maas? are you still getting that eof issue? [00:55] wallyworld_: I am, but hang on I am just completing a test [00:55] 10 mins [00:55] ok [00:56] rightio [01:12] wallyworld_: have you used the garage maas before? [01:12] nope [01:12] i've not used *ant* maas [01:12] any [01:12] how's the face? [01:15] wallyworld_: server is all yours [01:16] bigjools: ta. how do i get an oauth key for the env.yaml? [01:16] wallyworld_: it's all set up [01:16] just log in and bootstrap [01:17] ok [01:18] bigjools: i'm scared to look inside the folder called "backdoor-image". shudder [01:19] wallyworld_: left just for you [01:19] \o/ [01:46] thumper: do you have/use any bzr plugins to get diff summaries? (file names with +/- lines) [01:46] axw: bzr diff | diffstat [01:47] that's all I use [01:47] ah, didn't know diffstat [01:47] thanks [01:47] doesn't give full summary per file [01:47] although it may have options [01:47] thumper: that'll do nicely for me, thanks [01:50] wallyworld_: face is a little bruised, that's all [01:50] walked into a door [01:51] thumper: nothing to do with the missus? [01:51] boxing? [01:52] sparring at boxing, yes [01:52] the guy today is very fast [01:52] top 5 in the country and under 20 [01:52] so much younger and faster [01:52] but you don't learn unless you fight those better than you [01:54] your nick gets better and better [01:54] or possibly more ironic [01:56] heh [01:57] I have an interview with Rachel, Caitlin and her new principal later this afternoon [01:57] slight shiner to go in with [01:58] I should just ack all meek [01:58] and flinch when Rachel looks at me :) [01:58] school pickup time [01:58] * thumper walks a block [02:09] I want to test out my maas change to make sure that the allocated change is right [02:09] hence the email about the garage maas [02:57] bigjools: this maas eof thing is giving me the shits. it's failing doing a bog standard httpClient.Do(request) inside the gomaasapi client.go, complaining that "can't write HTTP request on broken connection" [02:57] so something is closing the maas client's connection [02:57] I know what it is [02:57] but i don't know what [02:57] pick me [02:58] pick me [02:58] pick me [02:58] pick me [02:58] ok! [02:58] wallyworld_: quick hangout? [02:58] sure [02:58] https://plus.google.com/hangouts/_/ca28925e2921123581a410b465ff00dda5e3c11c?hl=en [03:07] wallyworld_: I blame Go [03:07] bigjools: it's all fooked [03:07] \o/ [03:07] i just can't see what's wrong [03:08] wallyworld_: try emulating with a simple curl request [03:08] * bigjools looks in maas log as well [03:08] bigjools: i think i did that and still got an error. oh, i did it using wget [03:08] wallyworld_: what time did you try last? [03:08] you need auth header remember [03:09] bigjools: looks like remote end is closing the connection [03:09] and as it happens, I don't know [03:09] different problem [03:09] bigjools: ah yes. can you remind me of the syntax [03:09] wallyworld_: I can't :) [03:10] fat lot of good you are then [03:13] wallyworld_: what request is it making when it gets the EOF? [03:13] FWIW I get the EOF sometimes when not using --upload-tools [03:14] bigjools: http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools [03:14] or there abouts [03:14] i had commented out the prefix to try something [03:14] bigjools: we blame maas [03:15] 10.0.0.9 - - [15/Oct/2013:11:15:17 +1000] "GET /MAAS/api/1.0/files/?op=list&prefix= HTTP/1.1" 200 807 "-" "Go 1.1 package http" [03:15] maas is doing just fine it seems --^ [03:15] bigjools: why does this command return an error then: response, err := httpClient.Do(request) [03:15] it's effectively a straight http get call [03:15] which fails [03:16] it did not do this until recently and maas has not changed around this call [03:16] the code has not changed [03:16] can you dump the request and response? I'll add the same to the maas log [03:16] it's a simple list request [03:17] bigjools: there is no response cause the request call errors [03:17] &http.Request{Method:"GET", URL:(*url.URL)(0xc200230850), Proto:"HTTP/1.1", ProtoMajor:1, ProtoMinor:1, Header:http.Header{"Authorization":[]string{"OAuth oauth_signature_method=\"PLAINTEXT\", oauth_version=\"1.0\", realm=\"MAAS+API\", oauth_consumer_key=\"M2X2ZeCSNVWer6AEHc\", oauth_token=\"v3jKFjma2gZkhasdQR\", oauth_signature=\"%26yVgdrUAVWKCRsxNXGUEzyTrTaHYebAmH\", oauth_timestamp=\"1381806551\", oauth_nonce=\"38010820\""}}, [03:17] Body:io.ReadCloser(nil), ContentLength:0, TransferEncoding:[]string(nil), Close:false, Host:"10.0.0.9:80", Form:url.Values(nil), PostForm:url.Values(nil), MultipartForm:(*multipart.Form)(nil), Trailer:http.Header(nil), RemoteAddr:"", RequestURI:"", TLS:(*tls.ConnectionState)(nil)} [03:17] is the request [03:18] I'll try again and dump the maas log, hang on [03:19] wallyworld_: ok that request isn't even hitting maas [03:19] \o/ [03:19] Go is getting the EOF alllll on its own [03:19] huzar! [03:20] thumper: http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix= [03:21] wallyworld_: can you paste a curl -v of that url ?> [03:21] wallyworld_: the last request that maas sees is /MAAS/api/1.0/files/provider-state/ [03:21] davecheney: i'll have to look up the syntax - i can't recall how to construct a curl request with the headers etc [03:23] curl -H "Authorization: OAuth " http://www.example.com [03:25] ta [03:28] bigjools: so i pasted in the entire oauth access token from the env.yaml file as the ACCESS_TOKEN, right? [03:29] no [03:29] something like: [03:29] Authorization: OAuth oauth_signature_method="PLAINTEXT", oauth_version="1.0", realm="MAAS+API", oauth_consumer_key="M2X2ZeCSNVWer6AEHc", oauth_token="v3jKFjma2gZkhasdQR", oauth_signature="%26yVgdrUAVWKCRsxNXGUEzyTrTaHYebAmH", oauth_timestamp="1381807262", oauth_nonce="2814807 [03:30] ok [03:30] it mght get a bit tricky actually [03:31] you can always re-create the request with maas-cli [03:31] which works fine, BTW [03:31] does the maas-cli log the request? [03:31] "maas-cli maas files list prefix=tools-" [03:32] yeah I can see it in the mas log [03:32] maas-cli maas files list prefix=releases/tools/juju- also works [03:32] right [03:33] so why is Go's http call failing [03:33] there are no polite answers to that [03:33] as a data point I can re-create this another way [03:34] bigjools: do you have the output from mass-cli that you can give to davecheney who wanted to see a curl -v output? [03:34] if I bootstrap without --upload-tools, destroy and then bootstrap without --upload-tools again, you get the EOF [03:34] yes [03:35] http://paste.ubuntu.com/6238905/ [03:35] that's the host side [03:35] but amounts to the same thing [03:36] davecheney: is that ^^^^ sufficient? it sure seems like a http get request via other means works, but the same request sent from Go's http.Client fails [03:37] wallyworld_: nah, i was hoping to see the curl output [03:37] i know how to interpret that [03:38] ok, i'll see if i can get it [03:38] given that the request from Go isn't hitting maas, I suspect you need to look at why the Go internals are misbehaving [03:38] no amount of curl log is going to help with that [03:39] yeah [03:44] davecheney: that's the pastebin https://pastebin.canonical.com/99027/ [03:44] but [03:44] that worked [03:44] the Go code doesn't [03:44] the values in the pastebin were harvested by printing the http.Request struct from Go [03:45] so the request is formed with the right header values etc [03:45] but it doesn't reach the server side, and instead the Do() method returns an EOF [03:45] or a "can't write HTTP request on broken connection" error [03:46] wtf [03:49] bigjools: how recently did you upgrade to Go 1.1.2? could that be an issue? [03:50] wallyworld_: how can that affect anything? I am using a packaged juju binary on this box [03:51] i only run Go 1.1.1 on mine. if it's a Go internal issue, the version may be relevant [03:51] bigjools: you could have 1.0.x [03:51] that is known to be broke [03:51] that was all the difficulty we had bad in July [03:51] I still don't know how that can affect a binary built by someone else [03:51] davecheney: go version says 1.1.2 on his maas server [03:52] also, does maas use your forked http code ? [03:52] my maas server Go version is irrelevant [03:53] bigjools: i compiled juju from source on your box [03:53] to add in debugging [03:53] sure, but given this is also a problem with the packaged juju ... [03:53] bigjools: ooooooooooh dear [03:53] i wonder if the host that build juju was using 1.1.2 [03:54] if it's on saucy, quite likely [03:54] ok [03:54] that is known to work [03:54] except for this http request :-( [03:55] is it re-using objects or otherwise similarly being stupid? [03:55] a new http request is created each time, as is a new http.Client object [03:55] oh google you chunk of crap [03:56] silently rendering the word EOF as just "of" in my seach is stupid [03:56] http://stackoverflow.com/questions/17714494/golang-http-request-results-in-eof-errors-when-making-multiple-requests-successi [03:57] "You need to set Req.Close to true" [03:57] just saw that [03:57] is it doing that? [03:57] nope, this is maas code [03:57] is gomaasapi doing that? [03:58] i don't think so, but i wonder if we do it *anywhere* [03:58] * bigjools knows not of golang esoterics [03:58] a quick code search seems to indicate we don't [03:59] only in the forked gwacl code [03:59] so why the fuck is this not failing elsewhere if it is a problem [04:00] maas is the only provider that uses OAUTH ? [04:00] ^ guess [04:01] I think so [04:01] but why is it failing like this now as well? [04:01] 1.1.2? [04:02] wallyworld_: this is easy to test out [04:02] yeah, doing it now [04:03] ok - I was also but noticed the file changed under my feet :) [04:08] bigjools: davecheney: adding in req.Close = true after creating the req seems to have worked. but wtf. we don't do that anywhere else in juju-core [04:08] \o/ [04:08] and this is in the gomaasapi library [04:08] so why just maas and not goose or goaws wtc [04:09] mystery [04:09] * bigjools 's work here is done [04:09] thanks for helping out wallyworld_ [04:09] np. but i'm pissed at it all. not sure where to direct my rage [04:09] Rob Pike? :) [04:09] i blame Go [04:10] i mean if req.Close is required, why not just default to that? [04:10] indeed [04:10] and why is it not consistent? [04:10] do we just change maasapi? [04:10] or do we try and change *all* the other places we create requests [04:11] something to do with HTTP1.1 I guess [04:11] the document is singularly unhelpful [04:11] // Close indicates whether to close the connection after [04:11] // replying to this request. [04:12] no hint of why you'd want to do that [04:12] yeah, and the error seems like the opposite of that [04:12] bigjools: how long should juju status take to come back on your maas setup? [04:12] about 10 minutes [04:13] me taps fingers on desk waiting, waiting [04:13] my theory is that maas already closed the connection and it didn't detect that until trying to send on the same one again [04:13] think yourself lucky, it used to be 20 minutes, the fast installer cut the time in half [04:14] wonder why it didn't fail before now though [04:14] and 8 of the 10 minutes is cloud-init and a reboot [04:14] did maas change? [04:14] I reckon davecheney may be right when he said it could be go 1.1.2 [04:15] maas hasn't changed here - it's using the exact same stuff [04:15] Apache frontend [04:15] and 1.1.4 shipped on Go 1.1.1? [04:15] 1.14 i mean [04:15] no idea [04:15] when did 1.1.2 hit the archive? [04:15] no idea here either [04:16] could build 1.1.6 on go 1.1.1 and see [04:16] 1.16 even [04:16] go on then, i dare you [04:17] and juju stat came back ok [04:17] i'll revert my debugging [04:18] so stackoverflow, I can't upvote an solution because I don't have enough reputation, but I *can* edit the answer to oblivion. *HEADDESK* [04:18] funny, i searched for and found that page also, but my google didn;t "fix" the EOF spelling [04:19] I used +"EOF" [04:19] i didn't [04:19] sigh [04:20] Go seems to be a minefield where there's an actual mine wherever you step [04:20] yep :-( [04:20] AND no version control :0( [04:21] what could possibly go wrong [04:21] bigjools: you should have a working binary in your ~ubuntu/go/bin directory [04:21] hurrah! [04:22] update PATH quick! [04:22] no need I can just run that binary [04:22] there's also a bootstrapped env now [04:22] huzzah [04:22] not if you want to upload tools [04:22] oh [04:22] i think it needs to be in your path, or is that for local provider [04:22] one last bitchslap [04:23] default gopath is that env anyway [04:23] i can't recall, it may only be for local provider [04:23] ok, see how it goes [05:17] hello jtv [05:17] Hi bigjools [05:17] you're not on the internal irc [05:20] Reconnecting [05:29] davecheney: you guys planning another release to go out with 13.10? [05:30] bigjools: nope [05:30] 1.1.2 has been in saucy since september [05:31] davecheney: juju release I mean [05:31] bigjools: i think so [05:31] i don't know the detalis [05:31] thumper and sinzui probably know [05:31] thanks [05:37] wallyworld_: if you have some time this week, would you mind having a look over https://codereview.appspot.com/14527043/ [05:37] there's some changes to simplestreams metadata merging in there [06:30] axw: sure, looking now. was out buying dinner after school pickup [06:32] wallyworld_: thanks [06:33] axw: just had a quick read of some of the comments. i agree in theory the resolve can go away eventually. but my view is we need to now while we transition and have to cope with older metadata etc [06:33] lenient on what's read, strict on what's written and all that [06:34] wallyworld_: yep, sounds fair enough [06:34] i reckon we can get rid of it for 1.18 [06:35] so maybe if this is going into trunk we don't need it [06:35] or the 1.16 backport has it, land in trunk, and then follow up with a trunk branch to remove it [06:36] davecheney: from your recollection, was juju 1.14 done with go 1.1.1? [06:36] wallyworld_: yeah this isn't going into 1.16 [06:37] hmmm. maybe we can/should get rid of it then [06:37] reduce complexity [06:37] all metadata should be good for 1.16 [06:37] and if it's not, we don't want to propagate the issue [06:39] bigjools: i think there's going to be a 1.18 for saucy [06:40] 1.18? not 1.16.1? [06:40] do we not do stable updates like that? [06:40] axw: i *thought* it was going to be 1.18, based off 1.17 trunk [06:40] mmkay [06:40] too many bugs etc we are fixing in trunk [06:41] i could be wrong [06:41] but that's what i heard [06:41] okey dokey [06:41] ot thought i heard [06:41] wallyworld_: in that case, I'd rather hold off on changing the resolve logic for this CL [06:42] ok [06:42] oh but... we don't support going multiple versions do we [06:42] upgrading [06:42] axw: with WriteMetadata stuff, i have done something similar now for images metadata. but unlike for tools, i pulled the WriteXXX methods out into generate.go instead if stuffing into simplestreams.go [06:42] ok [06:43] i think this is the link, i have 3 of the fuckers https://codereview.appspot.com/14663043/ [06:43] generate.go is a new file [06:44] axw: we support going from 1.14->1,16 and 1.16->1.18 [06:44] but not 1.14->1.18 [06:44] right, so doesn't matter either way then [06:47] axw: it's extra work for you, but if the resolve bit were to be removed prior to committing to trunk, the code comes out cleaner [06:47] i looked at the merge proposal, it look ok [06:48] wallyworld_: yeah that's cool, it'll definitely clean it up - I'll do that [06:48] thanks for checking over it [06:48] I'm reviewing your CL now, looking good so far [06:48] np, thanks for fixing it :-) [06:48] axw: thanks :-) if you are a masocist, i have 3 related ones. the one you are looking at is the last of 3 [06:48] are they prereqs? [06:49] yeah [06:49] https://codereview.appspot.com/14502059/ is first, and https://codereview.appspot.com/14540055/ is second [06:49] not a masochist, but I'll do what I can ;) [06:49] the end result is that creating image metadata for private clouds is *much* easier [06:49] thanks :-) [06:49] cool [06:50] with this work, the user can run the image metadata gtenerate plugin over and over to buiild up their metadata [06:50] for different series, arches etc [06:50] before, the tool was just a prototype and just did one image and overwrote each tome [06:51] wallyworld_: ok ta [06:52] rogpeppe: can you recall - did we release juju 1.14 using Go 1.1.1? [06:55] wallyworld_: hmm, not sure [06:56] rogpeppe: ok. there's a bug i'm fixing in gomaasapi which i want to blame on the go 1.1.2 upgrade [06:56] cause nothing else makes sense [06:56] wallyworld_: is that the EOF problem? [06:56] yeah [06:57] reg from http.NewRequest needs to have req.Close = true all of a sudden [06:57] but only in gomaasapi [06:57] nowhere else [06:57] wallyworld_: FWIW if Close didn't default to false, you'd almost never get any connection reuse [06:57] and no gomaasapi code has changed [06:57] wallyworld_: i wonder how long connections are kept around before they're dumped [06:58] not sure [06:58] wallyworld_: it might be the maas server dumping old http connections [06:58] but this happened during bootstrap, so < 1 second [06:58] wallyworld_: ah, that seems wrong [06:58] why did it on;y show up now? [06:58] and not before [06:58] wallyworld_: good question; does it happen reliably? [06:59] yep [06:59] only in 1.16 juju-core [06:59] not 1.14 on same maas box [06:59] wallyworld_: have you tried 1.14 compiled with go 1.1.2 ? [07:00] rogpeppe: the front end is Apache, so whatever Apache is doing... [07:00] bigjools: hmm, seems unlikely then [07:00] rogpeppe: no [07:00] wallyworld_: that would be a good way to confirm or deny your suspicions [07:00] yeah [07:01] rogpeppe: so will this have adverse performance impace for gomaasapi wrt connection resuse? [07:01] if Close is always set to True [07:01] which it needs to be to make it work [07:01] wallyworld_: possibly [07:02] wallyworld_: these are https connections, right? [07:02] i think so [07:02] ah no [07:02] http [07:02] eg http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix= [07:03] bbiab [07:07] I'd take bad performance over no performance :) [07:08] bigjools: it would be nice to know what's going on though [07:08] indeed [07:08] could be a Go difference from 1.1.1 to 1.1.2 [07:09] bigjools: it is possible [07:09] bigjools: that should be easy enough to confirm [07:09] aye [07:14] * rogpeppe is reading through the Go http.Transport logic [07:30] have you guys done the environment uuid yet? [07:31] rvba: I reckon that is still the best solution BTW, as you said the other one is seriouslu abusing tags [07:43] If the environment uuid is not available yet, would it make sense to use a hash of the hash of the admin-secret as an identifier? [07:47] rvba, bigjools: what's the issue here? [07:48] rogpeppe: we want to fix https://bugs.launchpad.net/maas/+bug/1239488 [07:48] <_mup_> Bug #1239488: Juju api client cannot distinguish between environments [07:48] rogpeppe: so we need a way to flag nodes to account for the fact that they belong to a juju environment. [07:49] rogpeppe: we thought about two solutions… and we think solution 2 is best: [07:49] = 3. Fix by storing the API key used to acquire nodes = [07:49] ??? [07:50] Cons: harder to pull a key if compromised [07:50] rogpeppe: arg, wrong paste, sorry. [07:50] rogpeppe: http://pad.ubuntu.com/DnNONX6kFB [07:50] rvba: i don't quite see how the UUID is an *alternative* to using tags [07:50] rvba: won't you need to tag with the UUID? [07:50] rogpeppe: no, we need the UUID in both cases. [07:51] * rogpeppe goes to get his SSO key [07:54] rvba: so are you thinking of solution 2 here? [07:54] rogpeppe: yeah [07:55] rvba: i've always wanted the env UUID to be generated when the environment is first created [07:56] rvba: it should actually be quite a simple change now [07:56] rvba: it can be done at Prepare time [07:56] rvba: and would be passed through in the environ config [07:57] rogpeppe: that would be great. [07:58] rogpeppe: we need to fix this problem today/tomorrow morning… is that something (the UUID thingy) that could be done… like… now? ;) [07:59] rvba: probably not that quickly - it involves changes in a few places. however... [07:59] rvba: we could make a change in the maas provider only [08:00] rvba: to make it generate its own uuid, to be replaced with the environment uuid at some point in the future [08:01] rogpeppe: changing the uuid will imply dealing with already deployed environments. [08:01] rvba: there's actually no particular need for it to be the *actual* environment UUID, is there? [08:02] rogpeppe: no, we just need an identifier, specific to each environment. [08:02] rvba: ok, so here's a possible way forward: [08:02] bigjools: so we don't overlap, were you going to compile juju 1.14 with go 1.1.2 on your maas box? [08:02] rvba: change maas's EnvironProvider.Prepare so that it generates its own UUID and stores it in the environ's configuration [08:03] rvba: then change StartInstance and Bootstrap to tag the instance with that tag [08:04] (plus change Instances() to pass that tag to MAAS when listing instances) [08:04] rvba: yeah [08:04] rvba: oh yeah, it's agent_name, not tag, also :-) [08:04] Right, that's a detail :) [08:05] rogpeppe: isn't EnvironProvider.Prepare called every time juju is run? [08:05] rvba: nop [08:05] e [08:05] rvba: it's called just once for a given environment [08:05] I mean, I'm not sure I see how the uuid would be created once and persisted. [08:05] rvba: after Prepare is called, all the config attributes for that environment are stored in the .jenv file (as BootstrapAttrs) [08:06] rogpeppe: I see… so if you manually get rid of the .jenv you'll be in trouble then. [08:06] rogpeppe: what about using hash(admin-secret) ? [08:07] rvba: i don't like that idea [08:07] rvba: there's nothing guaranteeing that admin-secret is unique [08:08] rvba: you'll be in trouble if you manually get rid of the .jenv file anyway [08:08] rogpeppe: all right :). Then this seems like the best option. [08:11] morning [08:11] bigjools: I updated the plan with rogpeppe's idea. [08:12] rvba: for an example of a Prepare method that adds an attribute, take a look at openstack's environProvider.Prepare method [08:13] rogpeppe: okay, thanks. [08:13] rvba: although that method allows the control-bucket to be overridden in environments.yaml, and i'm not sure you'd want to allow that for the uuid [08:14] Probably not. [08:14] rvba: you'd probably want to make the method return an error if the uuid is already specified [08:18] wallyworld_: no, go for it [08:18] afk [08:20] rvba: ok. Existing deployments will be a problem IMO [08:21] bigjools: true. That's the only remaining problem. [08:21] rvba: and it's a hard one [08:21] rvba: although we could get it to work if juju only generates its uuid on bootstrapping [08:21] bigjools: one things we could do is detect that we don't have generated a UUID and that the env is already bootstraped, and in this case use '' as the agent name, [08:21] and existing deployments can stay without the uuid [08:22] thing* [08:23] brb [08:34] rvba: if you've upgraded a legacy environment, the uuid in the config will be unset [08:35] rvba: so that case should be easy to cater for [08:35] rvba: although it's perhaps a problem that upgrading to the new juju won't fix the current problem for existing envs [08:36] rogpeppe: I don't really think we have a choice here. [08:36] rvba: is the plan to make the agent_name dynamically changeable, or will it only be specifiable when an instance is started? [08:37] rogpeppe: I don't see why we should make it changeable. [08:37] rvba: if it was, it might be possible to fix existing deployments [08:54] axw_: you've got a review https://codereview.appspot.com/14430064/ [08:56] rogpeppe: sadly, juju-core does not appear to have been tagged for the 1.14 release. neither has goose. and other dependencies like gomaasapi have no tags at all :-( i can get the juju-core source for 1.14 because there's a series branch but for the dependencies i can't :-( [08:56] so a bit ahrd to recompile 1.14 :-( [09:00] wallyworld_: i think all the deps are in https://launchpad.net/juju-core/1.14/1.14.1/+download/juju-core_1.14.1.tar.gz [09:01] rogpeppe: ah ok. we have a src tarball [09:01] still not really best practice :-( [09:04] Thanks a lot for your help rogpeppe, we are starting to work on a fix for our bug, we will probably come to you during the day for advice/reviews :). [09:05] rvba: np [09:18] rogpeppe: juju 1.14 and go 1.1.2 seems to work. i'm not sure why. juju 1.14 uses the old tools lookup code, but still does similar things eg storage.List() etc [09:19] wallyworld_: ok, so i guess we can't blame the upgrade to go 1.1.2 [09:19] i am reticent just to set Close = true without knowing why [09:19] guess not [09:19] wallyworld_: i agree [09:19] wallyworld_: what request is it getting the EOF on [09:19] ? [09:20] a stor List() [09:20] http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools/releases/juju- [09:20] well, with the prefix quoted [09:20] i unquoted it to paste [09:21] the gomaasapi logic creates a new request, creates a new http client and then does a client.Do(req) [09:21] not much to fail [09:21] a curl done manually of the same thing worls [09:21] wallyworld_: it might be worth changing the code to log all requests and responses, to see how they've changed between the releases [09:22] yeah [09:22] i just can't see why a simple http req fails [09:23] rogpeppe: here's a pastebin of a manual curl to invoke the req that fails using httpClient.Do(req) [09:23] https://pastebin.canonical.com/99027/ [09:23] wallyworld_: well, it'll be reusing a socket from a previous request [09:23] sure [09:23] wallyworld_: it is possible that the maas server doesn't like that [09:24] but why now? [09:24] wallyworld_: and i don't know if you can make curl do that [09:24] wouldn't sockets always be reused [09:24] wallyworld_: yeah, but perhaps we're making more requests now [09:24] we would be [09:24] since tools look up check for metadata etc [09:24] and before it didn't [09:25] wallyworld_: exactly - so our access pattern has changed, and perhaps that's triggering some bug/feature on the server [09:25] hard to believe the server could be that fragile [09:25] it's just an apache http server [09:26] * rogpeppe doesn't find it that hard to believe... [09:26] rogpeppe: so, worst case, we may have to set Close = true for mass [09:26] just to get something working for release [09:26] juju doesn't really hammer the connection anyway, right? [09:27] wallyworld_: i'm not sure we know the worst case now - it may be that setting Close is just papering over a bug which will re-emerge later in some form [09:27] sure, depends if we can find the root cause [09:28] i really don't know where to start. the maas logs show the req isn't even getting through [09:29] wallyworld_: that's interesting in itself [09:29] so if it never arrives, it mught be getting lost inside the Go http client lib, or maybe apache is discarding it [09:29] i can ask bigjools to check the apache logs [09:30] wallyworld_: that would be good, to start with [09:30] or i can check [09:30] ok, so i can see the legacy tools request [09:30] i'll fire up 1.16 [09:31] wallyworld_: how long into the bootstrap do we see the EOF response? [09:31] rogpeppe: very near the start - when it is syncing tools [09:31] wallyworld_: i'm still wondering if it might be a stale-connection timeout issue [09:31] wallyworld_: how near? (in seconds) [09:32] um. 2? [09:32] 5? [09:32] not sure [09:32] i'll fire up 1.16 and see [09:34] rogpeppe: ffs . it worked that time [09:34] wallyworld_: ok, so that's interesting too [09:34] let me check something [09:35] wallyworld_: i suppose that means that it's possible that it wasn't the Close=true addition that caused it to succeed last time [09:36] rogpeppe: oh wait, i'm an idiot [09:36] i compiled a juju version with close=true for jools to use [09:36] and it's still in the path [09:36] i'll revert and try again [09:36] wallyworld_: ha [09:41] wallyworld_: assuming you manage to reproduce the problem, i'd like to check one thing - in gomaasapi/client.go, i'd like to change the "return nil, err" after httpClient.Do to return a more distinctive error, so we can be sure that the EOF is coming from that [09:42] rogpeppe: i've already logged that [09:42] and it is coming from the Do() [09:44] rogpeppe: error happens after about 7 seconds [09:45] wallyworld_: have there been several successful requests before the one that failed? [09:45] rogpeppe: log is full. let me clear it and i'll retry. too hard to tell [09:45] wallyworld_: i'd be interested to see the log actually [09:45] thought so :-) [09:46] will pastebin [09:46] ta [09:46] rogpeppe: actually, i deleted to log and there is no new one. it seems like the first http get fails [09:46] which is what i think the tools look up is [09:47] ie build tools locally, then look to see what's in target bucket, boom [09:50] wallyworld_: hmm, that is interesting. i'm surprised that the Close=true change makes a difference then [09:50] yeah [09:50] so the log is the apache log [09:50] so the request never leaves the client [09:50] or apache eats it [09:51] i can't see apache doing that [09:51] no new errors logged [09:52] wallyworld_: can you change gomaas API dispatchRequest to log every time it makes a request? [09:52] i can [09:53] i gotta attend to a couple of things. i'll do it between now and standup [09:53] wallyworld_: thanks [10:04] rogpeppe: there is one other request [10:04] ------ GET http://10.0.0.9:80/MAAS/api/1.0/files/provider-state/ [10:04] ------ GET http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools%2Freleases%2Fjuju- [10:04] ERROR Get http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools%2Freleases%2Fjuju-: EOF [10:05] the provider-state lookup [10:05] which is not a List() but a storage Get() I think [10:06] wallyworld_: and you're saying the first request isn't in the apache log? [10:07] rogpeppe: seems not. i deleted the access.log file from /var/log/apache2 and it hasn't created a new one [10:08] that doesn't seem right [10:08] wallyworld_: when you say "deleted", did you just truncate the file, or remove it? perhaps apache is just trying to append but not create? [10:08] yeah, let me touch it and try again [10:09] nope [10:09] empty [10:09] wtf [10:10] wallyworld_: maybe logging just isn't configured for that apache? [10:10] wallyworld_: or... are you sure it's actually going through that apache? [10:11] wallyworld_: if you do a curl request, does it show up in the log? [10:11] trying a few things [10:12] i ran juju 1.14 and lo log either [10:12] perhaps apache didn't like its log file going away [10:12] i'll retstart it [10:12] wallyworld_: it's possible, yeah [10:13] wallyworld_: perhaps it was still trying to write to the old (removed, but still there in the fs) log file [10:13] yeah [10:15] rogpeppe: i had started a juju env using 1.14. i destroyed using 1.16. it logged about 10 requests and did it no problem [10:15] so it's only bootstrap [10:16] wallyworld_: hmm [10:16] rogpeppe: and yes, the provider state lookup is logged [10:16] 10.0.0.9 - - [15/Oct/2013:20:14:58 +1000] "GET /MAAS/api/1.0/files/provider-state/ HTTP/1.1" 404 233 "-" "Go 1.1 package http" [10:16] but NOT the tools list [10:17] which is the next http get [10:18] wallyworld_: could you paste the log of the requests it made when destroying the 1.16 env using 1.14? [10:18] o [10:18] k [10:18] sorry, i mean the the other way around [10:18] wallyworld_: destroying the 1.14 env using 1.16 [10:18] yep [10:19] rogpeppe: https://pastebin.canonical.com/99039/ that's the whole log. it has the destroy followed by bootstrap [10:19] bootstrap at 14:58 [10:19] destroy ends at 14:48 [10:21] wallyworld_: hmm, and that contains a successful list request too - exactly the same request that fails in Bootstrap [10:21] wallyworld_: well, *almost* [10:21] yep [10:21] go fogure [10:21] figure even [10:22] wallyworld_: i wonder if we can try to repro this with a very simple example, rather than using bootstrap [10:23] wallyworld_: we know that there are only two requests and the second one fails [10:23] that's pretty simple [10:23] in itself [10:23] wallyworld_: so we could simply create a maas Environ and try to read its provider-state file and then list the tools [10:23] wallyworld_: with a 7 second gap between them [10:24] yes. but what would that tell us [10:24] that we don't already know [10:24] wallyworld_: if that failed, that would be very useful [10:24] wallyworld_: because we then have a very simple example that we can reduce further to see what's actually going on [10:24] wallyworld_: if that succeeds then we know something else is up [10:25] true [10:25] wallyworld_: it should only take 5 minutes to write [10:25] yes [10:26] i'll try and do it after standup if i am still awake, othwrwise tomorrow [10:28] wallyworld_: one mo, i'll do it if you like [10:41] wallyworld_: try this (substitute "my-maas-environ-name" with your env name: http://paste.ubuntu.com/6239995/ [10:42] rvba: i'm a bit confused by the maas storage handling - how are the storages from different juju environments in the same MAAS kept separate from one another? [10:44] rogpeppe: sorry, was afk. looking [10:44] ubuntu@maas:~/go/src/launchpad.net/juju-core$ go run masstest.go [10:44] ------ GET http://10.0.0.9:80/MAAS/api/1.0/files/provider-state/ [10:44] 2013/10/15 20:44:31 get provider-state: file 'provider-state' not found not found [10:44] ------ GET http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools%2Freleases%2Fjuju- [10:44] 2013/10/15 20:44:38 list tools ok [10:47] rogpeppe: standup? [10:48] TheMue: "the video call is full" [10:49] mgz: standup? [10:49] on my way [11:00] fwereade: ?? [11:24] wallyworld_: http://paste.ubuntu.com/6240109/ [11:27] rogpeppe: new version ran fine [11:28] * TheMue => lunch [11:28] wallyworld_: could you try the original version with a 15 second delay? [11:28] ok [11:28] rogpeppe: except bigjools just shot it down [11:29] shut [11:29] wallyworld_: oh [11:29] yeah :-( [11:36] rogpeppe: care to have a look at https://codereview.appspot.com/14696043/ ? [11:36] rvba: looking [11:39] rogpeppe: i got the server powerd up again. i added the 15s delay. all passed [11:39] wallyworld_: hmm [11:39] rogpeppe: i can add your ssh key [11:40] rogpeppe: done, you should be able to ssh in now [11:41] wallyworld_: great, thanks [11:41] the ~ubuntu/go dir is where the 1.16 source lives [11:41] i put the test go src you gave me in the juju-core root dir [11:45] wallyworld_: ok, i'm investigating now [11:45] rogpeppe: awesome. i'll try and hang around a bit. if you are finished and i'm not around, can you sudo poweroff? [11:49] rogpeppe, wallyworld_: thanks for looking into this -- I will try to pop back on and off, and I should be able to be around properly this evening if I can help you at all [11:49] fwereade: thanks [11:50] fwereade: np. we've spent a long time on it and at every turn it gets more confusing. we will kick ourselves when the issue is found i'm sure [12:37] rogpeppe: When does EnvironProvider.Prepare() get called? I only want to set the environment UUID when bootstrapping an environment; existing environments should not have a UUID. [12:37] rogpeppe: Hello too :) [12:38] allenap: hi :) [12:38] allenap: it gets called only if there's no .jenv file in ~/.juju/environments for that environment [12:38] allenap: and only if you call juju bootstrap or juju sync-tools [12:38] allenap: so that should be exactly what you need, i hope [12:39] rogpeppe: That sounds about right, thanks :) [13:05] rvba: Do you happen to know why maasEnvironConfig.attrs exists? environs.config.Config has two maps, one for known and one for unknown config. Does that not suffice, or was that not around at the time? [13:06] allenap: perhaps it's to save the map being copied every time an attr is used [13:07] rogpeppe: I'm a little rusty :) Can you explain why that would happen? [13:08] allenap: environ.Config makes a copy of the map that it returns, to prevent mutation of the immutable Config value. [13:08] allenap: no idea… by the looks of it, it just copies the pattern used in the openstack provider… maybe jtv (if he's around) will knwo. [13:08] know* [13:08] rogpeppe: Ah yes, I see, ta. [13:25] wallyworld_: FWIW, these are the (only) requests i see. the final request is the one that fails (it fails because it sees EOF on the persistent connection) http://paste.ubuntu.com/6240523/ [13:30] rogpeppe: that looks about right. the simplestreams code uses a different http client. maybe that is related. the http client used by juju-core code has the ssl support. the http client in gomaas doesn't [13:31] i'm not sure whar is done with the http transport in juju-core though [13:31] wallyworld_: they're both using the same transport [13:32] rogpeppe: so at bootstrap is when the different hhtp clients are used, whereas destoy just uses the gomaasapi one [13:32] surely not a coincidence? [13:33] rogpeppe: although, did you run bootstrap with --upload-tools? [13:33] wallyworld_: no [13:34] cause using --upload-tools doesn't first do a simplestreams search [13:34] and that still fails [13:34] wallyworld_: ah, i'll try that [13:37] wallyworld_: http.Client isn't stateful unless it's given a custom RoundTripper [13:38] wallyworld_: so it shouldn't make any difference if plain http.Get is used vs creating a new client with http.Client{} [13:38] i'm just clutching at straws, trying to think of twhat's different [13:39] sinzui: hi, did you see the streams repository is sortof set up? [13:40] wallyworld_, I did not [13:41] sinzui: ticket 63925 [13:41] it talks about the server sawo which i'm not familiar with [13:42] sinzui: i need to change the url embedded in juju-core, and the generated simplestreams metadata and tools need to be uploaded [13:42] I saw that and update a bug over my weekend [13:42] or the release scripts tweaked accordingly [13:43] i can easily do the juju-core side ahead of time [13:43] although if the tools were uploaded i could drop the legacy aws fallback at the same time [13:53] rogpeppe: i fell asleep on the couch before so i had better get myself off to bed for real. maybe you can drop me a quick note with any progress and i'll pick up tomorrow? also don't forget to poweroff the server [13:53] wallyworld_: how should i do that? [13:53] wallyworld_: just sudo shutdown? [13:53] sudo shutdown now [13:53] or does sudo poweroff work? not sure [13:53] wallyworld_: i'm making progress BTW [13:53] oh? [13:53] wallyworld_: i have a suspicion of what might be happening [13:54] oh great, nowyou've got me curious [13:54] wallyworld_: it looks like MAAS *is* closing the connection after 5s [13:54] wallyworld_: but somehow we're seeing that in the wrong place; not sure quite why yet [13:54] hmmm. ok. would be great to know the sequencing of things so explain why now and not before [13:55] wallyworld_: it *may* be to do with someone not closing an http request body. i need more investigation [13:55] if maas is going to do that, maybe we do need the close=true on the juju side? [13:56] wallyworld_: the http client *should* work ok even with that [13:56] ok. good luck and thanks. talk tomorrow [13:56] wallyworld_: but i guess there's always going to be a race. [13:56] wallyworld_: okeydokey. sweet dreams :-) [13:56] i'll ask jools tomorrow about the maas aspect of it [14:25] fwereade, sinzui: any critical bugs for 1.16.0 that I need to push in pre-release of saucy? [14:25] * sinzui looks [14:26] jamespage, I don't see any criticals that can be pushed [14:29] sinzui, coolio [14:29] * jamespage puts his feet up for a bit then [14:31] jamespage: I have one funny report on a bug fixed for 1.16 [14:31] that I want to (un)confirm [14:31] * jamespage takes his feet off the desk and listens [14:31] jamespage: see last two comments on bug 1236734 [14:31] <_mup_> Bug #1236734: juju 1.15.1 polls maas API continually [14:31] I'm pretty sure he just has the old version still [14:32] mgz, oh - I can confirm that has good [14:32] gone [14:32] excellent. [14:32] the load on the serverstack maas server went from 2.0 to 0.04 post upgrade [14:32] jamespage: excellent news - i was a bit concerned my fix hadn't [14:32] I'll comment of the bug and see if I can help the guy. [14:33] which is worrying - as it would indicate that the performance of the storage on MAAS is not great - I only have 6 physical and 8 lxc service units in the deployment [14:33] mgz, maybe he forgot todo juju upgrade-juju [14:33] fwiw that took a very long time to complete [14:33] which is worrying - as it would indicate that the performance of the storage on MAAS is not great - I only have 6 physical and 8 lxc service units in the deployment [14:37] hm, storage performace, or just general api being really slow? [14:38] django is a fair bit of overhead on top of postgres [14:41] * sinzui thinks there is a space in dependencies.tsv that breaks goamx [14:41] gah, and it's me... [14:41] sinzui: ah, i was wondering what the problem there might be [14:42] sinzui: I'll fix [14:42] should have thought of that when it was falling over yesterday rog... [14:42] my bad. [14:43] mgz: i should've thought of it too [14:43] sinzui: nice catch [14:46] fix proposed for rubber stamping. now, did I backport that too... [14:46] howdy all. Sorry to miss the standup this morning. [14:46] natefinch: blame columbus! [14:46] mgz: that bastard [14:48] fwereade, you about? [14:48] ah, actually, it wasn't my one that borked, so is just trunk [14:48] blame say nate :) [14:49] natefinch: https://codereview.appspot.com/14701043 [14:50] natefinch: heya [14:51] rogpeppe: could you help me with the provisioner? [14:51] mgz: dang, sorry [14:51] TheMue: sure; gimme a minute [14:51] rogpeppe: or even a level higher, I don't know if it is the provisioner [14:51] rogpeppe: thx [14:52] mgz: who uses tab separated values, anyway? Why not csv like normal people? :) [14:52] * natefinch grumbles about invisible, variable width characters [14:53] TheMue: what's the issue? [14:54] natefinch: because commas are more common in normal values [14:54] rogpeppe: my question is that in case of a dying machine (e.g. by shutdown -fh now) is the a mechanism that updates the state on it? [14:54] TheMue: i'm not sure [14:55] /me goes to look [14:55] * rogpeppe goes to look [14:59] TheMue: i can't see anything that polls instances for their status, no [14:59] rogpeppe: yes, that's my impression too [14:59] rogpeppe: thx [15:00] TheMue: np [15:01] rogpeppe: cts reports of a customer which turn down machines (wonder why they do it instead of removing units) and then in status they see the machine as down but the unit as alive [15:01] rogpeppe: that makes me wonder [15:02] TheMue: i have a feeling that the address updater worker might be a good place to put this functionality (i guess it might be renamed to "machineupdater" if it that happened) [15:02] mgz: does that make sense to you? [15:03] seems reasonable, though I'm not sure it's the right fix for the problem of machines just going away [15:04] mgz, rogpeppe: I'm trying to get more details from CTS [15:04] mgz: yeah, that would probably need another fix too, in the provisioner. although... [15:04] mgz: we'd need to decide what we want to do about that [15:05] mgz: do we want to automatically start another unit? [15:05] that was the pyjuju behaviour, roughly [15:05] not sure if it's what we want though [15:05] mgz: indeed [15:05] mgz: we probably do if minunits is set [15:18] mgz: it get's more clear now, they use juju to deploy openstack on maas and there the hacluster charm. then they stop nodes for failover tests [15:24] TheMue, rogpeppe, mgz: we dropped that behaviour, the only time anyone noticed it was when they didn't want it [15:25] TheMue, it is surprising that the unit is not reported as down if it's not running [15:25] fwereade: yeah, i thought so, although with SetMinUnits, we should perhaps rethink that [15:26] fwereade: because that expresses a clear intent, i think [15:27] TheMue, look at the api server, maybe it's not noticing that the unit's connection is gone and is hence not killing the presence bit [15:27] fwereade: ah of course, i'd forgotten about the presence thing [15:27] rogpeppe, wrt units uncautious agreement; wrt machines uncautious disagreement ;p [15:27] er machines *cautious* disagreement [15:28] fwereade: agreed [15:28] rogpeppe, essentially I need to write some access-revocation stuff for agents we force destroy [15:29] How permanent is the theme-oil bug tag? Is there a better tag to represent the bugs? server?, hyperscale? cloud-server? [15:29] rogpeppe, if I have that I can be sure that removed units aren't coming back and are safe to replace without compounding confusion [15:29] fwereade: i guess so, although i suppose it's not a huge problem [15:29] fwereade: ah yes, of course [15:29] rogpeppe, I am still twitchy about presence nodes as an arbiter of true existence [15:29] fwereade: although... [15:29] fwereade: we won't be recreating the same units, will we? [15:30] fwereade: or the same machines to run them, come to that [15:30] rogpeppe, I've seen a couple of `down (alive)` reports that have always cleared themselves up before I've been able to figure them out [15:30] rogpeppe, so basically I'm just nervous about our ability to tell for sure what's in the system [15:31] fwereade: yeah [15:31] rogpeppe, and I don't want things coming back to life once we think they're dead [15:31] fwereade: if we've destroyed their state entities, how can they? [15:32] fwereade: or perhaps that's what you're thinking of when you say "access revocation" [15:32] rogpeppe, I'm just worried about the potential for races, with things coming back and managing to do something while we think they're gone [15:33] rogpeppe, access revocation == telling things they're dead immediately, even if they're not quite dead in state yet [15:34] fwereade: we can't just kill 'em dead in state immediately? [15:35] * rogpeppe is still going down the maas EOF bug rabbit hole [15:35] rogpeppe, not sure we can while keeping sane guarantees [15:36] rogpeppe, (tyvm for keeping on at that) [15:36] rogpeppe, think of the relations a subordinate has joined, when that subordinate is in a container in a machine that itself has a unit running directly [15:37] rogpeppe, even calculating the right transaction for smoothly cleaning up the top-level machine kinda makes me cry [15:37] fwereade: there's something really weird happening, which is causing the remote http connection to be dropped *just* before we make a GET request [15:37] rogpeppe, I would like a quick "cut all these off" to be followed by a more measured cleanup of all the various agents potentially in play [15:38] fwereade: yeah, seems reasonable on second thoughts [15:45] fwereade: do you know how far Tim got in looking at MaaS? I was looking into a couple of the bugs, but was having trouble with my maas environment (as usual). If he's progressing, it might be better for me to work on something else [15:45] Hi all - I asked about bug 1236734 yesterday and didn't get a response. This is listed as fixed in 1.16.0, but it's not. [15:45] <_mup_> Bug #1236734: juju 1.15.1 polls maas API continually [15:45] Any chance we could get some attention on this? [15:46] natefinch, I saw rvba and bigjools talking with rogpeppe this morning [15:46] natefinch: which MaaS problem? [15:47] kurt_, mgz was going to follow up with that -- last I heard it was confirmed fixed by jamespage [15:47] fwereade: we talked about something different, we are in the process of fixing bug 1239488. [15:47] <_mup_> Bug #1239488: Juju api client cannot distinguish between environments [15:47] rogpeppe: the couple I was looking at was destroying non-allocated machines and destroying machines from outside the juju environment [15:47] kurt_, are all your agents running 1.16? [15:47] fwereade: its not. I have 1.16 installed and a still having issue [15:47] rvba, I *think* that's what natefinch was thinking of [15:47] natefinch: that's the bug I just mentioned. [15:47] natefinch: ah, yeah, there's some stuff being done for that [15:48] rogpeppe, rvba, fwereade: that sounds like a great fix for the bug [15:48] fwereade: how can I bring up the agent inof? [15:48] info rather [15:48] kurt_, juju status [15:48] kurt_, included agent-version [15:48] sorry, brb, diaper duty [15:48] natefinch: there's a CL here: https://codereview.appspot.com/14696043/ [15:49] kurt_, you may need to sync-tools before an upgrade-juju sees the latest tools to fix the problem [15:49] Oh - good point - how can I update the agent? just sync-tools? [15:49] fwereade, mramm2: any call outs you would like to add for juju-core in https://wiki.ubuntu.com/SaucySalamander/ReleaseNotes#Ubuntu_Server ? [15:50] kurt_, sync-tools should get them in place for you; upgrade-juju will actually upgrade the agents [15:50] kurt_, we didn't quite get the global tools source in place in time for it to be 1-step on maas :( [15:50] oh - so even if I followed the correct upgrade process, I still would have ran in to this? [15:51] sorry, I know this is the dev list and not real appropriate for this discussion [15:53] kurt_, no worries :) and... I guess it depends on definitions, because upgrade-juju can't upgrade further than it knows; but it's not necessarily entirely obvious that you need to sync tools before a maas environment will see them [15:54] interesting, right. I'm trying to blog some stuff, so I'd like to capture that [15:54] kurt_, the bug report is appreciated all the same -- it helps make it clear to us that we still need to smooth the process [15:55] kurt_, *hopefully* you just hit a bad window, because we've got some infrastructure just coming online now to make it more transparent (unless you're actually cut off from the internet entirely -- if you're isolated, syncing will always be required) [15:55] I'm not isolate [15:55] isolated - full connectivity to internet on MaaS [15:55] and juju [15:55] kurt_, but, indeed, I don't think you'll see the benefits until 1.18 -- sorry for the infelicity [15:56] kurt_: I responded on the bug today after checking with others, you might not have seen the comment unless you subscribed to the bug [15:56] so when upgrading juju - can you outline what the fool proof process should be? [15:56] (as to avoid this situation in the future) :D [15:57] mgz: thanks [15:57] I didn't see that yet [15:57] kurt_, for most users: `juju upgrade-juju`; for those using maas, or those who have manually synced tools in the past, you'll need a `juju sync-tools` first [15:58] fwereade: I believe that I did the juju upgrade via apt-get upgrade after adding the ppa [15:58] kurt_, we will hopefully be able to cut out that step for those not in isolated environments, but it's not there today [16:00] kurt_, ha, sorry, I was focused on the agent side of things: yes, you should also apt-get upgrade your client; and you should probably prefer to upgrade the client first, because that way round has been exercised more, but either way round should be fine in practice [16:01] so the upgrade of juju can be thought of at 2 levels, right? 1. the juju binary 2. the agents - is that a reasonable assumption? [16:01] oh there are the tools too [16:01] ! [16:02] kurt_, yeah, both need to be done -- but 1.14/1.16 binaries and agents should interoperate happily [16:02] kurt_, the tools are the agent binaries [16:02] i was previously on .15 :D [16:02] kurt_, just a quirk of terminology [16:02] 1.15.1 that is [16:02] I C [16:02] kurt_, then that *should* actually work too, but we don't make guarantees for odd-numbered minor versions [16:03] kurt_, we have no intention of breaking upgrades ofc... but 1.15 *did* have upgrade troubles [16:03] 1.15.1 had some fixes I needed previously - but yeah I get that - but that's why I was asking about best way to fip bits between dev and stable stuff. [16:03] kurt_, think of 1.odd versions as canaries to make sure we don't mess up the 1.even path [16:04] ok [16:12] * rogpeppe gets some lunch [16:18] fwereade: hmm, thx for the hint, found the machinePinger in apiserver/admin.go, but funnily it's used nowhere [16:22] rogpeppe, fwereade: ec2 instance type constraints.... would love to have some input: https://codereview.appspot.com/14523052/ [16:22] natefinch: I was leaving that one for next week [16:24] if you read through the mail archives, you'll find various arguments, and just having a value that's ignored depending on provider and generally has uncertain meaning isn't really an improvement on not supporting it at all [16:40] fwereade: Hi William, please can you add me to ~juju? Or please mark https://code.launchpad.net/~allenap/juju-core/maas-environment-uuid/+merge/191146 as Approved :) [16:47] allenap: did you do the rename rogpeppe mentioned? [16:48] mgz: In a subsequent branch I changed it to maas-environment-uuid then back again when I read his comment. [16:49] so, we want another branch generating it in Prepare for all providers? [16:50] mgz: I guess so. I have added the code to return an error if it's set in environments.yaml, but it's still in the MAAS provider for now. [16:50] mgz: I have a very specific problem to solve :) [16:51] wow, this is incredibly weird [16:51] allenap: submitted [16:51] mgz: Ta. [16:53] fwereade, hazmat, thumper, You will be getting email about this bug https://bugs.launchpad.net/juju-core/+bug/1232304 [16:53] <_mup_> Bug #1232304: consider tuning git setup for juju-core, and document caveats [16:57] * TheMue is stepping out for dinner, cu [16:58] sinzui: hi, can you give me any context on that bug (1232304)? it's a real problem for us at the moment - means some environments can't run upgrade-charm without significant effort === natefinch is now known as natefinch-afk [17:05] mgz: If you're itching to do a review, https://codereview.appspot.com/14644045 is scratching at the door waiting to be let into your heart. [17:09] * rogpeppe has gone up a blind alley and needs to stop and make curry. [17:09] g'night all [17:09] mthaddon, I have asked for input from the engineers. [17:10] nn rogpeppe [17:10] allenap: cheerio [17:21] allenap: may have a chance to go through that briefly === natefinch-afk is now known as natefinch [20:09] morning thumper. How goes? [20:09] natefinch: morning, good [20:09] can tell that today is going to get a little broken up [20:10] taking the car for a service in just under an hour [20:10] and will work from a cafe across the road while that is happening [20:10] Wish I could do that, my mechanic is sorta in the middle of nowhere [20:11] * natefinch doesn't really want to choose mechanics based on amenities within walking distance, however.... [20:13] :) [20:33] fwereade: you around? i was hoping for a hand off email about the maas http connection issue. do you know the status? [20:39] wallyworld_, back soon -- rogpeppe was thinking for a while that it *was* an unclosed request body, but then I saw him saying blind alley :( [20:39] ok, thanks [21:14] is the go bot set up to land on 1.16? [21:14] thumper: yes [21:15] wallyworld_: so why isn't it [21:15] ? [21:15] um. [21:15] it did for me last week, no problems [21:15] https://code.launchpad.net/~thumper/juju-core/bootstrap-state-no-constraints/+merge/190852 [21:15] i'll log in and check [21:15] approved, commit message [21:15] not getting merged [21:16] thumper: https://pastebin.canonical.com/99089/ [21:17] wtf? [21:17] thumper: wtf is a ghost revision? [21:17] a revision with referenced as a parent that isn't in the repository [21:18] most likely, a stacking issue [21:18] but launchpad can see it [21:18] so NFI [21:18] maybe do a new branch, cherry pick changes, re-propose? [21:22] ugh [21:43] wallyworld_: I know what it is [21:43] do tell [21:43] wallyworld_: no I don't [21:43] * thumper thought he did [21:43] nothing to see here, move on [21:43] yes I do [21:43] maybe [21:43] argh [21:43] it does have to do with stacking [21:43] what I did was start with a 1.16 branch [21:44] and then moved back 3 revisions to be what was currently merged with trunk [21:44] that was the base revision for that branch [21:44] when pushed to launchpad, it gets stacked on trunk [21:44] trunk should have those revisions from 1.16 [21:44] so now I'm confused again [21:44] bzr is doing something weird [21:45] * thumper wonders if he can unstack it [21:45] * thumper pokes [21:46] nothing seems to be going right this week [21:47] :) [21:47] looking forward to next week? [21:47] because that'll be awesome [21:47] :-) [21:47] well, that branch says it is reconfiguring as unstacked, we'll see if it actually works [21:48] bzr reconfigure --unstacked lp:~thumper/juju-core/bootstrap-state-no-constraints [21:48] in case you cared [21:49] cool, i have not done that before [21:49] yes, next week wil be great [21:53] wallyworld_: do you need any reviews? [21:54] thumper: i have a couple actually [21:54] there's a pipeline of 3 [21:54] andrew has done the last [21:55] https://codereview.appspot.com/14663043/ and https://codereview.appspot.com/14540055/ are the others [21:55] paste the links here and I can go through them [21:55] thanks :-) [21:55] the last is https://codereview.appspot.com/14502059/ fwiw [21:56] i've been looking at the http maas issue so haven't yet really looked at the review i do have, will do that today [22:03] * thumper nods [22:04] * thumper starts on the first one [22:04] looks big [22:05] btw, unstacking that branch worked [22:05] should get merge in now [22:05] hopefully [22:12] wallyworld_, do we have docs for sync-tools --source somewhere [22:12] ? [22:12] thumper, hi, long time [22:12] fwereade: not sure. i haven't really read any of our docs [22:12] fwereade: hey [22:12] just noticed the provisioner tests timed out on go-bot for the 1.16 branch [22:13] anyone seen this before? [22:13] fwereade: do we have any docs for sync-tools at all? [22:13] thumper: i get random timeouts from the bot semi-regularly [22:13] wallyworld_, hell, possibly not :/ [22:35] wallyworld_: how'd you get on with the http problem? [22:36] bigjools: hi, long story, lots of dead ends. but it looks like maas is closing the http connection and then juju complains about it [22:36] wallyworld_: and juju just needs to deal with that I guess [22:37] still need to figure out root cause [22:37] it might be getting closed because it's not using http/1.1? [22:37] or the keep-alive header is not getting set? [22:37] * bigjools stabs in the dark [22:37] not sure, will check. but none of that has supposedly changed [22:38] bigjools: so all of the http requests to maas go via the apache server? [22:38] did you work out if Go 1.1.1 made a difference? [22:38] bigjools: no difference [22:38] wallyworld_: yes, maas runs as a wsgi container in Apache [22:38] compiled 1.14 with go 1.1.2 and it worked [22:38] ! [22:38] so something changed inbetween 1.14 and 1.16.... fun [22:38] nothing obvious though [22:39] and the EOF at the client is really due to the server closing the connection [22:39] but what triggers the closure [22:39] that is the question [22:39] you could do something silly like compile 1.16 with the version of gomaasapi from 1.14 [22:39] just to eliminate that [22:39] or confirm it [22:40] could do. a very quick look at gomaasapi showed no significant changes, or nothing obvious in the http request dispatch side of thnugs [22:40] i think though the 1.14 gomaasapi won't worj with 1.16? will need to checki guess [22:42] bigjools: you using the maas server this morning? [22:42] wallyworld_: no [22:43] I am weeping into my coffee [22:43] i'll poke around some more then if that's ok [22:43] because of this multi-environ crap [22:43] weeping? [22:43] np [22:43] ah, the users getting mixed up issue? [22:45] yeah [22:45] I maintain it's a juju self-inflicted problem but thumper disagrees with me [22:45] what would he know, right? [22:46] not a lot [22:46] at least we can disagree in a more violent manner in person next week :) [22:46] i'll remember to pack the popcorn [22:46] or maybe jelly [22:46] * bigjools will secretly put one fewer shots in thumper's coffee [22:47] * thumper packs the boxing gloves [22:47] heh [22:47] you still 120kg thumper? [22:47] of pure muscle [22:47] Dibs on the front row seats! [22:48] just under 97kg now [22:48] gosh [22:48] never wes 120 [22:48] 111 maybe [22:48] :) [22:48] wasting away to nothing [22:50] so rvba has come up with the great idea of using juju's uuids in the filenames we send to maas [22:51] rvba: of course we have to make sure the tools don't get those uuids in the names I think [22:51] yep, that would be bad [22:51] bigjools: well, we just need to handle that transparently in the provider's code. [22:52] not as simple as that [22:52] you mean send the tools with the uuids? [22:52] and translate inthe provider? [22:52] or strip out the uuid and send the tools uncganged? [22:52] unchanged [22:53] juju itself will ask to a file with name="zzz" but the provider will just fetch the file named uuid+"zzz" [22:53] Same when storing files. [22:53] hmm. but then we will have multiple copies of the same tools tarballs [22:53] The only trick is the anonymously accessible files… but that will also work transparently I think, because in this case we simply use generated ids. [22:54] canit be path based? ie don't use uuids for certain paths? [22:54] so stuff under tools/releases is left alone [22:54] wallyworld_: well, yes, that could be an improvement. [22:55] and tools/streams [22:55] or just /tools in general [22:55] that would save duplicating the same tools tarballs and metadata over and over again for each user [22:55] so leave tools/ alone but add uuid/ to everything else [22:55] WFM [22:55] yeah, think that will work [22:55] Yeah, totally possible. But again, I think this should be done as a second step. [22:56] why 2nd? i think it is flawed to needlessly duplicate lots of data [22:56] rvba: so I don't want to change the filestorage object willy nilly because of this [22:56] it needs to be done at the provider level [22:57] Absolutely. [22:57] There is actually very little to be done. [22:57] But that's the beauty of it. It's just a fix to the provider. [22:57] All in provider/maas/storage.go [22:58] Basically, we just need to translate filename <--> uuid+filename everywhere. [22:58] and also the output from list [22:58] Yep [22:58] storage.List(stor, "") [22:58] Becomes storage.List(stor, uuid) [22:58] i will be trivial to add the /tools exclusion [22:58] bigjools: this is really too good to be true! [22:59] rvba: I disagree that it can only go there because it will affect the tools [22:59] I think it needs to go in some of its callsites [22:59] tools are access via storage [22:59] Exactly. [22:59] so it should work, so long as tools are excluded from uuid mangling [22:59] via that storage object no? [22:59] yes [22:59] special casing tools in there is crazy [23:00] agreed, i don't care where it's done [23:00] but we should duplicate tools [23:00] shouldn't [23:00] I'll sort the call sites out later [23:00] right now I need breakfast and an inbox cleanout [23:01] Again, I suggest dealing with the tools second. [23:01] or is that an outbox cleanout [23:01] Because we need to implement this and test it. [23:01] but not in a flawed way [23:01] bigjools: so, I'll let the lab running so that you can test this [23:01] tools duplication not good [23:01] rvba: I can test it locally [23:01] not good indeed. But not fatal. [23:01] my maas setup is working fine here [23:01] too much scope for hidden gotchas [23:02] bigjools: all right. [23:02] I'd rather do it properly from the start, it's no big deal. [23:02] bigjools: except i'm playing with it [23:02] rvba: thanks [23:02] wallyworld_: easily fixed :) [23:02] bastard [23:02] mwahahaha [23:02] you have a while anyway [23:02] bigjools: don't forget to base your work on Gavin's branch: ~allenap/juju-core/maas-environment-uuid-use [23:03] doubt I'll get to the QA stage until this afternoon [23:03] rvba: affirmative, thanks [23:03] is he finishe dcoding? [23:03] is he finished coding? [23:03] Yeah, it's up for review. [23:03] * bigjools looks at wallyworld_ [23:04] yeeees? [23:04] bigjools: btw, thumper reviewed it. [23:04] \o/ [23:05] He has an interesting point about backward compatibility. It should be fine (I think extra arguments like agent_name will be ignored by earlier version of MAAS) but it's worth a test. [23:06] rvba: it will be fine and a desirable outcome. It leaves the functionality as-is [23:07] I'll try to land it later [23:07] All I'm saying is that it's worth a test. [23:07] absolutely [23:08] And on that note, I think I'll call it a night. [23:08] I'll be available to do some QA tomorrow morning if we still have time. [23:25] davecheney: ping [23:31] wallyworld_: ack [23:31] i have tweaked a go file from the std lib (net/transport.go). how do i force that to compile so that a go build -a ./... from juju-core uses my mods? [23:31] i tried go build in the /usr/lib/go/pkg/net/http dir [23:32] wallyworld_: you should, 1 remove the packaged go [23:32] 2. download go from source [23:32] the packaged go cannot be modified [23:32] it's too mutated [23:32] davecheney: this is on jool's maas server which roger played with last night and he made changes [23:32] http://golang.org/doc/install/source [23:32] i'm trying to revert them [23:33] if they are to the packaged version [23:33] remove the package [23:33] make suer /var/lib/go is empty [23:33] reinstlal the package ? [23:33] would that work ? [23:33] i think so. i assume go 1.1.2 is in the repos? [23:33] although if roger made his changes stick, i shoild be able to as well? [23:34] i wonder what command he ran? [23:34] wallyworld_: i don't think I really understand the problem [23:34] or the solutoin you are persuing [23:34] ok, so roger modified transport.go [23:34] to add extra logging [23:34] wallyworld_: you coudl try, as root [23:34] cd /usr/lib/go/pkg/net/http [23:35] go install -x [23:35] and he did so by editing /usr/lib/go/pkg/net/http/transport.go directly [23:35] i don't know if that will work [23:35] never tried it [23:35] will try [23:35] davecheney: that looks like it worked. the timestamp on the .a file has now chaged [23:36] thanks :-) [23:36] when screwing with this [23:36] i recommend -x and -v flags [23:36] so you can see when things are chaning [23:36] or more important, when they are not 'cos the tool thinks everything is up to date [23:36] ok. i've not used -x before [23:37] yeah. i thought go build -a would have done the job [23:37] but it seems that's not for the std libs, only deps of current project [23:39] -a has problems [23:39] don't we all [23:39] https://groups.google.com/d/msgid/golang-nuts/9120d622-e8a8-451f-941e-34899ae0a457%40googlegroups.com