wallyworld_ | bigjools: wtf. bootstrap on your maas box works now | 00:16 |
---|---|---|
wallyworld_ | bigjools: so roger and i were debugging and trying things. then the server got shutdown. now, it seems it has just started working. i tried with and without all the debug logging | 00:21 |
wallyworld_ | maybe the power cycle on the server helped, not sure | 00:21 |
bigjools | wallyworld_: dafuq | 00:22 |
thumper | haha | 00:23 |
wallyworld_ | bigjools: so looks like we have wasted 2 man days when we should just have listened to Roy ans Moss and "turned it off and on again" | 00:24 |
bigjools | \o/ | 00:24 |
bigjools | I still think it's a bug | 00:24 |
wallyworld_ | well, not much juju can do if maas/apache closes the connection | 00:25 |
wallyworld_ | from underneath it | 00:25 |
wallyworld_ | i guess it could retry | 00:25 |
wallyworld_ | but wtf | 00:25 |
wallyworld_ | that's the only place that needs such logic | 00:25 |
bigjools | network code should never ever assume connections will stay open | 00:34 |
wallyworld_ | that's for the http lib to worry about | 00:34 |
bigjools | yes | 00:35 |
bigjools | I think the Go http lib is a little crazy | 00:35 |
bigjools | exposing that Close setting is one thing, but requiring it before it can cope with the other end closing it is a bug | 00:35 |
wallyworld_ | hard to argue that i think | 00:35 |
bigjools | anyway can someone land this please, I am not in the juju team any more: https://code.launchpad.net/~allenap/juju-core/maas-environment-uuid-use/+merge/191249 | 00:36 |
wallyworld_ | i'll add you | 00:36 |
bigjools | please don;t :) | 00:36 |
wallyworld_ | what's it worth to you | 00:36 |
bigjools | coffee and lunch at the Tavern? | 00:36 |
wallyworld_ | tempting | 00:36 |
bigjools | lower latency to my maas server? | 00:37 |
wallyworld_ | who knows | 00:37 |
bigjools | I'm slobbing it on the outdoor sofas today | 00:37 |
wallyworld_ | bigjools: so that branch, does it duplicate tools? | 00:38 |
bigjools | not started it yet | 00:38 |
bigjools | I do not intend to duplicate them | 00:38 |
wallyworld_ | ah, sorry, i thought the one you wanted landed was it | 00:38 |
bigjools | no, it's gavin's agent_name fixes | 00:39 |
wallyworld_ | i should have read the description | 00:39 |
wallyworld_ | bigjools: do you intend to propose this against 1.16 too? | 00:39 |
bigjools | failing the tavern, $5 big mac? :) | 00:39 |
wallyworld_ | or just land in trunk? | 00:39 |
bigjools | honestly NFI what's best | 00:40 |
bigjools | I am not familiar with the release plans | 00:40 |
wallyworld_ | thumper: when is 1.18 due out? | 00:40 |
* thumper shrugs | 00:40 | |
bigjools | someone said yesterday that there's another release for saucy | 00:40 |
wallyworld_ | there is | 00:40 |
wallyworld_ | 1.18 i think | 00:40 |
bigjools | so trunk then | 00:40 |
wallyworld_ | bigjools: so gavin's fix, how critial is it | 00:41 |
bigjools | very | 00:41 |
bigjools | and the one I am about to do | 00:41 |
wallyworld_ | i wonder if we need a 1.16.1 then | 00:41 |
wallyworld_ | fwereade: any idea on release plans as per the backscroll? | 00:41 |
bigjools | are you going to approve the MP then? | 00:42 |
wallyworld_ | yeah, sorry saw something shiny and got distracted | 00:44 |
wallyworld_ | should be in the bot now | 00:44 |
bigjools | heh | 00:45 |
bigjools | thanks | 00:45 |
wallyworld_ | bigjools: about lunch, is there a place to buy decent coffee beans out your way? | 00:46 |
bigjools | wallyworld_: the little bean in Kenmore | 00:46 |
wallyworld_ | that's more out my way :-) | 00:46 |
bigjools | it's on the way :) | 00:46 |
bigjools | that's the nearest | 00:47 |
bigjools | though there's a new coffee shop coming apparently \o/ | 00:47 |
wallyworld_ | i need coffee. i could drive out to you and get beans also. kill two birds with the one stone | 00:47 |
bigjools | poyfekt | 00:47 |
bigjools | remember that the cafe closed down, you have to go to the smaller place on the other side of the road now | 00:48 |
wallyworld_ | what time? | 00:48 |
wallyworld_ | yes | 00:48 |
thumper | wallyworld_: I have a feeling that we'll only be able to put 1.16 point releases directly into saucy | 00:48 |
bigjools | anhy time you want | 00:48 |
bigjools | however | 00:48 |
thumper | but continue as normal with trunk | 00:48 |
bigjools | I have a call from 12-1 | 00:48 |
thumper | and the ppa | 00:48 |
wallyworld_ | bigjools: i'll leave soon i guess | 00:49 |
bigjools | sure | 00:49 |
wallyworld_ | thumper: that sucks | 00:49 |
thumper | wallyworld_: that's working with distro | 00:49 |
wallyworld_ | i thought 1.18 was going into saucy | 00:49 |
bigjools | they will only take cherry picks into saucy now | 00:49 |
thumper | wallyworld_: unlikely at this stage | 00:49 |
wallyworld_ | cause i've done a bunch of stuff in trunk | 00:49 |
wallyworld_ | assuming it would be in saucy | 00:49 |
wallyworld_ | this is very bad | 00:50 |
wallyworld_ | 1.16 is not ready | 00:50 |
wallyworld_ | there's still the tools repository to do | 00:50 |
wallyworld_ | and the ongoing maas stuff | 00:50 |
wallyworld_ | and lots of other tooling stuff | 00:50 |
wallyworld_ | if we are forced to cherry pick stuff, it will be like the whole fucking cherry tree | 00:51 |
wallyworld_ | bigjools: leaving now | 00:58 |
bigjools | wallyworld_: righto | 00:58 |
bigjools | every time I leave the juju code base for a while and then come back to work on it, I struggle to get everything compiling. I presume this is because of mismatched dependencies. What's the best way of dealing with this? | 01:10 |
bigjools | or, I suspect, branches moving and Go has the bug of using the wrong url for a branch :/ | 01:12 |
* thumper nods | 01:12 | |
thumper | that is one | 01:12 |
bigjools | yeah goamz moved it seems | 01:12 |
thumper | apparently jam had a proposal to get golang to use lp: urls for launchpad | 01:13 |
thumper | no interest | 01:13 |
bigjools | oh dear | 01:13 |
davecheney | bigjools: yup, if the owner of goamz has moved | 01:18 |
davecheney | the go get'd branch is probably pointing at the wrong place | 01:19 |
bigjools | indeed it was | 01:19 |
davecheney | bigjools: niemeyer added support for bzr to go get | 01:20 |
davecheney | if you can show me what is wrong, i can try to get it fixed | 01:20 |
bigjools | thumper can explain it better than me | 01:20 |
bigjools | but the upshot is that it needs to pull from lp:project | 01:20 |
bigjools | not the actual branch url | 01:20 |
thumper | davecheney: when the go tool resolves bzr branches to launchpad | 01:21 |
thumper | it expands the project name into the full http url with unique name | 01:21 |
thumper | this is very slow | 01:21 |
bigjools | the old url is http://bazaar.launchpad.net/~gophers/goamz/trunk/ | 01:21 |
thumper | most LP users have their lp identiy set in bzr | 01:21 |
bigjools | the new url is bzr+ssh://bazaar.launchpad.net/+branch/goamz/ | 01:21 |
thumper | which means lp: urls resolve to bzr+ssh | 01:21 |
bigjools | the latter is owner-agnostic | 01:21 |
thumper | if you don't lp urls resolve to http | 01:21 |
thumper | so lp: is better | 01:21 |
thumper | also | 01:22 |
thumper | bzr+ssh://bazaar.launchpad.net/+branch/project | 01:22 |
thumper | always resolves to the development focus trunk of the project | 01:22 |
thumper | even if the owner changes | 01:22 |
thumper | but go get will turn "launchpad.net/loggo" into http://bazaar.launchpad.net/~thumper/loggo/trunk | 01:22 |
thumper | instead of bzr+ssh://bazaar.launchpad.net/+branch/loggo | 01:23 |
thumper | if go get passed "lp:loggo" to bzr | 01:23 |
thumper | bzr translates to the best it knows | 01:23 |
thumper | which is bzr+ssh if it has your id | 01:23 |
thumper | and http if not | 01:23 |
davecheney | thumper: i'm pretty sure the choice of http is deliberate | 01:23 |
thumper | davecheney: deliberate and stupid | 01:23 |
thumper | IMO | 01:23 |
davecheney | fair | 01:24 |
thumper | it is a choice made by someone who doesn't understand the bzr tool | 01:24 |
thumper | and when jam suggested a patch to golang, they ignored it | 01:24 |
* davecheney has no comment | 01:24 | |
thumper | even though he is probably the best person to make such a suggestion | 01:24 |
* thumper goes back to reviewing wallyworld's brach | 01:25 | |
wallyworld | \o/ | 01:25 |
* thumper needs to go pick up the car from the garage | 01:34 | |
thumper | bbs | 01:34 |
=== thumper is now known as thumper-afk | ||
=== thumper-afk is now known as thumper | ||
thumper | axw: how are you doin? | 02:12 |
axw | thumper: heya | 02:14 |
axw | not too shabby | 02:14 |
axw | working on fixing null provider bugs | 02:14 |
axw | the apt repo one's a bit of a pain, need to extract the key from the keyserver... cloud-init would normally take care of that | 02:15 |
* thumper nods | 02:15 | |
thumper | there isn't a handy command we can use? | 02:15 |
thumper | doesn't add-apt-repository download the key? | 02:16 |
axw | thumper: only for ppas | 02:17 |
thumper | bummer | 02:17 |
axw | I'm looking at the cloud-archive case | 02:17 |
* thumper nods | 02:17 | |
* thumper goes to pick up the wife | 02:23 | |
thumper | geez | 02:23 |
thumper | broken day | 02:23 |
thumper | bbs | 02:23 |
axw | davecheney: are you aware of any tools for looking for unused functions/vars/types/etc.? | 02:27 |
axw | or, how can I identify all functions that are only ever used in tests | 02:28 |
davecheney | axw: I think there is a mode for go vet in 1.2 | 02:36 |
davecheney | and kamil kissel has written a tool | 02:37 |
axw | davecheney: thanks, I'll take a look | 02:37 |
wallyworld | thumper: ping | 02:53 |
wallyworld | thumper: i did some fixes for axw's review in the wrong branch in the pipeline. i'm fixing now so ignore the new diff in your review. | 02:54 |
thumper | ok | 03:09 |
wallyworld | thumper: what i did do though is reply to your comments on both merge proposals. i'll fix the issues like gc.HasLen etc but there's also a few things i've replied back to | 03:30 |
* thumper nods | 03:32 | |
thumper | wallyworld: I feel I may pop down to harvey norman to look at the coffee machine | 03:32 |
thumper | really need one that doesn't make me angry | 03:33 |
wallyworld | yes indeed | 03:33 |
wallyworld | get the dual boiler! | 03:33 |
wallyworld | thumper: do all tests really need to extend loggingsuite even if they don't requite the base functionaity | 03:33 |
wallyworld | seems like a waste | 03:34 |
thumper | wallyworld: the logging suite captures the logging | 03:34 |
thumper | without it, tests become noisy | 03:34 |
thumper | if someone decides to add logging somewhere | 03:34 |
thumper | that is in the testing path | 03:34 |
wallyworld | fair point | 03:34 |
thumper | that's all it really does | 03:34 |
wallyworld | i guess we should fix all existing test suites at some point then | 03:35 |
* thumper nods | 03:35 | |
wallyworld | thumper: if you still have any spare bandwidth left today, i've done fixes for those 2 mp's | 04:40 |
axw_ | wallyworld: when you have a moment, I've updated https://codereview.appspot.com/14527043/ | 04:48 |
wallyworld | sure | 04:48 |
axw_ | wallyworld: sync no longer resolves metadata, but "juju metadata generate-tools" will still | 04:48 |
wallyworld | axw_: i think everything should calc the sha etc, drop the option to allow it not to be done. the sha256 and size is absolutely needed for sync tools | 04:56 |
wallyworld | and generate metadata is typically run using local files so it can always be done for that as well | 04:57 |
axw_ | wallyworld: it's only for existing tools with no metadata - I thought the conclusion was that it would be okay after 1.16? | 04:57 |
wallyworld | after 1.16, there should be no metadata without size/checksum | 04:58 |
wallyworld | so if this mp is to go into trunk, then drop the resolve option altogether | 04:59 |
wallyworld | imo | 04:59 |
wallyworld | always just do the checksum/size | 04:59 |
axw_ | wallyworld: as in, behave as if the call were specified with fetch/resolve==true all the time? | 04:59 |
wallyworld | yeah | 05:00 |
axw_ | what's the point if there's no metadata without size/checksum? | 05:00 |
wallyworld | the fetch=true tells the command to read the tarball data to do the size/checksum when the metadata is generated | 05:00 |
wallyworld | and that's what we always want now | 05:00 |
wallyworld | since we don't want to produce metadata without size/checksum | 05:00 |
wallyworld | so fetch=false should be verboten | 05:01 |
wallyworld | make sense? or am i missing something? | 05:02 |
axw_ | wallyworld: with the change, metadata is still beign generated with size/hash. It's populated when the tools are copied to storage | 05:02 |
axw_ | wallyworld: the only thing that's affected is tools that are in storage, but either don't have metadata, or have metadata without size or hash | 05:02 |
wallyworld | so - if i have some tarballs locally, and i just want to generate metadata json, and not copy the tarballs anywhere - that's what the generate-metadata command does - that should always happen with size/hash | 05:04 |
wallyworld | and even if the tarballs are not local, ie on a cloud, the same applies | 05:04 |
wallyworld | the generate-metadata command should always produce json with size/hash now | 05:04 |
axw_ | wallyworld: yes, it does and will continue to do so with this change | 05:04 |
axw_ | generate-tools only | 05:05 |
wallyworld | as of 1.16, there should be no metadata without size/hash | 05:05 |
wallyworld | so i'm not sure if your comment above holds? | 05:05 |
axw_ | right | 05:05 |
wallyworld | this one i mean | 05:05 |
wallyworld | [15:02:48] <axw_> wallyworld: the only thing that's affected is tools that are in storage, but either don't have metadata, or have metadata without size or hash | 05:05 |
axw_ | right, so my point is - the change won't break anything :) | 05:06 |
wallyworld | sure, but why cater for a forbidden scenario | 05:06 |
axw_ | as in, it only affects a scenario that won't occur | 05:06 |
wallyworld | it just complicates the code base | 05:06 |
axw_ | I'm explicitly not catering for it now | 05:06 |
wallyworld | but there's still the fetch option etc | 05:06 |
wallyworld | that is no longer needed | 05:07 |
wallyworld | fetch=true always | 05:07 |
axw_ | sorry, wallyworld are you talking just about the metadata plugin? | 05:07 |
axw_ | as in, get rid of the command line option and have *that* always fetch? | 05:07 |
wallyworld | i was just looking quite narrowly at the diff in the code review | 05:07 |
wallyworld | and saw the option to resolve or not still there | 05:07 |
axw_ | wallyworld: yeah, that's *only* in the plugin now. | 05:08 |
wallyworld | i do think we should always fetch, but we can do that as a separate mp | 05:08 |
axw_ | I can make it always do it | 05:08 |
wallyworld | sorry, my brain hadn't made the distinction of what was where when reading the diff | 05:08 |
axw_ | can I just confirm that it's okay *not* to resolve metadata for syncing? | 05:08 |
wallyworld | we do need to resolve for syncing | 05:09 |
axw_ | when I say resolve metadata, I mean fill in size/hash | 05:09 |
wallyworld | cause we may have new tools | 05:09 |
wallyworld | that need to be copied | 05:09 |
axw_ | wallyworld: heh, I mean for existing tools | 05:09 |
axw_ | sorry | 05:09 |
axw_ | not for newly copied ones | 05:09 |
axw_ | newly copied ones will always get it, there's no option to disable it | 05:10 |
wallyworld | ok, i think it's reasonable, in trunk, to assume existing tools will have size.hash | 05:10 |
wallyworld | agree? | 05:10 |
axw_ | okay, cool | 05:10 |
axw_ | yes | 05:10 |
wallyworld | by brain hurts :-) | 05:10 |
wallyworld | my | 05:10 |
axw_ | sorry :) | 05:10 |
wallyworld | not your fault | 05:10 |
axw_ | wallyworld: and I'll update the generate-tools command to always fetch | 05:10 |
wallyworld | ok, that would be great. i like leaving less legacy / tech-debt :-) | 05:11 |
wallyworld | thaks :-) | 05:11 |
axw_ | wallyworld: updated | 05:15 |
wallyworld | looking | 05:15 |
wallyworld | axw_: looks good, land that fucker :-) | 05:19 |
axw_ | sweet, thanks | 05:19 |
wallyworld | thank you for making it all work :-) | 05:19 |
axw_ | heh nps | 05:20 |
bigjools | pls to be reviewerating https://code.launchpad.net/~julian-edwards/juju-core/maas-uuid-file-prefix/+merge/191336 | 05:34 |
bigjools | sorry no Blofeld | 05:34 |
fwereade | bigjools, so how's the environment-uuid config field hooked up to the actual environment uuid? | 05:43 |
bigjools | fwereade: don't know the details, allenap did that already | 05:44 |
axw_ | fwereade: the UUID is allocated randomly, at prepare time | 05:46 |
axw_ | so... pointing at the same env requires sharing the UUID | 05:46 |
fwereade | axw_, bigjools: looks like that's not an environment UUID at all, it's just somemade-up shit :/ | 05:50 |
axw_ | yeah | 05:50 |
* fwereade sighs deeply | 05:51 | |
fwereade | bigjools, your branch looks fine | 05:51 |
bigjools | fwereade: ok thanks | 05:51 |
bigjools | and wow are you working late or in a different TZ? | 05:52 |
fwereade | bigjools, early | 05:52 |
fwereade | bigjools, flying to the US later today | 05:52 |
bigjools | fwereade: it calls utils.NewUUID() in gavin's branch | 05:52 |
fwereade | bigjools, need to go and see laura fora bit though, might not be back | 05:52 |
bigjools | so what is the somemadeup-shit you're talking about? | 05:53 |
bigjools | hoho | 05:53 |
fwereade | bigjools, the problem is the overwhelming bugfuck insanity of naming that thing "environment-uuid" when we already have an "environement uuid" that is not at all the same thing | 05:53 |
fwereade | bigjools, how to write unmaintainable code vol 1 ch1page 1 | 05:53 |
fwereade | bigjools, but I cannot deal with this now,I might be back shortly | 05:54 |
bigjools | it's very easy to criticise | 05:54 |
bigjools | but at least it got done | 05:54 |
bigjools | so do you guys still need two +1s or can I land on one now? | 05:56 |
axw_ | bigjools: just one | 05:57 |
bigjools | thanks axw_ | 05:57 |
=== axw_ is now known as axw | ||
bigjools | axw: can you approve it please, I am ont in the juju team so I can't do it | 05:59 |
axw | bigjools: sure | 05:59 |
bigjools | thank you sir | 05:59 |
fwereade | bigjools, ok, I did not express myself in a helpful way and I apologise for that | 06:42 |
fwereade | bigjools, but I think it really is a problem that some environments now have two UUIDs and there's no clear distinction between them | 06:42 |
fwereade | bigjools, would it be possible to do a quick branch that just s/environment-uuid/maas-agent-name/ and eliminates this source of confusion? | 06:43 |
bigjools | fwereade: I'm not sure where Gavin received his advice from, but I believe it was mostly under the direction of someone in the core team and that whoever it was had a plan to resolve this | 06:43 |
fwereade | bigjools, yeah, I just read the review :( | 06:44 |
bigjools | sadly this is what happens when stuff needs to go in quickly before a release | 06:45 |
fwereade | bigjools, yeah, I would kinda like to figure out how the api-key fiction got created and then propagated so widely in the first place | 06:47 |
fwereade | bigjools, it never even crossed my mind that it was completely made up | 06:47 |
fwereade | bigjools, because it's persisted all the way through back from python days | 06:47 |
fwereade | bigjools, and we never even had a maas environment to check against for such a long time | 06:48 |
rogpeppe | mornin' all | 07:04 |
rogpeppe | fwereade: the environment-uuid thing is all my fault | 07:04 |
rogpeppe | fwereade: i don't really see what harm it can cause tbh | 07:05 |
fwereade | rogpeppe, heyhey, I saw the review, and I think I see the reasoning... but ISTM that now we have two "environment uuids" for maas environments, and I don't see how we're ever going to be able to pull them back together | 07:05 |
rogpeppe | fwereade: they don't join up | 07:05 |
rogpeppe | fwereade: the environment-uuid in the config doesn't make anywhere else, does it? | 07:06 |
fwereade | rogpeppe, then why do they have the same name? it looked like it was justified on the strength of being step 1 towards picking one at prepare time rather than bootstrap time | 07:06 |
fwereade | rogpeppe, which would be great, if we did it | 07:07 |
fwereade | rogpeppe, but now we have an environment config with one value, used by some parts of the system, and an environment doc with another used by different parts of the system | 07:07 |
fwereade | rogpeppe, and to imagine that never the twain shall meet strikes me as... optimistic | 07:07 |
rogpeppe | fwereade: well, currently maas has a private attribute called environment-uuid; the environment uuid in state doesn't come from or go into the config | 07:08 |
rogpeppe | fwereade: given that state.Initialize takes an environ config, we can easily change that at a later stage to put the environ-uuid from that into the current uuid doc | 07:09 |
rogpeppe | fwereade: and likewise we can easily change environs.Prepare to create it | 07:09 |
rogpeppe | fwereade: and when we do that, i *think* everything will just work, and the maas environ-uuid will then join up with the state uuid | 07:10 |
rogpeppe | wallyworld_: after sleeping on it, i *think* i know what's going on with the maas EOF bug | 07:11 |
fwereade | rogpeppe, that's fine for new environments, but existing environments will need to keep both around | 07:11 |
rogpeppe | fwereade: is that a problem? | 07:12 |
fwereade | rogpeppe, I think so, yes, because there is no longer a singular concept of environment uuid | 07:12 |
fwereade | rogpeppe, and I don't see how an existing environment can ever be brought in line | 07:13 |
rogpeppe | fwereade: is that a problem? | 07:13 |
fwereade | rogpeppe, well, yes, because an environment uuid is the only thing we have for globally identifying an environment | 07:13 |
fwereade | rogpeppe, and the last thing I want is to have to respond to bug reports by saying "ah, yes, it doesn't work because you should have used the *other* environment uuid" | 07:14 |
rogpeppe | fwereade: is it any worse than if maas created a new attribute, for example maas-machine-identifier ? | 07:14 |
fwereade | rogpeppe, yes, I think it is much worse | 07:15 |
fwereade | rogpeppe, a new identifier would have been great | 07:15 |
fwereade | rogpeppe, I thought I even saw you advocating that yesterday morning as I rushed by, and I thought "ah cool, everything's undercontrol" | 07:15 |
rogpeppe | fwereade: i advocated one or the other | 07:15 |
rogpeppe | fwereade: i quite liked the idea of just using environment-uuid, because i *don't* think there's a great problem currently - the maas attribute is not really visible to the user | 07:16 |
fwereade | rogpeppe, you think nobody looking at the environ config is going to be fooled? | 07:17 |
fwereade | rogpeppe, the environ config is most certainly visible | 07:18 |
fwereade | rogpeppe, it's *more* visible to the user than the one in the environ doc | 07:18 |
rogpeppe | fwereade: i actually think that fixing it properly is going to be quite a small change. | 07:18 |
fwereade | rogpeppe, what do we do about all the environments that have two uuids then? | 07:19 |
rogpeppe | fwereade: we just need to change environs/config to add UUID, change environs.Prepare to create it and change state.Initialize to use it | 07:19 |
fwereade | rogpeppe, apart from the fact that we have to carry code FOREVER to handle the fact that sometimes they're different | 07:19 |
rogpeppe | fwereade: really? | 07:19 |
rogpeppe | fwereade: what code would we need? | 07:20 |
fwereade | rogpeppe, code to figure out which one is "meant" at any given time | 07:20 |
fwereade | rogpeppe, as it is today we will be starting envs with two uuids | 07:20 |
fwereade | rogpeppe, both of which are exposed to external systems | 07:20 |
rogpeppe | fwereade: the other side of the coin is that in the future, we *would* like maas to use the environ uuid to tag its machines | 07:21 |
fwereade | rogpeppe, and which we therefore cannot change | 07:21 |
rogpeppe | fwereade: and if we don't make it use environ-uuid, it will forever use some other identifier | 07:21 |
rogpeppe | fwereade: well, some other attribute anyway | 07:22 |
rogpeppe | fwereade: because it could still take its value from environ-uuid | 07:22 |
fwereade | rogpeppe, yeah, that would be nice, we would be able to derive the differently-named attribute from the real uuid if a legacy one werenot already set | 07:22 |
fwereade | rogpeppe, bigjools: is there *any* way we can get this fixed without releasing in this state? | 07:23 |
rogpeppe | fwereade: well, it's just a naming issue right? | 07:23 |
rogpeppe | fwereade: so we just need to change the name | 07:24 |
fwereade | rogpeppe, yeah, but I am out of the loop and have no idea what timelines etc are in play | 07:24 |
bigjools | fwereade: we are at the mercy of the release managers in ubuntu | 07:25 |
fwereade | rogpeppe, if you can fix it, or ask someone else to, in time to not release with it in place, please please do so... but I have about half an hour to get up, pack, and catch a taxi to the airport | 07:26 |
bigjools | this is a major flaw in juju and maas and really needs to at least be a zero-day fix | 07:26 |
bigjools | so there is time to change it I think | 07:26 |
rogpeppe | fwereade: ok. how about i just fix it properly? i *think* it's quite a small change, though i may be wrong | 07:26 |
fwereade | rogpeppe, if you were to use environment-uuid in InitializeState that would be fine with me too | 07:27 |
bigjools | but one of you needs to do it AFAIC because my engineers have done enough already | 07:27 |
rogpeppe | fwereade: i'll give it a go | 07:27 |
fwereade | rogpeppe, can you do that please? and coordinate with jamespage I guess? tyvm | 07:27 |
rogpeppe | fwereade: i know what's going with the MAAS bootstrap EOF bug BTW, i'm pretty sure | 07:28 |
rogpeppe | fwereade: it's a very interesting conjunction of issues | 07:28 |
wallyworld_ | rogpeppe: hi | 07:40 |
rogpeppe | wallyworld_: hiya | 07:40 |
wallyworld_ | sorry, i was out getting my presecription filled before i go away | 07:41 |
rogpeppe | pwd | 07:41 |
wallyworld_ | rogpeppe: a reboot of the server fixed everything | 07:41 |
rogpeppe | wallyworld_: of the MAAS server? | 07:41 |
wallyworld_ | yep | 07:41 |
wallyworld_ | i think juju's http is flawed | 07:41 |
rogpeppe | wallyworld_: i don't believe the problem is fixed | 07:41 |
wallyworld_ | it should cope with disappearing connections | 07:41 |
wallyworld_ | any networking stack needs to be robust | 07:42 |
wallyworld_ | to connections going away | 07:42 |
rogpeppe | wallyworld_: i think the real problem is an underlying problem with the http protocol itself | 07:42 |
wallyworld_ | sure, ut the http lib needs to hide that | 07:42 |
rogpeppe | wallyworld_: i'm not entirely sure whether it's possible | 07:42 |
wallyworld_ | http libs from python et al do | 07:42 |
rogpeppe | wallyworld_: i wonder how they cope with this race: | 07:43 |
rogpeppe | wallyworld_: you use an existing connection and send a request, but the remote end drops the connection before it reads your request | 07:43 |
frankban | hi juju devs: is it safe to use ~/.juju/current-environment as a reliable way to retrieve the current default env name? or should we just consider it an internal detail? | 07:43 |
rogpeppe | wallyworld_: then it looks like you're getting EOF in response to your request | 07:43 |
bigjools | why is a request data object dealing with protocols? | 07:44 |
rogpeppe | bigjools: ? | 07:44 |
bigjools | request has a Close on it | 07:44 |
bigjools | seems odd | 07:44 |
rogpeppe | bigjools: it's an http header | 07:44 |
wallyworld_ | frankban: the value in that file can be overridden by JUJU_ENV i think | 07:44 |
wallyworld_ | frankban: so i would not rely on it | 07:45 |
rogpeppe | wallyworld_: when the above scenario happens, should the http client resend the http request on a new connection (possibly duplicating side-effects) or just return the error? | 07:45 |
wallyworld_ | not sure. i'd like to know how other libs handle it | 07:46 |
rogpeppe | wallyworld_: me too | 07:46 |
frankban | wallyworld_: sure, I am trying to implement this logic: if JUJU_ENV is set, use it, otherwise, retrieve the default env as set by "juju switch". So my question is: how to reliably grab that value in the second code path? | 07:46 |
wallyworld_ | but i've never seen this sort of behaviour elsewhere | 07:46 |
rogpeppe | wallyworld_: the thing is, it's usually a race with a very narrow window | 07:47 |
rogpeppe | wallyworld_: but in this case, an unfortunate set of circumstances conspire to make it happen every time | 07:47 |
bigjools | rogpeppe: in that case I'd expect the transport to deal with headers that affect its operation | 07:47 |
wallyworld_ | frankban: what if juju switch has not been called yet? | 07:47 |
rogpeppe | bigjools: where should the user be able to tell the http package whether connections should be reused or not? | 07:48 |
wallyworld_ | not on the request object that is for sure :-) | 07:48 |
frankban | wallyworld_: it's ok, we tried and failed, and we have no default value. | 07:48 |
frankban | wallyworld_: the last chance could be looking for environments.yaml[default] actually | 07:49 |
wallyworld_ | frankban: is this a python script or something? | 07:49 |
frankban | wallyworld_: yes it is | 07:49 |
rogpeppe | wallyworld_: the reason (i'm pretty sure, though i haven't had time this morning to verify) why we were seeing the problem every time, is that just before we send the request that fails, we do some very cpu-intensive operations for more than 5 seconds | 07:50 |
wallyworld_ | frankban: so i think the order juju-core checks is: juju_env, juju switch file, env.yaml | 07:50 |
wallyworld_ | frankban: so if you do that, you should be ok | 07:51 |
frankban | wallyworld_: so, in order: JUJU_ENV -> juju switch -> environments.yaml[default] -> error "please specify an env name". | 07:51 |
wallyworld_ | frankban: i think so | 07:51 |
rogpeppe | wallyworld_: and that meant that the goroutine that usually sees the remote connection being dropped was not being scheduled in that time | 07:51 |
frankban | heh | 07:51 |
bigjools | rogpeppe: I'd have a higher level function on the transport rather than exposing protocol details on a request object | 07:51 |
rogpeppe | bigjools: the transport is actually lower level here, no? | 07:51 |
rogpeppe | bigjools: and most http clients don't see it | 07:52 |
bigjools | rogpeppe: not in that sense, I mean a function on the transport to say whether to do it or not. manipulating headers is low-level | 07:52 |
fwereade | rogpeppe, hey, change of heart -- please *don't* use environment-uuid, just change the name to something maas-specific | 07:52 |
frankban | wallyworld_: yes my question is about the "juju switch" part: parsing the output seems fragile, and I was wondering if ~/.juju/current-environment is considered an internal detail. anyway, implementing something like "juju switch --format json" could be a good idea | 07:52 |
fwereade | rogpeppe, I'm not convinced we have properly thought through the issues witrh setting it early | 07:52 |
fwereade | rogpeppe, and I don't want maas/juju collisions | 07:52 |
rogpeppe | fwereade: really? | 07:52 |
bigjools | fwereade: I chatted to wallyworld_ about this earlier and we concluded that its akin to a private bucket name | 07:53 |
fwereade | rogpeppe, really really | 07:53 |
rogpeppe | fwereade: ok - i've almost done it, BTW | 07:53 |
wallyworld_ | frankban: i just checked. the checks are done in the order specified | 07:53 |
fwereade | rogpeppe, just call it maas-agent-name or something | 07:53 |
jamespage | fwereade, rogpeppe: what do I need to know about? | 07:53 |
fwereade | rogpeppe, sorry, but I just want to avoid adding more little threads connecting different bits | 07:53 |
fwereade | rogpeppe, that at least betrays its actual usage | 07:54 |
fwereade | rogpeppe, then, as a followup, when we have done early-set-uuid properly | 07:54 |
fwereade | rogpeppe, we can make maas-agent-name derive therefrom if unset | 07:54 |
wallyworld_ | frankban: the current-environment file just has a single line with the env name. ideally i agree about the output bit you suggest. but i can't see it changing | 07:54 |
rogpeppe | fwereade: actually, we shouldn't allow maas-agent-name to be set explicitly | 07:54 |
fwereade | rogpeppe, it will already be set explicitly inenvironments we upgrade,ok? | 07:55 |
frankban | wallyworld_: ok, so it's ok to use the current-environment file, correct? | 07:55 |
rogpeppe | fwereade: i don't *think* so | 07:55 |
rogpeppe | fwereade: i think the only time it could be explicitly set is if someone specifies it in their environments.yaml | 07:56 |
fwereade | rogpeppe, it *will* because we will set it *now* and it will need to persist in the env | 07:56 |
wallyworld_ | frankban: well, given there's nothing else, then yes. but i think we need a command to print the current env name to hide the details | 07:56 |
fwereade | rogpeppe, when we upgrade thosesubsequently we will need to deal with it | 07:56 |
fwereade | rogpeppe, I don't much care how we react to its resence in Prepare, cutting it off there doesn't seemcrazy | 07:56 |
wallyworld_ | frankban: we can whip something up next week at the sprint perhaps | 07:56 |
axw | wallyworld_, frankban "juju switch" shows you the current env | 07:57 |
rogpeppe | fwereade: i think that's fine - maas will just always use maas-agent-name | 07:57 |
rogpeppe | fwereade: but at Prepare time, it can derive maas-agent-name from environ-uuid | 07:57 |
wallyworld_ | frankban: oh, we already do it it seems | 07:57 |
wallyworld_ | that i didn;t realise, sorry for the noise | 07:57 |
frankban | axw, wallyworld_: yes "juju switch" is already there, but it seems to me fragile to parse the output, that's why I was suggesting something like "juju switch --format json" | 07:58 |
axw | frankban: ah sorry, I missed that | 07:58 |
frankban | axw: the current output is: Current environment: "ec2" | 07:58 |
fwereade | rogpeppe, agreed Ithink | 07:59 |
wallyworld_ | frankban: almost json :-) | 07:59 |
rogpeppe | fwereade: it does mean that we'll need the maas-agent-name attribute indefinitely in the future, which is what i was hoping to avoid | 07:59 |
axw | frankban: agreed, I think we should have a machine readable output mode | 07:59 |
wallyworld_ | remove the space, add {} | 07:59 |
fwereade | rogpeppe, just keep environment-uuid out of there for now, Ifear the tentacles' reach | 07:59 |
wallyworld_ | axw: i agree too | 07:59 |
rogpeppe | fwereade: but i see your point about having two things called "environment-uuid" being confusing too | 07:59 |
frankban | wallyworld_, axw: do you want me to file a bug about "juju switch --format machine-readable"? | 07:59 |
* fwereade has to go right now, thanks guys | 08:00 | |
wallyworld_ | yes | 08:00 |
rogpeppe | fwereade: safe journeys | 08:00 |
axw | later fwereade, see you next week | 08:00 |
fwereade | cheers | 08:00 |
axw | frankban: yes please | 08:00 |
* fwereade has to shut down an env beforeheflies actually | 08:00 | |
rogpeppe | axw: do you know much about the http protocol? | 08:00 |
axw | rogpeppe: I know enough to be dangerous, but maybe not intimately enough... why? | 08:01 |
rogpeppe | axw: just wondering how most http clients deal with the inherent race involved in reusing connections when the client might drop the connection at any moment | 08:01 |
TheMue | morning | 08:02 |
frankban | axw, wallyworld_: it seems there is one already: bug 1193244 | 08:02 |
_mup_ | Bug #1193244: juju env could be friendlier to scripts <improvement> <juju-core:In Progress by themue> <https://launchpad.net/bugs/1193244> | 08:02 |
axw | cool | 08:02 |
axw | good morning TheMue | 08:02 |
wallyworld_ | frankban: we'll fix that for sure | 08:02 |
TheMue | axw: oh, morning, came in at the right moment? | 08:02 |
axw | heh :) | 08:02 |
frankban | wallyworld_: great, thanks | 08:03 |
wallyworld_ | frankban: well, TheMue will :-) | 08:03 |
frankban | :-) | 08:03 |
TheMue | wallyworld_: yeah | 08:03 |
axw | rogpeppe: sorry, I know the protocol well enough, but not how most clients work in that regard | 08:03 |
TheMue | frankban: seen my last comment there? regarding the way to control the output? | 08:03 |
frankban | TheMue: --raw sounds good | 08:04 |
rogpeppe | axw: we came across an issue that triggered that race reliably | 08:04 |
wallyworld_ | rogpeppe: isn't it the server that drops the connection rather than the client? | 08:05 |
rogpeppe | axw: so we'd see the connection drop at *almost exactly* the same moment we make a request | 08:05 |
rogpeppe | wallyworld_: yes, but the Go client wasn't *seeing* the connection being dropped because it was busy doing other things | 08:06 |
wallyworld_ | rogpeppe: sure, but i was referring to your comment above that the client dropped the connectiopn | 08:06 |
wallyworld_ | just clarifying | 08:06 |
rogpeppe | wallyworld_: ha, yes | 08:06 |
rogpeppe | wallyworld_: i meant server there | 08:06 |
wallyworld_ | :-) | 08:07 |
TheMue | frankban: fine, then I will do it that way | 08:07 |
frankban | TheMue: thanks! | 08:08 |
axw | rogpeppe: tbh, this seems like the Go stdlib should handle. | 08:08 |
axw | it's suggested that you reuse clients, and that it's safe to do so | 08:09 |
rogpeppe | axw: agreed. but i'm not quite sure how it can do so reliably | 08:09 |
wallyworld_ | axw: that's what me and bigjools think too | 08:09 |
rogpeppe | axw: istm that this is an inherent race in the http protocol | 08:09 |
rogpeppe | axw: and i'm not sure how it can be dealt with other than just trying to reduce the window for the race | 08:10 |
axw | hmmm | 08:11 |
dimitern | morning all | 08:12 |
rogpeppe | axw: (the window could certainly be smaller in the Go http client) | 08:12 |
rogpeppe | dimitern: yo! | 08:12 |
axw | morning dimitern | 08:12 |
rogpeppe | axw: actually, on reading of the rfc 2616, it looks like the Go http client is just wrong here | 08:16 |
rogpeppe | " | 08:16 |
rogpeppe | Client software SHOULD reopen the | 08:16 |
rogpeppe | transport connection and retransmit the aborted sequence of requests | 08:16 |
rogpeppe | without user interaction so long as the request sequence is | 08:16 |
rogpeppe | idempotent (see section 9.1.2). | 08:16 |
rogpeppe | " | 08:16 |
axw | yeah, but "so long as the request sequence is idempotent" | 08:16 |
axw | I was thinking that, but how do you guarantee idempotency? | 08:16 |
axw | that's an application level thing | 08:17 |
rogpeppe | axw: yeah, but it defines GET, HEAD, PUT and DELETE as being idempotent | 08:17 |
rogpeppe | axw: i guess that means you still have a potential problem for POST | 08:17 |
* axw wonders how many applications are not idempotent for those methods ;) | 08:17 | |
rogpeppe | axw: http://tools.ietf.org/html/rfc2616#section-9.1.2 | 08:17 |
rogpeppe | axw: i wonder that too | 08:17 |
axw | rogpeppe: anyway, "wrong" is maybe too harsh for not implementing an RFC "SHOULD" | 08:18 |
axw | but it would certainly be useful for an option at least | 08:18 |
rogpeppe | axw: yeah | 08:19 |
TheMue | dimitern: morning, how has PyCon been? | 08:19 |
dimitern | TheMue, ah, it was interesting and not too long :) | 08:20 |
TheMue | dimitern: I somehow left Python a few years ago, so would almost have to relearn it. ;) | 08:21 |
rogpeppe | axw, wallyworld_: there's an outstanding Go issue for this actually: https://code.google.com/p/go/issues/detail?id=4677 | 08:22 |
axw | rogpeppe: heh, nice, same conclusion :) | 08:22 |
jamespage | fwereade, rogpeppe: what do I need to know about? (hint: release of saucy is tomorrow - if I need to upload it needs to be today otherwise I'm in SRU territory) | 08:22 |
wallyworld_ | rogpeppe: fat lot of good that is - it's not going to e fixed | 08:23 |
rogpeppe | jamespage: i need to make a small naming fix to the maas provider | 08:23 |
rogpeppe | wallyworld_: it's not going to be fixed for 1.2, yeah | 08:23 |
wallyworld_ | so go's http is broken and there's nothing we can do about it. oh joy | 08:23 |
wallyworld_ | wtf | 08:23 |
rogpeppe | wallyworld_: we can set the Close field | 08:23 |
wallyworld_ | and remove connection reuse? | 08:24 |
rogpeppe | wallyworld_: or change the transport to allow no idle connections | 08:24 |
wallyworld_ | why oh why does Go get so much so wrong | 08:24 |
rogpeppe | wallyworld_: yeah | 08:24 |
wallyworld_ | version control, proper hhtp stack etc etc | 08:24 |
wallyworld_ | it's not like comp sci is a new field of science | 08:25 |
* wallyworld_ is very frustrated | 08:25 | |
axw | I like this comment from the Chromium dev: "* If you pipeline requests and get a transport error, we pray that HEADs and GETs are actually idempotent and retry." | 08:25 |
axw | wallyworld_: I'd actually prefer that this not be enabled by default, because it's generally not safe to assume GETs are idempotent | 08:26 |
bigjools | don't fret wallyworld_, you get the pleasure of my company on Saturday for 15 hours | 08:26 |
axw | opt-in would be good | 08:26 |
wallyworld_ | oh joy | 08:26 |
wallyworld_ | axw: sure, but we don't kow if it's needed or not at any time | 08:26 |
bigjools | GET is supposed to be idempotent, are there really crackful websites that are not? | 08:27 |
wallyworld_ | +1 to that | 08:27 |
axw | bigjools: supposed to be, sure, but people do all sorts of wrong things in their web applications | 08:27 |
bigjools | then that's their tough titty | 08:27 |
wallyworld_ | axw: that's no excuse for not assumimg the spec holds | 08:28 |
axw | not to mention HTTP servers and proxies that don't follow protocols | 08:28 |
bigjools | tools should not behave as stupidly | 08:28 |
wallyworld_ | no excuse | 08:28 |
wallyworld_ | the wrong implementations should get fixed | 08:28 |
wallyworld_ | not the clients | 08:28 |
bigjools | just look at the havoc IE6 created | 08:28 |
wallyworld_ | or else the problem will never be fixed | 08:28 |
wallyworld_ | ie6 is a great example | 08:28 |
wallyworld_ | axw: rogpeppe: so you'd think a robust http lib would be the cornerstone requirement of any new langage. and yet they say not fixed for 1.2????? what else could be more important | 08:30 |
wallyworld_ | TLS is another example | 08:30 |
wallyworld_ | they refused to accept the need for it - so we are forced to fork | 08:31 |
rogpeppe | wallyworld_: i trust Adam Langley when he says that tls renegotiation is badly broken | 08:31 |
wallyworld_ | well it works for us | 08:32 |
wallyworld_ | or if broken, just fix it already | 08:32 |
wallyworld_ | and give us a feature complete http stack | 08:32 |
wallyworld_ | not like it's important or anything | 08:32 |
rogpeppe | wallyworld_: there's always a tension when trying to write clean software that's dealing with crappy standards | 08:33 |
wallyworld_ | sure, but it's not like Python, C++ etc etc etc etc weren't around to learn from | 08:33 |
wallyworld_ | solved problems | 08:33 |
wallyworld_ | meanwhile Juju breaks and we deal with the fallout | 08:34 |
wallyworld_ | customers don't care the Go is deficient, they blame Juju and us | 08:34 |
wallyworld_ | and we stuff Juju full of all of these hack and workarounds to paper over Go's cracks | 08:34 |
jamespage | rogpeppe, bug reference? | 09:10 |
jamespage | I'd like to raise ubuntu tasks now so they get noticed in the right places | 09:10 |
rogpeppe | jamespage: i'll just file a bug for it | 09:10 |
rogpeppe | jamespage: https://bugs.launchpad.net/juju-core/+bug/1240423 | 09:13 |
_mup_ | Bug #1240423: provider/maas: environment-uuid is the wrong name to use for the machine disambiguation tag <juju-core:New> <https://launchpad.net/bugs/1240423> | 09:13 |
jamespage | rogpeppe, is that linked to bug 1229275 | 10:02 |
_mup_ | Bug #1229275: [maas] juju destroy-environment also destroys nodes that are not controlled by juju <maas> <theme-oil> <juju:Triaged> <juju-core:In Progress by thumper> <maas (Ubuntu):Triaged> <https://launchpad.net/bugs/1229275> | 10:02 |
rogpeppe | jamespage: yes | 10:03 |
jamespage | rogpeppe, so I'm right in saying the juju-core 1.16.0 with MAAS in Saucy will exhibit this problem? | 10:04 |
rogpeppe | jamespage: the above bug? yes, i believe so, unless https://code.launchpad.net/~allenap/juju-core/maas-environment-uuid/+merge/191146 has landed in 1.16 yet | 10:05 |
jamespage | rogpeppe, so I'm going to need that as a fix for 1.16.0 as well | 10:19 |
rogpeppe | jamespage: yes | 10:20 |
jamespage | :-) | 10:20 |
rogpeppe | jamespage: i hope to provide one in the next hour or so | 10:20 |
jamespage | rogpeppe, lovely - thanks! | 10:21 |
rogpeppe | jamespage: i've changed the code - i just need to test in various places | 10:21 |
rogpeppe | allenap: i'd appreciate a review of this please: https://codereview.appspot.com/14741045/ | 10:45 |
dimitern | rogpeppe, mgz, TheMue, others - standup | 10:46 |
rogpeppe | dimitern, mgz, TheMue: ^ (this urgently needs to go in BTW) | 10:46 |
TheMue | oh | 10:47 |
rogpeppe | bigjools: ^ | 10:48 |
natefinch | rogpeppe: the standup link seems not to be working? | 10:48 |
rogpeppe | natefinch: it works for me | 10:48 |
rogpeppe | natefinch: https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.mf0d8r5pfb44m16v9b2n5i29ig?authuser=1 | 10:48 |
* TheMue => lunch | 11:35 | |
rogpeppe | mgz, natefinch, dimitern, TheMue: PLEASE could someone review this? https://codereview.appspot.com/14741045/ | 12:19 |
natefinch | rogpeppe: it's already open in my browser :)\ | 12:19 |
rogpeppe | natefinch: thanks | 12:19 |
mgz | rogpeppe: whoops, didn't hit publish | 12:20 |
mgz | rogpeppe: HIT IT. | 12:21 |
mgz | whoops caps, but kinda funny. | 12:22 |
natefinch | rofl | 12:22 |
rogpeppe | mgz: i thought that was deliberate | 12:23 |
natefinch | rogpeppe: me too | 12:23 |
rogpeppe | anyone around here know about the new maas agent_name semantics? | 12:23 |
rogpeppe | mgz: i've just realised that the branch might be wrong | 12:23 |
mgz | hm, in what way? | 12:24 |
natefinch | mgz: I couldn't help but hear it like this: http://www.youtube.com/watch?v=erbL_BxITHw | 12:24 |
rogpeppe | mgz: because it still filters by agent_name even when there's no maas-instance-uuid | 12:24 |
mgz | I didn't closely review the first branch, so may be missing some subtelties | 12:24 |
rogpeppe | mgz: i don't know what the agent_name filtering semantics are though | 12:24 |
rogpeppe | allenap: ping | 12:25 |
mgz | the filtering is all our code, no? | 12:27 |
mgz | yeah, the filter is just our standard stuff | 12:27 |
rogpeppe | mgz: um, it looks like the filter is passed directly to the maas API GET request | 12:28 |
mgz | which does also raise the back-compat question... | 12:28 |
mgz | rogpeppe: hm, I see what you mean | 12:28 |
rogpeppe | mgz: so i don't know what will happen if you pass an agent_name="" filter | 12:29 |
rogpeppe | mgz: it *might* match all instances, or it might not. | 12:29 |
rvba | rogpeppe: it will match all instances | 12:29 |
mgz | well, that's not something you've changed in this branch | 12:29 |
rogpeppe | mgz: no it isn't | 12:29 |
rogpeppe | rvba: ok, that's good | 12:29 |
rvba | It is good indeed :). | 12:30 |
rogpeppe | rvba: and there's no problem having a blank agent name in the acquire params? | 12:30 |
rvba | rogpeppe: that's fine too (I was sure of it but I still tested it this morning.) | 12:30 |
rvba | I mean, what I tested was a version of juju using agent_name with a version of MAAS which doesn't know about it. | 12:31 |
rvba | not exactly what you asked | 12:31 |
rogpeppe | rvba: yeah, true | 12:32 |
rvba | But using a blank agent name is the same as not providing one. | 12:33 |
rogpeppe | rvba: it would be nice to make sure. or, alternatively, a less risky strategy is just to lose the agent_name params if the agent name is blank | 12:33 |
rogpeppe | rvba: so you can't filter instances that have blank agent names? | 12:33 |
rvba | Like I said, as far as MAAS is concerned, this is the same. | 12:33 |
rvba | No. | 12:33 |
rvba | err | 12:34 |
rvba | Yes you can do that. But you have to know that a blank agent name is the default. | 12:34 |
rogpeppe | rvba: ah, so using a blank agent name isn't *quite* the same as not providing one, then? | 12:35 |
rvba | It is exactly the same. | 12:35 |
rogpeppe | rvba: so how would you distinguish between a) asking for all instances regardless of agent_name and b) asking for any instances which happen to have a blank agent_name ? | 12:36 |
rvba | rogpeppe: no, I was wrong, when listing instances, it's not the same. | 12:37 |
rvba | Sorry for the confusion. | 12:37 |
rogpeppe | rvba: ok, cool. i actually think that's better for our purposes | 12:37 |
rogpeppe | rvba: as it means that existing maas juju deployments won't see new environments | 12:38 |
rogpeppe | rvba: as long as they've been upgraded | 12:38 |
rvba | Right. | 12:38 |
rvba | But you can have only one of these "old" environments. | 12:38 |
TheMue | frankban: ping | 12:46 |
rogpeppe | rvba: yeah - that's a current restriction though | 12:46 |
frankban | TheMue: pong | 12:46 |
rogpeppe | rvba: i'd appreciate it a lot if you could take a glance through this before i merge: https://codereview.appspot.com/14741045 | 12:46 |
TheMue | frankban: just seen in the channel log that you talked about a json output of env/switch | 12:47 |
rogpeppe | mgz: i've renamed maas-instance-uuid to maas-agent-name throughout as it seemed to make more sense once i read through a bit more of the code | 12:47 |
TheMue | frankban: I understood raw as being simply the env name(s) | 12:47 |
TheMue | frankban: in case of json I would prefer a different flag than --raw | 12:48 |
TheMue | frankban: what do you say? | 12:48 |
frankban | TheMue: since "juju switch" will always return just a single string, --raw seems reasonable. I was thinking about --format just for symmetricity with "juju api-endpoints", but I don't think that's required | 12:52 |
frankban | TheMue: what is --raw supposed to return if no default env is set? just an empty string? | 12:53 |
TheMue | frankban: it is <not specified>, at least today | 12:54 |
rvba | rogpeppe: looking now | 12:54 |
rogpeppe | rvba: thanks | 12:54 |
frankban | TheMue: hum... perhaps an empty string is better? I guess spaces and <> are not allowed in env names, so that could be ok, but a user still have to parse the return value, or just compare with "<not specified>". | 12:57 |
TheMue | frankban: I prefer the "<not specified>" as it is more explicit than just an empty string | 12:59 |
frankban | TheMue: the retcode is in both cases, right? | 13:00 |
frankban | TheMue: is 0 I mean | 13:00 |
TheMue | frankban: yep | 13:01 |
frankban | TheMue: if you are ok with users comparing against "<not specified>" the it's all good. But then we must ensure that string will not change in the future. On the other hand, if --raw is the machine friendly command, I am not sure it has to be explicit for humans, but as said, I am +1 on your plan | 13:06 |
natefinch | frankban: when would a script ever need to check the output of juju switch? Couldn't it just always do juju switch <foo> or -e <foo> each time? Not saying no one will ever do it, it just seems like an unnecessary step | 13:08 |
rogpeppe | hmm, bzr's "merge specific revisions" logic seems not to work very well | 13:09 |
rogpeppe | natefinch: i agree | 13:09 |
abentley | rogpeppe: how do you mean? | 13:10 |
rogpeppe | abentley: i just had some conflicts with some very weird diffs in | 13:11 |
abentley | rogpeppe: Well, if you think it's bzr's fault, give me steps to reproduce, and I can have a look. I did write most of that code. | 13:12 |
frankban | natefinch: assume you have a script that needs to bootstrap an environment. that script can either 1) force the user to pass a "-e" parameter or 2) make that parameter optional. In the second case, we still want to ensure an environment is ready to be bootstrapped, and we might also want to grab the env name. So we check JUJU_ENV, then we ask to juju switch, and finally we look inside environments.yaml[default] | 13:12 |
rogpeppe | abentley: here's an example: http://paste.ubuntu.com/6245564/ | 13:13 |
abentley | natefinch: A script would need to check the output of "juju switch" in order to determine what the current environment is. For example, I have a script that runs "nova" using the credentials of the current juju environment. | 13:14 |
rogpeppe | abentley: note the line sitting in the middle of nowhere - it actually came from a test function in the merge-source that was almost entirely lost | 13:14 |
rogpeppe | abentley: ah, sorry, you're talking about juju switch, not bzr :-) | 13:14 |
natefinch | abentley: the script can tell if *an* environment is chosen, but not necessarily the right now. Better to just always switch to the right one first | 13:15 |
rogpeppe | perhaps what we need is a command that prints the current env name | 13:15 |
natefinch | brb screaming baby | 13:15 |
TheMue | frankban: in case of a set JUJU_ENV the command juju env returns it | 13:15 |
rogpeppe | so scripts aren't trying to second-guess JUJU_ENV, juju switch, environments.yaml juju logic | 13:16 |
abentley | natefinch: The right one is the current one. The best way to determine that is to ask juju what the current one is. | 13:16 |
mattyw | rogpeppe, is there a canonical example of writing a go client to connect to the api somewhere in the juju source? | 13:16 |
rogpeppe | mattyw: i don't think so. | 13:16 |
abentley | natefinch: One way to do that is to run "juju switch" with no env specified. | 13:17 |
frankban | TheMue: yes, and also if current-environment is missing, juju switch seems to return the default in envs.yaml. | 13:17 |
rogpeppe | mattyw: there's a command in launchpad.net/juju-utils that does, but it's quite possible it doesn't compile currently | 13:17 |
abentley | rogpeppe: But I bet if you look at THIS and OTHER, that line was preserved in both. | 13:18 |
* TheMue will come back in a few moments | 13:18 | |
rogpeppe | abentley: OTHER and THIS were both as i expected | 13:18 |
rogpeppe | abentley: but usually i expect the merge to contain all the lines from both | 13:19 |
rogpeppe | abentley: in this case almost an entire function had gone missing, leaving me wondering what else might have gone | 13:20 |
abentley | rogpeppe: That's not what merge does. It attempts to apply changes from both, which is both insertion and deletion. | 13:20 |
rogpeppe | abentley: i don't believe the source branch deleted anything in that place | 13:20 |
rogpeppe | abentley: note that this isn't a straight branch merge i'm talking about here | 13:21 |
rogpeppe | abentley: i did: merge -r1984..1985 trunk | 13:21 |
abentley | rogpeppe: Okay, and trunk here is juju-core? | 13:21 |
rogpeppe | abentley: to try to bring some trunk changes into 1.16 | 13:21 |
rogpeppe | abentley: yeah | 13:21 |
rogpeppe | abentley: you could duplicate the problem yourself easily, if you're interested | 13:22 |
abentley | rogpeppe: And you're merging into which branch? | 13:22 |
rogpeppe | abentley: into lp:juju-core/1.16 | 13:22 |
natefinch | abentley: (sorry, had to step out for a second) I'm confused as to why the script needs to know the current environment name, if it just assumes whatever is current is the right one. | 13:31 |
abentley | natefinch: Because it needs to parse environments.yaml and determine what the correct values are for OS_REGION_NAME, OS_USERNAME, OS_PASSWORD, OS_TENANT_NAME and OS_AUTH_URL | 13:35 |
abentley | natefinch: That varies depending on whether the current env uses my personal credentials or my team credentials. | 13:36 |
natefinch | abentley: why is the script parsing environments.yaml? That's for configuring juju... not configuring the script | 13:37 |
natefinch | abentley: oh wait, you mean, it's going to set the environment variables based on what environment is set in juju? | 13:38 |
abentley | natefinch: Right. | 13:38 |
natefinch | abentley: that seems like... a bad idea. If you're going to store the credentials in the script, why not store them in environments.yaml? Or do we not support those particular values in environments.yaml? | 13:39 |
abentley | I'm not storing the credentials in the script, I'm getting them from environments.yaml. | 13:40 |
natefinch | abentley: I'm confused again. If stuff is stored in environments.yaml already, why do you need to set them as environment variables? Won't juju pick them up from environments.yaml itself? | 13:40 |
rogpeppe | rvba, natefinch, TheMue, wallyworld_: this CL merges the maas changes into 1.16 https://codereview.appspot.com/14746044 | 13:41 |
abentley | natefinch: The script runs "nova", not "juju". | 13:41 |
natefinch | abentley: ahh, that's what I was misunderstanding. | 13:42 |
rogpeppe | rvba, natefinch, TheMue, wallyworld_: i've done it as a single branch to save an hour or so of committing overhead; i hope that's ok | 13:42 |
natefinch | abentley: I see you already said that, but I guess I missed it. | 13:42 |
rogpeppe | natefinch: do you have access to a maas environment that you could run up a quick live test of that branch on, by any chance? | 13:43 |
natefinch | rogpeppe: I wish I did... the virtual maas environment I had set up got nuked somehow | 13:44 |
rogpeppe | natefinch: hmm | 13:44 |
rogpeppe | anyone got a maas environment available? | 13:44 |
abentley | natefinch: I have another script that uses the current juju environment (by default) to run sshuttle. Again, it uses "switch" to determine the current environment. | 13:45 |
rogpeppe | abentley: could you clarify for me why the script needs to know the name of the current environment? | 13:46 |
rogpeppe | abentley: (i'm not saying you haven't got a good use case, but i'm interested in what it is) | 13:47 |
rogpeppe | if i wasn't clear, i would really like a review of this CL please - it *needs* to land today. https://codereview.appspot.com/14746044 | 13:53 |
rogpeppe | rvba, axw, natefinch, TheMue, wallyworld_: ^ | 13:53 |
rvba | rogpeppe: ah ok, I didn't get it had to be reviewed again. | 13:54 |
rogpeppe | rvba: well, i think it probably should be, as this is the actual change that's landing on 1.16, and i've had to do some manual merge conflict resolution | 13:54 |
rvba | okay | 13:54 |
rogpeppe | rvba: thanks | 13:55 |
TheMue | rogpeppe: just starting review | 13:56 |
TheMue | rogpeppe: reviewd | 14:05 |
rogpeppe | TheMue: thanks | 14:05 |
TheMue | eh, reviewed | 14:05 |
rogpeppe | TheMue: those are changes that should be made in trunk - i'm not making them independently in this review | 14:06 |
* rogpeppe grabs a bit to eat | 14:06 | |
rogpeppe | bite | 14:06 |
TheMue | rogpeppe: ok | 14:06 |
abentley | rogpeppe: My use case for the first script is to use nova to manipulate the instances of my current juju environment, especially "nova status" and "nova add-floating-ip". | 14:13 |
abentley | rogpeppe: The use case for the second script is to be able to access the current environment using an SSH tunnel, since the bootstrap node has a private IP. I use the current environment name to determine which "juju-*-machine-0" to ssh into for my tunnel. | 14:15 |
rogpeppe | abentley: so this is useful only because the openstack provider uses the environment name to name some of its resources, right? | 14:17 |
rogpeppe | abentley: s/this/switch printing the current environment name/ | 14:18 |
abentley | rogpeppe: Yes, the second script is useful because the instance name can be predicted from the environment name. The first script is useful regardless. | 14:18 |
rogpeppe | abentley: we'll probably change this behaviour in the future, BTW - the environment name doesn't make for a very good unique key. | 14:19 |
abentley | rogpeppe: In the context of a given nova account, I think there's some sense in requiring unique environment names. Things can be very confusing otherwise. | 14:22 |
rogpeppe | abentley: we are moving towards the idea of using the environ UUID | 14:22 |
rogpeppe | abentley: which will be generated at bootstrap time | 14:22 |
rogpeppe | abentley: so if you've destroyed an environment but for some reason some instances are still around, there's then no possibility of crosstalk with a newly bootstrapped environment, even if it happens to have the same name | 14:23 |
rogpeppe | abentley: is that the kind of thing you think might be confusing? | 14:24 |
abentley | rogpeppe: That seems okay, as long as it's not ephemeral. Writing essential info to *.jenv alone makes sharing accounts painful. | 14:24 |
rogpeppe | abentley: we already write essential info to .jenv alone | 14:24 |
abentley | rogpeppe: I know, and it's evil. | 14:24 |
rogpeppe | abentley: to share accounts, you'll need to share the .jenv file | 14:24 |
rogpeppe | abentley: (but you won't need to share anything else at all) | 14:25 |
rogpeppe | abentley: i don't see any other way that we can have local state | 14:25 |
abentley | rogpeppe: And then someone tears down the environment and re-bootstraps, and everyone's broken. | 14:25 |
abentley | rogpeppe: Write it to environments.yaml instead of .jenv. | 14:25 |
abentley | rogpeppe: Then everyone with the same config can access the same environment. | 14:26 |
rogpeppe | abentley: we can't without losing comments etc | 14:26 |
abentley | rogpeppe: Losing comments is better than breaking other team members. | 14:26 |
abentley | rogpeppe: And perhaps there is a yaml parser out there that doesn't lose comments. There certainly are for ini files. | 14:27 |
rogpeppe | abentley: most YAML parser out there can barely parse YAML :-) | 14:27 |
rogpeppe | parsers | 14:27 |
rogpeppe | abentley: it's not just comments - there are many ways to format YAML | 14:28 |
rogpeppe | abentley: i'm afraid the only way to do it is to share the .jenv files (possibly in a shared filesystem) | 14:28 |
abentley | rogpeppe: No, that's not acceptable as long as .jenv files are deleted by destroy-environment. | 14:29 |
rogpeppe | abentley: if you destroy an environment and then bootstrap it again, it's actually not the same environment any more | 14:29 |
abentley | rogpeppe: The same people need to be able to access it, though. | 14:29 |
abentley | rogpeppe: Why can't you store all the .jenv-unique data in swift/s3? | 14:30 |
rogpeppe | abentley: because some of that data is used to work out which swift/s3 bucket to talk to | 14:31 |
rogpeppe | abentley: for instance, control-bucket is now automatically generated | 14:31 |
abentley | rogpeppe: but if I specify control-bucket in environments.yaml, it's respected? | 14:32 |
rogpeppe | abentley: yes, currently | 14:32 |
rogpeppe | abentley: i agree that this is a very significant change in behaviour BTW, and i understand your point of view | 14:32 |
abentley | rogpeppe: It is very discouraging to see this. It seems like juju is going to get worse and worse from the perspective of teams. | 14:33 |
rogpeppe | abentley: i think the way of the future is to have a service that stores .jenv files | 14:33 |
rogpeppe | abentley: (the code is actually written with that in mind) | 14:34 |
abentley | rogpeppe: I don't think we need a new service. Just a swift/s3 bucket that doesn't change. One per provider would be fine. | 14:36 |
rogpeppe | abentley: s3 isn't great for shared write access | 14:37 |
rogpeppe | abentley: i.e. you can't change anything atomically AFAIK | 14:38 |
abentley | rogpeppe: Okay, but I don't want to live with a worse-and-worse juju for a long time because the jenv sharing service is more complex than necessary. | 14:38 |
abentley | and therefore takes more time to implement. | 14:38 |
rogpeppe | abentley: essentially all it needs to do is implement the Storage interface defined in environs/configstore/interface.go | 14:40 |
rogpeppe | abentley: there's some thinking to be done as to how to implement the encryption | 14:42 |
rogpeppe | abentley: (i.e. do you let the server see the plaintext contents of the .jenv files?) | 14:43 |
rogpeppe | abentley: but i don't anticipate it being more than a week's worth of work for the server and maybe another week to integrate it into juju-core. | 14:44 |
abentley | rogpeppe: Is there a plan to split environments.yaml into multiple files, or is that a misunderstanding of *.jenv files? | 14:46 |
rogpeppe | abentley: there's a plan to lose environments.yaml entirely | 14:46 |
abentley | rogpeppe: What replaces it? | 14:46 |
rogpeppe | abentley: another config file based around somewhat different abstractions | 14:47 |
rogpeppe | abentley: fwereade has a better idea than me - he thrashed out some of the details with bdfl | 14:47 |
abentley | rogpeppe: For teams, it would be nice if each environment had its own config, because some environments are team environments and some are personal environments. It's easier to maintain the team environments if they're not intermixed with personal ones. | 14:48 |
abentley | s/config/config file/ | 14:48 |
rogpeppe | abentley: personally, i'd be happy if the environments.yaml file became simply a URL to the config-file storage server and probably a key for that too | 14:49 |
rogpeppe | abentley: then we'll be able to operate more coherently in a team environment | 14:50 |
abentley | rogpeppe: I'd be unhappy, because then it wouldn't have the openstack credentials, and my first script wouldn't work. | 14:50 |
rogpeppe | abentley: your script could use the server to fetch the openstack credential | 14:50 |
rogpeppe | s | 14:50 |
abentley | rogpeppe: Don't I need credentials in order to use the server? | 14:50 |
rogpeppe | abentley: not provider-specific credentials | 14:51 |
abentley | rogpeppe: Doesn't that make it harder for private clouds? I would think you'd want it to be a per-provider service at least. | 14:52 |
rogpeppe | abentley: (if we do it right, a single config storage server could easily serve the needs of thousands of clients - the demands aren't high) | 14:52 |
rogpeppe | abentley: for private clouds, you would indeed need to run a server somewhere. | 14:52 |
rogpeppe | abentley: or have a file-based mechanism that you could use too | 14:54 |
rogpeppe | abentley: i'm thinking off the top of my head here BTW - none of this stuff has been discussed yet in the team | 14:56 |
w7z | yeay, finally internet back | 14:59 |
sinzui | rogpeppe, Thank you for fixing the gnuflag branch. I will remove the workarounds from the scripts | 15:14 |
abentley | rogpeppe: About the merge, the source deletes 4 lines from environprovider_test.go: http://pastebin.ubuntu.com/6246103/ | 15:14 |
abentley | rogpeppe: If you look at the diff of provider/maas/environprovider_test.go (with conflicts and everything), you'll see that no function was deleted. What happened is that trunk altered a function that doesn't exist in 1.16. | 15:16 |
abentley | rogpeppe: That's why there's a conflict with just one line showing, because that's the only line of the non-existent function that trunk wanted to change. | 15:17 |
rogpeppe | abentley: hmm, which function was altered that doesn't exist in 1.16? | 15:18 |
abentley | rogpeppe: TestUnknownAttrsContainEnvironmentUUID | 15:21 |
rogpeppe | abentley: hmm, interesting. i think i must have got the merge command wrong. i thought that "bzr merge -c 1234" was equivalent to "bzr merge -r 1234", but i'm guessing that it's actually equivalent to "bzr merge -r 1233..1234" | 15:28 |
* rogpeppe is not worried that more stuff has been lost | 15:28 | |
rogpeppe | s/not/now/ | 15:28 |
abentley | rogpeppe: The latter is correct. | 15:28 |
abentley | rogpeppe: You may have actually wanted bzr merge -r 1984..1985? | 15:29 |
abentley | rogpeppe: I mean you may have actually wanted bzr merge -r 1983..1985? | 15:29 |
rogpeppe | abentley: yes | 15:29 |
abentley | rogpeppe: It may help to know that the -r parameters affect diff the same way as merge. | 15:30 |
rogpeppe | abentley: yeah, i'd thought they were slightly different | 15:30 |
abentley | rogpeppe: So if you do diff -r 1983..1985, you'll see all of the changes that merge will attempt to integrate. | 15:30 |
rogpeppe | abentley: ok, it seems we're lucky in this case | 15:32 |
rogpeppe | abentley: the only changes that were made in 1984 were overwritten by changes made in a later branch that i also merged | 15:32 |
* rogpeppe slaps own wrist and considers himself once-bitten | 15:32 | |
rogpeppe | abentley: thanks for solving the mystery for me | 15:34 |
abentley | rogpeppe: You're welcome. | 15:34 |
rogpeppe | if anyone's been following it, i now have a tiny demo program of the net/http problem that has been causing us problems: http://paste.ubuntu.com/6246231/ | 15:48 |
rogpeppe | i'm thinking this might be the cause of bugs like this: https://bugs.launchpad.net/juju-core/+bug/1228255 | 15:49 |
_mup_ | Bug #1228255: Live bootstrap tests fail on canonistack <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1228255> | 15:49 |
rogpeppe | see my comment on https://code.google.com/p/go/issues/detail?id=4677 | 16:02 |
rogpeppe | rebooting | 16:04 |
natefinch | rogpeppe: interesting about the http thing, though I think your bug report has a typo: "It succeeds when run with GOMAXPROCS>0." | 16:09 |
natefinch | rogpeppe: pretty sure gomaxproxs is always >0 ;) | 16:09 |
TheMue | natefinch: as long as your system doesn't blocks | 16:12 |
rogpeppe | natefinch: oops | 16:17 |
rogpeppe | natefinch: correct | 16:17 |
rogpeppe | natefinch: corrected | 16:17 |
TheMue | so, have a good night guys | 17:14 |
TheMue | cu tomorrow | 17:14 |
natefinch | TheMue: g'night | 17:14 |
rogpeppe | i'm also off now | 17:19 |
rogpeppe | g'night all | 17:19 |
natefinch | g'night! | 17:19 |
thumper | abentley: how's tricks? | 20:04 |
abentley | thumper: I'm doing fine. How are you? | 20:05 |
thumper | abentley: pretty good | 20:05 |
thumper | abentley: I was wondering about your bzr plugin for lbox | 20:05 |
thumper | abentley: did you have one? | 20:05 |
thumper | well, reitveld | 20:05 |
thumper | I guess | 20:05 |
abentley | thumper: Yes, since reitveld was updating launchpad and juju-gui has tarmac, it really just sets the commit message and marks the MP approved. | 20:07 |
gary_poster | thumper, abentley http://jujugui.wordpress.com/2013/05/24/thanks-to-diogo-matsubara-well-be-migrating-to/ | 20:07 |
smoser | https://bugs.launchpad.net/juju-core/+bug/1240667 | 20:07 |
thumper | abentley: ah, but not for submitting? | 20:07 |
_mup_ | Bug #1240667: Version of django in cloud-tools conflicts with horizon:grizzly <ubuntu-cloud-archive:Confirmed> <juju-core:Confirmed> <https://launchpad.net/bugs/1240667> | 20:07 |
thumper | well, proposing | 20:07 |
thumper | smoser: is that for maas? | 20:08 |
smoser | its really only solvable in juju i think | 20:08 |
abentley | thumper: No, I didn't do one for proposing. Is lbox propose giving you grief? | 20:08 |
smoser | not specific to maas | 20:08 |
thumper | abentley: regularly | 20:08 |
thumper | smoser: but juju doesn't use django | 20:09 |
gary_poster | thumper, API server is falling over (it closes the socket). that usually means there's some juju error to look at. could you direct me towards the probably pertinent logs? | 20:09 |
gary_poster | oh wait | 20:09 |
gary_poster | machine 0 | 20:09 |
abentley | thumper: Any chance of switching to MPs? | 20:09 |
thumper | gary_poster: probably the api logs on machine-0 | 20:09 |
thumper | abentley: only when we get inline commenting | 20:09 |
thumper | abentley: I've raised this before | 20:09 |
smoser | django is in the cloud archive for maas, yes. but juju added the cloud archive to the node. | 20:10 |
smoser | which is where the problem came from. | 20:10 |
thumper | lbox diff generation isn't as good as LPs | 20:10 |
thumper | smoser: ah | 20:10 |
thumper | smoser: that's annoying | 20:10 |
abentley | thumper: Wish I had a quick solution for you, but I guess you'll have to roll your own if you want to propose on Reitveld. | 20:10 |
thumper | I thought you had one that dealt with pipelines | 20:11 |
thumper | I often forget the -req bit for lbox | 20:11 |
smoser | thumper, yeah. its a problem there for django, but it could be a problem for anything really. | 20:11 |
smoser | and coudl expose itself with lxc. or even mongodb. | 20:11 |
* thumper nods | 20:12 | |
thumper | smoser: any plan or ideas? | 20:12 |
smoser | none that i like. | 20:12 |
abentley | thumper: lp-propose handles pipelines. I don't know of a reitveld equivalent. I can dig up the plugin that generates diffs if you like. | 20:12 |
thumper | abentley: nah | 20:12 |
thumper | abentley: probably not going to get time to look at it | 20:13 |
smoser | when does juju need to enable the cloud archive ? | 20:13 |
thumper | too much else on | 20:13 |
smoser | seem my comments there, am I correct? | 20:13 |
* thumper looks at the bug | 20:13 | |
smoser | if so, then the most reasonable solution is only to use cloud archive on bootstrap node or "un-containerized" node (that would then want to create containers with lxc) | 20:14 |
thumper | smoser: yeah, your comments are right | 20:14 |
thumper | I wish people didn't use the packaged django | 20:15 |
thumper | virtualenvs are the way to go with python IMO | 20:16 |
smoser | oh yeah, of course, installing random stuff from untrusted sources on the internet that can change is always the best way to build software. | 20:18 |
* smoser realizes that argument might not seem ridiculous here | 20:18 | |
natefinch | smoser: haha | 20:19 |
gary_poster | thumper or any sympathetic soul, there is no api log that I see on machine 0. I see /var/log/juju/machine-0.log. In it, http://pastebin.ubuntu.com/6247538/ seems to show that the AllWatcher is falling over. Any thoughts on where to look next? | 20:20 |
thumper | the machine-0.log has all the api stuff in it | 20:20 |
thumper | gary_poster: can you post the error? | 20:21 |
thumper | pastebin | 20:21 |
thumper | or something | 20:21 |
* thumper sighs | 20:21 | |
thumper | I see it there | 20:21 |
* thumper clicks | 20:21 | |
thumper | it is all organge | 20:21 |
thumper | orange | 20:21 |
hazmat | that looks like a client close of connection | 20:21 |
gary_poster | organge?: | 20:21 |
thumper | I'm used to seeing the hyperlinks as blue | 20:21 |
thumper | dumb client | 20:21 |
gary_poster | hazmat, nope. | 20:21 |
hazmat | gary_poster, do you have the corresponding browser trace? | 20:22 |
hazmat | websocket trace from the client that is | 20:22 |
gary_poster | hazmat, yes. Connection Close Frame | 20:22 |
hazmat | gary_poster, do you have a way to reproduce? | 20:22 |
thumper | hmm... | 20:22 |
gary_poster | hazmat, yes, but it is expensive. Deploy this and go to GUI. https://raw.github.com/paulczar/charm-championship/master/monitoringstack.sh | 20:23 |
hazmat | ah.. that one. | 20:23 |
thumper | gary_poster: what is the gui asking for? | 20:32 |
thumper | when it is falling over? | 20:32 |
gary_poster | thumper, that log is as close to a record of that as we have. in request 5, we ask the all watcher to give us the next output. this is the first such request, I am pretty sure, so it will be the full system status. We then ask for various other things, and get successful responses (such as line 3, which is our 16th request to juju in this connection) but then on line 5 (and 4?) of the pastebin, juju says that the | 20:35 |
gary_poster | AllWatcher Next request has an error...because the "state watcher was stopped". By whom? I'd love to know. That appears to be death knell for the whole connection. | 20:35 |
gary_poster | The correlation is by the "RequestId": line 5 is a reaction to line 1 | 20:36 |
thumper | gary_poster: the whole "all watcher" is a part I've not yet delved into, and I gather a rather complicated beast | 20:48 |
gary_poster | thumper, ack. this may be a big deal. it might explain some other reports I've heard. | 20:48 |
thumper | gary_poster: probably needs lots of debugging added to it to find out what is going on | 20:48 |
* thumper sighs | 20:48 | |
thumper | big deals always happen at the last moment, no? | 20:48 |
gary_poster | thumper, ack. I'm checking with another source to see if they can dupe in another situation | 20:49 |
gary_poster | yeah :-/ | 20:49 |
thumper | gary_poster: if you can get me something simpler to generate the problem with it'd be appreciated | 20:49 |
thumper | gary_poster: let me finish off the reviews I'm in the middle of | 20:50 |
thumper | and then I'll start poking | 20:50 |
gary_poster | thumper, heh, I'd love to. I think I need some hint on cause before I can come up with a smaller case. | 20:50 |
thumper | gary_poster: does that monitoringstack deploy list always cause the problem? | 20:50 |
thumper | is it really reproducable? | 20:50 |
thumper | do I have to open the gui? | 20:51 |
thumper | or does it just happen? | 20:51 |
gary_poster | thumper, so far, yes, it is reproducable. API connection falls over in about 4.35 seconds, from the perspective of the client. | 20:51 |
thumper | gary_poster: how far through the script does it get before it falls over? | 20:52 |
gary_poster | thumper, and yes, you go to the environment with the GUI and log in | 20:52 |
thumper | gary_poster: does the system appear stable prior? | 20:52 |
gary_poster | thumper, the script sets up an environment. then you go to the gui, and just look at it, and the connection falls over | 20:52 |
* thumper nods | 20:53 | |
thumper | ok | 20:53 |
gary_poster | thumper, other than the API connection, everything seems fine | 20:53 |
thumper | gary_poster: how long are you around for? | 20:53 |
gary_poster | thumper, and other than the AllWatcher Next, within that shining 4 seconds of connectivity, other replies in the connecton seem to be behaving fine | 20:53 |
gary_poster | thumper, my EoD is in 6 minutes, and I should probably go not too soon after that. last night was already a long one. | 20:54 |
thumper | gary_poster: ack | 20:54 |
thumper | gary_poster: do we have a bug yet? | 20:54 |
gary_poster | thumper, no. I will file one before I leave, or, if I get confirmation from the reporter, add juju core to an existing gui bug. | 20:55 |
gary_poster | (add the dupe instructons I have so far) | 20:55 |
gary_poster | and add | 20:55 |
gary_poster | such as they are | 20:55 |
thumper | ok, ta | 20:56 |
sinzui | thumper, Can you take 10 minutes to advise/comment on this bug https://bugs.launchpad.net/juju-core/+bug/1232304 | 21:00 |
_mup_ | Bug #1232304: consider tuning git setup for juju-core, and document caveats <canonical-webops> <doc> <feature> <pes> <juju-core:Triaged> <https://launchpad.net/bugs/1232304> | 21:00 |
thumper | sinzui: I can try | 21:00 |
natefinch | sinzui, thumper: any process that relies on storing binary blobs in git is flawed | 21:04 |
sinzui | natefinch, agreed | 21:04 |
sinzui | I really think there is a process problem in the bug. If we tune git, how much more can we scale before he hit the next problem? | 21:05 |
natefinch | sinzui: tuning git is not the solution. Not storing blogs in git is the solution. Is it a matter of charm authors doing something wrong, or a problem of juju doing something wrong? I don't know the charm upgrade code at all. | 21:06 |
natefinch | and by "wrong" I can also mean "we haven't given them a better way" | 21:07 |
thumper | natefinch: care to comment on that bug? | 21:08 |
thumper | natefinch: I know nothing about git | 21:08 |
* thumper will try to replicate gary_poster's bug on the local provider to save AWS bills | 21:09 | |
gary_poster | +1 | 21:09 |
* gary_poster needs to try and get local working on this machine. lost > day on itso been putting it off | 21:09 | |
thumper | gary_poster: we could look next week :) | 21:10 |
gary_poster | thumper, it's my desktop. laptop was fine last I checked | 21:10 |
natefinch | thumper: all I know is that when you store binary in git, and then submit a change that changes the binary.... it doesn't store the diff of the two binaries, it just stores the two binaries. So, if you have a 200 meg zip file and you add one thing to it and re-commit, you now have two 200-meg zip files in the repo. | 21:10 |
thumper | gary_poster: ah | 21:10 |
natefinch | thumper: and whenever you get code from git, you get the WHOLE REPO. There's no getting a specific branch to reduce the amount you have to download. | 21:11 |
thumper | WTF... my local provider won't bootstrap... | 21:12 |
thumper | does no-one test this shit? | 21:13 |
* thumper rages | 21:13 | |
natefinch | thumper: anyway, I'll comment on the bug, but since I don't know the upgrade charm code, I'm not sure where the problem lies | 21:13 |
natefinch | thumper: I get ERROR juju supercommand.go:282 Get http://10.0.3.1:8040/provider-state: dial tcp 10.0.3.1:8040: connection refused | 21:15 |
thumper | natefinch: me too | 21:15 |
* thumper sighs heavily | 21:16 | |
* thumper wonders if 1.16 works | 21:16 | |
thumper | it used to | 21:16 |
thumper | no. | 21:16 |
thumper | that fails too | 21:16 |
thumper | WTF!!!! | 21:16 |
* thumper headdesks | 21:17 | |
natefinch | it's pretty epically bad if local bootstrap doesn't work in 1.16 | 21:17 |
thumper | true that | 21:17 |
* thumper files critical bug | 21:17 | |
gary_poster | thumper, I did not triage https://bugs.launchpad.net/juju-core/+bug/1240708 as critical yet but I filed it | 21:19 |
_mup_ | Bug #1240708: API server falls over repeatably during AllWatcher Next, killing GUI <juju-core:New> <https://launchpad.net/bugs/1240708> | 21:19 |
thumper | gary_poster: sorry, but have to fix the local provider first | 21:19 |
gary_poster | thumper, completely understood | 21:20 |
thumper | gary_poster: it is broken, AGAIN | 21:20 |
gary_poster | :-( | 21:20 |
thumper | yeah | 21:20 |
natefinch | thumper: I guess we don't have tests that actually try bootstrapping local? Seems like, of all the providers, that one should be the most thoroughly tested. | 21:21 |
thumper | natefinch: it needs root | 21:21 |
thumper | so we can't do it in unit tests | 21:21 |
thumper | natefinch: we plan to have it work in the qa lab | 21:21 |
thumper | but that is still being set up | 21:21 |
thumper | at least I think I know how to get this working | 21:22 |
gary_poster | thumper I have to run. I will check back later briefly | 21:23 |
* thumper puts on his debugging music nice and loud | 21:23 | |
thumper | gary_poster: ack | 21:23 |
natefinch | thumper: making people type in a sudo password when running the tests seems like a worthwhile pain in order to get full testing on the local provider | 21:24 |
thumper | natefinch: perhaps... | 21:24 |
* thumper goes to fix the problem | 21:24 | |
natefinch | thumper: good luck | 21:24 |
davecheney | or add themselves to whell | 21:25 |
davecheney | wheel | 21:25 |
davecheney | or sudoers | 21:25 |
davecheney | or something to make it automagic | 21:26 |
natefinch | indeed | 21:26 |
davecheney | but is probably going to be a non starter for CI | 21:27 |
* natefinch is at EOD | 21:28 | |
thumper | well that was easy | 21:28 |
natefinch | thumper: ha! awesome | 21:28 |
davecheney | what is the plan for getting all these post 1.16 fixes into Saucy | 21:31 |
thumper | davecheney: 1.16.1 | 21:35 |
thumper | I have no other answer at this stage | 21:35 |
davecheney | roger | 21:36 |
thumper | davecheney: if you insist, we could get roger to do everything | 21:36 |
thumper | :P | 21:36 |
* thumper is frustrated with lbox and how it reuses merged merge proposals | 21:37 | |
thumper | I never want that | 21:37 |
thumper | davecheney: you may have re-approved the old one, I'm resubmitting | 21:38 |
thumper | https://codereview.appspot.com/14573046 | 21:38 |
davecheney | thumper: LGTM. Looks like the same | 21:40 |
thumper | davecheney: it is :) | 21:40 |
thumper | davecheney: it was me leaving a clean trail behind me | 21:40 |
thumper | davecheney: once it lands in trunk, I'll submit for 1.16 | 21:41 |
thumper | I made sure I branched off a common ancestry revision | 21:42 |
thumper | so I don't need to cherry pick | 21:42 |
thumper | davecheney: do you handle the packaging for juju? | 21:45 |
thumper | davecheney: in main world, we had seb and didier | 21:45 |
thumper | davecheney: who we'd pass a patch to in situations like this and they'd rebuild the deb | 21:45 |
davecheney | thumper: that honor belongs to sinzui | 21:47 |
thumper | sinzui: ping | 21:47 |
sinzui | hi thumper | 21:47 |
thumper | sinzui: got time for a hangout? | 21:47 |
sinzui | yes | 21:47 |
thumper | sinzui: need bandwidth | 21:47 |
thumper | sinzui: https://plus.google.com/hangouts/_/95c73d1f1d77129b8096dd279bf17d654e856cda?hl=en | 21:48 |
wallyworld_ | sinzui: hi | 21:48 |
sinzui | hi wallyworld_ | 21:48 |
wallyworld_ | you otp? | 21:48 |
thumper | wallyworld_: he is soon :) | 21:48 |
davecheney | please form an orderly line | 21:49 |
wallyworld_ | ok, sinzui maybe you can ping me when done | 21:49 |
sinzui | wallyworld_, your up | 22:06 |
wallyworld_ | sinzui: i was just wondering the plan for getting tools and metadata onto streams.canonical.com now that it has been commissioned | 22:07 |
sinzui | I plan to test it for 1.17 | 22:08 |
sinzui | 1.18 will make it available to everyone | 22:09 |
wallyworld_ | sinzui: ok. i will need to make changes to juju-core to update the url etc | 22:09 |
wallyworld_ | when were you planning to start? | 22:09 |
sinzui | Did you see the bug I updated about that? | 22:09 |
wallyworld_ | ah no | 22:09 |
sinzui | streams.canonical.com/juju/tools | 22:10 |
wallyworld_ | i mustn't be subscribed to the bug | 22:10 |
wallyworld_ | yeah, that url looks fine | 22:10 |
wallyworld_ | do you have a bug # handy? | 22:10 |
sinzui | ^ We share the host with cloud images. We made an executive decision that juju-dist was too long | 22:10 |
* sinzui looks | 22:11 | |
wallyworld_ | ok, it will take a little work to retool everything, but it's only software | 22:11 |
sinzui | wallyworld_, https://bugs.launchpad.net/juju-core/+bug/1220965 | 22:11 |
_mup_ | Bug #1220965: add official tools repository to metadata search <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1220965> | 22:11 |
sinzui | wallyworld_, really? you need juju-dist? | 22:12 |
wallyworld_ | wtf, i even reported that bug | 22:12 |
sinzui | wallyworld_, I read EVERY bug over the weekend. Even yesterday I was dizzy from too much information | 22:12 |
wallyworld_ | sinzui: actually, no. i was not thinking straight. juju-dist is assumed as the container name for cloud storage (ie private bucket) | 22:13 |
wallyworld_ | i'll need to check my mail filters to see where emails from that bug went | 22:13 |
wallyworld_ | anyways, that url will be fine | 22:13 |
wallyworld_ | i'll update juju-core with the necessary changes | 22:14 |
wallyworld_ | will be done this week i expect | 22:14 |
sinzui | wallyworld_, If it is a problem to match the path /juju/tools/, I can discuss the issue with Ben | 22:14 |
wallyworld_ | nah, should be fine | 22:14 |
wallyworld_ | will be fine, i was just caffeine deprived | 22:14 |
sinzui | :/ | 22:15 |
sinzui | Drink | 22:15 |
wallyworld_ | will do :-) | 22:15 |
wallyworld_ | sinzui: can you let me know when stuff has been uploaded? how hard is it to just copy across the tools from s3 or wherever? | 22:16 |
sinzui | Not sure yet | 22:16 |
wallyworld_ | ok | 22:16 |
sinzui | at the moment. I assemble all the tools to call sync-tools and make meta data. Since I have a cache from aws of the historic tools, I can build the tree in about 5 minutes. the release-public-tools script then deploys to all CPCs taking 15 minutes | 22:18 |
sinzui | wallyworld_, So the new process is pull/sync from streams.canonical.com, then 15 minute publication to all CPCs | 22:19 |
wallyworld_ | the ticket talks about syncing from sawo or something like that | 22:19 |
wallyworld_ | my question above was more aimed at getting some initial metadata up on streans.c.c so i could test | 22:20 |
sinzui | thats right. I have no experience uploading to it and know how long from that server to the new server | 22:20 |
thumper | gui won't install locally | 22:21 |
thumper | grr | 22:21 |
sinzui | thumper, dir permissions? | 22:21 |
wallyworld_ | sinzui: i'll leave you alone in peace to work your magic and you can let me know when there's some news :-) | 22:21 |
gary_poster | thumper, what is error? | 22:21 |
gary_poster | thumper, for the API server falling over, more info. From comment I added in bug: "We verified that the other bug has the same behavior (linked as dupe). Apparently, then, the charms are likely unrelated, because the other scenario is of an openstack bundle. This may simply be scale--and not very much of it." | 22:22 |
thumper | gary_poster: http://pastebin.ubuntu.com/6248091/ | 22:22 |
thumper | gary_poster: logged in, and all at latest revision | 22:22 |
thumper | gary_poster: in fact, every charm failed to install | 22:23 |
gary_poster | thumper, yeah, "apt-get -y install python-apt python-launchpadlib python-tempita" failed, which doesn't seem like a gui issue | 22:24 |
thumper | yeah... | 22:24 |
thumper | hang on | 22:24 |
* thumper pastebins | 22:24 | |
thumper | http://pastebin.ubuntu.com/6248095/ | 22:25 |
thumper | seems to be a cloud-init issue | 22:25 |
thumper | which is why all failed | 22:25 |
gary_poster | :-( :-( :-( | 22:25 |
sidnei | thumper: yeah, it's all broken :/ | 22:25 |
thumper | sidnei: any idea? | 22:25 |
sidnei | thumper: it's an issue with procps and lxc, we've been tracking it all day | 22:25 |
gary_poster | I have to run | 22:25 |
thumper | gary_poster: ack | 22:26 |
thumper | sidnei: oh, didn't know | 22:26 |
sidnei | thumper: it started happening yesterday because procps landed in precise-updates | 22:26 |
thumper | ah | 22:26 |
sidnei | bug #1157643 | 22:26 |
_mup_ | Bug #1157643: procps fail to start <failed> <patch> <procps> <start> <procps (Ubuntu):Confirmed> <https://launchpad.net/bugs/1157643> | 22:26 |
thumper | sidnei: are they going to roll it back? | 22:26 |
sidnei | thumper: nope, that doesn't help unfortunately | 22:27 |
thumper | why? | 22:27 |
sidnei | thumper: the problem is that procps has been broken in lxc since ever, but since it's installed by default and init doesn't block on it failing it went unnoticed | 22:27 |
sidnei | thumper: it only became an issue when calling the postinst script from dpkg | 22:27 |
thumper | haha | 22:27 |
thumper | oops | 22:27 |
sidnei | so reverting won't help | 22:28 |
sidnei | thumper: there's a patch in the bug, but it needs to go into sru and all that i guess | 22:29 |
* thumper nods | 22:29 | |
sidnei | until then, no workie lxc :/ | 22:30 |
thumper | oh well, I guess using aws to test this other failure is what we need then | 22:30 |
thumper | davecheney: this one is for 1.16 https://codereview.appspot.com/14439067/ | 22:34 |
* thumper self approves | 22:36 | |
thumper | huh - rieveld ignores my own LGTMs :) | 22:37 |
sidnei | thumper: there :) | 22:41 |
bigjools | o/ | 22:43 |
thumper | o/ | 22:47 |
thumper | gym time | 22:49 |
wallyworld_ | thumper: don't forget my 2 branches :-) | 22:50 |
thumper | wallyworld_: oh yeah... after gym | 22:54 |
wallyworld_ | thumper: ok :-) also i think we need a 1.18 in saucy | 22:54 |
wallyworld_ | but we can discuss later | 22:54 |
thumper | wallyworld_: it won't happen dude | 22:54 |
thumper | yes, lets | 22:54 |
wallyworld_ | well, stuff will be broken | 22:54 |
wallyworld_ | no tools repository | 22:55 |
wallyworld_ | unless we back port lots :-( | 22:55 |
wallyworld_ | i wonder why it's so hard to get own own software into our own distro | 22:56 |
davecheney | wallyworld_: i couldn't agree more strongly | 23:00 |
wallyworld_ | davecheney: yeah :-( there's a lot going into trunk right now | 23:00 |
wallyworld_ | we either want juju to work or we don't | 23:01 |
rogpeppe1 | ooh, i *so* nearly just tried to buy a camera from this site. then i looked at the terms and conditions... http://www.pcmshop.info/index.php?route=information/information&information_id=5 | 23:15 |
rogpeppe1 | mark v cheney ftw | 23:15 |
davecheney | rogpeppe1: we'll, I didn't want you to have to fall on your sword this sprint | 23:16 |
davecheney | i figured i was my time | 23:16 |
wallyworld_ | rogpeppe1: where in goamz did you see req.Close = true? i can't seem to find it | 23:43 |
wallyworld_ | ah never mind, found it | 23:47 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!