rogpeppe | mgz: reviewed | 00:09 |
---|---|---|
fwereade_ | rogpeppe, in case you're there, my unease has crystallized -- how does addressupdater play with containers? | 00:09 |
rogpeppe | fwereade_: it doesn't currently | 00:09 |
fwereade_ | rogpeppe, I guess it doesn't have to yet | 00:09 |
rogpeppe | fwereade_: we have to work out how we're going to do container addressing first | 00:10 |
rogpeppe | fwereade_: in case you missed it, i'm after a review of this, which actually integrates the address updater: https://codereview.appspot.com/14306043/ | 00:11 |
fwereade_ | rogpeppe, well, we know that one instance will have at least N addresses that need to be shared amongst the machine and its containers | 00:11 |
rogpeppe | fwereade_: i'm not sure who will be responsible for allocating container addresses | 00:12 |
rogpeppe | fwereade_: whatever happens, there has to be *something* like the address updater at the top level, i think | 00:15 |
fwereade_ | rogpeppe, yeah, I think the trickiness it just going to be passing the extra addresses on to containers | 00:15 |
rogpeppe | fwereade_: yes | 00:16 |
fwereade_ | rogpeppe, and that's orthogonal, so... LGTM | 00:16 |
fwereade_ | rogpeppe, and you reviewed mgz's already, so I'm going to bed :) | 00:17 |
fwereade_ | gn | 00:17 |
rogpeppe | fwereade_: gn | 00:17 |
rogpeppe | ' | 00:17 |
thumper | fwereade_: you still up? | 00:18 |
thumper | geez | 00:18 |
thumper | rogpeppe: you heading to bed too? | 00:18 |
rogpeppe | thumper: i was hoping i might get the API client address caching done... | 00:19 |
thumper | rogpeppe: how much does it still have to do? | 00:19 |
rogpeppe | thumper: 1) it needs State.APIAddresses to return addresses from the state server machines rather than from mongo peers | 00:20 |
rogpeppe | thumper: 2) it needs the API login to call State.APIAddresses and return them as the result | 00:21 |
rogpeppe | thumper: 3) it needs some code to actually save the API endpoint returned from the API login | 00:21 |
rogpeppe | thumper: the first two are pretty trivial; the third requires a little more thought but should be easy enough. | 00:21 |
rogpeppe | thumper: all of them can actually be done independently | 00:22 |
thumper | and before you sleep? | 00:24 |
rogpeppe | thumper: erm, maybe i'm being a little optimistic :-) | 00:27 |
rogpeppe | thumper: i think i'll probably just land the address updater | 00:27 |
* thumper nods | 00:28 | |
rogpeppe | thumper: it would be nice to change status so that we can actually see the new address info too... | 00:29 |
thumper | :) | 00:30 |
rogpeppe | right, i am doing No More | 00:40 |
rogpeppe | thumper: g'night | 00:40 |
thumper | night | 00:40 |
davecheney | sinzui: +1 on your change | 00:41 |
* rogpeppe just live bootstrapped with an environments.yaml entry that's simply : "envname": {"type": "ec2"} | 00:46 | |
davecheney | rogpeppe: wowzers | 00:53 |
davecheney | talk about simple | 00:53 |
davecheney | rogpeppe: what is the next step, bootstrap with no envuronments.yaml and just some flags? | 00:53 |
rogpeppe | davecheney: it did use the conventional $AWS_ environment vars, so it's a bit of a cheat really | 00:53 |
rogpeppe | davecheney: i think that would be good, yeah | 00:54 |
davecheney | juju bootstrap -e rog -t ec2 | 00:54 |
davecheney | creates ~/.juju/environments.yaml ? | 00:54 |
rogpeppe | davecheney: no need | 00:54 |
rogpeppe | davecheney: creates ~/.juju/environments/rog.jenv | 00:54 |
* davecheney twitches | 00:55 | |
davecheney | prefixing everything with j sounds very 2002 | 00:55 |
davecheney | :) | 00:55 |
rogpeppe | davecheney: look, there was a bikeshed CL specifically for that :-) | 00:55 |
rogpeppe | davecheney: you didn't weigh in so you're out | 00:55 |
rogpeppe | davecheney: FWIW i'm not that keen on .jenv either, but it was the only reasonable suggestion | 00:56 |
davecheney | rogpeppe: fair enough, my fault for not being involed | 00:57 |
davecheney | i'll shut up | 00:57 |
rogpeppe | davecheney: if you really have a better suggestion, i'm happy to hear it. | 00:57 |
rogpeppe | davecheney: it can be changed now; not so easily in the future. | 00:58 |
davecheney | rogpeppe: ignore my griping, i'm a chicken, not a pig | 00:58 |
* rogpeppe duly ignores | 00:58 | |
rogpeppe | right, the address updater has landed. i'm going to bed. | 00:59 |
rogpeppe | davecheney: g'night. | 00:59 |
* thumper sighs heavily | 00:59 | |
thumper | why is dummy provider so dumb | 00:59 |
thumper | is there any specific trick I need to know about dummy.Storage()? | 01:00 |
rogpeppe | thumper: what about it? | 01:00 |
thumper | I have a test hanging forever | 01:00 |
thumper | inside dummy bootstrap method, I'm trying to call common.SaveState | 01:00 |
thumper | so it behaves like a real environment | 01:00 |
thumper | for some other tests | 01:00 |
rogpeppe | thumper: dummy.Storage shouldn't be doing anything special | 01:01 |
thumper | but the test just hangs | 01:01 |
thumper | how can I tell where it is hanging? | 01:01 |
rogpeppe | thumper: ^\ | 01:01 |
rogpeppe | thumper: i.e. SIGQUIT | 01:01 |
thumper | ok, but it is | 01:01 |
rogpeppe | thumper: that'll give you a stack dump | 01:01 |
thumper | is there a control key for that? | 01:01 |
rogpeppe | thumper: control-backslash | 01:01 |
thumper | ta | 01:01 |
rogpeppe | thumper: paste the stack trace; i have a suspicion what might be your problem | 01:02 |
thumper | rogpeppe: http://pastebin.ubuntu.com/6186255/ | 01:02 |
rogpeppe | thumper: i think you're probably calling common.SaveState while the mutex is held | 01:02 |
thumper | rogpeppe: hopefully I waited long enough | 01:02 |
thumper | ah | 01:03 |
rogpeppe | thumper: yup, looks like that's the issue | 01:03 |
rogpeppe | thumper: (see goroutine 52) | 01:03 |
* thumper nods | 01:03 | |
rogpeppe | right, i really *am* going to bed noew | 01:03 |
thumper | rogpeppe: that was it, passes now | 01:06 |
* thumper runs all the tests again | 01:07 | |
thumper | oh... | 01:18 |
thumper | test panci | 01:18 |
thumper | panic | 01:18 |
* thumper wondered what I did | 01:18 | |
hatch | is there any way I can deploy a charm locally (lxc) from my launchpad branch? | 01:19 |
hatch | the branch is lp:~hatch/+junk/hadoop-charm-update | 01:19 |
hatch | but it says that is an invalid charm name | 01:19 |
thumper | hatch: I think you may need to have a copy locally | 01:24 |
thumper | axw: I got a test failure with the null provider tests | 01:25 |
thumper | axw: may be a timing thing | 01:25 |
axw | thumper: yeah, rog filed a bug | 01:25 |
hatch | thumper: yeah that's what it looks like in the docs....thanks for confirming | 01:25 |
axw | will look into it shortly | 01:25 |
thumper | axw: this one? environSuite.TestEnvironBootstrapStorager | 01:25 |
thumper | hatch: np | 01:25 |
axw | thumper: yup | 01:25 |
thumper | ok, | 01:26 |
axw | https://bugs.launchpad.net/bugs/1234125 | 01:26 |
_mup_ | Bug #1234125: provider/null: sporadic test failure <intermittent-failure> <juju-core:New for axwalk> <https://launchpad.net/bugs/1234125> | 01:26 |
* thumper sighs | 01:26 | |
thumper | why do core-devs file "New" bugs | 01:26 |
thumper | it should at least be triaged | 01:26 |
thumper | stabby stabby | 01:26 |
thumper | wow | 01:46 |
thumper | found out why so many of our bootstrap tests are slow | 01:47 |
thumper | automagical upload is now rebuilding jujud every test :) | 01:47 |
thumper | why is it one hour jobs become four hour jobs | 01:59 |
sidnei | thumper: around? | 02:43 |
thumper | sidnei: kinda | 02:43 |
sidnei | thumper: just wondering if bootstrapping from 1.5.1 just built from trunk this morning it should be picking tools 1.4.1.1 or am i doing something wrong? | 02:43 |
sidnei | and by 1.4 and 1.5 i meant 1.14 and 1.15 of course | 02:44 |
davecheney | sidnei: as i understand it from 1.15.x onwards it will only be able to work if it finds an exact tools match | 02:44 |
sidnei | uhm, so this shouldn't have happened, if that's indeed correct | 02:45 |
sidnei | unless im missing some branch that was landed recently | 02:45 |
thumper | sidnei: I'd say it certainly shouldn't be | 02:45 |
sidnei | https://pastebin.canonical.com/98419/ fyi | 02:46 |
sidnei | let me paste that to paste.ubuntu.com | 02:47 |
sidnei | oh, oh. i think i know what the problem is | 02:47 |
sidnei | yeah, much better now. 'sudo juju bootstrap' was picking /usr/bin/juju obviously. | 02:48 |
thumper | davecheney, axw: https://codereview.appspot.com/14279044/ | 03:08 |
axw | thumper: looking | 03:08 |
thumper | sidnei: yea, I've picked up `sudo $(which juju) bootstrap' from axw | 03:09 |
thumper | before I'd hardcode the path | 03:09 |
sidnei | ah, nice one | 03:09 |
* davecheney looks | 03:09 | |
thumper | it is a beautiful day here | 03:13 |
thumper | once my wife is back with kid from the doctor, we are going for a picnic | 03:13 |
thumper | \o/ | 03:13 |
=== julian__ is now known as julianwa | ||
davecheney | thumper: +1 on that change | 03:19 |
thumper | and this? https://codereview.appspot.com/14321043 | 03:19 |
davecheney | nope https://codereview.appspot.com/14279044/ | 03:20 |
davecheney | is there another review ? | 03:20 |
thumper | yeah, the one I just said :) | 03:20 |
axw | I think there's a bug for this too | 03:21 |
thumper | there is | 03:21 |
thumper | linked to the branch | 03:22 |
axw | ah | 03:22 |
thumper | https://bugs.launchpad.net/juju-core/+bug/1216775 | 03:22 |
_mup_ | Bug #1216775: cmd/juju: local provider doesn't give a clear explanation when lxc is not configured correctly <papercut> <juju-core:Triaged by thumper> <https://launchpad.net/bugs/1216775> | 03:22 |
axw | thumper: yep, just expected a "Fixed #..." | 03:22 |
thumper | axw: I use the bzr --fixes lp:nnn | 03:22 |
thumper | rather than the lbox thingy I can't remember | 03:22 |
axw | okey dokey | 03:22 |
davecheney | 22:08 < thumper> davecheney, axw: https://codereview.appspot.com/14279044/ | 03:24 |
davecheney | that was the one I reviewed | 03:24 |
thumper | davecheney: yeah, I know | 03:24 |
thumper | davecheney: and thanks | 03:24 |
thumper | davecheney: I was giving you another :) | 03:24 |
thumper | help punished by requesting more help | 03:24 |
thumper | :) | 03:24 |
davecheney | cokc | 03:24 |
davecheney | fuck the lags is bad here today | 03:25 |
davecheney | _+1 | 03:26 |
axw | thumper: https://codereview.appspot.com/14315044/ if you have the time | 03:28 |
* thumper looks | 03:29 | |
thumper | axw: I don't get this: ( echo $* | grep -q touch ) && head -n 1 > /dev/null | 03:33 |
thumper | can you explain? | 03:33 |
axw | thumper: one sec | 03:33 |
axw | thumper: actually that's broken | 03:34 |
axw | thumper: my intention was to only expect input for the second bash, but that's just wrong | 03:34 |
axw | thumper: PTAL | 03:49 |
* thumper had to decode petal | 03:49 | |
axw | sorry, Please Take Another Look :) | 03:49 |
thumper | I got it, it just took me a while | 03:50 |
thumper | my cat is attacking me, she wants food | 03:50 |
axw | thumper: nps, it can wait | 03:50 |
axw | the code.. not the cat | 03:50 |
axw | :) | 03:50 |
thumper | have to teach the cat some time | 03:50 |
thumper | fark | 03:51 |
thumper | exec 0<&- | 03:51 |
thumper | now that is cryptic | 03:51 |
thumper | even with the comment, I don't get it | 03:51 |
axw | hence the comment ;) | 03:51 |
axw | heh | 03:51 |
axw | can't say I grok the syntax either | 03:51 |
thumper | axw: do you have any other null provider things to get done? | 03:55 |
axw | thumper: nothing for 1.16 | 03:55 |
davecheney | help | 04:09 |
thumper | davecheney: whazzup? | 04:09 |
davecheney | does anyone remember the issue for the charm dir being owned by root 0700 ? | 04:09 |
thumper | no | 04:09 |
thumper | well, I don't | 04:09 |
davecheney | there must be one | 04:10 |
davecheney | a few people lost their shit over it | 04:10 |
thumper | axw: you're branch failed | 04:12 |
* thumper sighs | 04:13 | |
thumper | your | 04:13 |
axw | huh | 04:13 |
thumper | obviously you aren't a branch | 04:13 |
axw | I tested that | 04:13 |
davecheney | thumper: https://bugs.launchpad.net/juju-core/+bug/1226088 | 04:13 |
_mup_ | Bug #1226088: config-get fails with "open FORCE-VERSION: permission denied" <juju-core:Invalid> <https://launchpad.net/bugs/1226088> | 04:13 |
axw | heh | 04:13 |
davecheney | is a protruberance of this issue | 04:13 |
davecheney | but not the core issue | 04:13 |
davecheney | got it | 04:14 |
davecheney | https://bugs.launchpad.net/juju-core/+bug/1205286 | 04:14 |
_mup_ | Bug #1205286: charm directory permissions now more restrictive <canonical-webops> <juju-core:Won't Fix> <postgresql (Juju Charms Collection):Fix Released> <postgresql-psql (Juju Charms Collection):Fix Released> <https://launchpad.net/bugs/1205286> | 04:14 |
* thumper back later | 04:15 | |
rogpeppe | mornin' all | 07:21 |
fwereade_ | heya rogpeppe | 07:36 |
rogpeppe | fwereade_: hiya | 07:36 |
axw_ | jam: ping | 08:35 |
jam | hi rogpeppe fwereade_ and axw_ | 08:36 |
jam | axw_: what's up? | 08:36 |
rogpeppe | jam: hiya | 08:36 |
axw_ | jam: what do we want the cloud-tools pocket for? | 08:36 |
fwereade_ | jam, heyhey | 08:39 |
jam | axw_: it holds backports of tools related stuff. AIUI juju-core itself will be in there, but so will LXC | 08:47 |
jam | axw_: I *think* we'll migrate to using Cloud Tools instead of ppa:juju/stable | 08:48 |
rogpeppe | axw_: small change to environs/sshstorage - i started commenting on your CL but then realised it was merged; https://codereview.appspot.com/14327043 | 08:48 |
axw_ | jam: ah ok. should I bother trying to get this in today? or have we already cut off? | 08:49 |
axw_ | rogpeppe: looking | 08:49 |
axw_ | oh, no comment left | 08:49 |
jam | the discussion says cut off, but we can still land it | 08:49 |
axw_ | rogpeppe: is there a problem, or did I already fix it? :) | 08:49 |
axw_ | rogpeppe: I made another fix in a further bzr push, didn't repropose | 08:50 |
rogpeppe | axw_: no real problem - just that if there were several lines of output, they wouldn't be joined with newlines | 08:50 |
rogpeppe | axw_: ah, if the change wasn't reproposed, i have no idea | 08:50 |
axw_ | rogpeppe: agh, yeah, because I changed to use the scanner | 08:50 |
axw_ | I'll fix in another, thanks | 08:50 |
rogpeppe | axw_: that CL fixes it | 08:51 |
axw_ | oh sorry, that's yours | 08:51 |
* axw_ looks again | 08:51 | |
rogpeppe | axw_: yeah | 08:51 |
axw_ | rogpeppe: heh, I did actually split it out into a separate function, but reversed it to keep the EOF bits close together | 08:53 |
axw_ | but... meh, no big deal | 08:53 |
rogpeppe | axw_: yeah, i see that; i think the separate function is still worth it though. i think the @EOF is distinctive enough really. | 08:55 |
rogpeppe | axw_: i suppose we could pass the @EOF to copyAsBase64 as a "terminator" argument | 08:56 |
axw_ | rogpeppe: I think it's fine, that'd probably be overkill | 08:56 |
axw_ | rogpeppe: lgtm, thanks | 08:56 |
rogpeppe | axw_: ta | 08:57 |
* fwereade_ breakfast | 09:03 | |
axw_ | rogpeppe: anything we can do about this in the short term? https://bugs.launchpad.net/bugs/1234534 | 09:09 |
_mup_ | Bug #1234534: local provider spams machine 0 log with "localInstance.Addresses not implemented" <juju-core:New> <https://launchpad.net/bugs/1234534> | 09:09 |
axw_ | bbiab - making dinner | 09:09 |
jam | fwereade_: enjoy your food, but poke for when you get back | 09:09 |
rogpeppe | axw_: hmm interesting. i'll investigate. | 09:10 |
rogpeppe | axw_: there's a trivial fix | 09:11 |
rogpeppe | axw_: just remove that log statement :-) | 09:11 |
rogpeppe | axw_: but that leaves a slight problem | 09:11 |
rogpeppe | axw_: we shouldn't really be polling for address changes when the address can never change | 09:12 |
rogpeppe | axw_: i wonder if Addresses should return ErrUnimplemented; then the polling loop could just not bother anymore | 09:13 |
rogpeppe | fwereade_, dimitern, axw_: trivial CL to add a little bit of logging: https://codereview.appspot.com/14328043 | 09:14 |
dimitern | rogpeppe, looking | 09:15 |
dimitern | rogpeppe, reviewed | 09:21 |
rogpeppe | dimitern: ta | 09:21 |
axw_ | rogpeppe: would it be bad to just to return an empty list of addresses from the local provider? | 09:27 |
rogpeppe | axw_: no, but we don't really want to be contiuously polling those addresses | 09:28 |
axw_ | rogpeppe: ah, it keeps checking if it has an empty list? ok | 09:28 |
rogpeppe | axw_: yes - it'll wait for the machine to get an address (currently it polls quite frequently at that stage, once a second) | 09:28 |
rogpeppe | dimitern: you're right about the %#v thing - it was a cop-out | 09:30 |
rogpeppe | dimitern: i'm wondering about a nice printed format for addresses | 09:30 |
rogpeppe | dimitern, mgz: how about something like this: public:12.3.5.6(networkname) | 09:31 |
rogpeppe | dimitern, mgz: where (networkname) is omitted if NetworkName is unset | 09:31 |
rogpeppe | dimitern, mgz: and i'd omitted the address type because it's almost always easily divinable from the contents of the address. | 09:32 |
dimitern | rogpeppe, that lgtm | 09:34 |
rogpeppe | mgz: i suspect that Address.AddressType could actually be omitted entirely and implemented as a method. | 09:35 |
jam | dimitern or rogpeppe: https://bugs.launchpad.net/juju-core/+bug/1233936 | 09:38 |
_mup_ | Bug #1233936: worker/uniter: uniter restarts when relation removed <juju-core:Triaged> <https://launchpad.net/bugs/1233936> | 09:38 |
jam | It looks like if you delete a relation | 09:39 |
jam | then when the agents see the Changed event | 09:39 |
jam | they get an EPERM trying to find out what changed | 09:39 |
jam | (rather than, say, an ENOTFOUND) | 09:39 |
jamespage | can someone spare me some time to debug a juju-core 1.14.1/openstack havana problem I'm seeing? | 09:39 |
rogpeppe | wonderful errors without context | 09:39 |
rogpeppe | jamespage: what's the issue? | 09:39 |
jamespage | rogpeppe, I can bootstrap an environment OK; but when I deploy services, extra instances are not being provisioned | 09:40 |
jam | jamespage: does "juju status" tell you anything informative? | 09:41 |
jamespage | 'pending' | 09:41 |
jam | often it should give errors for the units you've asked for | 09:41 |
jamespage | no errors | 09:41 |
dimitern | jam, seems your analysis of that bug is correct | 09:41 |
jam | dimitern: I'm pretty sure I know why it is failing, I'm not 100% sure what the correct behavior is | 09:42 |
jamespage | jam, rogpeppe: I see this in machine-0.log | 09:42 |
jamespage | http://paste.ubuntu.com/6187432/ | 09:42 |
dimitern | jam, the problem is, before we get the relation we can't really decide whether the user is allowed to see it or not | 09:42 |
dimitern | jam, hence the ErrPerm there | 09:42 |
jam | dimitern: don't we encode the endopints into the key itself? | 09:43 |
dimitern | jam, but I guess the correct behavior is, rather than trying to fix the API (which is correct in this case - at least consistent), we should fix the uniter no to die on ErrPerm there | 09:43 |
jam | so we could check if the agent is one piece of the relation that would-have-existed | 09:43 |
rogpeppe | jamespage: could you paste the whole log please? | 09:43 |
rogpeppe | jamespage: from the look of that last line it looks like it is actually trying to provision an instance | 09:44 |
jam | jamespage: so given this is "com.canonical.serverstack.serverstack:ubuntu:..." I'm guessing this is a custom data source | 09:44 |
dimitern | jam, "would-have-existed" for a remote entity (i.e. not our authenticated one), is meaningless, if we get NotFound from state | 09:44 |
jam | jamespage: it *looks* like it properly finds an image id to launchd | 09:45 |
jam | candidate matches for products ["com.ubuntu.cloud:server:12.04:amd64"] are [0xc2004ac500] | 09:45 |
jamespage | jam: yes - thats the simplestreams sync of data into our testing cloud | 09:45 |
jam | (pointers don't help but the fact it isn't an empty list is a good sign) | 09:45 |
dimitern | jam, I mean we can't distinguish between "this is something you should be able to see" and not | 09:45 |
jam | dimitern: if "unit-0" has been validated, and asks about "relation-unit-0:unit-2" it doesn't seem like a problem to tell it ENOTFOUND | 09:46 |
jamespage | status output - http://paste.ubuntu.com/6187449/ | 09:46 |
jam | dimitern: at least, I thought we changed relation-tags to hold the relation-keys which are crafted based on the unit endpoints involved | 09:46 |
dimitern | jam, relation tags are exactly as the relation keys in state - no more, no less | 09:46 |
dimitern | jam, informationwise | 09:46 |
jamespage | jam, rogpeppe: http://paste.ubuntu.com/6187453/ | 09:46 |
jamespage | machine-0.log file | 09:47 |
jam | jamespage: the fact that we produce "openstack user data" should sounds like we are trying to start an instance | 09:47 |
jam | So it feels like we are missing the next lines | 09:47 |
dimitern | jam, and just because unit 0 was validated doesn't mean "have access to any arbitrary service a relation tag might specify" imo | 09:47 |
jam | dimitern: if it has *unit-0* in the tag | 09:47 |
jam | then it has access to relations involving *unit-0& | 09:47 |
jam | unit-0 | 09:47 |
rogpeppe | jam: yes, that message is actually produced inside the openstack StartInstance method | 09:48 |
jamespage | jam: agreed - I see calls to the nova-api for flavors and stuff | 09:48 |
dimitern | jam, well, I guess, although not entirely convinced it should | 09:48 |
jamespage | but nothing related to actually starting an instance! | 09:48 |
rogpeppe | jamespage: i suppose it's possible that the startinstance request is blocking forever | 09:48 |
jam | dimitern: why would a unit *ever* not have access to a relation involving that unit ? | 09:48 |
dimitern | jam, if we check the relation tag before trying to get the relation from state and make sure one of the services is the same as our unit's service, then report NotFound instead of ErrPerm | 09:49 |
rogpeppe | jamespage: could you try something for me? kill -QUIT the jujud process | 09:50 |
jam | jamespage: so, looking at the code | 09:50 |
jam | the next thing it does | 09:50 |
jam | is try to set up a Security Group | 09:50 |
jam | it is possible you're out of security groups | 09:50 |
jam | but our error logging is terrible ? | 09:50 |
fwereade_ | jm, heyhey | 09:50 |
fwereade_ | jam ^ | 09:50 |
jam | hey fwereade_ | 09:50 |
rogpeppe | jamespage: that *should* produce a stack trace to the log file, showing where everything is | 09:50 |
dimitern | jam, how critical is this bug? | 09:51 |
rogpeppe | jam: if the StartInstance call fails, the provisioner *does* actually log the error immediately | 09:51 |
jamespage | rogpeppe, http://paste.ubuntu.com/6187464/ | 09:51 |
jam | dimitern: it really depends, *I* would think that if a relation was deleted then we should fire a charm hook | 09:51 |
jam | so the charm can clean itself up | 09:51 |
jam | rogpeppe: but I'm wondering if we are hitting a failure in ensureSecurityGroup | 09:51 |
jam | which I don't see any log messages about | 09:52 |
rogpeppe | jamespage, jam: yes, looks like the StartInstance request is blocked forever | 09:52 |
dimitern | jam, that's a tall order - not many charms actually use that | 09:52 |
jamespage | rogpeppe, jam: log from nova api server - http://paste.ubuntu.com/6187468/ | 09:52 |
jam | dimitern: if the appropriate behavior is "deleted relation, nothing happens" then it doesn't matter if the nothing-happens is done by restarting the agent or by just continuing the loop | 09:52 |
rogpeppe | jam: see goroutine 78 in that last paste | 09:52 |
dimitern | jam, although restarting the agent kinda sucks | 09:52 |
dimitern | jam, I'll look into it later today then | 09:53 |
jam | jamespage: it does, indeed, appear to be stalled in an HTTP request | 09:53 |
fwereade_ | jam, the uniter did itself clean up that relation completely, didn't it? | 09:53 |
jam | fwereade_: no mention of it in https://bugs.launchpad.net/juju-core/+bug/1233936 | 09:53 |
_mup_ | Bug #1233936: worker/uniter: uniter restarts when relation removed <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1233936> | 09:53 |
fwereade_ | jam, looks like it's doing it to me | 09:54 |
jam | fwereade_: however that loop is trying to aggregate f.relationsChanged(ids) | 09:54 |
jam | fwereade_: if it can't get the id and returns immediately in the "range keys" loop | 09:54 |
jam | then it won't call f.relationsChanged | 09:54 |
jam | because the Uniter dies before it gets to call that | 09:54 |
jam | fwereade_: now, I could be completely wrong about where the error is originating, because we don't have any clue but where the error was finally logged | 09:55 |
fwereade_ | jam, it doesn't need to call relationsChanged | 09:56 |
jam | fwereade_: but if you hit the line 350 because err != nil and err != ENOTFOUND then you won't call f.relationsChanged | 09:56 |
jam | fwereade_: so, am I wrong in believing that if you delete a relation it should trigger a charm hook ? | 09:56 |
fwereade_ | jam, the hooks have all already run | 09:56 |
rogpeppe | jam: is that the entire nova server log? i'd expect to see something around 09:34 (assuming the clocks are vaguely in sync) | 09:56 |
fwereade_ | jam, relation removal wastriggered by the completion of the last hook essentially | 09:56 |
jam | rogpeppe: I think you mean jamespage ^^ | 09:56 |
rogpeppe | jam: i do - very inconvenient juxtaposition of irc nicks :-) | 09:57 |
jam | fwereade_: so dave's original bug was that he deleted the relation from the client | 09:57 |
jamespage | rogpeppe, those where the calls I saw when issued the QUIT | 09:57 |
jam | jamespage: do you know if nova logs when it starts a request or when it finishes them ? | 09:57 |
rogpeppe | jamespage: ah, i'd like to see what happened around the time the request was issued | 09:57 |
jam | jamespage: given the "time" field | 09:57 |
fwereade_ | jam, if the filter sees a relation that doesn't exist, the uniter *by definition* has no knowledge of it | 09:57 |
jam | sounds like when it finishes them | 09:58 |
jam | jamespage: if something was hung, then you wouldn't have a nova log either | 09:58 |
jamespage | rogpeppe, full log -http://paste.ubuntu.com/6187481/ | 09:58 |
jamespage | http://paste.ubuntu.com/6187481/ | 09:58 |
fwereade_ | jam, if the uniter knew about it, it'd be in scope | 09:58 |
fwereade_ | jam, if it's in scope, the relation can't be removed | 09:58 |
jam | fwereade_: so what does "delete the relation from a client mean" ? just trigger the teardown of the process ? | 09:59 |
fwereade_ | jam, destroy-relation will remove it straight off if no units are in scope; otherwise it sets dying and waits for the last departing unit to remove the relation as it does so | 10:00 |
fwereade_ | jam, it clearly ran the hooks that need to be run before leaving scope | 10:00 |
jamespage | jam: not sure bout the logging | 10:00 |
fwereade_ | jam, and then the relation was somehow removed | 10:00 |
jamespage | but I do see several established connections on the API server from the bootstrap node | 10:00 |
fwereade_ | jam, the overwhelming balance of prob is that the uniter really did leave scope | 10:00 |
jam | jamespage: .100 is the bootstrap node, right? | 10:02 |
jamespage | yes | 10:02 |
jam | fwereade_: so I think it boils down to: we used to get ENOTFOUND and now we get EPERM, is it best to return a nice ENOTFOUND if it looks like the uniter would have had access to that relation if it did exist. | 10:03 |
jam | fwereade_: I'm still skeptical that rebooting is "just ok" | 10:03 |
jam | as there is a step that would have happened if it had got ENOTFOUND | 10:03 |
jam | so, (1) I definitely think this should be fixed, but (2) I'd accept that it isn't Critical | 10:04 |
jam | fwereade_: but I actually wanted to chat about Uniter.CharmURL when we finish the other priority interrupts :) | 10:04 |
fwereade_ | jam, rebooting is just fine | 10:04 |
fwereade_ | jam, all the local uniter state for that relation is cleaned up before that can happen | 10:04 |
fwereade_ | I think the right thing is for the client to handle ErrPerm explicitly, given that we basically always return perm rather than notfound | 10:04 |
jam | fwereade_: that seems *very* cavalier to me | 10:05 |
jam | perhaps a "rebooting is fine in this cause because..." | 10:05 |
fwereade_ | jam, rebooting should not happen, but it doesn't cause any harm | 10:05 |
jam | fwereade_: hence my "this should be fixed, but isn't Critical" | 10:06 |
axw_ | jam, rogpeppe, dimitern: joining? | 10:07 |
axw_ | mgz: | 10:07 |
jam | jamespage: so from what I can see we are making an attempt, it is possible that the Openstack server is telling us "too many requests, try again after X seconds" and we will retry up to 3 times for that | 10:09 |
jam | I would expect that to show up in the nova log, though. | 10:10 |
jamespage | jam, rogpeppe: thats odd - the two missing instances just started up | 10:15 |
rogpeppe | jamespage: ha ha | 10:15 |
jamespage | rogpeppe, yeah - but the lag was massive | 10:16 |
rogpeppe | jamespage: that's probably because killing the agent terminated the requests, then it retried when it restarted | 10:16 |
rogpeppe | jamespage: so i suspect that for some reason those requests were blocked forever - i don't know if its our problem or nova's | 10:17 |
rogpeppe | jamespage: perhaps we should time-limit our requests | 10:17 |
jamespage | rogpeppe, maybe - I turned off ratelimiting | 10:17 |
jamespage | that might be why it started working but I'm uncertain | 10:18 |
jamespage | let me test that again | 10:18 |
jam | rogpeppe: I agree, we've had a bug reported that we would end up with huge amounts of "dead" connections because they never timed out | 10:20 |
arosales | fwereade_, thanks for the fix on bug https://bugs.launchpad.net/juju-core/+bug/1217781 | 10:24 |
_mup_ | Bug #1217781: machine destruction depends on machine agents <cts> <cts-cloud-review> <juju-core:Fix Committed> <https://launchpad.net/bugs/1217781> | 10:24 |
jamespage | jam, rogpeppe: hmm - so I turned rate-limiting back on and its still fine | 10:24 |
jamespage | odd | 10:24 |
jam | jamespage: well we haven't done enough requests yet to cause a problem :) | 10:24 |
jam | it may not be rate limiting | 10:25 |
jam | but something in nova that starts a request and never finishes it | 10:25 |
jamespage | add-unit -n 16 worked just fine as well | 10:25 |
jamespage | jam: lots of variables - might be sucky networking on 12.04 precise LXC (the cloud-controller is running in juju managed LXC) | 10:26 |
arosales | natefinch, was the maas bug you were going to look into bug https://bugs.launchpad.net/gomaasapi/+bug/1222671 ? | 10:33 |
_mup_ | Bug #1222671: maas provider must only attempt to stop machines in the allocated state <cts-cloud-review> <Go MAAS API Library:Triaged> <https://launchpad.net/bugs/1222671> | 10:33 |
natefinch | arosales: no, there's a bug about juju going out and finding nodes that are part of a different juju environment and shutting them down | 10:36 |
arosales | natefinch, ok | 10:38 |
arosales | natefinch, added a comment to your merge request | 10:54 |
arosales | https://code.launchpad.net/~natefinch/juju-core/018-azure-help/+merge/188936/comments/432802 | 10:54 |
arosales | the hp cloud trailing "/" is important as that actually breaks config | 10:55 |
natefinch | arosales: you have to do "publish and mail comments" for me to see the comments. | 10:57 |
natefinch | arosales: I saw that bug you filed... is the slash supposed to be there or not? | 10:57 |
arosales | natefinch, sorry I am not following usual core process here :-/ | 10:58 |
arosales | but you should see the comment in lp @ https://code.launchpad.net/~natefinch/juju-core/018-azure-help/+merge/188936 | 10:58 |
arosales | natefinch, the trailing slash is supposed to be there for hp. | 10:59 |
mramm | jam: you said you had two new bugs | 10:59 |
natefinch | arosales: ahh I see.. thanks. | 10:59 |
arosales | jam, fwereade_ can we quickly discuss https://bugs.launchpad.net/gomaasapi/+bug/1222671 ? | 11:00 |
_mup_ | Bug #1222671: maas provider must only attempt to stop machines in the allocated state <cts-cloud-review> <Go MAAS API Library:Triaged> <https://launchpad.net/bugs/1222671> | 11:00 |
mramm | jam: can you target them to 1.15.1? | 11:00 |
jam | mramm: will do: bug #1234577 | 11:01 |
_mup_ | Bug #1234577: Uniter needs to support ssl-hostname-verification: false <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1234577> | 11:01 |
jam | and bug #1234576 | 11:01 |
_mup_ | Bug #1234576: Upgrader needs to support ssl-hostname-verification: false <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1234576> | 11:01 |
natefinch | arosales: changes made from your comments. Thanks for looking. | 11:04 |
arosales | natefinch, thanks | 11:05 |
arosales | thanks for getting that in too :-) | 11:05 |
jam | dimitern: https://bugs.launchpad.net/juju-core/+bug/1233451 you can dupe it to something else if you already had a bug | 11:05 |
_mup_ | Bug #1233451: juju upgrade-juju results in unsupported behavior <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1233451> | 11:05 |
arosales | keeping the help menu up to date that is | 11:05 |
natefinch | arosales: no problem. It bugs me when there aren't good docs on tools. | 11:06 |
arosales | natefinch, yup and a bug UX issue | 11:11 |
natefinch | arosales: I also bugged evilnick and the web guys to make the docs more prominent on juju.ubuntu.com since they're really hard to find even if you know they're there somewhere | 11:12 |
mgz | I wonder if we could fix bu 1222671 at the juju-core level, it seems a little similar to filtering the result of instances to machines in the building/running state | 11:12 |
mgz | not sure if the list mass gives us actually has enough to trim out non-allocated machines though | 11:12 |
mgz | bug 1222671 | 11:13 |
_mup_ | Bug #1222671: maas provider must only attempt to stop machines in the allocated state <cts-cloud-review> <Go MAAS API Library:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/1222671> | 11:13 |
arosales | natefinch, I will also follow up with the web team on that feedback. | 11:13 |
natefinch | arosales: I literally had to do a text search of the front page to convince myself there even was a link to the docs there | 11:14 |
arosales | ouch | 11:14 |
arosales | natefinch, a user has to go to Resources --> then docs | 11:15 |
jam | mgz: it does say "maas provider" which sounds juju-y, | 11:15 |
natefinch | davecheney: I called azure Windows Azure because that's what Microsoft calls it (even though it has nothing to do with windows): http://www.windowsazure.com/en-us/ | 11:15 |
arosales | natefinch, that is the correct name branding | 11:17 |
rogpeppe | fwereade_, axw_, mgz, dimitern: here's the fix for that address polling problem: https://codereview.appspot.com/14337043/ | 11:17 |
dimitern | rogpeppe, looking | 11:18 |
natefinch | arosales: Cool. I just copied what they put on their website, figure they can't get mad at us for that :) | 11:18 |
rogpeppe | dimitern: thanks | 11:18 |
jam | rogpeppe: it doesn't seem like IsUnimplemented should be a Warning | 11:20 |
rogpeppe | jam: ? | 11:20 |
rogpeppe | jam: oh, i see | 11:21 |
jam | rogpeppe: https://codereview.appspot.com/14337043/patch/1/1007 | 11:21 |
rogpeppe | jam: i think it's reasonable to see that single message | 11:21 |
jam | rogpeppe: on every boot, on etc, I really don't think we want a Warning | 11:21 |
rogpeppe | jam: won't you only see it in the machine agent log file? | 11:22 |
jam | rogpeppe: it will end up in debug-log | 11:22 |
dimitern | rogpeppe, reviewed | 11:22 |
rogpeppe | dimitern: ta | 11:22 |
jam | and every time the agent restarts | 11:22 |
jam | etc | 11:22 |
dimitern | jam, a little clarification about bug 1234576 ? | 11:22 |
_mup_ | Bug #1234576: Upgrader needs to support ssl-hostname-verification: false <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1234576> | 11:22 |
rogpeppe | jam: ok, i'll make it not log anything in that case | 11:22 |
jam | rogpeppe: I would be ~ ok if it was INFO/DEBUG but Warning says that you might need to fix something, and this is explicitly unfixable | 11:22 |
jam | dimitern: Upgrader grabs a Tools URL and then does net/http/Get of that file | 11:23 |
rogpeppe | jam: fair enough | 11:23 |
jam | we need a way to have Upgrader realize that EnvironConfig.SSLHostnameVerification() == false | 11:23 |
rogpeppe | dimitern: would you be happier if it was 100 years? | 11:23 |
jam | dimitern: and then use utils.NonValidatingClient | 11:23 |
dimitern | rogpeppe, why poll at all in this case? | 11:24 |
rogpeppe | dimitern: because it doesn't complicate the code any more | 11:24 |
dimitern | jam, so basically the change is in fetchTools in the upgrader worker | 11:24 |
rogpeppe | dimitern: and we don't care if it does | 11:24 |
jam | dimitern: and some sort of API to get the environ setting into the worker | 11:24 |
dimitern | rogpeppe, it doesn't seem right | 11:24 |
jam | dimitern: I was going to change the Tools api to include that bit | 11:25 |
jam | but fwereade_ asked to make it a separate api call | 11:25 |
rogpeppe | dimitern: because...? | 11:25 |
jam | I don't really care | 11:25 |
dimitern | rogpeppe, it feels like we should stop polling and report an error there | 11:25 |
dimitern | jam, why is the API involved at all here? | 11:25 |
rogpeppe | dimitern: the machine poller has to stay around, unless we refactor most of the logic in that worker | 11:25 |
jam | dimitern: the Uniter worker doesn't have ENv creds | 11:25 |
jam | and doesn't have ENvironConfig | 11:25 |
jam | so it doesn't *know* if ssl-hostname-verification was set or not | 11:25 |
rogpeppe | dimitern: so it might as well just have a very long poll interval, i think | 11:26 |
dimitern | jam, ah, it's an environ config setting, ok, I get it now | 11:26 |
jam | dimitern: and essentially the same thing for Uniter downloading a charm | 11:26 |
rogpeppe | dimitern: which fixes this problem without making the code more complex for a case which is going to go away anyway | 11:26 |
jam | they don't have access to that config setting | 11:26 |
jam | and need it in one fashion or another | 11:26 |
dimitern | rogpeppe, doesn't it feel bad fixing it like that? :) | 11:26 |
rogpeppe | dimitern: no | 11:26 |
rogpeppe | dimitern: :-) | 11:26 |
rogpeppe | dimitern: it feels like a minimally invasive and perfectly sufficient change | 11:27 |
dimitern | jam, so some call shared by the uniter and upgrader, called SSLHostVerification bool ? | 11:27 |
rogpeppe | dimitern: and when the local provider gets an Addresses implementation, we just need to delete 4 lines of code. | 11:27 |
fwereade_ | jam, dimitern: I think I'm sold on putting the ssl setting in the api calls that return urls | 11:27 |
fwereade_ | jam, dimitern: we can just use the env setting now, and we're free to improve it as jam suggested at our leisure | 11:27 |
jam | dimitern: right, as mentioned I was going to put it into the existing API that they are already calling, but fwereade_ thought that was more risky | 11:27 |
rogpeppe | fwereade_: that seems good to me | 11:28 |
jam | fwereade_: makes me happy, though dimitern the Uniter charm one is using a StringsBoolResult | 11:28 |
jam | which is shared with GetP? call | 11:28 |
fwereade_ | jam, dimitern: I thought I just backpedalled on that but Ican't find where I typedit | 11:28 |
jam | which is what I wanted to talk with fwereade_ about earlier | 11:28 |
fwereade_ | jam, dimitern: so: I'm fine putting a DisableSSLHostnameVerification bool into the existing results, I think | 11:29 |
dimitern | fwereade_, jam, so adding a field to the result of the Tools() upgrader API call (which will make it available to the provisioner as well), and a field to the CharmURL() uniter call | 11:29 |
jam | dimitern: that was the idea, yeas | 11:29 |
jam | yes | 11:29 |
jam | fwereade_: to be clear, the EnvironConfig setting is "SSLHostnameVerification" | 11:29 |
fwereade_ | jam, dimitern: it can always match the env config setting for now, and it's all behind the api so it's possible to be more sophisticated in future | 11:30 |
dimitern | jam, camel case?? | 11:30 |
jam | dimitern: I mean, inverted true/false | 11:30 |
jam | the config setting | 11:30 |
jam | you set it to "false" to disable | 11:30 |
jam | false => stop checking certs | 11:30 |
rogpeppe | dimitern: another possibility would be to have EnvironProvider implement a SupportsInstanceAddresses method, then avoid starting the worker at all if that returns false. that's more efficient but much more invasive. | 11:30 |
dimitern | ok | 11:30 |
fwereade_ | jam, dimitern: and definitely not CharmURL | 11:31 |
dimitern | rogpeppe, but certainly feels more like the right thing to do | 11:31 |
dimitern | fwereade_, what? | 11:31 |
rogpeppe | dimitern: we'll only want to rip it out again later. | 11:31 |
fwereade_ | jam, dimitern: CharmURL is completely irrelevant AFAICT | 11:32 |
fwereade_ | jam, dimitern: why would we ever change that? | 11:32 |
rogpeppe | dimitern: there's no point in making all the code base more complex for this little temporary hack. | 11:32 |
dimitern | fwereade_, I don't know, just asking to figure it out what needs changing | 11:32 |
dimitern | rogpeppe, why temporary? when is it going away? | 11:32 |
fwereade_ | jam, dimitern: we're thinking of CharmArchiveURL | 11:32 |
jam | rogpeppe: dimitern: for polling, I could see some value in a say 1/hour message indicating that if the IP address were to change, we wouldn't notice | 11:32 |
fwereade_ | jam, dimitern: CharmURL itself is completely different | 11:33 |
jam | fwereade_: so *shrug* I hadn't finished writing it. But whatever we actually download needs to change. | 11:33 |
fwereade_ | jam, dimitern: obviously :-/ | 11:33 |
rogpeppe | dimitern: when the local provider implements Instance.Addresses, which i hope will happen quite soon | 11:33 |
fwereade_ | rogpeppe, you have any idea how to do that? | 11:33 |
dimitern | rogpeppe, ok, but please bug it and add a TODO about that | 11:33 |
dimitern | lest we forget later | 11:34 |
jam | rogpeppe: an infrequent message so when something does happen and the user goes WTF is kind of nice, but I'm happy with what you've done, (though it shouldn't be a Warning) | 11:34 |
fwereade_ | jam, dimitern: so anyway -- the only other thing to be careful about is how a false value for that field will be interpreted, because that's what we'll always get returned from 1.14 | 11:35 |
jam | rogpeppe: and I'm pretty strong on a JFDI to be done :) | 11:35 |
jam | fwereade_: fair point | 11:35 |
jam | and one that I would have gotten to, yes | 11:35 |
rogpeppe | fwereade_: well, the local provider implements DNSName and i am presuming that Addresses is going to supplant DNSName completely | 11:35 |
jam | fwereade_: I think I was leaning towards Disabled | 11:35 |
jam | but when I saw it just now I thought it should line up | 11:35 |
dimitern | fwereade_, jam, ok, it seems not CharmURL, but ArchiveURL is the one to patch in the uniter | 11:35 |
fwereade_ | jam, dimitern: so we might need to invert meaning for sanity's sake, but today I fail boolean logic | 11:35 |
dimitern | fwereade_, right, "false" will mean do verification | 11:37 |
dimitern | fwereade_, so he field will be called SkipSSLHostnameVerification | 11:37 |
dimitern | or NoSSLHostnameVerification | 11:38 |
fwereade_ | dimitern, let's call it Disable to match the setting it's mirroring maybe? | 11:38 |
rogpeppe | jam: it would be nice if you didn't get a log message every time it polls actually, although i'm not sure how to do that while preserving useful information. | 11:39 |
dimitern | fwereade_, ok DisableSSLHostnameVerification in ArchiveURL() and Tools() | 11:39 |
jam | fwereade_: so it *really* looks like CharmURL from everything I've traced through | 11:39 |
fwereade_ | dimitern, actually I may be on crack, I have no idea | 11:39 |
fwereade_ | jam, CharmURL is "cs:" or "local:" | 11:39 |
jam | fwereade_: ok, it turns out the object returned which has URL inside it is "live" and have a connection to the API | 11:39 |
dimitern | jam, fwereade_, uniter.charm.download() uses ArchiveURL() | 11:40 |
jam | it wasn't very clear that it wasn't just a blob of data | 11:40 |
rogpeppe | jam: we could log only if the message is different, but i suspect that some providers will include a bunch of other stuff in the error message including request ids etc which would make that not work | 11:40 |
jam | dimitern: well, getArchiveInfo("CharmArchiveURL") | 11:40 |
jam | dimitern: ah, but the API for it is ArchiveURL | 11:40 |
jam | gotcha | 11:40 |
dimitern | yep | 11:40 |
jam | it is hard to figure out what Charm object I'm looking at | 11:40 |
jam | across 3 level | 11:40 |
jam | levels | 11:40 |
jam | or more | 11:40 |
fwereade_ | rogpeppe, AFAICT the local provider Instance is completely fucked | 11:40 |
jam | as they are all *just called Charm* | 11:41 |
fwereade_ | rogpeppe, and will never work | 11:41 |
fwereade_ | rogpeppe, except for machine 0 :/ | 11:41 |
rogpeppe | fwereade_: i didn't look at it too much | 11:41 |
rogpeppe | fwereade_: fucked how, exactly? | 11:41 |
jam | fwereade_: so in default Precise, you can't find the IP address for the lxc's you started, only the instances themselves can report it back. *however* the 12.04.03 update gives us better LXC tools and lxc-ls *does* give us the info | 11:42 |
jam | fwereade_: so I don't think we're just-fucked | 11:42 |
fwereade_ | rogpeppe, getAddressForInterface("eth0") to get an address for some other container? | 11:42 |
jam | fwereade_: ^^ we change to us the lxc-* tools once we can assume we have the updated tools | 11:42 |
jam | which is at least *some* of what we want Cloud-tools archive for | 11:42 |
fwereade_ | jam, indeed, that's cool | 11:43 |
fwereade_ | jam, sooooo | 11:43 |
fwereade_ | jam, rogpeppe: ...we'll need an address updater permachine agent, then? | 11:43 |
rogpeppe | fwereade_: don't they all share the same address space currently? | 11:43 |
fwereade_ | rogpeppe, sure | 11:44 |
fwereade_ | rogpeppe, but getting one's own eth0 is unlikely to help in determining the address of something completely distinct | 11:44 |
rogpeppe | fwereade_: so it's kinda fit for purpose *currently*... | 11:45 |
jam | rogpeppe: right, william is just remarking that the current implementation will never work to get addresses for another machine, but there are plans to change how we do it | 11:45 |
fwereade_ | rogpeppe, by sheer ridiculous luck, yes, it works in the single situation it's usedbecauseit appens to run on the correct machine | 11:46 |
rogpeppe | fwereade_: i won't argue with that :-) | 11:47 |
yolanda | hi, i'm using juju-deployer to deploy a set of charms, but i find this error : error: cannot get latest charm revision: charm not found in "/home/yolanda/development/canonical-ci": local:precise/postgresql - shouldn't be local:postgresql, not local:precise/postgresql ? | 11:48 |
jam | yolanda: official charm locations have the series in them | 11:49 |
jam | local repos have a directory with the series | 11:49 |
jam | so $REPO/precise/postgresql would be the structure that you would do "juju deploy --local --repo $REPO postgresql | 11:49 |
jam | and jujut 'fills in' the default series (aka precise) | 11:49 |
yolanda | jam, i know it, but i'm asking about juju-deployer, it embeds the precise into it | 11:50 |
yolanda | if i deploy locally with local:charm works, but juju-deployer is deploying that as local:precise/charm | 11:50 |
jam | yolanda: you can also "juju deploy precise/postgresql" | 11:50 |
jam | I think | 11:50 |
jam | yolanda: I'm pretty sure that is supposed to work | 11:50 |
yolanda | i don't have control about juju deploy commands using juju-deployer wrapper, it's automated | 11:50 |
yolanda | so my question is about deploying using juju-deployer wrapper, not manually using juju deploy, which works for me | 11:51 |
yolanda | jam ^ | 11:52 |
fwereade_ | yolanda, a charm url is not valid without a series | 11:53 |
jam | yolanda: and I'm saying, that shouldn't be the problem, because both syntaxes are supposed to be valid | 11:53 |
fwereade_ | yolanda, local:postgresql is shorthand for user input only | 11:54 |
fwereade_ | yolanda, there is no such actual charm as local:postgresql | 11:54 |
yolanda | jam, fwereade, but then juju complains if i use local:precise/postgresql, and works if i use local:postgresql | 11:54 |
yolanda | as juju-deployer uses first syntax, it gives me error | 11:54 |
yolanda | if first url is set to be working, what can be stopping to work in my environment? | 11:54 |
fwereade_ | yolanda, so the charm is in $REPO/precise/posgresql, right? | 11:55 |
yolanda | yes | 11:55 |
jam | yolanda: it isn't something like you changed default series and actually have $REPO/saucy/postgresql locally, right? | 11:56 |
yolanda | jam, no, series is set as precise | 11:56 |
yolanda | and i have $REPO/precise/postgresql charm there | 11:56 |
fwereade_ | yolanda, and "juju deploy local:precise/postgresql" does not work, while "juju deploy local:postgresql" does? | 11:57 |
yolanda | fwereade_, sorry, tried manually now with local:postgresql and doesn't work also | 11:57 |
yolanda | but i have the charm in my local repo | 11:58 |
fwereade_ | yolanda, what's the charm name in the metadata? | 11:58 |
yolanda | postgresql | 11:58 |
fwereade_ | yolanda, if you run with --debug, do you see any "failed to load charm at" warnings? | 12:01 |
yolanda | let me try it | 12:01 |
yolanda | mm... 2013-10-03 12:01:48 WARNING juju repo.go:341 charm: failed to load charm at "/home/yolanda/development/canonical-ci/precise/postgresql": YAML error: line 6: found a tab character where an intendation space is expected | 12:02 |
yolanda | that should be a bug in postgres charm? | 12:02 |
jam | yolanda: looks like | 12:02 |
fwereade_ | yolanda, sounds like | 12:02 |
fwereade_ | :) | 12:02 |
yolanda | i can fix it and do an mp | 12:03 |
yolanda | fwereade. should i raise a bug? | 12:04 |
fwereade_ | yolanda, sorry, against what? it looks like it's a charm problem, but equally juju could probably somehow do better on that front | 12:05 |
fwereade_ | yolanda, local repos are pretty baroque | 12:05 |
rogpeppe | fwereade_: BTW rather my don't-poll fix, why don't i just implement Addresses in the local provider to simply call the existing DNSName method | 12:06 |
rogpeppe | ? | 12:06 |
rogpeppe | s/rather/rather than/ | 12:06 |
rogpeppe | fwereade_, mgz, dimitern: can you think of any down sides to the above? | 12:06 |
yolanda | fwereade, well, i was thinking in an MP for postgresql charm, but a bug against juju, to deal with malformed files... what do you think? | 12:06 |
mgz | rogpeppe: I nearly asked in the standup why not just implement it for local | 12:07 |
dimitern | rogpeppe, if it works by live testing, why not | 12:07 |
mgz | the hard case is containers in another provider, I didn't remember any local catches | 12:07 |
rogpeppe | mgz: i don't know why i didn't think of that, tbh | 12:07 |
jam | yolanda: if "juju deploy cs:postgresql" works, I would guess there is something other than a bug in the charm itself | 12:07 |
rogpeppe | right, i'll ditch that CL | 12:07 |
yolanda | jam, i branched lp:charms/postgresql, and config.yaml file has tabs instead of spaces, that's right | 12:08 |
yolanda | maybe it's not the same version as in charmstore? | 12:08 |
jam | yolanda: I see it here, I think: http://bazaar.launchpad.net/~charmers/charms/precise/postgresql/trunk/view/head:/config.yaml | 12:09 |
jam | It looks like someone's editor changed spaces to tabs | 12:09 |
jam | "helpfully" | 12:09 |
yolanda | do you want me to create the mp? | 12:10 |
jam | yolanda: and that change is in the very last commit to lp:charms/postgresql | 12:10 |
yolanda | or are you dealing with that? | 12:10 |
fwereade_ | jam, also looks like whoever committed it didn't try deploying it before doing so | 12:10 |
jam | fwereade_: yep | 12:10 |
fwereade_ | jam, yolanda: but I imagine the charm store just ignored it because it's invalid | 12:10 |
jam | fwereade_: stub merged richard's patch, but looks like he accidentally broke it | 12:10 |
mgz | yolanda: go ahead and fix and propose I'd say | 12:10 |
yolanda | ok | 12:10 |
fwereade_ | jam, ahhh :) | 12:10 |
fwereade_ | yolanda, so an MP against the postgres charm would be great | 12:11 |
yolanda | ok, doing it | 12:11 |
jam | fwereade_: it looks like the last change from Richard was to clean-up the description for one of the fields | 12:11 |
fwereade_ | yolanda, re juju-core, the idea was that local repos would ignore things that aren't valid charms | 12:11 |
jam | and, naturally, that breaks everything | 12:11 |
jam | but was "just a comment fix" | 12:11 |
jam | so it wasn't tested | 12:11 |
fwereade_ | yolanda, we have had troublein the past in whichone broken charmin a repo prevents anything being deployedfrom that repo | 12:12 |
fwereade_ | jam, heh | 12:12 |
yolanda | fwereade, ideally if these changes can't go into cs, it will be ok, the problem will be only if using some launchpad branch | 12:13 |
fwereade_ | yolanda, so I can't see a clear way forward that fixes your surprise without breaking things much worse | 12:13 |
fwereade_ | yolanda, yeah | 12:14 |
jam | fwereade_: reporting the warning by default would help :) | 12:14 |
jam | (there was something that looks like what you requested, but it isn't actually valid) | 12:14 |
yolanda | fwereade, jam: https://code.launchpad.net/~yolanda.robla/charms/precise/postgresql/fix_tabs/+merge/189057 | 12:17 |
jam | yolanda: lgtm, but Stub is the maintainer of that charm | 12:17 |
fwereade_ | jam, there wasn't really anything that actually looked like you wanted though | 12:18 |
fwereade_ | jam, directory name is irrelevant | 12:18 |
jam | I hopefully poked him in another window | 12:18 |
fwereade_ | jam, hmm, ok, maybe it did, I guess we read the metadat without difficulty | 12:18 |
yolanda | jam, cool, i'll update manually in the meantime | 12:19 |
fwereade_ | jam, actually showing that output by efaultwouldhave been helpful though | 12:19 |
jam | yolanda: stub says he's landing it now | 12:19 |
yolanda | but yes, instead of reporting as "charm does not exist", maybe juju deploy could show some error about invalid charm or something like that | 12:19 |
fwereade_ | yolanda, well, juju was 100% accurate, there was no such charm in the repo | 12:21 |
fwereade_ | yolanda, I think that the error is correct, and that just warning about broken charms is the Right Thing to do | 12:22 |
fwereade_ | yolanda, it feels like the worst bit is that the warning got swallowed by default | 12:22 |
dimitern | jam, 2013-10-03 12:22:39 ERROR juju supercommand.go:282 disabling ssh-hostname-verification is not supported | 12:22 |
dimitern | jam, how can I test it if I cannot disable it? | 12:23 |
jam | dimitern: so it should work for openstack, which is the one we care about | 12:23 |
yolanda | fwereade, if you enable debugging you can see it, but if not you aren't aware of the error | 12:23 |
jam | dimitern: or you could just comment out that config validation failure | 12:23 |
dimitern | jam, oh, I need to dust out my canonistack permissions | 12:23 |
jam | if you really want to test on EC2 | 12:23 |
jam | dimitern: or hp | 12:23 |
jam | dimitern: fwiw I tested the previous steps by mv /usr/share/ca-certificates and then running commands | 12:24 |
jam | otherwise the cert is still valid so it wouldn't fail whether you had that flag or not | 12:24 |
fwereade_ | yolanda, yeah, hiding important messages STM like a juju-core bug, please go ahead | 12:25 |
rogpeppe | is there any way to get sudo to preserve your existing $PATH ? | 12:25 |
fwereade_ | yolanda, thanks | 12:25 |
fwereade_ | -E | 12:25 |
jam | rogpeppe: sudo -E ? | 12:25 |
rogpeppe | jam: doesn't preserve $PATH, it seems | 12:25 |
dimitern | jam, /usr/share/ca-certificates where? | 12:25 |
jam | rogpeppe: "man sudo" says "ENvironment PATH may be overridden by the security policy" | 12:26 |
fwereade_ | rogpeppe, works for me, anyway | 12:27 |
jam | rogpeppe: so you could probably remove the Paths line from config if you wanted to avoid that | 12:27 |
jam | but you can't guarantee it | 12:27 |
jam | for $ARBITRARY_USER | 12:27 |
dimitern | jam, on your machine or ? | 12:27 |
jam | dimitern: so for "juju bootstrap" I did it on my machine, and then ssh'd into the started machine and did it there to test the line in cloud-init | 12:27 |
rogpeppe | jam: i'm just trying to work out a decent way of using the local provider when you're not using /usr/bin/juju | 12:28 |
dimitern | jam, ok, so I'll do it on machine 0 once it starts and restart the agent | 12:28 |
jam | dimitern: but honestly, if you are using utils.NonValidating that is *known* to work properly with non-validating certs. | 12:28 |
jam | rogpeppe: sure, thumper said earlier "sudo `$(which juju)` bootstrap" | 12:28 |
dimitern | jam, it's actually utils.GetNonValidatingClient() | 12:29 |
jam | dimitern: sure | 12:29 |
rogpeppe | jam: that doesn't work | 12:29 |
dimitern | ok | 12:29 |
rogpeppe | jam: i'm doing this currently: x=$PATH sudo -E sh -c 'export PATH=$x; juju bootstrap --debug' | 12:29 |
jam | dimitern: if you're using the one that the other stuff uses, I've tested that pretty well. You *can* write a test case for it | 12:29 |
rogpeppe | jam: which isn't ideal | 12:29 |
jam | using httptest.Server | 12:29 |
jam | dimitern: I don't have a test for cloud-init specifically, because we don't have any 3rd-party clouds that we have creds to that don't use valid certs | 12:30 |
jam | dimitern: but I *do* have a bunch of openstack localHTTPSServer tests | 12:30 |
jam | for the actual Provider interaction | 12:30 |
dimitern | jam, well, it seems to work ok | 12:35 |
rogpeppe | dimitern, jam, mgz: alternative fix to the logging spam problem: https://codereview.appspot.com/14339043 | 12:36 |
rogpeppe | fwereade_: ^ | 12:36 |
dimitern | rogpeppe, no tests? | 12:38 |
rogpeppe | dimitern: there are no tests for DNSName, (this is a thin wrapper around that), and I don't want to block this on adding appropriate testing to that | 12:38 |
dimitern | fwereade_, jam, https://codereview.appspot.com/14340043 - fix for bug 1234576 (one of the best ids so far!) | 12:38 |
_mup_ | Bug #1234576: Upgrader needs to support ssl-hostname-verification: false <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1234576> | 12:38 |
rogpeppe | dimitern: i raised a bug | 12:38 |
dimitern | rogpeppe, ok, please live test it at least | 12:39 |
rogpeppe | dimitern: i have | 12:39 |
rogpeppe | dimitern: at least, i've verified that we don't get the log spam - there's no externally visible way currently to see the output of Addresses. | 12:39 |
dimitern | rogpeppe, reviewed | 12:40 |
dimitern | rogpeppe, why? | 12:40 |
rogpeppe | dimitern: because nothing uses it yet | 12:40 |
dimitern | rogpeppe, can't you log what addresses you get and compare them? | 12:41 |
dimitern | rogpeppe, not as part of the code, just for testing | 12:41 |
rogpeppe | dimitern: we could write a test that does that, yes | 12:41 |
rogpeppe | dimitern: and the addressupdater code tests that | 12:41 |
rogpeppe | dimitern: but you can't see what Addresses are attached to a machine by looking at juju status, for example | 12:42 |
dimitern | rogpeppe, I meant simply adding a log.Errorf("Addresses returns: %v", addresses) and bootstrapping a local environment | 12:42 |
dimitern | rogpeppe, and use lxc-ls or something to get the container addresses? | 12:42 |
mgz | dimitern: we run into the precise problem with that | 12:44 |
mgz | lxc-ls is useless on precise | 12:45 |
rogpeppe | dimitern: we're talking about two statements here. i believe that they work, and it doesn't actually matter if they don't. we need more testing in this area, but i don't think it matters at this moment. | 12:45 |
dimitern | rogpeppe, are you running precise? | 12:45 |
rogpeppe | dimitern: no | 12:45 |
dimitern | rogpeppe, ok then | 12:46 |
mgz | ah, you didn't mean doing that in the code? | 12:46 |
mgz | read through the conversation a bit too fast :) | 12:46 |
dimitern | I simply meant as a local live test | 12:46 |
dimitern | but whatever | 12:46 |
dimitern | seeing is better than believing alone | 12:47 |
yolanda | fwereade, jam: https://bugs.launchpad.net/juju-core/+bug/1234687 | 12:47 |
_mup_ | Bug #1234687: juju is hiding bugs in charms <juju-core:New> <https://launchpad.net/bugs/1234687> | 12:47 |
natefinch | mgz, rogpeppe, fwereade_, jam: anyone want to finish up the review Dave started? Just docs, but I'd like them in asap: https://codereview.appspot.com/14207048/ | 12:47 |
mgz | natefinch: if no one beats me to it, I'll look after doing various code cleanup things on my branch | 12:52 |
natefinch | mgz: thanks | 12:53 |
rogpeppe | natefinch: any particular reason you added the extra newline before the Doc text in addmachine.go ? | 12:53 |
rogpeppe | natefinch: ha, it looks like it's stripped anyway | 12:54 |
natefinch | rogpeppe: it makes the text in-code more clear not to have it indented due to the variable assignment, and they produce the same output anyway... I'm trying to keep all the doc formatting the same | 12:54 |
natefinch | rogpeppe: yep | 12:54 |
fwereade_ | rogpeppe, I am confused by https://codereview.appspot.com/14339043/ | 12:56 |
rogpeppe | fwereade_: go on | 12:57 |
fwereade_ | rogpeppe, didn't we agree that local.Instance.DNSName isno good unless it's run on the relevant instance? | 12:57 |
jam | dimitern: so one test we *could* add is to set up the local dummy service with an HTTPS Server and assert that the Upgrader is able to find the tools, do you think that is worthwhile? | 12:57 |
rogpeppe | fwereade_: well, Addresses is in exactly the same boat | 12:57 |
fwereade_ | rogpeppe, I would*much* rather have the notimplemented hack than poke bad data into state | 12:57 |
rogpeppe | fwereade_: and it's never called for real | 12:58 |
rogpeppe | fwereade_: the bad data is already in state | 12:58 |
fwereade_ | rogpeppe, how did it get there? | 12:58 |
rogpeppe | fwereade_: by calling Instance.DNSName, no? | 12:58 |
dimitern | jam, seems extreme | 12:58 |
fwereade_ | rogpeppe, when did that go into state? | 12:58 |
jam | dimitern: well it is the only thing that we actually care about | 12:59 |
jam | you don't ever test that the boolean we return is actually acted upon | 12:59 |
fwereade_ | rogpeppe, the only addresses we have hitherto stored (that I am aware of) have come from code running on the instances in question, as part of uniter setup | 12:59 |
dimitern | jam, I have no idea how to do that | 12:59 |
rogpeppe | fwereade_: good point, but... how is Addresses any different from DNSName ? | 13:00 |
rogpeppe | fwereade_: the result of Addresses isn't going into the state either | 13:00 |
rogpeppe | fwereade_: .... is it? | 13:00 |
fwereade_ | rogpeppe, huh? isn't that precisely what addressupdater does? | 13:00 |
rogpeppe | fwereade_: ha, i see | 13:00 |
jam | dimitern: so in the interests of getting 1.15.1 out the door, I think we should probably just land it, have you implement the next one, land it, and then come back to fill in the tests | 13:01 |
rogpeppe | fwereade_: erm, aren't things using the result of instance.DNSName currently to decide where to connect to? | 13:01 |
jam | dimitern: but http://bazaar.launchpad.net/~go-bot/juju-core/trunk/view/head:/provider/openstack/local_test.go#L705 is what I set up for the Openstack HTTPS tests | 13:01 |
fwereade_ | rogpeppe, yeah, but by sheer luck the only things that do are running somewhere where Instance.DNSName happens to be correct | 13:01 |
rogpeppe | fwereade_: i don't really mind storing the DNSName results in state - it's not as if they're permanent | 13:02 |
fwereade_ | rogpeppe, they are *wrong*, and they will fuck everything up | 13:02 |
rogpeppe | fwereade_: any time we upgrade to make a better implementation, the addresses stored in state will change appropriately | 13:02 |
fwereade_ | rogpeppe, asking a unit for its addresses asks the machine first | 13:02 |
fwereade_ | rogpeppe, if you put bad data into machines, units start reporting the wrong addresses | 13:03 |
dimitern | jam, sgtm, will have a look after landing these two | 13:03 |
rogpeppe | fwereade_: hmm, so if there's no machine address, the uniter uses an EnvironProvider method to find the address of itself? | 13:04 |
fwereade_ | rogpeppe, yeah | 13:04 |
fwereade_ | rogpeppe, well, the unit always does that | 13:04 |
fwereade_ | rogpeppe, but machine addresses are the canonical location | 13:04 |
fwereade_ | rogpeppe, we just can't yet drop the unit address lookup | 13:05 |
fwereade_ | rogpeppe, precisely because we can't get good addresses for the machines in all cases | 13:05 |
rogpeppe | fwereade_: does that mean the address updater cannot work in the local provider? | 13:05 |
fwereade_ | rogpeppe, I thought you already knew that, and that was the reasoning behind the notimplemented thing :) | 13:05 |
fwereade_ | rogpeppe, until we can use a new lxc-ls everywhere, yes | 13:06 |
rogpeppe | fwereade_: ah, i see - i hadn't realised that was the blocker | 13:06 |
mgz | gah, right we do need a working lxc-ls for the local provider | 13:07 |
rogpeppe | fwereade_: in which case, there's https://codereview.appspot.com/14337043/ instead | 13:07 |
fwereade_ | rogpeppe, that LGTM if it's really hard to abort the loop vs polling once per year | 13:09 |
rogpeppe | fwereade_: it would have to duplicate a lot of the loop's logic (reading on Dying, sending on died, reading on changed, exiting when dead); i don't really see the advantage of doing it. | 13:11 |
rogpeppe | fwereade_: unless, as i said above, we added some method to EnvironProvider, but that has tentacles and seems way overkill for killing a few log messages. | 13:11 |
jam | dimitern: step one LGTM | 13:12 |
fwereade_ | rogpeppe, were it not for the tentacles, that would be my favoured solution | 13:12 |
fwereade_ | rogpeppe, but I'm looking for mr right now, not mr right ;p | 13:13 |
jam | fwereade_: but the tentacles are the tasty part :) | 13:13 |
fwereade_ | jam, I have a sudden recollection of laura over summer... playing peter pan with cuddly toys... "and cthooley can be tinkerbell" | 13:15 |
fwereade_ | (my stepmother had a cuddly cthulu) | 13:15 |
mgz | that's pretty ace | 13:16 |
fwereade_ | I thought so :) | 13:16 |
dimitern | jam, thanks | 13:19 |
rogpeppe | fwereade_: i had a sudden thought that if it found an unimplemented error, it could kill the whole worker (ensuring it doesn't restart), but it's not that straightforward to do, sadly. | 13:22 |
fwereade_ | rogpeppe, ah, not to worry | 13:23 |
fwereade_ | rogpeppe, maybe a short comment explaining why it's necessary would be a good idea though | 13:23 |
rogpeppe | fwereade_: i'm adding one | 13:23 |
fwereade_ | rogpeppe, cheers | 13:24 |
rogpeppe | it wouldn't be *too* hard (just make a special error that is recognised by worker.Runner that says "i really want to quit without taking anything else down or being restarted"), but not for now. | 13:24 |
fwereade_ | rogpeppe, agreed | 13:25 |
dimitern | jam, added bug 1234715 for that | 13:26 |
_mup_ | Bug #1234715: Verify SSLHostnameVerification: false behavior with a test (upgrader, uniter) <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1234715> | 13:26 |
rogpeppe | dimitern, fwereade_: landin | 13:32 |
rogpeppe | g | 13:32 |
dimitern | rogpeppe, the "no log spam" thing? | 13:33 |
rogpeppe | dimitern: yeah | 13:33 |
dimitern | rogpeppe, sweet | 13:33 |
rogpeppe | dimitern: i changed unimplementedError to notImplementedError | 13:33 |
rogpeppe | dimitern: and added a comment about the 1y thing | 13:34 |
dimitern | rogpeppe, great, thanks | 13:34 |
dimitern | jam, fwereade_ this is the fix for bug 1234577 https://codereview.appspot.com/14337044 | 13:34 |
_mup_ | Bug #1234577: Uniter needs to support ssl-hostname-verification: false <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1234577> | 13:34 |
rogpeppe | fwereade_: i don't know if you saw last night, but i succeeded in bootstrapping a live ec2 environment with a environments.yaml entry that was just "myenv": {"type": "ec2"} | 13:35 |
rogpeppe | fwereade_: which is quite cool | 13:35 |
dimitern | rogpeppe, you have your AWS_* env vars set then | 13:36 |
rogpeppe | dimitern: yeah, it needed them of course | 13:36 |
fwereade_ | rogpeppe, nice, I did that too, it was very satisfying :) | 13:36 |
jam | dimitern: reviewed LGTM | 14:00 |
mattyw | fwereade_, I'm hoping to point you in the direction of a merge proposal later for the id stuff (just the api and owner-get tool) would I be able to grab you today to talk about the next stage (user creation and deletion) | 14:04 |
mattyw | (I appreciate you're busy at the moment) | 14:05 |
fwereade_ | mattyw, I have 20 mins until my next meeting and had had not much expectation of accomplishing other things in the interim, so that's perfect :) | 14:10 |
fwereade_ | mattyw, otherwise sometime later should also be fine | 14:10 |
mattyw | fwereade_, if now's good I'm ready to listen :) | 14:11 |
fwereade_ | mattyw, I think there's an intermediate step | 14:12 |
fwereade_ | mattyw, in which we explicitly set the admin user on the services she creates | 14:12 |
mgz | rogpeppe: don't know if you want to re-stamp the default vpc branch before I land, have made the changes you suggested | 14:13 |
fwereade_ | mattyw, taking care to keep the services which don't have that field still working | 14:13 |
fwereade_ | mattyw, at the state level, that involves adding a param to AddService | 14:14 |
fwereade_ | mattyw, and at the apiserver level it involves extracting the entity tag from the connection and passing it into AddService | 14:14 |
fwereade_ | mattyw, shouldn't otherwise hit the API at all I think | 14:15 |
dimitern | jam, cheers | 14:15 |
mattyw | fwereade_, so that would mean that any service that gets deployed would have the admin-user set as the owner of that service? | 14:15 |
mattyw | and no change to the command line args | 14:15 |
fwereade_ | mattyw, yeah, that'd be the effect | 14:15 |
mattyw | fwereade_, and again - we'd just hardcode the user - so we'd add a oarameter to addservice - but we'd also pass a hardcoded "user-admin" to it? | 14:17 |
fwereade_ | mattyw, the api server knows who's connected to it | 14:20 |
fwereade_ | mattyw, you can get the tag from ... uh, somewhere | 14:20 |
fwereade_ | mattyw, so while there's only one user still it *will* always be user-admin | 14:21 |
fwereade_ | mattyw, but it sets us up to do the right thing transparently when there are more users | 14:21 |
fwereade_ | mattyw, adding users is easy, and deleting them is a bit more interesting, but I think that's something we can ignore for a little bit longer | 14:22 |
fwereade_ | mattyw, sorry, popping out for a quick cig before 4:30 | 14:23 |
fwereade_ | mgz, rogpeppe, natefinch, dimitern: are your https://launchpad.net/juju-core/+milestone/1.15.1 bugs up to date? | 14:30 |
fwereade_ | and would someone please take a look at axw's https://codereview.appspot.com/14329043/ and, if it checks out, land it for him please? (unless he's actively here?) | 14:31 |
mgz | fwereade_: yes, but I'm about to bot-propose mine now, so the bot will change status shortly | 14:32 |
mgz | can also pick up that cl if needed | 14:33 |
dimitern | fwereade_, yeah, except I'm not doing the upgrade one for now, until I land the other 2 fixes | 14:34 |
dimitern | and the bot is not being helpful | 14:34 |
mgz | dimitern: what's up with the bot? | 14:36 |
mgz | I'm just about to poke | 14:36 |
dimitern | mgz, it seems the random test failures have increased | 14:37 |
dimitern | mgz, and the variety of the packages that fails as well | 14:37 |
rogpeppe | hmm, the error you now get when destroying an already-destroyed environment is not great: http://paste.ubuntu.com/6188368/ | 14:41 |
natefinch | wow, that's bad | 14:42 |
fwereade_ | mramm, call is full | 14:44 |
mramm | :( | 14:44 |
mramm | sinzui: hey, is this in progress bug completed? https://bugs.launchpad.net/juju-core/+bug/1234456 | 15:09 |
_mup_ | Bug #1234456: release-public-tools.bash must be hacked to work with debs <juju-core:In Progress by sinzui> <https://launchpad.net/bugs/1234456> | 15:09 |
dimitern | mgz, did you update bot's goamz? | 15:12 |
mgz | dimitern: I did | 15:12 |
mgz | was the only poke I made in that case | 15:13 |
dimitern | ok | 15:13 |
mgz | but I can try and help you with other things | 15:13 |
sinzui | mramm, I need to land it | 15:13 |
sinzui | oh, and it didn't land | 15:13 |
mramm | ok | 15:13 |
mgz | so, 1.15.1 is going to be from r194X, right? when the last bit we want gets landed today? | 15:14 |
mgz | sinzui: can you link me the release notes doc so I can add a note? | 15:15 |
sinzui | mgz, https://docs.google.com/a/canonical.com/document/d/1o8YsLrQuadB1gbd5veJ3cpN_r2uozKwwTmIh1RmOVHM/edit#heading=h.h7wry0fbg197 | 15:16 |
mgz | sinzui: ta! | 15:17 |
sinzui | mgz, can you give me a clue to resolve this error: http://pastebin.ubuntu.com/6188547/ | 15:25 |
mgz | sinzui: to land things on juju core, we just mark the mp approved and the bot looks for that | 15:26 |
sinzui | mgz, I think it is https://code.launchpad.net/~sinzui/juju-core/release-public-tools-with-streams/+merge/188966 | 15:26 |
mgz | aaron's lp:rvsubmit is a pretty bzr plugin to make it easy | 15:27 |
mgz | sinzui: +add a commit message | 15:27 |
sinzui | doh!, I get hate mail from the lander we created for charmworld. I assumed it did the same | 15:27 |
mgz | yes, it's very confusing needing to use `lbox propose` but needing to NOT use `lbox submit` | 15:28 |
dimitern | well, the first 2 times maybe, but then you just forget about lbox submit | 15:28 |
dimitern | fwereade_, ping | 15:30 |
mgz | dimitern: till you then need to land something on goamz or the like, and rediscover it :) | 15:31 |
dimitern | mgz, yeah, although we can bring goamz under the bot as well :P | 15:31 |
fwereade_ | dimitern, pong | 15:32 |
dimitern | fwereade_, re bug 1233451 | 15:33 |
_mup_ | Bug #1233451: juju upgrade-juju results in unsupported behavior <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1233451> | 15:33 |
dimitern | fwereade_, is this for today as well? | 15:33 |
dimitern | fwereade_, (i.e. 1.15.1) | 15:33 |
fwereade_ | dimitern, I had not thought it was, I was a little surprised to see it there | 15:36 |
fwereade_ | dimitern, it's only necessary if we have to do 1.20 before 2.0 | 15:37 |
dimitern | fwereade_, yeah, because I definitely won't manage for today with it | 15:37 |
fwereade_ | dimitern, and certainly shouldn't block today, indeed | 15:37 |
fwereade_ | dimitern, I guess take it off the milestone | 15:37 |
dimitern | fwereade_, done | 15:38 |
rogpeppe | these kinda of sporadic failures are worrying; i have no idea what's going on here: https://code.launchpad.net/~rogpeppe/juju-core/436-addressupdater-log/+merge/189012/comments/432808/+download | 15:43 |
fwereade_ | rogpeppe, the bright side I suppose is that the suite eventually recovers from the dirty socket problems | 15:50 |
* fwereade_ has to go out for a while -- would mgz, rogpeppe, dimitern, natefinch look after the remaining 1.15.1 bugs please? | 16:04 | |
rogpeppe | fwereade_: yeah, i'm still live verifying stuff | 16:05 |
natefinch | fwereade_: sure thing | 16:05 |
fwereade_ | thanks guys | 16:06 |
natefinch | rogpeppe or mgz: if one of you can check out my docs changes, I can land them to get that one off the list: https://codereview.appspot.com/14207048/ | 16:06 |
mgz | natefinch: right, really doing that now | 16:06 |
rogpeppe | natefinch: sorry, i got diverted while looking at them | 16:06 |
dimitern | sinzui, https://bugs.launchpad.net/juju-core/+bug/1234456 fix has landed, right? | 16:10 |
_mup_ | Bug #1234456: release-public-tools.bash must be hacked to work with debs <juju-core:In Progress by sinzui> <https://launchpad.net/bugs/1234456> | 16:10 |
sinzui | dimitern, yes | 16:10 |
dimitern | sinzui, ok, marked as Fix Committed | 16:10 |
sinzui | oh, sorry, I thought gobot could do that | 16:10 |
dimitern | lbox does weird things sometimes, like changing milestones around | 16:10 |
dimitern | sinzui, it used to but it stopped a while ago I think | 16:11 |
mgz | natefinch: see review | 16:13 |
natefinch | mgz: initial newlines are stripped automatically already... it just makes the code cleaner when you have it all smashed up against the left side of the window. And I ordered the providers alphabetically on Dave's suggestion... it seems more fair and an easier rule to follow. | 16:15 |
mgz | fair enough | 16:16 |
natefinch | mgz: I'll make those removals though, thanks. | 16:16 |
dimitern | approving axw's fix for bug 1234127 | 16:17 |
_mup_ | Bug #1234127: juju should enable cloud-tools pocket for Precise <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1234127> | 16:17 |
dimitern | that's the last one for 1.15.1 after nate's | 16:17 |
rogpeppe | mgz: reviewed | 16:19 |
mgz | rogpeppe: thank you. natefinch ^ | 16:20 |
rogpeppe | mgz: ta | 16:20 |
rogpeppe | natefinch: thanks for doing that - it's really good to make docs more consistent | 16:20 |
natefinch | rogpeppe: np. Happy to put my little bit of OCD to good use :) | 16:20 |
natefinch | rogpeppe: thanks for the review | 16:21 |
natefinch | so glad sublime text has alt-q to wrap the current paragraph to the column ruler.... it even preserves indents and line prefixes so you can use it on stuff that's commented out | 16:24 |
hazmat | arosales, so what's been missing in the azure provider affinity group discussion, is what changed from when it when worked well for other regions in 1.14.1 | 16:31 |
arosales | hazmat, the defaults were East | 16:33 |
arosales | and the tools were in east | 16:33 |
arosales | so not affinity issues | 16:33 |
arosales | issue is when you try to deploy in West | 16:33 |
hazmat | arosales, hmm.. ic, thanks | 16:34 |
rogpeppe | fwereade_: i'm wondering about the behaviour of juju when you try to use (not bootstrap or sync-tools) an environment that has no associated info. at the moment you'll get a config error if the environments.yaml file omits some prepare-time attributes, such as admin-secret. | 16:38 |
rogpeppe | fwereade_: i'm thinking that it there's no associated info and creating a new Environ directly from environments.yaml fails, that we should just return an "environment does not exist" error. | 16:39 |
rogpeppe | fwereade_: or perhaps "environment has not been created" | 16:40 |
rogpeppe | fwereade_: and that when everyone has .jenv files, we can just fail when there's no associated info, with the same error | 16:40 |
arosales | hazmat, I don't know if it is a juju thing that is setting the affinity group or Azure | 16:43 |
arosales | hazmat, I found this http://michaelwasham.com/2012/08/07/http-error-message-the-location-or-affinity-group-east-us-specified-for-source-image/ | 16:44 |
arosales | but juju is only reading the public tools which I wouldn't think would cause an affinity group set, but I don't know. That is the main question | 16:44 |
arosales | the juju tool read does happen first, | 16:44 |
hazmat | arosales, interesting that link does suggest a solution | 16:45 |
arosales | hazmat, basically upload tools to the same region you deploy in, or are you reading something different there? | 16:45 |
hazmat | although exactly where the issue is not entirely clear either without further investingation | 16:45 |
arosales | hazmat, it also may be that my juju control bucket for storage is in east when I try to deploy in west . . . | 16:46 |
hazmat | arosales, i'll try digging deeper over the weekend if no else has gotten to it. i did have azure working with last week. | 16:47 |
hazmat | arosales, the azure provider does a very good job of cleaning up after itself | 16:47 |
hazmat | arosales, at the cost of being synchronous, including its tool bucket. | 16:47 |
arosales | hazmat, it does but it takes some time. | 16:47 |
arosales | hazmat, there is a new api for delete that we should work in some time. | 16:48 |
hazmat | arosales, its hard to avoid given the api.. to kill the vnet, it has to wait on instance kills, etc. although ideally its doing that in parallel with a wait for the group (for instance stop). | 16:48 |
arosales | hazmat, yup and I think msft tried to clean that code up. | 16:49 |
hazmat | arosales, did i read that right? msft cleaned up juju code? will wonders never cease. | 16:53 |
natefinch | rogpeppe, mgz, fwereade_, dimitern: just landed my docs branch (well, set to approved, waiting for the bot). Gotta run out for a doctor's appointment, but I'll be back later in the day. | 16:54 |
mgz | I'm going to be around but only with one eye here for the evening | 16:57 |
sinzui | rogpeppe, I see GhostRevisionHasNoRevno errir in gnuflag. I think you have the power to fix it: http://pastebin.ubuntu.com/6188999/ | 17:23 |
rogpeppe | sinzui: interesting | 17:24 |
* rogpeppe doesn't know about ghost revisions | 17:24 | |
sinzui | yeah, I haven't seen a ghost issue in years | 17:24 |
sinzui | I don't know the details, I think abentley, jam, and mgz do though | 17:25 |
rogpeppe | sinzui: | 17:26 |
rogpeppe | bzr fetch-ghosts | 17:26 |
rogpeppe | bzr: ERROR: unknown command "fetch-ghosts" | 17:26 |
abentley | rogpeppe: it's from bzrtools. | 17:26 |
* sinzui is updating the make tarball script to honour dependencies.tsv | 17:27 | |
rogpeppe | abentley: thanks | 17:28 |
rogpeppe | sinzui: how can i tell if i've fixed the issue? | 17:28 |
sinzui | rogpeppe, this is how I found the issue | 17:30 |
sinzui | bzr checkout --lightweight -r 12 lp:gnuflag gnuflag | 17:30 |
rogpeppe | sinzui: i just tried that actually - it still failed | 17:30 |
sinzui | :/ | 17:30 |
rogpeppe | sinzui: i'll try the second suggestion | 17:30 |
rogpeppe | sinzui, abentley: nope. nothing doing. i've tried all the combinations i can think of | 17:38 |
rogpeppe | sinzui: the bzr push always says "No new revisions or tags to push" | 17:38 |
rogpeppe | sinzui: which i presume means it's not making any changes | 17:39 |
sinzui | hmm. this is tricky | 17:39 |
rogpeppe | sinzui: should fetch ghosts, then make a blank commit and try again? | 17:39 |
rogpeppe | s/should/should i/ | 17:39 |
sinzui | push --overwrite says there is nothing to do? | 17:39 |
rogpeppe | sinzui: yes | 17:39 |
sinzui | I don't think fetch will help | 17:39 |
sinzui | I can change the release tarball script to not use lightwieght checkouts. I think that will unblock me | 17:40 |
rogpeppe | sinzui: if i knew what the issue was, i could probably do something about it | 17:40 |
sinzui | abentley, ^ do you think the project's repo is corrupt? | 17:40 |
rogpeppe | sinzui: i could erase most of the history of the branch, i suppose, by rewinding it and applying a patch, and push --overwriting. | 17:41 |
sinzui | rogpeppe, me too. | 17:41 |
sinzui | me looks at branch again | 17:41 |
rogpeppe | sinzui: i don't really want to lose the repo history though - it's quite useful sometimes | 17:42 |
sinzui | rogpeppe, I agree. | 17:42 |
sinzui | rogpeppe, I see from https://code.launchpad.net/~gnuflag/gnuflag/trunk that the branch we would normally use as the nase of the stack is actually stacked on the original branch | 17:43 |
sinzui | ^ abentley could this issue a manefestation of stacking and will `bzr reconfigure --unstacked` address the issue | 17:44 |
* sinzui pulls the branch and tries | 17:44 | |
rogpeppe | sinzui: apologies for my ignorance. what's a stacked branch? | 17:44 |
sinzui | rogpeppe, lp:~gophers/gnuflag/trunk | 17:45 |
rogpeppe | sinzui: that's the trunk branch, yes? | 17:45 |
rogpeppe | sinzui: what does it mean for it to be "stacked"? | 17:45 |
abentley | sinzui: No, the problem is that bzr doesn't normally push ghost revisions. That's why you need to use the fetch-ghosts command, not "pull" or "push". | 17:45 |
sinzui | rogpeppe, stacking is a disappointing space optimisation used by Lp. the true trunk branch (yours) does not have all its revisions, so bzr will pull the remaining revs from ~gophers. Shared repos are better...but Lp doesn't know about them | 17:46 |
rogpeppe | sinzui: so the true trunk branch isn't lp:~gophers/gnuflag/trunk ? | 17:47 |
sinzui | actually it is... | 17:48 |
sinzui | rogpeppe, the project thinks ~gnuflag/gnuflag/trunk is the current focus of development. It will stack all new branches on that, so that all new branches will be based on true trunk. But true trunk doesn't have all its revisions, most are in lp:~gophers/gnuflag/trunk. This probably happened when the branches were switched | 17:50 |
sinzui | This is one of many disappointing "features" of stacking | 17:51 |
rogpeppe | sinzui: ah, i hadn't twigged to ~gnuflag vs ~gophers distinction | 17:51 |
rogpeppe | s/ to / the / | 17:51 |
rogpeppe | sinzui: well, if you work out a way of fixing it, let me know what to type and i'll type it :-) | 17:52 |
sinzui | okay. | 17:52 |
* rogpeppe has definitely reached eod | 18:00 | |
rogpeppe | g'night all | 18:00 |
abentley | sinzui: If you manually stack ~gnuflag/gnuflag/trunk on ~gophers/gnuflag/trunk, that should fix it. And if it does, I would expect it would stay fixed after reconfigure --unstacked. | 18:29 |
sinzui | abentley, I will try that | 18:30 |
abentley | sinzui: Stacking uses the database ids of branches, so renaming branches should not break stacking. | 18:30 |
sinzui | I just realised that -r revno is broken in this case, but -r revid works | 18:32 |
abentley | sinzui: I have scripted out basically the whole process of setting up for lxc. Now setting it up on juju-gui-qa. I love watching when scripts Just Work. | 19:10 |
sinzui | me too abentley | 19:12 |
thumper | morning | 19:47 |
thumper | morning natefinch | 19:48 |
natefinch | thumper: morning | 19:48 |
thumper | I was just looking at the 1.15.1 milestone | 19:48 |
thumper | and the only non-fix committed bug is the azure help | 19:48 |
thumper | how's that going? | 19:48 |
natefinch | thumper: it's committed... the bot just seems to not want to land it | 19:48 |
thumper | ?! | 19:49 |
natefinch | thumper: https://code.launchpad.net/~natefinch/juju-core/018-azure-help/+merge/188936 | 19:49 |
natefinch | *shrug* | 19:49 |
* thumper looks | 19:49 | |
thumper | natefinch: you need to set the commit message :) | 19:49 |
natefinch | sonofa | 19:49 |
thumper | heh | 19:49 |
natefinch | why can't it just take the description? :/ | 19:49 |
thumper | there is a long and sorded history | 19:50 |
thumper | my answer to that is this: | 19:50 |
thumper | the description is what you write to help the reviewer, explaining the changes, what why and how | 19:50 |
thumper | the commit message is what the change is | 19:50 |
natefinch | Fair enough | 19:50 |
thumper | they have different audiences | 19:50 |
thumper | however, most people in the team just copy the description | 19:51 |
thumper | I often edit it | 19:51 |
natefinch | do I have to poke it in and out of approved, or will the bot notice the change to the commit message? | 19:51 |
natefinch | The commit message thing would bug me a lot less if the bot would just email me that I forgot the commit message, instead of silently failing to do anything | 19:52 |
thumper | the bot will notice | 19:56 |
* natefinch thinks that line sounds ominous | 19:59 | |
natefinch | I have the root-disk constraint working on EC2, just need to get it to actually report back the correct disk size (currently it just hard codes the 8G default) | 20:01 |
thumper | hmm.. | 20:01 |
thumper | the bot failed | 20:01 |
thumper | but because it couldn't create a directory | 20:01 |
* thumper kicks it into approved again | 20:02 | |
natefinch | I can retry... though not being able to create a directory doesn't sound like something that's likely to work a second time either | 20:02 |
natefinch | yeah, cool | 20:02 |
thumper | it could be a random clash | 20:03 |
thumper | if not, someone needs to log into the machine and clear out the /tmp dir | 20:03 |
thumper | from all the gocheck temp dirs | 20:03 |
thumper | I have a feeling that some are left behind if there are issues with teardown | 20:03 |
natefinch | ug | 20:03 |
* thumper nods | 20:03 | |
thumper | yeah | 20:03 |
thumper | I have about 10 in my /tmp dir | 20:04 |
natefinch | I gotta run my daughter to a doctor's appointment, but I'll be back on after for a bit. | 20:04 |
thumper | ack | 20:04 |
=== natefinch is now known as natefinch-afk | ||
thumper | fwereade_, mgz: anyone know where to landing bot instructions are? | 20:22 |
thumper | need to clean out the tmp dir | 20:22 |
thumper | bot is failing | 20:22 |
=== _mup__ is now known as _mup_ | ||
thumper | fwereade_: if you are around, I'd love to chat and rant | 21:39 |
natefinch-afk | thumper: how goes? | 23:06 |
=== natefinch-afk is now known as natefinch | ||
=== natefinch is now known as natefinch-bedtim | ||
=== natefinch-bedtim is now known as natefinch-bed |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!