wallyworld | ericsnow: has it been like that forever? i didn't think lxc had that issue? | 00:00 |
---|---|---|
ericsnow | wallyworld: this was just the lxd provider | 00:01 |
wallyworld | ok, that makes sense since that provider was re-written from scratch | 00:01 |
wallyworld | i didn't think we had this issue in 1.25 | 00:01 |
wallyworld | with lxc | 00:02 |
ericsnow | wallyworld: I'm guessing no-one had tried to use the LXD provider with non-amd64 before | 00:02 |
wallyworld | not till now :-) | 00:02 |
ericsnow | wallyworld: anyway, I gotta run; natefinch can fill in the rest :) | 00:02 |
wallyworld | ok, ty | 00:03 |
wallyworld | ttyl | 00:03 |
=== Spads_ is now known as Spads | ||
axw | menn0: if you were going the first route, I'd just supply a names.Tag rather than a GlobalEntity | 00:17 |
menn0 | axw: yep, fair enough | 00:18 |
axw | menn0: I'm ambivalent though. I've found having the methods on state make it a bit easier to mock in tests | 00:18 |
axw | but at the same time, I don't like throwing it all on the one type | 00:18 |
menn0 | that ship has sailed I think :) | 00:18 |
axw | menn0: :) | 00:19 |
menn0 | but I guess I don't have to make it worse | 00:19 |
menn0 | axw: you're right about mocking in tests though... given that we only have concrete Machines I'm setting myself up for difficult testing if I add methods to Machine | 00:21 |
menn0 | difficult / dumb | 00:21 |
axw | menn0: some packages go to lengths to mock Machine out too, but it is a PITA | 00:22 |
menn0 | axw: especially for something so simple | 00:23 |
axw | menn0: not sure if you've looked at state/volume.go or state/filesystem.go, but this is why I put everything at the top level | 00:23 |
axw | it made testing much easier | 00:23 |
axw | menn0: I ended up having to have some methods on Volume/Filesystem to support, e.g. StatusSetter/StatusGetter | 00:24 |
menn0 | yep, makes sense | 00:24 |
rick_h_ | axw: howdy, wanted to chat on the model statuses and get your thought on something | 00:31 |
rick_h_ | axw: since the end state is archived, what about destroying is archiving? | 00:31 |
axw | rick_h_: it did occur to me. I would say it's not really archiving at that point. It's destroying the resources within the model, it's just the the model docs that are archived | 00:32 |
rick_h_ | axw: right, but it's working toward the archived state. And it's not used elsewhere because of that idea | 00:32 |
rick_h_ | axw: I guess it's not quite 'factual' but as far as state transitions it goes through a few moving bits on the way to archived. Picking any one points out a bit of what's up? | 00:33 |
axw | rick_h_: so I kinda agree that you should have through "archiving" to get to "archived", but at the same time I don't think "archiving" implies that anything is being cleaned up | 00:34 |
rick_h_ | axw: do you have a link to the bug there? I wanted to go back and refer to the list you had but I'm failing to see the bug | 00:34 |
axw | but more ... put away | 00:34 |
axw | 1 sec | 00:34 |
axw | rick_h_: https://bugs.launchpad.net/bugs/1534627. I've currently got it as "active", "destroying", "archived" in my branch | 00:35 |
mup | Bug #1534627: Destroyed models still show up in list-models <2.0-count> <conjure> <juju-release-support> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1534627> | 00:35 |
rick_h_ | axw: so what caused us to keep it around by default anyway? | 00:36 |
rick_h_ | axw: vs a cleanup all the wya and removal? | 00:36 |
axw | rick_h_: I don't know, that predates my involvement. thumper? | 00:36 |
rick_h_ | axw: I mean it seems useful, but it seems like the exception to the rule. A destroy-model --archive that has this seems like the cleaner wa | 00:36 |
axw | it definitely is an exception to the rule | 00:37 |
rick_h_ | I guess I assumed we had a good reason from some stakeholder but I'm coming up empty for making this the default behavior. | 00:37 |
rick_h_ | wallyworld: ^ | 00:37 |
wallyworld | rick_h_: predates me too | 00:38 |
thumper | hmm... | 00:38 |
axw | rick_h_: it doesn't seem useful to me, TBH. I'd prefer to just lean on external logging to keep records | 00:38 |
thumper | was there mostly so we don't suddenly error | 00:38 |
rick_h_ | axw: witht he rsyslog logging work going in I'm +1 to that idea | 00:38 |
thumper | because the way people watch an environment being destroyed is status | 00:38 |
thumper | if you had "watch juju status" running | 00:39 |
thumper | you would watch the services go down | 00:39 |
thumper | machines being removed | 00:39 |
rick_h_ | thumper: oic, so this was so that you could see it die vs go away and then if it failed to destroy you had no good way to get at it? | 00:39 |
thumper | then "error: unknown environment" | 00:39 |
thumper | mostly so you could watch it die without errors | 00:39 |
thumper | and grab logs... maybe | 00:40 |
rick_h_ | axw: can you investigate if we can just destroy in a clean way and do away with the kept around models at all please? | 00:40 |
rick_h_ | axw: I feel like we're chasing the wrong end of the problem atm | 00:40 |
wallyworld | rick_h_: our general pattern has been to shepard entities through dying -> dead. until they are reaped by a cleaner, they will still be there | 00:40 |
axw | rick_h_: sure | 00:41 |
rick_h_ | wallyworld: right, so can we do that better vs keep things around like this? | 00:41 |
thumper | the 24 hours thing was entirely arbitrary | 00:41 |
thumper | and something I picked out of the air | 00:41 |
thumper | we could have a sensible default... | 00:41 |
thumper | of a much smaller number | 00:41 |
rick_h_ | thumper: yea, I mean can we just watch it until it's dead cleanly and then remove it right away? | 00:41 |
wallyworld | rick_h_: oh, i didn't realise they were kept for 24hrs | 00:41 |
rick_h_ | e.g. check every 5s or something? | 00:41 |
wallyworld | rick_h_: i thought they were removed immediately once dead, sorry | 00:41 |
rick_h_ | I'm all for giving the user confidence/observability in what they're doing | 00:42 |
thumper | it is the client going "juju status" | 00:42 |
thumper | and us wanting to say "environment is dead" | 00:42 |
thumper | rather than "error, you suck" | 00:42 |
rick_h_ | right, so at some point it fails that "model is not available" or the like | 00:42 |
thumper | eventually | 00:42 |
rick_h_ | you asked for it to go away, what do you expect? | 00:42 |
thumper | right now, 24 hours later | 00:42 |
rick_h_ | yea, that's gotta go imo | 00:42 |
thumper | this is the job of the undertaker | 00:42 |
rick_h_ | if we can tell from status it's dead, then we should be able to reap it right then and there | 00:42 |
wallyworld | +1 | 00:43 |
rick_h_ | and if status says it's not dead, then we don't reap it and you can get at it and diagnose | 00:43 |
wallyworld | that's ow i thought it worked | 00:43 |
rick_h_ | heh no | 00:43 |
wallyworld | didn't realise models were special | 00:43 |
rick_h_ | so we were looking to do all this renaming and such which :/ | 00:43 |
rick_h_ | little snowflakes in the wind :) | 00:43 |
thumper | oh shit | 00:45 |
thumper | I think I need to work out this juju 2.0 cli thing | 00:45 |
* thumper needs to add a maas cloud | 00:45 | |
thumper | perhaps after a dog walk | 00:45 |
* rick_h_ grumbles | 00:45 | |
rick_h_ | thumper: on that please fix the whole maas as a cloud vs not needing credentials/etc kthx | 00:45 |
thumper | wat? | 00:46 |
* thumper has no idea about that | 00:46 | |
axw | rick_h_: as in maas showing up in list-clouds? | 00:46 |
rick_h_ | thumper: there's a thing in that maas setup that's confusing users | 00:46 |
thumper | maas is a cloud isn't it? | 00:46 |
axw | maas is a type of cloud | 00:46 |
rick_h_ | axw: yea, and if you do add it and then you try to add-credential it tells you maas does't need credentials | 00:46 |
axw | wat | 00:46 |
rick_h_ | but then users don't know how to use it since it's different | 00:46 |
thumper | wat? | 00:47 |
* redir goes eod | 00:47 | |
redir | see you tomorrow juju-dev | 00:47 |
rick_h_ | night redir | 00:47 |
axw | rick_h_: I think that might be covered by anastasiamac_'s branch, http://reviews.vapour.ws/r/4573/ | 00:47 |
axw | night redir | 00:47 |
anastasiamac_ | rick_h_: iwat axw said - it's beeing fixed ^^ | 00:47 |
anastasiamac_ | (landing, really) | 00:47 |
=== redir is now known as redir_afk | ||
anastasiamac_ | thumper: about cloud vs cloud type, see https://bugs.launchpad.net/juju-core/+bug/1564054 | 00:48 |
mup | Bug #1564054: lxd, maas and manual do not make sense in list-clouds <juju-release-support> <juju-core:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1564054> | 00:48 |
wallyworld | rick_h_: what do you mean? juju add-credential maas works fine | 00:49 |
rick_h_ | wallyworld: hmm, a user was hitting it and gave me a pastbin the other day | 00:49 |
wallyworld | rick_h_: that would have been an old beta | 00:50 |
rick_h_ | wallyworld: I'm trying to find it, they got an error out of juju along the lines of "xxx does not need a credential" | 00:50 |
wallyworld | this is all wip | 00:50 |
thumper | :) | 00:50 |
rick_h_ | heh | 00:50 |
wallyworld | it used to be that | 00:50 |
wallyworld | we have been delivering things as threy are finished in each beta | 00:50 |
rick_h_ | ah ok | 00:50 |
rick_h_ | never mind then | 00:50 |
wallyworld | add credentials came late in the beta cycle | 00:50 |
wallyworld | rick_h_: stakeholders want maas included in list clouds | 00:51 |
rick_h_ | wallyworld: did we have plans for an add-cloud as wel? | 00:51 |
wallyworld | that's why we did it we know maas is not a cloud | 00:51 |
rick_h_ | wallyworld: well, maas clouds need to be there. The trouble is the lack of an add-cloud command for now | 00:51 |
anastasiamac_ | wallyworld: a user would hit a "boom" if they specify cloud as part of command arguments.. | 00:51 |
rick_h_ | wallyworld: yea, I was one of those, but mark is right | 00:51 |
axw | rick_h_: we have add-cloud, but not interactive | 00:51 |
rick_h_ | anastasiamac_: wallyworld so I +1 going with Mark's feedback there | 00:51 |
rick_h_ | axw: right, interactive is what I'm thinking | 00:51 |
axw | you just point at a YAML file | 00:51 |
rick_h_ | axw: not for 2.0 but we should add it down the road | 00:52 |
wallyworld | rick_h_: will you break the news to adam? | 00:52 |
axw | it would be helpful | 00:52 |
rick_h_ | axw: right, but folks are confused as to what goes in the clousd.yaml, the credentials, and the config | 00:52 |
* axw nods | 00:52 | |
rick_h_ | wallyworld: stokachu? | 00:52 |
wallyworld | yeah | 00:52 |
wallyworld | he needs maas in list clouds for his app | 00:52 |
rick_h_ | wallyworld: heh ok, will do | 00:52 |
rick_h_ | wallyworld: I'm still -1 on the maas:/ special thing where you don't have to add it to the clouds.yaml | 00:53 |
wallyworld | rick_h_: that was also well received and asked for by users :-) | 00:53 |
rick_h_ | wallyworld: bah, it just adds a 'differnet way to do it' that's special and unique and causes confusion folks have to look it upo | 00:54 |
wallyworld | rick_h_: you mean like lxd :-P | 00:54 |
wallyworld | there's no lxd in clouds.yaml also | 00:54 |
rick_h_ | wallyworld: heh | 00:55 |
thumper | ah crap... | 00:55 |
wallyworld | rick_h_: the use case driving this is - people want to know what they can put with juju bootstrap <controllername> <what goes here> | 00:56 |
thumper | I don't think my 1gb kvm maas instances are up to running juju are they? | 00:56 |
thumper | wallyworld: remember when I said the whole cloud credential spec would be a big pile of work? | 00:56 |
rick_h_ | wallyworld: yea...but they can't do maas without config info (in this case the IP address) so we've changed the bootstap command to make it work | 00:56 |
wallyworld | rick_h_: altered slightly yeah, but still inituitive imo | 00:58 |
* thumper goes to walk the dog and get away from the copmuter a bit | 00:58 | |
wallyworld | thumper: oh i know | 00:58 |
wallyworld | i never doubted it | 00:58 |
wallyworld | we piled in so much stuff in such a short time, and delivered incrementally over several betas and got beat up when it wasn't all there beta 1 :-/ | 00:59 |
rick_h_ | wallyworld: no one got beat up :P | 01:09 |
rick_h_ | wallyworld: I couldn't reach you! | 01:09 |
wallyworld | lol | 01:09 |
wallyworld | rick_h_: not beat up directly | 01:09 |
rick_h_ | :P | 01:10 |
wallyworld | , more complaints :-) | 01:10 |
rick_h_ | wallyworld: we just had an opinion that add-credential should have come first :P | 01:15 |
rick_h_ | not complaints, opinions...I hear it's like something else everyone has :) | 01:15 |
wallyworld | rick_h_: sure, but it was harder to add and was icing - we needed the core functionality with credentials added by hand. having fancy add credentials with nothing working would not have been cool | 01:16 |
rick_h_ | wallyworld: understand completely...after I stopped and thought about it :P | 01:16 |
wallyworld | :-) | 01:16 |
wallyworld | rick_h_: next time we'll start the work at the beginning of the cycle, not a month or 2 out :-) | 01:17 |
rick_h_ | wallyworld: psh, don't go changing everything on me | 01:17 |
wallyworld | great, now i've got that song stuck in my head | 01:17 |
rick_h_ | lol, glad to be of service | 01:17 |
wallyworld | i love you just the way you are.... ta de dum | 01:18 |
wallyworld | axw: got 5 mintes, standup? | 01:43 |
axw | wallyworld: sure, brt | 01:43 |
bradm | should juju2 be able to bootstrap against a private openstack cloud? I'm getting: | 01:50 |
bradm | 2016-04-14 01:30:25 ERROR cmd supercommand.go:448 failed to bootstrap model: model "admin" of type openstack does not support instances running on "amd64" | 01:50 |
bradm | my index.json does list 14.04 images... | 01:50 |
bradm | ah, when I don't use --upload-tools it gives: | 02:09 |
bradm | 2016-04-14 02:05:56 ERROR cmd supercommand.go:448 failed to bootstrap model: cannot start bootstrap instance: no "trusty" images in bootstack-canonistack-bos01 matching instance types [m1.small m1.medium m1.large m1.xlarge] | 02:09 |
thumper | wallyworld: ping | 02:20 |
wallyworld | wot | 02:21 |
wallyworld | thumper: ? | 02:22 |
thumper | wallyworld: got a few minutes to chat? | 02:22 |
wallyworld | sure | 02:22 |
wallyworld | bradm: juju 2 should work similar to juju 1 for private clouds | 02:23 |
wallyworld | it's all about getting the correct streams metadata which can be tricky | 02:23 |
bradm | wallyworld: just filed LP#1570162 about it | 02:23 |
wallyworld | but if it works for juju 1 it should work for juju 2 | 02:24 |
wallyworld | ok, we'll look at the bug | 02:24 |
bradm | let me know if you need any more info about it | 02:24 |
thumper | wallyworld: 1:1 ? | 02:24 |
wallyworld | yup | 02:24 |
bradm | wallyworld: this is mitaka on trusty, and it has different arch compute nodes too | 02:24 |
bradm | grabbing some lunch, back in a while | 02:25 |
mup | Bug #1570162 opened: juju2 openstack private cloud cannot start bootstrap instance <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1570162> | 02:25 |
mwhudson | wallyworld: juju-mongo-tools3.2 just got accepted \o/ | 02:45 |
* mwhudson runs away for a bit | 02:45 | |
wallyworld | mwhudson: awesome | 02:45 |
menn0 | axw: I was thinking that the machiner would send the SSH host key to the state server (via the machiner facade) | 03:00 |
axw | menn0: sounds fine to me | 03:01 |
menn0 | axw: seem reasonable? | 03:01 |
natefinch | wallyworld: btw, this is the fix for the lxd arch bug that I'm pretty sure works: https://github.com/juju/juju/pull/5116/files. But I still can't quite fully test due to not being able to fully disable upload-tools.. I haven't figured out how to disable saving the tools (I've found a couple likely spots and commented out code, but to no avail). | 03:04 |
wallyworld | ok, i'll look at pr, thanks | 03:04 |
wallyworld | there's a toolsstorage that manages the tools in gridfs | 03:05 |
natefinch | wallyworld: I commented out this whole loop: https://github.com/juju/juju/blob/master/apiserver/tools.go#L262 which looks like the place where we store the tools after uploading them... but it didn't seems to change anything | 03:07 |
wallyworld | natefinch: why not just make storage.Add() itself a no op | 03:09 |
wallyworld | natefinch: anyway, here's what you wnated | 03:10 |
wallyworld | fetchAndCacheTools | 03:10 |
wallyworld | or maybe not, that is when we download then from streams | 03:10 |
wallyworld | it's in jujud/bootstrap.go | 03:11 |
wallyworld | logger.Debugf("Adding tools: %v", toolsVersion) | 03:11 |
wallyworld | if err := toolstorage.Add(bytes.NewReader(data), metadata); err != nil { | 03:11 |
wallyworld | i just searched for usages of toolsstorage.Add() | 03:11 |
wallyworld | there's only a few to check | 03:11 |
natefinch | wallyworld: thanks... I really wasn't sure what to search for | 03:13 |
wallyworld | just the .Add() | 03:13 |
wallyworld | and see what calls it | 03:13 |
wallyworld | as that's where tools are added to state | 03:13 |
natefinch | wallyworld: yes but, I didn't know it was called .Add | 03:13 |
wallyworld | but you commented it out so you must have ? | 03:13 |
wallyworld | in that place | 03:14 |
wallyworld | anyways, you'll need to talk to john about that todo | 03:14 |
natefinch | wallyworld: I looked in the server code for where we were handling the http post for tools upload | 03:14 |
wallyworld | as he is making lxd work on provisioned machones | 03:14 |
wallyworld | sure, and toolsstorage.Add() was right there | 03:14 |
natefinch | wallyworld: it never occurred to me that there would be code in the client that would be identical | 03:14 |
wallyworld | bootstrap is special | 03:15 |
natefinch | wallyworld: indeed | 03:15 |
wallyworld | it has to have logic built in as there's no server running yet | 03:15 |
wallyworld | so to recap, we'll need to get that todo sorted asap | 03:15 |
wallyworld | i think | 03:15 |
natefinch | not really | 03:15 |
natefinch | this is the lxd provider | 03:16 |
natefinch | we only support localhost | 03:16 |
wallyworld | actually, yeah maybe not | 03:16 |
wallyworld | it's only local yeah | 03:16 |
natefinch | some day we maybe might support a remote host. Today's not that day :) | 03:16 |
wallyworld | i was confusing the prtovisioner | 03:16 |
natefinch | yeah, it's confusing | 03:16 |
wallyworld | good that it's fixed, ty | 03:17 |
natefinch | well, I'll go test it, but I'm pretty certain that fixes it | 03:17 |
wallyworld | yeah, the code looks correct | 03:18 |
natefinch | there was a suspicious log message on boot that we were saving amd64 tools, even though we were bootstrapping with arm64... and that message changes to arm64 now... but I have to do the full test with upload-tools disabled to know for sure. | 03:18 |
wallyworld | natefinch: maybe not - cloud init gets tools via the controller which acts as a proxy | 03:19 |
wallyworld | so even if upload tools is used, you just trace requests into the controller | 03:20 |
wallyworld | i didn't think of that previously | 03:20 |
wallyworld | so just trace the tools download hander gets | 03:20 |
natefinch | I'm building on the arm64 machine which is just about the slowest machine in existence, I'm pretty sure | 03:28 |
thumper | quick | 03:29 |
thumper | someone | 03:30 |
thumper | where is the cloudinit data stored | 03:30 |
thumper | on the cloud machine | 03:30 |
thumper | I need to get it off before the code kills the machine | 03:30 |
natefinch | dunno, sorry | 03:31 |
thumper | menn0: ^^? | 03:31 |
thumper | juju brought the machine up but can't ssh in | 03:32 |
thumper | axw: hey... | 03:32 |
thumper | this may be part your history | 03:33 |
thumper | 2016-04-14 03:33:01 DEBUG juju.provider.common bootstrap.go:328 connection attempt for 192.168.100.3 failed: /var/lib/juju/nonce.txt does not exist | 03:33 |
thumper | axw: we just bring up a machine with ssh keys | 03:33 |
thumper | and I'm guessing that file | 03:33 |
axw | thumper: yes, so we know we've connected to the right machine | 03:34 |
thumper | axw: but that file isn't there | 03:34 |
axw | thumper: something's not doing the right thing with cloud-init then I guess? | 03:34 |
thumper | axw: know where the cloud init file is? | 03:35 |
axw | thumper: /var/lib/cloud I think? | 03:35 |
* axw rummages | 03:35 | |
thumper | I have user_data.txt from there | 03:35 |
thumper | but it is base64 encoded | 03:35 |
thumper | and decoded looks binary | 03:35 |
axw | thumper: there should be a plaintext file nearby | 03:35 |
axw | thumper: instance/cloud-config.txt maybe | 03:36 |
thumper | doesn't exist | 03:36 |
thumper | oh | 03:36 |
thumper | zero bytes | 03:36 |
axw | thumper: interesting, I don't think I've ever seen a zero-byte one. maybe part of the problem. | 03:37 |
thumper | hmm... | 03:37 |
thumper | I think I'll add logging to the creation of the user data | 03:37 |
mup | Bug #1570175 opened: juju2 kill-controller doesn't work when bootstrap server is unreachable <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1570175> | 03:37 |
thumper | axw: we are this |-| close to having maas2 boot | 03:38 |
axw | thumper: cool :) | 03:38 |
natefinch | ahh man, the fact we can't cross compile is killing me on this dumb arm64 machine | 03:42 |
natefinch | my laptop compiles juju in like 10 seconds. This machine takes like 3 minutes | 03:42 |
menn0 | thumper: sorry, missed your message... I don't know the answer either (cloud-init newbie) | 03:43 |
thumper | well, I've added some debugging code to the cloud init renderer | 03:43 |
thumper | which I found... | 03:44 |
menn0 | thumper, axw: would the cloudinit logs give some clues? (/var/log/cloud-init.output and cloud-init.log) | 03:44 |
thumper | so I can catch it as it is yaml-ified, before gzip, base64 | 03:44 |
thumper | ok, got the cloud init | 03:45 |
thumper | but it is kinda big... | 03:45 |
* thumper looks deeper | 03:45 | |
axw | menn0 thumper: *is* there a /var/log/cloud-init-output.log? | 03:45 |
thumper | I'll look this time | 03:45 |
thumper | last time 10 minutes passed and machine was released | 03:45 |
thumper | - install -D -m 644 /dev/null '/var/lib/juju/nonce.txt' | 03:46 |
thumper | - printf '%s\n' 'user-admin:bootstrap' > '/var/lib/juju/nonce.txt' | 03:46 |
thumper | from the local userdata it is sending down | 03:46 |
thumper | not much of a nonce :) | 03:46 |
axw | thumper: heh yeah, but it's better than connecting to some random machine on your local network and running juju setup on it :) | 03:47 |
thumper | true | 03:48 |
thumper | 2016-04-14 03:47:40,431 - __init__.py[WARNING]: Unhandled non-multipart (text/x-not-multipart) userdata: 'H4sIAAAJbogA/+w7aXPbuJLf...' | 03:50 |
thumper | WTF... | 03:50 |
thumper | basically it stopped cloudinit | 03:50 |
axw | thumper: nice :) | 03:50 |
axw | thumper: so MAAS is just dropping it on the floor? | 03:50 |
* thumper shrugs | 03:50 | |
thumper | I'll have to dig a bit more | 03:50 |
axw | thumper: people have complained about the nonce.txt file being missing before, I never could repro tho | 03:51 |
axw | maybe it depends on contents/padding/something | 03:51 |
natefinch | wallyworld: huzzah, success | 04:00 |
wallyworld | great | 04:01 |
wallyworld | after waiting ages for a compile | 04:01 |
natefinch | wallyworld: well, I kept getting stupid compile errors... finally realized I could just slap a return nil before the code I wanted to avoid | 04:01 |
wallyworld | thumper: anastasiamac_ committed a fix for that maas bug you saw with listing credentials | 04:07 |
bradm | wallyworld: hmm, I can bootstrap with juju2 beta 3 on canonistack-lcy01, but not on this new mitaka cloud | 04:09 |
wallyworld | bradm: it may not have the flavours set up correctly? | 04:09 |
wallyworld | juju uses those plus arch to determine the instance id | 04:10 |
wallyworld | and matches it all with simplestreams | 04:10 |
wallyworld | it all needs to match up correctly | 04:10 |
bradm | wallyworld: it has flavours, I'm not sure how you have to set them up, we've never done anything about mapping them before | 04:12 |
bradm | wallyworld: I can certainly boot instances, although that is done by specifying an image id | 04:12 |
wallyworld | right, it appears the image id selection algorithm fails due to reaons | 04:13 |
wallyworld | reasons | 04:13 |
wallyworld | it would need investigation to see what's wrong with the set up | 04:13 |
bradm | I can see all 3 arches in the index.json | 04:13 |
wallyworld | i personally have not internalised the algorithm - would need to trace it all out and see what's needed where | 04:14 |
bradm | something must have changed with mitaka | 04:14 |
bradm | or juju2 | 04:14 |
bradm | hm, thats a point, I should try juju1 on it | 04:15 |
wallyworld | bradm: also anastasiamac_ thinks there could be related (same?) issue that is fixed in beta 4 | 04:15 |
bradm | wallyworld: yeah, maybe I should just wait until beta 4 is out | 04:16 |
bradm | are we there yet? are we there yet? ;) | 04:16 |
wallyworld | bradm: in maybe 24 hrs tops | 04:16 |
wallyworld | in time for xenial cut off | 04:17 |
bradm | wallyworld: awesome. I'll try out juju1 on this to see how it goes, might narrow things down a bit. | 04:17 |
wallyworld | bradm: yes please, that data point would be helpful in case there's still an issue | 04:17 |
wallyworld | juju1 and 2 should behave the same AFAIK | 04:17 |
wallyworld | that bug mentioned above may well be an existing issue | 04:18 |
wallyworld | i don't know the detials | 04:18 |
bradm | a newly released mitaka being deployed via unreleased charms and trying to use a beta juju on top of it? nah, couldn't be any problems there. :) | 04:18 |
wallyworld | so we'll take it a step at a time: try juju1, wait till beta4 | 04:18 |
wallyworld | of course not | 04:18 |
wallyworld | what could possibly go wrong | 04:18 |
bradm | its a house of unreleased cards | 04:19 |
bradm | but thats why we're doing it, to help bash out any bugs early | 04:19 |
bradm | wallyworld: aha, it fails in the same way | 04:25 |
wallyworld | \o/ | 04:25 |
wallyworld | not juju2 :-) | 04:25 |
bradm | the index.json is pretty simple | 04:28 |
bradm | oho | 04:30 |
bradm | arch: x86_64 | 04:30 |
bradm | why is that there | 04:30 |
natefinch | whelp, I have a PR up for that fix here: http://reviews.vapour.ws/r/4555/ I haven't been able to successfully change the tests such that they actually fail with the old code, but it's time for bed. | 04:35 |
bradm | wallyworld: hah, that was it. for some reason the version of glance-simplestreams-sync we have was canonicalizing the arch from amd64 to x86_64. now I get a different error. :) | 04:40 |
wallyworld | bradm: progress! | 04:40 |
bradm | wallyworld: now it wants me to give it a network id | 04:40 |
wallyworld | what is "it"? | 04:41 |
bradm | the error from the bootstrap | 04:41 |
wallyworld | i haven't done much with networking, not sure | 04:42 |
bradm | Multiple possible networks found, use a Network ID to be more specific. | 04:42 |
bradm | that feels like a nova error, though. | 04:42 |
wallyworld | yes it does | 04:42 |
wallyworld | could be propagated back from start instance | 04:43 |
bradm | how do you tell juju2 what the default network is, like the network setting in juju1 | 04:43 |
bradm | just trying to add it where you define the endpoint | 04:45 |
bradm | nope, no go :-/ | 04:47 |
bradm | do I just mark the bug as invalid? | 04:48 |
wallyworld | bradm: juju has spaces and things. but if you are asking about config, instead of foo=bar in environments.yaml, you use --config "foo-bar" on the bootstrap cmd line | 04:50 |
wallyworld | juju2 i mean | 04:50 |
wallyworld | if it's a set up issue with openstack, then yeah, bug may be invalid | 04:51 |
bradm | wallyworld: thats got it | 04:53 |
bradm | wallyworld: at least, its trying to get an address now, so its much further than before. looks like we need a way to set the default network | 04:53 |
wallyworld | that worked? | 04:53 |
wallyworld | ok | 04:53 |
wallyworld | that bit i am not sure of | 04:53 |
wallyworld | bradm: dimiter and andy are the folks to ask about that | 04:54 |
bradm | oh boy, not quite. now it errored out with: | 04:57 |
bradm | error: flag provided but not defined: --model-config | 04:57 |
wallyworld | bradm: you need upload-tools if you are running from trunk | 05:00 |
bradm | wallyworld: that was with upload-tools... | 05:00 |
bradm | trying now without | 05:00 |
wallyworld | the error looks like it is because the jujud binary from tools is old | 05:01 |
bradm | it seems to be getting further without the upload tools | 05:05 |
bradm | oh, still fails. | 05:05 |
bradm | https://pastebin.canonical.com/154283/ <- error message | 05:05 |
bradm | maybe I should just wait for beta4 | 05:09 |
menn0 | axw: here's the state part and some of the API work for storing SSH host keys. http://reviews.vapour.ws/r/4586/ | 05:09 |
axw | menn0: cool, looking | 05:09 |
wallyworld | bradm: do you have the latest code checked out? that could explain why upload tools give poor results. also that error - is the auth url correct? just aguess | 05:09 |
bradm | wallyworld: nope, I'm just using teh beta3 version from the ppa | 05:09 |
wallyworld | bradm: in that case upload tools does nothing | 05:10 |
bradm | wallyworld: that can't be true, I saw very different things | 05:10 |
wallyworld | ok, so it grabs the first juju binary from the search path and uses that | 05:10 |
menn0 | axw: next up... some routines in juju/utils for parsing host key files and generating known hosts files | 05:10 |
wallyworld | bradm: or builds from source, but you don't have source checked out | 05:11 |
bradm | wallyworld: and what auth url do you mean? the endpoint in the clouds.yaml | 05:11 |
wallyworld | bradm: so there may be another juju in the path | 05:11 |
wallyworld | bradm: for openstack, yeas i think so | 05:11 |
bradm | wallyworld: there's no juju source on this box at all | 05:11 |
bradm | wallyworld: there is juju 1.25.5 | 05:11 |
wallyworld | bradm: so in that case upload tools is using those binaries | 05:12 |
wallyworld | which explains a lot | 05:12 |
wallyworld | the --model-config unknown etc | 05:12 |
bradm | I used juju 1.25.5 to deploy openstack | 05:12 |
wallyworld | sure but you said you wer eusing juju2 from a ppa | 05:12 |
bradm | yeah, juju2 from ppa, juju1 from ppa | 05:12 |
wallyworld | so if you us a juju2 client and upload tools picks the 1.25 binaries it wil screw up | 05:12 |
wallyworld | don't use upload tools please :-) | 05:13 |
wallyworld | unless you are a developer and have source | 05:13 |
bradm | righto. | 05:13 |
bradm | I'm just trying different things to work out what its doing :) | 05:14 |
wallyworld | bradm: so now it may be that keystone is messing up | 05:14 |
wallyworld | bradm: i would love upload tools to be removed tbh | 05:14 |
wallyworld | it causes too many issues unless used under strict conditions | 05:15 |
bradm | I'm sure we've had to use it to fix things in the past | 05:15 |
bradm | but yeah, if its no longer of use | 05:15 |
wallyworld | with source yes, or custom binaries put in the right place | 05:15 |
wallyworld | it has a use, but yu need to be careful | 05:15 |
bradm | wallyworld: I've been using juju since the python days, I'm sure I've got all sorts of redundant things in my brain about it :) | 05:16 |
wallyworld | and in this case, it caused "weird" errors until i found out you didb't have source code and had 2 versins etc | 05:16 |
wallyworld | bradm: understood. you deserve a medal :-) | 05:16 |
wallyworld | for consuming all of our bugs for so long | 05:16 |
mup | Bug #1570162 changed: juju2 openstack private cloud cannot start bootstrap instance <canonical-bootstack> <juju-core:Invalid> <https://launchpad.net/bugs/1570162> | 05:17 |
bradm | oh my | 05:20 |
bradm | that error message really is true, I can't reach the keystone IP from a VM | 05:20 |
wallyworld | at least juju is not lying :-) | 05:20 |
bradm | juju would never lie to us, would it? | 05:21 |
wallyworld | *never* | 05:21 |
bradm | missing a route. | 05:24 |
axw | menn0: code looks good, but I have a question about the structure of the keys | 05:24 |
axw | (in RB) | 05:24 |
menn0 | axw: can the SSH server really have multiple host keys for a given algorithm? | 05:26 |
axw | menn0: I think so, but I'll test to make sure | 05:26 |
menn0 | axw: I just checked the man page... I don't think it's possible | 05:27 |
axw | menn0: sshd_config? what part? | 05:28 |
menn0 | axw: man sshd | 05:28 |
bradm | juju2 likes leaving secgroups around, just hit a quota limit | 05:29 |
menn0 | axw: the "-h host_key_file" part and the FILE section | 05:29 |
menn0 | FILES | 05:29 |
menn0 | axw: there's one host key file per key/algorithm type | 05:29 |
axw | menn0: AFAIK they're just the default ones, referenced by /etc/ssh/sshd_config | 05:30 |
menn0 | axw: the wording in man sshd_config is more vague about it | 05:31 |
menn0 | axw: I guess it's safer to use []string (with no real downside) | 05:32 |
menn0 | axw: good catch | 05:32 |
menn0 | axw: in that case, I'll just store the key files in state verbatim (they're one line each) | 05:32 |
menn0 | axw: and handle the parsing and reformatting in the client when it generates the bespoke known_hosts file | 05:33 |
axw | menn0: SGTM. FWIW, starting sshd with multiple RSA keys works fine | 05:33 |
* axw nods | 05:33 | |
menn0 | axw: you just tried it? | 05:33 |
axw | menn0: yep | 05:33 |
axw | menn0: BTW, there's a function that you can use to parse the public keys: https://godoc.org/golang.org/x/crypto/ssh#ParseAuthorizedKey | 05:33 |
menn0 | axw: ok cool.. that's definitely the right approach then | 05:33 |
menn0 | axw: good to know... I just found something in juju/utils/ssh which also does it :) | 05:35 |
axw | heh | 05:35 |
axw | can't have too many | 05:36 |
menn0 | axw: maybe the keys should be parsed to a (type, keydata, comment) struct and send and stored that way? | 05:36 |
bradm | oh, you can just run juju2 enable-ha ? | 05:36 |
bradm | er, can't just | 05:36 |
menn0 | axw: rather than the raw strig | 05:36 |
menn0 | string | 05:36 |
axw | menn0: *shrug* if it's easy enough to do without losing any info, maybe | 05:37 |
axw | I'm not sure it's worth the effort tho | 05:37 |
axw | IOW, authorized key format is already perfect information, so probably not worth destructuring at that point unless we think we're going to query on the individual fields | 05:39 |
menn0 | axw: you're right | 05:41 |
menn0 | axw: what threw me a little was that the known_hosts file doesn't include the comment field on my machine | 05:42 |
menn0 | axw: so I was thinking it would need to be stripped | 05:42 |
menn0 | axw: but looking at the docs, it's fine if it's there | 05:42 |
menn0 | axw: []string it is | 05:42 |
axw | menn0: cool. a comment saying that it's in authorized_keys format would be helpful | 05:43 |
menn0 | axw: yep, will add. | 05:43 |
bradm | wallyworld: yeah, this network thing is going to be a blocker. need a way to set it as a default somewhere. just setup a multiuser env, tried to boot a VM and it errored out with the same thing about multiple networks | 05:57 |
wallyworld | bradm: i recall conversations in this area but not any specifics, not sure of the status | 05:57 |
wallyworld | aybe there's a solution already, i just don't know it | 05:57 |
wallyworld | axw_: ping | 06:09 |
axw_ | wallyworld: pong | 06:09 |
wallyworld | axw_: stupid quetion of the day, i'll make a dick of myself i'm sure. can look look at line 71 or environ_broker.go in the lxd provider | 06:10 |
axw_ | wallyworld: yep? | 06:11 |
wallyworld | should be finishInstanceConfg() | 06:11 |
wallyworld | the arg struct is passed by value | 06:11 |
wallyworld | so how will it ever work | 06:11 |
axw_ | wallyworld: pretty sure InstanceConfig is a pointer | 06:11 |
axw_ | yep | 06:11 |
wallyworld | ah right | 06:11 |
wallyworld | yes i missed that | 06:12 |
axw_ | well shit. I just bootstrapped and it panicked on the server due to "send on closed channel" in the systemd package | 06:13 |
wallyworld | yay | 06:17 |
menn0 | axw_: please take another look at http://reviews.vapour.ws/r/4586/diff/ | 06:20 |
menn0 | axw_: no rush as I'm about to EOD. but if you could look before you finish that would be great | 06:21 |
axw_ | menn0: no worries, have a nice evening | 06:21 |
menn0 | axw_, wallyworld : it's feeling like this ssh host key handling issue is going to be easier to lick than it seemed (still a bit to do I realise) | 06:22 |
wallyworld | win | 06:22 |
axw_ | wallyworld: is there an agenda for the sprint yet? formal or informal | 06:30 |
axw_ | wallyworld: I mean, topic list we're compiing | 06:30 |
axw_ | compiling* | 06:30 |
axw_ | wallyworld: CI for storage really needs to happen | 06:30 |
wallyworld | axw_: sort of - right as of now, it's digesting the roadmap wish list | 06:30 |
axw_ | it's been broken in master since last year, in Malta... | 06:30 |
axw_ | yep | 06:31 |
axw_ | ok | 06:31 |
wallyworld | axw_: agreed about CI for storage. can we discuss in 1:1 tomorrow? | 06:31 |
axw_ | wallyworld: sure | 06:31 |
=== bradm_ is now known as bradm | ||
mwhudson | wallyworld: is there any other packaging stuff juju is waiting on? | 07:17 |
mwhudson | other than juju itself ;-p | 07:17 |
wallyworld | not that i know of | 07:17 |
wallyworld | well do want mongo 3.2 in trusty and wily at some stage | 07:17 |
wallyworld | soon hopefully :-) | 07:17 |
mwhudson | wallyworld: somehow the deadlines on those don't seem so tight | 07:18 |
=== blahdeblah_ is now known as blahdeblah | ||
mwhudson | eg i guess i should backport go 1.6.1 to trusty... | 07:19 |
wallyworld | mwhudson: they are not as tight, but we do want a consistent mongo experience across series at some stage | 07:20 |
mwhudson | i guess we can find out if the packages build for a start | 07:20 |
wallyworld | axw_: could you look at http://reviews.vapour.ws/r/4587/ ? i want to land it because CI is failing with lxd on arm. i have taken nate's work and added tests | 07:41 |
wallyworld | i want to try and get this in for beta4 | 07:42 |
axw_ | wallyworld: looking | 07:42 |
wallyworld | ta | 07:42 |
bradm | I've filed LP#1570219 if anyone who knows more about the networking side of things could take a look, that'd be great, thanks. | 07:42 |
axw_ | wallyworld: done | 07:44 |
wallyworld | ta | 07:44 |
mup | Bug #1570216 opened: juju2 not cleaning up nova secgroups with openstack provider <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1570216> | 07:47 |
mup | Bug #1570219 opened: juju2 openstack provider setting default network <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1570219> | 07:47 |
fwereade__ | ashipika, cmars: responded to a couple of bits of review, went into detail with the problems with the idiosyncratic approach to workers; we should probably all talk about this live today | 07:50 |
ashipika | fwereade__: thanks.. it's ok.. i was not going to land this PR as it is.. i still welcome any and all comments, but i'll be breaking it up into smaller managable bits, i expect.. | 07:51 |
fwereade__ | ashipika, cool, thanks, let me know if you want to talk about any of it | 07:53 |
ashipika | fwereade__: i expect i will.. many times along the way :) but this week we have our priorities elsewhere, so this might have to wait a bit.. anyways it needs to land by may, if i understand the timeline.. | 07:53 |
fwereade__ | ashipika, cool | 07:54 |
ashipika | fwereade__: and the thing is: there was no spec.. it was just "oh, this needs to work.. " | 07:54 |
fwereade__ | ashipika, yeah, that was rather my reading of it | 07:54 |
ashipika | fwereade__: we kept asking for more details, but nothing came back.. other than DTAG wants rsyslog forwarding | 07:55 |
=== terje is now known as Guest76597 | ||
voidspace | babbageclunk: so, in my test for waitForNodeDeployment I'm seeing the same NotFound error | 08:50 |
voidspace | babbageclunk: so looks like a straightforward bug | 08:50 |
babbageclunk | voidspace: yay! | 08:50 |
babbageclunk | voidspace: with my stuff landed now, what should I pick up? | 08:50 |
voidspace | babbageclunk: maas2Instance.volumes would be good | 08:51 |
babbageclunk | voidspace: ok | 08:51 |
voidspace | babbageclunk: there's a list at the top of the status document of tasks | 08:51 |
voidspace | babbageclunk: you could take on fixing the behaviour when you run against MAAS2 without the feature flag | 08:52 |
voidspace | babbageclunk: currently it just panics | 08:52 |
voidspace | babbageclunk: instead we should detect MAAS 2 and exit with an error instead | 08:52 |
voidspace | babbageclunk: that's easy enough to do - might be good to get that in first | 08:52 |
voidspace | babbageclunk: so attempt to create the controller even without the feature flag | 08:52 |
voidspace | babbageclunk: if it succeeds then we're on MAAS 2 | 08:53 |
voidspace | babbageclunk: if we don't have the feature flag error out with a NotSupported error | 08:53 |
babbageclunk | voidspace: yeah, I'll do that first. | 08:55 |
mwhudson | wallyworld: juju-mongo* stuff builds on trusty with a bit of flailing for -tools https://launchpad.net/~mwhudson/+archive/ubuntu/devirt/+packages/?field.series_filter=trusty | 08:56 |
voidspace | babbageclunk: ah no - my NotFound error is because my test doesn't give the fakeController.Machines method anything to return | 08:58 |
voidspace | babbageclunk: so not the same bug... | 08:58 |
voidspace | babbageclunk: ooh, see Tim's status update - he did some work for us | 09:00 |
voidspace | babbageclunk: and got bootstrap further | 09:00 |
* thumper wonders if he is in a different hangout to the others | 09:01 | |
voidspace | thumper: morning | 09:01 |
voidspace | thumper: keen bean | 09:01 |
voidspace | babbageclunk: just added a new task to the list - use maas2NetworkInterfaces from StartInstance | 09:23 |
voidspace | babbageclunk: should be trivial, but needs a test as well | 09:23 |
voidspace | babbageclunk: (a test at the StartInstance level) | 09:24 |
=== terje is now known as Guest32809 | ||
=== meetingology` is now known as meetingology | ||
voidspace | dimitern: frobware: babbageclunk: http://pastebin.ubuntu.com/15826362/ | 09:52 |
voidspace | dimitern: frobware: babbageclunk: that's real progress | 09:52 |
dimitern | voidspace: awesome! | 09:53 |
frobware | voidspace: indeed | 09:54 |
dimitern | voidspace: I see you're possibly hitting the same ssh issue I had - my key is ssh-dss, apparently no longer considered secure | 09:54 |
dimitern | voidspace: but I found a way around it, if you need | 09:55 |
voidspace | dimitern: go ahead | 09:55 |
voidspace | dimitern: I thought this was the issue tim had with not being able to ssh in | 09:55 |
dimitern | might still be that, but try this: | 09:56 |
voidspace | dimitern: can you join us in #maas on canonical | 09:56 |
voidspace | dimitern: (as well) | 09:56 |
dimitern | voidspace: http://paste.ubuntu.com/15826416/ | 09:57 |
dimitern | ah, I thought I'm there already | 09:57 |
dimitern | voidspace: so replace the IP ranges to match yours - the important bits are the last 4 sections | 09:57 |
voidspace | dimitern: thanks | 09:58 |
voidspace | dimitern: /etc/ssh/ssh_config | 09:59 |
dimitern | voidspace: ~/.ssh/config | 09:59 |
voidspace | dimitern: cool, thanks | 09:59 |
voidspace | dimitern: can you confirm that we gzip userdata for allenap in #maas on canonical irc | 09:59 |
dimitern | voidspace: looking | 10:01 |
voidspace | retrying | 10:01 |
voidspace | (the bootstrap I mean) | 10:01 |
mup | Bug #1570269 opened: state: ensure that Models are always paired with the correct State <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1570269> | 10:05 |
babbageclunk | voidspace: awesome | 10:05 |
thumper | voidspace: no, that is a different bug | 10:09 |
TheMue | morning | 10:09 |
thumper | 2016-04-14 03:57:54 ERROR cmd supercommand.go:448 failed to bootstrap model: waited for 10m0s without being able to connect: /var/lib/juju/nonce.txt does not exist | 10:09 |
thumper | o/ TheMue | 10:09 |
* thumper is outa here | 10:09 | |
voidspace | thumper: o/ | 10:09 |
TheMue | n8 thumper | 10:09 |
TheMue | :) | 10:10 |
mup | Bug #1570285 opened: worker/undertaker: update status with remaining resources <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1570285> | 10:32 |
babbageclunk | dimitern: how do I bootstrap to a different region in AWS? If I use --to zone=eu-west-1 I get this: http://pastebin.ubuntu.com/15827282/ | 10:47 |
babbageclunk | dimitern: also, why do I always need to specify --upload-tools? | 10:47 |
mwhudson | babbageclunk: i don't know but eu-west-1 is a region, not a zone | 10:47 |
dimitern | babbageclunk: juju bootstrap <controller-name> aws/<region> ... | 10:48 |
babbageclunk | mwhudson: Ah, ok - I was following some old docs, I think | 10:48 |
dimitern | babbageclunk: since that changed I tend to keep this around for reference: http://paste.ubuntu.com/15127859/ | 10:48 |
babbageclunk | dimitern: thanks | 10:49 |
babbageclunk | dimitern, mwhudson - also, juju help placement lead me astray too (I guess there are people writing new help messages). | 10:50 |
dimitern | babbageclunk: docs can be improved indeed, but have you tried 'juju help bootstrap' ? | 10:51 |
babbageclunk | dimitern: yup, that was how I got to 'juju help placement' | 10:51 |
dimitern | babbageclunk: ah :) I see - "placement" is related to the --to argument | 10:52 |
frobware | dimitern: can you drop into the sapphire standup HO? | 10:55 |
dimitern | frobware: sure, just a sec | 10:55 |
frobware | thx | 10:55 |
babbageclunk | dimitern: Right - totally missed the bit at the top of the bootstrap help - was confused because the placement docs matched what I saw in the old docs on the web. | 10:56 |
babbageclunk | dimitern: thanks! | 10:57 |
voidspace | frobware: dimitern: babbageclunk: http://reviews.vapour.ws/r/4591/ | 12:23 |
dimitern | voidspace: LGTM | 12:38 |
perrito666 | bbl | 12:38 |
mup | Bug #1570368 opened: juju commands timeout while a bootstrap is in process <conjure> <juju-core:New> <https://launchpad.net/bugs/1570368> | 12:57 |
voidspace | dimitern: thanks | 13:12 |
* voidspace lunches | 13:12 | |
mattyw | fwereade_, ping? | 13:13 |
mattyw | fwereade_, I have some questions if you can spare 5 minutes? | 13:16 |
babbageclunk | voidspace: what kind of error should I return for when the endpoint's MAAS 2 but the flag isn't set? errors.NotSupportedf? | 13:19 |
babbageclunk | voidspace: struggling to give an error message that makes sense with " not supported" stuck on the end. | 13:20 |
babbageclunk | voidspace: ok "unless the 'maas2' feature flag is set MAAS 2 is" | 13:22 |
dimitern | babbageclunk: for those cases there's also a errors.NewNotSupported(nil, fmt.Sprintf("fmt str", args,...)) you can use | 13:43 |
babbageclunk | dimitern: great, thanks¬ | 13:43 |
babbageclunk | oops, ! | 13:43 |
katco | morning all | 14:00 |
natefinch | morning katco | 14:02 |
katco | and actually need to reboot... brb | 14:02 |
voidspace | babbageclunk: just errors.New and a sensible error message of your choice will be fine | 14:08 |
babbageclunk | voidspace: hello | 14:08 |
voidspace | babbageclunk: hello | 14:09 |
babbageclunk | voidspace: so, obviously that change was trivial | 14:09 |
voidspace | babbageclunk: cool | 14:09 |
babbageclunk | voidspace: but working out why the tests weren't failing already in the same way wasn't | 14:09 |
voidspace | babbageclunk: don't forget to update the status doc | 14:09 |
voidspace | babbageclunk: hah | 14:09 |
voidspace | babbageclunk: I added the feature flag to our tests at some point | 14:09 |
babbageclunk | voidspace: it turns out maas2 just returns some html when you ask for /1.0/version/ | 14:10 |
babbageclunk | voidspace: rather than 404ing | 14:10 |
voidspace | babbageclunk: ah, it used to return nul | 14:10 |
voidspace | babbageclunk: they've changed it | 14:10 |
voidspace | dimitern: babbageclunk: I'm leaving early today to go to a tatooist | 14:11 |
voidspace | dimitern: babbageclunk: then I'm coming back in again later | 14:11 |
babbageclunk | voidspace: well, it returns null when you parse it as json | 14:11 |
dimitern | voidspace: ok, have phun ;) | 14:11 |
voidspace | ah | 14:11 |
voidspace | dimitern: I will | 14:11 |
voidspace | babbageclunk: that makes sense | 14:11 |
voidspace | babbageclunk: well, not returning a 404 doesn't make sense | 14:11 |
voidspace | but there you go | 14:11 |
babbageclunk | voidspace: have a nice tattoo appointment! | 14:12 |
voidspace | babbageclunk: I'm sure I will, not going yet - but soonish | 14:13 |
babbageclunk | voidspace: won't forget the doc this time, sorry! | 14:13 |
voidspace | heh, np | 14:14 |
mup | Bug #1453805 opened: Juju takes more than 20 minutes to enable voting <blocker> <ci> <ensure-availability> <intermittent-failure> <regression> <juju-core:Triaged | 14:50 |
mup | by menno.smits> <juju-core 1.23:Fix Released by menno.smits> <juju-core 1.24:Fix Released by menno.smits> <https://launchpad.net/bugs/1453805> | 14:50 |
babbageclunk | voidspace: still around? | 14:57 |
voidspace | dimitern: ping | 14:57 |
voidspace | babbageclunk: yes | 14:57 |
voidspace | babbageclunk: I might have found the bug | 14:57 |
voidspace | babbageclunk: gomaasapi does base64 encoding for us, and so do we | 14:57 |
babbageclunk | voidspace, dimitern, frobware: http://reviews.vapour.ws/r/4595/ | 14:58 |
rick_h_ | voidspace: do you know if frobware is around today? | 14:58 |
babbageclunk | voidspace: Oops | 14:58 |
voidspace | rick_h_: he was earlier, yes | 14:58 |
babbageclunk | voidspace: nice | 14:58 |
rick_h_ | voidspace: k, ty | 14:58 |
babbageclunk | voidspace: Is there an easy way to explore the maas api? | 14:58 |
voidspace | babbageclunk: I use the CLI... | 14:58 |
babbageclunk | voidspace: ah, I keep forgetting about that. | 14:59 |
babbageclunk | voidspace: thanks | 14:59 |
babbageclunk | voidspace: the docs are singularly unhelpful. | 15:00 |
frobware | rick_h_: yep, here, but IRC dropping out a lot atm | 15:00 |
rick_h_ | frobware: ah ok, I asked stokachu to shoot you an email on a potential network/bridge issue he was seeing last night | 15:01 |
katco | natefinch: standup time | 15:01 |
rick_h_ | frobware: wanted to let you know I asked him to and I know you've been doing MAAS2/bug stuff but wanted to see if you or something could poke at it and see if it's a bug or working as intended/etc | 15:01 |
alexisb | rick_h_, we should have him open a bug so we can get it on the squad board | 15:03 |
alexisb | that is where the full team is pulling priority bugs | 15:03 |
rick_h_ | alexisb: rgr, the question was "is this a bug?" so just wanted to make sure first | 15:04 |
frobware | rick_h_: I semi-stalled on an answer to stokachu. tych0 is proposing a patch for the problems discussed in that email. I also owe tych0 a patch too. | 15:06 |
katco | natefinch: ping? | 15:06 |
rick_h_ | frobware: ok, cool. Ignore me then. | 15:06 |
natefinch | katco: sorry | 15:07 |
natefinch | katco: lost track of time, coming | 15:07 |
voidspace | dimitern: babbageclunk: frobware: removing the extra base64 encode from gomaasapi fixes the issue Tim reported this morning | 15:08 |
voidspace | and now we die in a new way | 15:08 |
frobware | voidspace: that was base64 on base64 then? | 15:10 |
dimitern | voidspace: ah, good - too much encoding then :) | 15:10 |
voidspace | frobware: yep | 15:11 |
tych0 | rick_h_: yeah, i know what the issues are with a bridge | 15:11 |
tych0 | gonna send some patches today | 15:11 |
tych0 | just need to catch up on email :) | 15:11 |
rick_h_ | tych0: ok cool, thanks for all the help in figuring it out! | 15:11 |
voidspace | dimitern: where is that done - in the cloudinit package? | 15:11 |
dimitern | voidspace: in cloudconfig/providerinit IIRC | 15:13 |
voidspace | dimitern: you are correct | 15:13 |
voidspace | dimitern: I've asked Tim to fix it in gomaasapi | 15:14 |
dimitern | voidspace: sweet! | 15:14 |
tych0 | rick_h_: sure, np | 15:15 |
voidspace | dimitern: do we propagate feature flags onto the juju controller machine? | 15:16 |
voidspace | we must do | 15:16 |
voidspace | however, the issue I'm seeing now kinda implies not | 15:17 |
dimitern | voidspace: yeah | 15:19 |
dimitern | voidspace: we do | 15:19 |
voidspace | ok, kinda hard to see where this "requested map got nil" comes from | 15:20 |
voidspace | I'm bootstrapping with debug to see | 15:20 |
voidspace | maybe Subnets | 15:20 |
dimitern | voidspace: this sounds like a GetMap() failed somewhere | 15:21 |
voidspace | heh, possibly from space discoovery | 15:21 |
voidspace | dimitern: well yes... | 15:21 |
dimitern | but on a jsonobject, not a maasobject | 15:21 |
dimitern | i.e. while processing a response | 15:21 |
voidspace | that's the error message we usually get hitting a 1.0 endpoint against 2.0 | 15:22 |
voidspace | but I'm trying to work out where | 15:22 |
voidspace | the error message is pointing me to a non existent line in supercommand.go and --debug provided no extra information | 15:22 |
voidspace | although the debug line before it is immediately before a call to NewEnviron - which would report that error message if it thought the feature flag wasn't set | 15:23 |
voidspace | dimitern: where in juju are feature flags set on the controller machine | 15:24 |
voidspace | dimitern: if it's after we attempt to open an environ then we'll fail in this way | 15:24 |
voidspace | when running jujud on the controller | 15:24 |
dimitern | voidspace: let me check exactly | 15:24 |
dimitern | voidspace: cmd/jujud/main_nix.go | 15:25 |
voidspace | dimitern: I see a call to SetFlagsFromEnvironment in jujud/main_nix.go | 15:25 |
voidspace | dimitern: right, but what puts them in the environment - cloud init? | 15:26 |
dimitern | voidspace: no, they are part of the agent config we pass via the userdata | 15:26 |
dimitern | voidspace: check also cmd/jujud/agent/machine.go - in the beginning or Run() | 15:27 |
dimitern | you could grep for "developer feature flags enabled" in the logs | 15:27 |
voidspace | ok | 15:28 |
voidspace | dimitern: thanks | 15:28 |
alexisb | ericsnow, when you have a moment can you kindly review http://reviews.vapour.ws/r/4583/ | 15:29 |
ericsnow | alexisb: will do | 15:44 |
dimitern | babbageclunk: you've got a review btw | 15:47 |
voidspace | dimitern: babbageclunk: as I suspected. With maas2 including babbageclunk's branch *and* my gomaasapi fix, bootstrap now dies with | 16:06 |
voidspace | 2016-04-14 16:03:57 ERROR cmd supercommand.go:448 MAAS 2 is not supported unless the 'maas2' feature flag is set | 16:06 |
voidspace | dimitern: babbageclunk: so the feature flag isn't being propagated correctly / early enough to the controller machine | 16:06 |
dimitern | voidspace: you're not seeing the log saying they're enabled? | 16:07 |
voidspace | dimitern: haven't checked yet - we haven't touched that code path though! | 16:09 |
voidspace | off to the tatooist | 16:09 |
voidspace | will look when I return | 16:09 |
dimitern | ok | 16:09 |
=== Spads_ is now known as Spads | ||
tych0 | rick_h_: jam: frobware: https://github.com/juju/juju/pull/5164 | 16:26 |
dimitern | tych0: agent.LxdBridge is never ever set | 16:28 |
dimitern | tych0: only agent.LxdBridge is | 16:28 |
dimitern | if that | 16:28 |
tych0 | dimitern: sorry, i don't understand? | 16:28 |
dimitern | tych0: e.g. here https://github.com/juju/juju/pull/5164/files#diff-7db54798352f1e675c4e2ecba7bc349dR57 | 16:29 |
dimitern | or the one below | 16:29 |
dimitern | in MaintainInstance | 16:29 |
tych0 | dimitern: you said "agent.LxdBridge is never ever set only agent.LxdBridge is" | 16:30 |
frobware | tych0: btw, the merge should be into next | 16:30 |
dimitern | agent.LxcBridge is non-empty if explicitly set by a provider - MAAS and EC2 used to do that, but no longer do | 16:30 |
dimitern | tych0: sorry :) | 16:30 |
dimitern | so 'only agent.LxcBridge is' | 16:31 |
dimitern | but as I said, now agent.LxcBridge is no longer set and is always empty | 16:32 |
dimitern | the confusion comes from bad naming - agent.LxcBridge should've been called agent.ContainerBridge | 16:33 |
dimitern | frobware: gofmt breaks alignment when it finds a blank line | 16:33 |
tych0 | dimitern: ok. so you're saying we should just delete that entirely and always use lxcbr0? or? | 16:36 |
tych0 | i don't actually know where that comes from, i just figured it was configuration from the user | 16:36 |
dimitern | tych0: yes, I think that's correct | 16:36 |
tych0 | dimitern: i guess i'm a little gunshy about making that change | 16:37 |
tych0 | since i don't understand any of this very well :) | 16:37 |
dimitern | tych0: long, long ago there was a "network-bridge" setting you could use to override agent.LxcBridge, but it's long gone | 16:37 |
voidspace | babbageclunk: have you made much progress on volumes in gomaasapi? | 16:38 |
tych0 | dimitern: ok. it seems like that should be part of a larger change to get rid of it everywhere else then i guess? | 16:39 |
voidspace | babbageclunk: it would be good to let Tim know where you got to in the status doc | 16:39 |
tych0 | i can drop that patch if you think it doesn't matter though | 16:39 |
babbageclunk | voidspace: nope - struggling to understand how the current code works. | 16:39 |
dimitern | tych0: so now unless the provider populates ContainerBridgeName in the BootstrapParams passed to providercommon.Bootstrap(), agent.LxcBridge won't be set in the agent config | 16:39 |
babbageclunk | voidspace: It seems to rely on attrs that aren't in the 1.9 JSON. | 16:39 |
voidspace | babbageclunk: can you write it up in the doc - Tim can look at it or we can feature request the maas guys | 16:40 |
babbageclunk | voidspace: writing my own little test harness | 16:40 |
voidspace | babbageclunk: cool | 16:40 |
voidspace | babbageclunk: maybe there's another api to get the information | 16:40 |
mup | Bug #1570473 opened: juju lxd bridge detection fallback is not reliable <conjure> <juju-core:New> <https://launchpad.net/bugs/1570473> | 16:44 |
alexisb | ericsnow, or dimitern. or frobware : this is a high priority PR for review today: http://reviews.vapour.ws/r/4598/ | 16:50 |
ericsnow | alexisb: k | 16:50 |
ericsnow | alexisb: already looking at it :) | 16:51 |
alexisb | sweet :) | 16:51 |
dimitern | alexisb: I've already added comments and discussed a few points with tych0 | 16:52 |
dimitern | tych0: apart from using the always empty agent.LxdBridge (or agent.LxcBride) - LGTM | 16:52 |
tych0 | dimitern: yeah. i guess i'm not super comfortable getting rid of that because i don't really know how it works | 16:53 |
tych0 | it seems like if we want to get rid of it, we should get rid of it everywhere | 16:53 |
mup | Bug #1570473 changed: juju lxd bridge detection fallback is not reliable <conjure> <juju-core:New> <https://launchpad.net/bugs/1570473> | 16:53 |
dimitern | tych0: sgtm | 16:53 |
mup | Bug #1570473 opened: juju lxd bridge detection fallback is not reliable <conjure> <juju-core:New> <https://launchpad.net/bugs/1570473> | 16:59 |
ericsnow | tych0: FYI, ship-it | 17:00 |
ericsnow | tych0: (with one small comment) | 17:00 |
tych0 | ericsnow: no, that constant isn't exported in the LXD package; i moved it to lxdclient because we needed it there | 17:10 |
ericsnow | tych0: sounds good | 17:11 |
tych0 | how do i change the branch target? | 17:11 |
tych0 | seems lik ei might need a new pr? | 17:11 |
ericsnow | tych0: of the PR? yeah, make a new PR and link to the old review request | 17:11 |
tych0 | ericsnow: ok, cool. and then i'm good to merge right away? | 17:12 |
ericsnow | tych0: yep | 17:12 |
tych0 | ok, cool | 17:12 |
tych0 | ericsnow: wait, next is older than master? | 17:13 |
ericsnow | tych0: no, though it may have temporarily diverged a little | 17:14 |
tych0 | ok | 17:14 |
alexisb | tych0, remind me, what version of lxd did the switch to lxdbr0? | 17:21 |
alexisb | was it rc9?? | 17:21 |
tych0 | i think so | 17:22 |
* tych0 looks | 17:22 | |
tych0 | yeah | 17:23 |
tych0 | rc9 | 17:23 |
perrito666 | mm, how long until its morning in nz? | 17:24 |
* perrito666 needs a hand from menn0 | 17:24 | |
alexisb | perrito666, you have about 2 hours | 17:26 |
perrito666 | one of the fun things of this job :p one question I hardly thought I would be making | 17:26 |
redir | bbiab | 17:27 |
alexisb | and tych0 do you have the link handy for your insights write-up on lxd init and bridge setup? | 17:27 |
alexisb | cheryl linked me to is yesterday but now I cant find it :) | 17:28 |
alexisb | tych0, nevermind | 17:28 |
alexisb | found it | 17:28 |
alexisb | sorry | 17:28 |
perrito666 | agh why are the tests that take the longer the ones that always fail | 17:28 |
tych0 | alexisb: cool, np | 17:28 |
perrito666 | dimitern: voidspace can any of you make anything of the first error in https://pastebin.canonical.com/154358/ ? | 17:35 |
dimitern | perrito666: looks like map ordering issue? | 17:37 |
alexisb | katco, did you add channels to the release notes? | 17:37 |
katco | alexisb: no | 17:38 |
katco | alexisb: we didn't do the front-end work... did it not make it in there? | 17:38 |
alexisb | nope | 17:38 |
dimitern | perrito666: in any case, feel free to skip/ignore or just delete this test, as it's no longer relevant - uses state.NetworkInterface which must be removed (no longer used) - just haven't got there yet myself | 17:38 |
alexisb | katco, would you be up to adding soemthing? | 17:38 |
alexisb | we need it to release | 17:38 |
katco | alexisb: yeah adding now | 17:38 |
alexisb | thanks | 17:38 |
alexisb | heading under "Whats new for beta4" please | 17:39 |
katco | alexisb: do you think this should be an overview of channels, or simply a blurb stating that they exist | 17:42 |
alexisb | katco, I think an overview would be nice to have | 17:42 |
alexisb | so that people know | 17:43 |
alexisb | but it doesnt have to be overly detailed | 17:43 |
katco | alexisb: ok, i'm going to ping someone from the CS side of things as they are way more familiar | 17:43 |
alexisb | fair enough | 17:43 |
perrito666 | dimitern: tx, Just making sure everything tests properly with mongo3 and was not sure if I should pay attention to that test | 17:43 |
katco | alexisb: are you fine with me linking to our already excellent documentation, and then providing info about juju's command line? https://jujucharms.com/docs/devel/authors-charm-store#entities-explained | 17:52 |
alexisb | katco, yes that is fine | 17:53 |
katco | alexisb: k | 17:53 |
perrito666 | hey, I suddenly have to go for like an hour, ill be back later, mail me if you need anything | 17:59 |
=== redir_afk is now known as redir | ||
voidspace | perrito666: no idea without digging into it, sorry | 18:16 |
perrito666 | Voidspace no worries dimitern told me what I needed | 18:28 |
mup | Bug # changed: 1450299, 1538303, 1554675, 1556207, 1559099, 1560391, 1564694, 1567017, 1567020, 1568092, 1569982 | 19:42 |
perrito666 | I see some bugs changed | 19:44 |
katco | ericsnow: i'm in our 1:1 if you're ready | 20:05 |
alexisb | wallyworld, when you are in please ping me | 20:30 |
wallyworld | alexisb: give me 5 | 20:31 |
alexisb | lol | 20:31 |
alexisb | it is not urgent | 20:31 |
katco | wallyworld: you are a robot, i'm convinced | 20:31 |
alexisb | katco, me too | 20:34 |
alexisb | convinced that wallyworld is a robot | 20:34 |
alexisb | fueled by hoity-toity coffee | 20:35 |
redir | wall-eworld | 20:35 |
alexisb | lol | 20:35 |
alexisb | that was good redir | 20:35 |
redir | :) | 20:35 |
wallyworld | alexisb: zup? | 20:39 |
alexisb | 1x1 HO | 20:40 |
wallyworld | ok | 20:42 |
mup | Bug # changed: 1426729, 1516668, 1524077, 1533262, 1537620, 1538735, 1543223, 1553272, 1554251, 1554687, 1555083, 1555248, 1556249, 1560201, 1560511, 1560520, 1560531, 1560595, 1560665, 1560667, 1563576, 1563615, 1563628, 1563762, 1563843, 1563845, 1563853, 1563923, 1563924, 1563927, 1563928, | 20:42 |
mup | 1563938, 1563958, 1564057, 1566237, 1566589, 1566628, 1567182, 1567228, 1567683, 1568312, 1568390, 1569024, 1569097, 1569196, 1569408, 1569725 | 20:42 |
thumper | hi ho | 20:47 |
thumper | hi ho | 20:47 |
* thumper thinks he knows what's wrong with maas2 bootstrapping | 20:47 | |
mup | Bug #1570594 opened: read access to admin model allows grant <docteam> <juju-core:New> <https://launchpad.net/bugs/1570594> | 21:09 |
alexisb | thumper, and it is your fault | 21:18 |
alexisb | the bootstrap issue | 21:18 |
thumper | alexisb: there is another one :) | 21:18 |
thumper | probably also my fault | 21:18 |
thumper | but from much earlier | 21:18 |
thumper | wallyworld: would love a chat when you have a minute | 21:22 |
thumper | damn... | 21:24 |
wallyworld | thumper: sure, after release standup | 21:29 |
thumper | wallyworld: s'ok, I think I've sorted it out | 21:29 |
thumper | code has changed from what I remembered it being | 21:30 |
thumper | and I was having to work through things | 21:30 |
* thumper crosses his fingers | 21:41 | |
thumper | oh... getting close... | 21:45 |
thumper | fuck yeah!!! | 21:47 |
thumper | alexisb: bootstrap maas2 succeeded | 21:47 |
perrito666 | I take you found it? | 21:47 |
* thumper tries deploy | 21:48 | |
thumper | hmm... | 21:48 |
thumper | wat | 21:48 |
mgz_ | so, the correct final step after dpkg-reconfigure lxd on a fresh xenial, | 21:48 |
mgz_ | is `systemctl restart lxd`, right? | 21:48 |
thumper | can't deploy? | 21:48 |
thumper | wat? | 21:49 |
perrito666 | mgz_: dpkg should do that for you | 21:49 |
perrito666 | mgz_: in any case, if it doesnt, lxd-bridge | 21:49 |
mgz_ | it does, but that doesn't create the bridge | 21:49 |
thumper | http://paste.ubuntu.com/15839571/ | 21:49 |
perrito666 | mgz_: lxd-bridge is the service to restart | 21:49 |
thumper | anyone had deploy issues? | 21:49 |
mgz_ | okay | 21:49 |
thumper | trying to deploy ubuntu charm dies talking to charmstore | 21:49 |
perrito666 | mgz_: did you instruct dpkg to create the ipv4 network? | 21:50 |
mgz_ | yeah | 21:50 |
thumper | WAT? debug-log not supported? | 21:51 |
alexisb | \o/ | 21:51 |
alexisb | thumper, that is freak'n awesome!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | 21:51 |
thumper | alexisb: except debug log isn't working | 21:51 |
wallyworld | perrito666: did you want to chat now? | 21:51 |
thumper | deploy isn't working | 21:51 |
thumper | and neither of these things are maas bugs | 21:52 |
wallyworld | thumper: that error looks like juju not charm store | 21:52 |
alexisb | heh baby steps | 21:52 |
alexisb | and at least you know enough about the bug to know that | 21:52 |
thumper | maybe it is maas's fault | 21:55 |
thumper | jujud is panicing | 21:55 |
alexisb | thumper, i jut deployed on lxd provider on latest next build | 21:55 |
thumper | nice | 21:56 |
menn0 | thumper: I did think of this, and then forgot. Hooray for tests :) http://paste.ubuntu.com/15839652/ | 22:00 |
thumper | :) | 22:01 |
thumper | alexisb: the failure was due to the maas provider needing to do subnet/space discovery the new way | 22:01 |
thumper | so I'll leave that for voidspace | 22:01 |
thumper | and get on to the filesystem bits of gomaasapi | 22:02 |
perrito666 | wallyworld: going | 22:02 |
perrito666 | wallyworld: ?? | 22:04 |
wallyworld | perrito666: cpu 100%, hangout frozen | 22:04 |
alexisb | thumper == gomaasapi guy | 22:08 |
alexisb | ;) | 22:08 |
* thumper goes to make a coffee | 22:09 | |
* thumper looks to see who is on-call reviewer | 22:43 | |
thumper | ericsnow: still around? | 22:43 |
ericsnow | thumper: yep | 22:44 |
thumper | I'm just proposing a very simple branch that we need for maas2 | 22:44 |
ericsnow | thumper: k | 22:44 |
thumper | damn it | 22:45 |
thumper | proposed agains master , not next | 22:46 |
* thumper redoes | 22:46 | |
thumper | ericsnow: http://reviews.vapour.ws/r/4603/diff/# | 22:47 |
ericsnow | thumper: ah, bootstrap-state :) | 22:48 |
thumper | ericsnow: were you around when voidspace was having these issues? | 22:49 |
ericsnow | thumper: around but not involved | 22:49 |
* thumper nods | 22:49 | |
ericsnow | thumper: Windows isn't a concern here, right? | 22:50 |
thumper | ericsnow: no, because we only bootstrap on ubuntu | 22:51 |
thumper | windows is currently workload only | 22:51 |
thumper | not apiserver | 22:51 |
ericsnow | katco: ^^^ | 22:51 |
ericsnow | thumper: sounds good | 22:51 |
ericsnow | thumper: ship-it | 22:52 |
thumper | ericsnow: ta | 22:52 |
menn0 | wallyworld: an old MongoDB HA bug has resurfaced (it's one of the blockers) | 23:46 |
wallyworld | menn0: which branch? next? | 23:46 |
wallyworld | bug number? | 23:47 |
menn0 | wallyworld: yep on next | 23:47 |
wallyworld | menn0: it may be fixed in master | 23:47 |
wallyworld | we fixed a bunch of ha stuff for beta4 | 23:47 |
mgz_ | they will be reconverging shortly | 23:47 |
menn0 | wallyworld: bug 1453805 | 23:47 |
mup | Bug #1453805: Juju takes more than 20 minutes to enable voting <blocker> <ci> <ensure-availability> <intermittent-failure> <regression> <juju-core:Triaged by menno.smits> <juju-core 1.23:Fix Released by menno.smits> <juju-core 1.24:Fix Released by menno.smits> <https://launchpad.net/bugs/1453805> | 23:47 |
menn0 | wallyworld: ok that's good to know | 23:47 |
wallyworld | oh i haven't see that bug | 23:47 |
menn0 | wallyworld: it's an old one that aaron reopened because the symptoms look the same | 23:48 |
menn0 | wallyworld: what happens is that after enable-ha the new controller hosts come up and the agents can connect to MongoDB but then get disconnected | 23:48 |
menn0 | wallyworld: we don't have the mongodb logs to confirm what's going on | 23:48 |
wallyworld | joy | 23:49 |
menn0 | wallyworld: but off memory I think that can happen when the replicaset isn't ready yet | 23:49 |
menn0 | wallyworld: it's intermittment, I can't replicate it | 23:49 |
wallyworld | sigh | 23:49 |
menn0 | wallyworld, mgz_ : we really need those MongoDB logs to know what's happening | 23:49 |
wallyworld | if it's an existing bug why is it a regression? | 23:49 |
menn0 | wallyworld: it was fixed in 1.24 and 1.25 and has now come back | 23:50 |
menn0 | wallyworld: it could well be a completely different cause | 23:50 |
wallyworld | i'd say so because i don't think we messed with those bits | 23:50 |
wallyworld | but there were a lot of changes | 23:50 |
menn0 | wallyworld: what about those changes to mongodb setup in the machine agent that you made? (all that deleted code) | 23:51 |
menn0 | wallyworld: could that reorg have something to do with it? | 23:51 |
wallyworld | the deleted code was for pre ha environments where stuff wasn't set up yet for replication | 23:51 |
wallyworld | that setup is now done in bootstrap | 23:51 |
menn0 | wallyworld: yeah... seems unlikely | 23:52 |
menn0 | wallyworld: looking at the failures it's happening in master and next | 23:52 |
wallyworld | and i think next was branched before my changes | 23:52 |
wallyworld | but i did notice it took a while to transition to has-vote | 23:52 |
wallyworld | i just thought it was mongo behaving as normal, because well, you know, mongo is web scale | 23:53 |
menn0 | wallyworld: not 20mins though right/ | 23:54 |
menn0 | ? | 23:54 |
wallyworld | not sure tbh | 23:54 |
wallyworld | maybe 5? | 23:54 |
redir | heading out for a while. I'll check back later this eve to see if things merged... | 23:54 |
wallyworld | let's hope so | 23:55 |
=== redir is now known as redir_afk | ||
menn0 | wallyworld: 5 is acceptable I think, 20+ is not | 23:56 |
wallyworld | even 5 seems unforntunate | 23:56 |
wallyworld | i mean, wtf is it doing | 23:56 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!