[00:16] axw: ah, i think i understand what you might be saying - snap lxd daemon picks up the request from juju rather than deb lxd daemon and hence the wrong lxd get used. i thought setting LXD_DIR was supposed to cause the right lxd to get used (at least that's what nicholas thought). [00:22] wallyworld: I just realised that I hit the same thing with the conjure-up snap last week. it ships with its own version of juju and has LXD issues. [00:22] menn0: yeah. adam and nick know about it. not sure exactly what their solution is. my system is still somewhat screwed even after removing lxd snap [00:23] need to look at it again today [00:23] wallyworld: do you need to run sudo lxd init again? [00:23] that's my last resort, yeah === ben__ is now known as benk01 [01:25] wallyworld: PR to update azure regions: https://github.com/juju/juju/pull/6969. seeing as we need to get the public-clouds.yaml updated anyway, figured we may as well try and get this one in too? [01:25] sure [01:45] axw: sorry about delay. lgtm but there's a block on landing it due to current QA policy. there's a meeting tomorrow where that will be fixed [01:46] wallyworld: thanks [03:13] wallyworld: were you running into this at bootstrap? 2017-02-13 03:05:33 ERROR cmd supercommand.go:458 new environ: Get https://10.0.8.1:8443/1.0: x509: certificate has expired or is not yet valid [03:13] anastasiamac: or is this what you saw? ^^ [03:13] menn0: similar, yeah, can't recall exact wording [03:14] its happening for me r [03:14] too [03:14] axw: could ^^^ be related to the lxd cert caching change? [03:14] menn0: i did not see this. wallyworld may have... just fixed my lxd to work over the weekend. my failures were related to lxc profile misconfiguration [03:15] menn0: i've had to totally purge lxd, manually remove neworks and ip links, lxd init again etc. still not there though - juju can't talk to container [03:15] menn0: maybe. or the snap. are you using the snap? [03:15] no snaps involved [03:16] menn0: you haven't installed the juju or lxd snaps? [03:16] not on this machine [03:17] menn0: ok. seems most likely related to my changes then. it doesn't happen to me though. can you try and isolate it? [03:17] axw: sure. where's the cache file? [03:18] menn0: juju will pull certs out of either ~/.config/lxc or ~/.local/share/juju/lxd [03:19] axw: I don't have ~/.local/share/juju/lxd [03:19] menn0: either/or. if that one doesn't exist, it'll look in ~/.config/lxc [03:19] (I don't have ~/.local/share/juju/lxd either) [03:20] axw: lxc is working fine by itself [03:20] axw: I just launched a container with "lxc launch" [03:21] menn0: lxc will use the unix socket locally [03:21] menn0: so HTTPS won't factor at all [03:21] axw: ok right [03:31] axw: moving the .config/lxc directory out of the way doesn't fix things [03:31] thumper: menn0: coming by to say hi? [03:32] jam: yeah [03:32] jam: coming [03:35] menn0: can you try adding some logging to finalizeLocalCertificateCredential in provider/lxd/credentials.go? we should be generating a new cert, uploading it to lxd, and then using that [03:41] wallyworld: can you pastebin the output of "lxc config trust list" for me? [03:43] wallyworld menn0: one possibility is that the certs in ~/.config/lxc are expired. mine expire in 2026... [03:43] axw: but I moved them out of the way and the symptoms stayed the same? [03:43] menn0: ok, weird [03:45] menn0: even that doesn't repro for me [03:45] menn0: which version of ubuntu, lxd? [03:47] axw: well those certs *have* just expired [03:47] Validity [03:47] Not Before: Feb 10 02:21:23 2016 GMT [03:47] Not After : Feb 9 02:21:23 2017 GMT [03:47] lxd 2.0.8, xenial [03:49] menn0: did you add a lxd cert credential to credentials.yaml? by autoload-credentials perhaps? [03:50] axw: no lxd creds in credentials.yaml [03:51] menn0: well if you moved the certs, I can't see how their expiry matters :/ [03:53] axw: unless there's something inside the lxd daemon? [03:54] menn0: maybe the server cert is the thing that has expired? [03:55] menn0: the client credential changes could be a coincidence. seeing as the client certs you had have expired, possibly the server cert has too [03:56] axw: I think you're right. the problem happens with juju 2.0.3 too [03:57] menn0: ok, cool [03:57] * axw wonders how to regen [03:59] menn0: looks like if you delete /var/lib/lxd/server.{crt,key}, lxd will recreate them on startup [04:00] wallyworld: ^^ [04:00] axw: they've expired too. that's got to be the problem. [04:00] menn0: terribly confusing coindidence :) [04:10] axw: yeah, i did that earlier, full lxd reinstall and init was also needed for me [04:10] due to wierd network issues === frankban|afk is now known as frankban [12:29] wallyworld: ping if you're around [12:29] hey [12:30] morning you two [12:30] evening here :-) [12:33] wallyworld: any urgent bug that was left last night? otherwise ill just pick from the pile [12:35] perrito666: this one is important https://bugs.launchpad.net/juju/+bug/1623217 [12:35] Bug #1623217: juju bundles should be able to reference local resources [12:36] not sure if it can be done in time [12:36] ie not sure of the scope of any change, haven't looked into it [12:40] jam: can i help with something? [12:41] on bug #1577556 they just mentioned that they saw a 'statuses' doc that didn't have a txn-revno. I was a bit confused but it looks like statuseshistory is where we don't use TXNs but 'statuses' we *should* be using txns, right? [12:41] Bug #1577556: unit failing to get unit-get private-address in the install hook [12:41] [12:43] wallyworld: sorry, wrong bug. actual is bug #1484105 [12:43] Bug #1484105: juju upgrade-charm returns ERROR state changing too quickly; try again soon [12:44] wallyworld: tx [12:44] jam: yeah, status hisotry avoids txns. but the status doc itself in the status collection should use txn [12:45] afaik the regular status collection does use txns [12:46] jam: all usages of statusDoc should be in the context of a txn.Op slice [12:47] perrito666: right, its supposed to, I was worried we had a case where sometimes we were and sometimes we weren't. [12:47] but they are reliably (?) hitting a case where the statuses docs are missing the txn-revno, which is bad mojo for the TXN logic [12:49] jam: odd, I wonder if someone confused status with status history [12:49] jam: the latest status added was model status iirc [12:49] jam: i just did a (quick) code search and can't see anywhere where we are writing to the status collection not using txns [12:50] perrito666: wallyworld: yeah, I grepped the code as well. It does mention an assert that "txn-revno == 0" which would indicate we tried to read it, got back 0 and then said "well, its gotta stay 0 for this updaet" [12:51] jam: you talking about assert := bson.D{{"txn-revno", txnRevno}} in statusSetOps() I assume [12:51] wallyworld: comment #16 on the bug [12:52] there is an enttry in txn queue that says: "a": { "txn-revno": NumberLong(0) } [12:52] my guess is that it read the doc, saw there was no txn-queue so got the 'zero' value and then put that back into the assert. [12:52] So I don't think that is what *caused* the problem in the first place, just those txns all fall over because the data is bad. [12:53] we don't really know what caused it to be bad in the beginning. [12:56] jam: juju uses the txn-revno assert in a set status function (this pattern isused elsewhere too, eg updating ports). but i think that relies on a create being done first to insert the original doc and set the txnrev-no field. and we do have a createStatusOp. i can't see how we would attempt to set a status value on a doc with a given doc id without having done a create first [12:56] and that create should insert the txnrev-no field [12:57] wallyworld: (a) race? (b) if we're creating the doc with an upsert, we're still using the txn logic, and the mgo/txns is the thing that keeps txn-revno correct [12:57] we're just using an assert to say "if this doc is changed underneath us, ignore this change" [12:58] that matches my understanding [12:58] jam: just a wild idea, do you think this could be caused by mgopurge? [12:59] we create the doc with an insert and a doc not exists assert [12:59] wallyworld: https://bugs.launchpad.net/juju/+bug/1623217 seems like it will need some spec-ing and agreement from stakeholders [12:59] Bug #1623217: juju bundles should be able to reference local resources [13:00] probably, i just saw it and it looked important [13:00] should move to 2.2 [13:00] wallyworld: it has been like that since december, I dont think it is that urgent that we can skip proper procedure :) [13:01] we should get someone started on that spec though [13:01] fair enough, i just skimmed it [13:01] true, was just a comment not a rant [13:01] perrito666: the goal there is just to allow local file paths like local charms [13:01] perrito666: so hopefully a short spec [13:02] rick_h: yes [13:09] rick_h: perrito666: not sure how much of a spec it should have, vs you already have a syntax for 'use this local charm' ./path, we just support that for a resource blob as well [13:11] jam: +1 [13:11] perrito666: short enough spec for you ? :) [13:13] jam: sure it is, if annyone comes asking ill post that line :p === plars-away is now known as plars === perrito667 is now known as perrito666 [13:38] perrito666: I didn't think that was actually going into 2.1, and doing it post rc feels a bit late, but if it isn't terribly hard, it is a nice quality-of-life for a bunch of people [13:48] jam: I dont think ill be able to fit this into 2.1 most likely tonight ill move that to 2.2 === balloons26 is now known as balloons === frankban is now known as frankban|afk [18:49] https://github.com/cloud-green/juju-relation-mongodb/pull/6 cmars, mattyw - JFYI: sent out a couple of patches for the mongodb interface [19:19] morning folks [19:25] * perrito666 touches tip of hat [19:25] Dmitrii-Sh, hey there, thanks very much, will take a look === elmo_ is now known as elmo [19:45] could there be a reason why juju 1.25.6 fails to bootstrap an aws model with authentication failure, while juju 2.0.0 works fine (with the same credentials) [19:48] tasdomas: no idea sorry [20:27] Bug #1664359 opened: Authentication fails for juju 1.25.6 on aws [23:09] perrito666: thumper: IMO the proposed test fix for that status history test is too lenient - it throws away the notion of the order of the expected status messges [23:10] the fix should have been to be lenient with the expected count of messages, not the order [23:15] wallyworld: I can very well send a follow up, since the test belo does test the order I thought it was not really that important [23:17] it does but the two tests are ostensibly similar - one uses a filter, the other doesn't. they should essentially test the same things the same way [23:17] just with nd without filter [23:19] yes, but there is a race there, that infrastructure was built by william but he did not notice that by inserting the records with identical dates the ordering is non deterministic [23:20] two items with the same date can be returned in any given order, and that is ok [23:20] because in reality it will never happen that two items have the same date to the nanosecond [23:24] ah, i see, that helps explain it a bit