[01:10] tlm: can you review this plz? https://github.com/juju/juju/pull/11591 [01:10] sure [01:11] wallyworld: free HO? [01:11] kelvinliu: ok [01:12] lgtm babbageclunk, will try again with that when it merges. Just go it again [01:29] tlm: thanks! [01:29] sorry about that [01:34] not your fault [02:27] hpidcock: if you are going to update juju to use -N on ppc64el, did you want to look at the etcd-io/bbolt? [02:28] thumper: I can look at that too [02:29] lt looks like a drop in replacement with a mem fix [02:29] perhaps extra fixes too [02:29] at least that is the theory [02:29] hpidcock: how's the python fun going? [02:31] thumper: I think I'm done for pylibjuju for now, just need to cut a release, might have it coincide with rc2 [02:31] tlm: looks like a test failure in your model agent landing [02:31] hpidcock: ack [02:31] tlm: FAIL: machine_test.go:640: MachineSuite.TestManageModelRunsCleaner [02:37] tlm: I'm wondering how useful that test is, looking at the content, it is doing a hell of a lot that isn't what we care about [02:54] tlm: looking, I'm not sure you touched that one at all, and it is just a time sensitive test [02:56] thumper: sorry back from lunch [02:56] are this test is driving me nuts [03:00] thumper: what do you recommend ? [03:06] tlm: sorry - I've been playing whack-a-mole with those tests. I'm going to bump up the timeouts wholesale [03:07] ok babbageclunk, no issues at all. Weird that mine is the one suffering. Making me wonder if I have missed something [03:12] tlm: it's possible I guess but I can't see how it would be something you've done - those tests won't be running your workers [03:29] tlm: just try to merge again [03:29] I have a branch that should fix that intermittent timeouit [03:29] just pushing now [03:32] babbageclunk: https://github.com/juju/juju/pull/11592 [03:33] wallyworld: can you fork github.com/hashicorp/raft-boltdb into juju/raft-boltdb please [03:34] ok [03:34] hpidcock: what level of changes do we need? [03:34] thumper: just a path rewrite on the bolt db [03:34] because etcd renamed the project [03:34] hpidcock: and go mod doesn't help there? [03:34] ah... [03:34] poo [03:34] not when they renamed it [03:35] thumper: could you do the fork, i'm in the middle of some Z^%W%@! unit tests [03:35] wallyworld: ack [03:35] hpidcock: here you go: https://github.com/juju/raft-boltdb [03:35] thumper: thanks [03:40] thumper: approved with gusto! [03:45] thumper: wallyworld: can you both review and merge please https://github.com/juju/raft-boltdb/pull/1 [03:57] thumper: thanks for the PR [04:15] thumper: your pr hit a different intermittent failure, I kicked it off again [04:15] duh, sorry, that was a check build not a merge one [04:15] * babbageclunk is a dork [04:16] * tlm offers babbageclunk a run [04:17] * babbageclunk accepts [04:21] babbageclunk: no worries [04:21] I've filed a bug for that [04:21] we get a lot of intermittent failures in that package [04:21] I feel that they all have the same root cause [04:21] but I've not looked yet [04:22] tlm: for the record, I kicked your PR merge again [04:24] thanks thumper [04:29] thumper: can you both review and merge please https://github.com/juju/raft-boltdb/pull/1 [04:33] babbageclunk: it's bigger than it looks due to deleting a lot of code and moving some code. i still have a unit test to fix in worker/uniter but good apart from that https://github.com/juju/juju/pull/11593 [04:33] wallyworld: normally people say the other way? [04:33] hpidcock: looking now [04:33] wallyworld: ok. looking [04:33] oops meant a comma there [04:33] fullstop sounds super terse! [04:33] all good [04:35] whoa, looks big! [04:35] lots of deleted code [04:35] and moved code [04:35] core changes not too bad [04:35] hpidcock: done [04:35] wallyworld: many thanks [04:36] babbageclunk: thre's 4 commits which natch the pr description if that helps. the raft and lease worker bits should be familiar hopefully [04:37] ok [04:37] yeah, that definitely helps [04:37] it's all a bit of a rush sorry [04:37] otherwise i'd have done separate prs [04:38] just got this fix this %W@$!%$ uniter test [04:43] no worries! [04:48] wallyworld: oh, you've done the autoexpire removal work, nice [04:48] * babbageclunk gets rid of that part of his branch [04:51] babbageclunk: yeah, sorry, i had to cause it was all mixed up in the work [04:53] makes sense [04:54] there' sstill the dummy provider stuff though [04:54] i think there's a fair bit that can be deleted off that [04:59] * tlm ducking out for a little bit to get some air [05:05] babbageclunk: i added an implementation of RevokeLease() in the dummy store and that fixes the tests [05:06] nice [05:07] maybe i can delete ExpireLease() now for the dummy store, i think we only use it to claim a lease for leadership tsting [05:08] yup, nothing uses it [05:08] wallyworld: the only extra bit is that there needs to be a background goroutine for the dummy lease store so it can expire leases internally [05:08] babbageclunk: i thought about it but from what i can see, we only ever claim a lease to set up a unit leader [05:09] i am pretty sure the testst will now all pass [05:09] ok, if you don't think there are any places that need expiry that's easier [05:09] yeah, i'll see if the current tests pass [05:09] sounds good [05:09] i'll add expiry if needed but i don't think so [05:11] kelvinliu: did moving the uniter struct initialisation help? [05:11] HO? [05:11] sure [05:26] thumper: the deferreturn issue fix was landed https://go-review.googlesource.com/c/go/+/234105/ [05:31] hasn't been picked up for a backport to 1.14 yet. Will need to keep an eye out for it. [06:17] thumper: https://github.com/juju/juju/pull/11594 [06:19] kelvinliu: this solves most of it - leadership stable after removing wrench. it keeps logging that it wants to depose leadership so a small issue to solve still https://pastebin.ubuntu.com/p/Fx8Y8XsfSd/ [06:26] kelvinliu: just afk for a bit, be back soon [06:28] wallyworld: looking now, ty [07:20] kelvinliu: did it work for you too? [07:21] yes finishing the pr now [07:21] kelvinliu: did you see the repeated messages about running a leader deposed hook? [07:22] seems to be more log noise than anything since show-status-log is ok [07:22] but something needs fixing [07:23] it might be the addition of the logger which now prints messages [07:23] so it's always been there [07:27] I saw the warning message even before this branch [07:29] lots og them repeated? [07:29] i'll see if i can fix [07:29] did u see lots of repeat? I only saw once [07:29] i saw lots of repeats [07:30] we expect one but not repeated [07:41] I can't re-produce the warning message now.. [07:44] i'm trying again, we'll see [07:46] kelvinliu: it happens after adding and removing the wrench file [07:47] kelvinliu: and it happens because the unit agent local state struct gets Leader=true for non leaders for some reason [07:48] because looks like leader tracker is setting remotestate leader to true [07:49] u mean the local leader state is out of sync [07:49] seems like it, need to do more debugging [07:57] kelvinliu: yeah, the new tracker still gives bd results for no leaders after the wrench file is removed :-( [07:59] wallyworld: did u build the latest code? [07:59] it works fine for me [08:00] kelvinliu: i'll pull your latest code and try again [08:00] i was working with my initial diff [08:01] wallyworld: I just removed debugging msg and fixed tests. no much change [08:02] ok, i'll pull latest any and try [08:10] yep [10:27] manadart, https://github.com/juju/python-libjuju/pull/423 [10:28] or hpidcock if you're around [10:31] stickupkid: Approved it. [10:31] ta [14:06] stickupkid: Landed on develop instead of 2.8. Backport: https://github.com/juju/juju/pull/11597./ [14:24] achilleasa, hml: Can you tick that patch? ^ [14:24] manadart: looking [14:25] hml, petevg: Test the shutdown service. It does indeed just fail on Bionic, and is pointless. [14:26] manadart: i’m getting a compare changes screen, not a pr [14:26] hml: https://github.com/juju/juju/pull/11597 [14:26] manadart, done [14:26] manadart: hah! I guess that's a strong argument for just queuing it up for now. And also for making a bug to actually fix it ... [15:13] manadart, achilleasa do we still need this acceptance test, or does the new CI test cover this? https://github.com/juju/juju/blob/develop/acceptancetests/assess_network_spaces.py [15:31] stickupkid: Which new one do you mean? [15:31] https://github.com/juju/juju/tree/develop/tests/suites/spaces_ec2 [15:33] stickupkid: Thought so. It doesn't really test the same things. That one tests bindings, including the upgrade-charm path. [15:33] shame, wanted to get rid of another test tbh [15:33] stickupkid: Python one tests space constraints, including container-in-machine. [15:34] manadart, fiiiiiiiiiiiiiiiiiiine, will add it to the lst of things to move rather than delete [15:34] stickupkid: I can re-write in shell style when I do that bindings card in the "Doing" lane./ [15:35] manadart, let's do that because the python one doesn't do a good job at cleaning up [15:35] stickupkid: Ack. [15:35] also I'm pretty sure we can reuse the VPC... in the python test, but let's ignore that for now [15:36] hml, you could fix that as a tempary measure I guess for running out of VPCs in eu-west-1 [15:37] hml, petevg. Did a quick smoke test on MAAS. Bionic containers appear to release IPs upon both remove-machine and kill-controller, so we don't need a network shutdown service. [15:37] * manadart heads home. [20:57] petevg: https://github.com/juju/juju/pull/11598 plz [20:58] thumper: taking a look [20:58] petevg: it is just forward porting the fix I did on friday [20:58] as wallyworld mentioned, should get it into the 2.8 branch [20:58] I should have done it friday, or yesterday, but it slipped my mind [20:59] thumper: Got it. I marked it as approved. [21:00] petevg: ta [21:00] np [21:25] petevg: bug 1876849 [21:25] Bug #1876849: [bionic-stein] openvswitch kernel module was not loaded prior to a container startup which lead to an error [23:42] wallyworld: https://github.com/juju/juju/pull/11599 [23:43] looking [23:43] tlm: ta, lgtm