[00:03] Bug #1634390 changed: jujud services not starting after reboot when /var is on separate partition [00:12] * thumper relocates while care gets serviced [00:28] Any ETA on 2.2 release? Asking for a friend's HA controller. :-) [01:17] blahdeblah: 2.2 rc1 this week or more likely early next week; 2.2 fa soon after [01:17] s/ga/fa [01:18] anastasiamac: or s/fa/ga/ even :-) [01:19] wallyworld: thanks [01:19] blahdeblah: as long as u've figured :) [01:19] details... you know?... :D [01:21] babbageclunk: free now if you want to chat [01:24] wallyworld: hey, yes please - back in standup? [01:25] sure be there in a sec [01:46] wallyworld: we've got a lot of landing failures in the api package, going to look at that before windows things [01:47] axw_: ok. i'm free now too whenever you wanted to talk [01:47] wallyworld: ok, see you in standup then [02:11] wallyworld: yay, it turns out that MachineAgent.apiserverWorkerStarter would just leak the state if any error occurred when creating the apiserver in MachineAgent.newAPIserverWorker. The syslog tests pass now. [02:12] whoot [02:47] anastasiamac: would you kindly review https://github.com/juju/juju/pull/7428? [02:47] me looking [02:48] axw_: lgtm, tyvm!! [02:48] gracias [03:45] axw_: is this one that either you or wallyworld have fixed? https://bugs.launchpad.net/juju/+bug/1665040 [03:45] Bug #1665040: Race in github.com/juju/juju/worker/peergrouper [03:46] github.com/juju/juju/worker/peergrouper.(*workerSuite).TestSetsAndUpdatesMembers.func1.1() [03:46] thumper: yep, axw i think, i'd need to check the pr [03:46] thumper wallyworld: pretty sure I fixed a different one, checking now [03:46] but there's no more peer grouper races in the lastest runs [03:46] thanks [03:47] thumper: actually my PR would have fixed a bunch of tests, it was related to some common code. so yes [03:47] sweet [03:47] axw_: can you put the pr in that bug? [03:47] yup [03:47] ta [04:18] wallyworld: https://github.com/juju/juju/pull/7429 should fix the windows test failure. going for a ride, bbs [04:18] feel free to $$merge$$ if you're happy with it [04:18] axw_: alright. after that i need to talk to you about storage [04:54] veebers: ping [04:54] veebers: are you able to jump on a quick hangout? [04:56] thumper: yep, real quick have a standup coming up :-) [04:56] oh, you go to jam's? [04:56] aye, most of the time [04:56] yeah [04:56] https://hangouts.google.com/hangouts/_/canonical.com/quick [05:01] ha ha, state has 270 public methods. [05:02] :) [05:02] hazaah [05:17] thumper: any luck with that test now? [05:37] veebers: got some time? [05:37] I only have a few minutes before heading out [05:37] taking Maia to guides [05:40] veebers: I'm still in quick [05:44] veebers: nm, have to head out now [05:44] I'm running an attempt at a test fix [05:44] babbageclunk: burton-aus: i think this failure may just be a slight difference in file content, but i haven't looked closely http://reports.vapour.ws/releases/5321/job/log-forward/attempt/1238 [05:45] axw_: babbageclunk: I've been doing some tweaking on the internals of mgo/txn, and while I'd usually reach out to menn0, he's not around to discuss them. Are either of you interested? [05:47] jam: I would be, but I need to drop soon for child feeding and hosing down, sorry. [05:47] babbageclunk: well, these are not high priority, so if you're interested in the area, we can schedule it for the future [05:47] jam: yeah, definitely! [05:48] whoa, that was probably more enthusiasm than I intended. [05:48] But I definitely am interested. [05:49] ugh, sorry thumper was peeling potatoes :-\ [05:53] wallyworld, burton-aus: I haven't looked very hard at that, but there are definitely forwarded messages in the logs there, so I think it might be a test issue? [05:54] babbageclunk: yeah. my initial thought was that the changes done should have been transparent, so if the test was passing before, it should pass now also [05:56] wallyworld: now that I think about it there might be ordering differences (since the logs for each model would be forwarded independently), or the forwarding might only be set up for the controller model in the test (expecting that would also forward the model logs, which isn't true any more). [05:57] babbageclunk: the latter sounds more plausible [05:57] we'll have to get the test updated [05:58] wallyworld: want me to take a look at that? I think I'm finished the log collection splitting. Just fixing state tests that look directly in the logx collection, and then migration steps. [05:58] I mean, upgrade steps [05:59] babbageclunk: i think it maybe better to continue your wip [06:02] babbageclunk: burton-aus might get to it first, otherwise you could look after putting up the log split PR [06:02] even if we just identify that the test needs fixing, we can sort out something to unblock the release [06:15] wallyworld: sorry, you wanted to chat storage? 1:1? [06:25] axw_: yeah, standup ho? [06:26] axw_: https://github.com/wallyworld/juju/compare/cleanup-removes-app-artefacts...wallyworld:cleanup-removes-app-artefacts2?expand=1 [06:41] jam: if you have something written down about your changes, I'd be interested to read - I don't know enough about the insides of mgo/txn to provide immediate useful feedback === axw_ is now known as axw [06:42] axw: sure. the specific changes in this case are doing some caching and preloading of db requests [06:54] wallyworld: yeah, I thought that too, just checking. [07:10] wallyworld, burton-aus: with the log-forwarding test, the test uses a regex to look for log entries that come from the other machine, that might need to be updated (maybe simplified) [07:11] veebers this is the one I guess: [07:12] "^[A-Z][a-z]{,2}\ +[0-9]+\ +[0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}\ machine-0.3ec9b846\-9520\-4d40\-87f2\-5c9114c8a28f\ jujud-machine-agent-3ec9b846\-9520\-4d40\-87f2\-5c91\ .*$" [07:12] veebers though that machine related string is just fetched from the run. [07:12] burton-aus: aye, that's the one. [07:16] babbageclunk, wallyworld, burton-aus: It's kind of hidden but the failure is in ensure_multiple_models_forward_messages, which adds a new model and deploys something, then checks that logs from that model appears in the rsyslog machine logs [07:17] so Looking at what babbageclunk mentioned, perhaps there is some extra config needed to make sure those logs get forwarded as well? [07:21] veebers: that bit should be transparent IIANM [07:23] wallyworld: I'm sorry I don't understand, which part, that there needs to be extra config for the models? Or that there shouldn't be any need for extra config? [07:23] no need for extra config [07:24] if the test is checking logs from a model, that bit should work the same as before [07:24] axw: i think we have an issue still - cleanupDyingUnit calls cleanupUnitStorageAttachments() with remove=false, so the storage removal doesn't happen, and EnsureDead() fails. i can't see that the processing of a dying unit adds a cleanup job to remove dying storage [07:25] wallyworld: ah cool, thanks for clarifying. It's possible the regex check needs tweaked (and or relaxed) if the format has changed a bit [07:25] i am likely missing something [07:25] wallyworld: just a minute, looking [07:25] veebers: the format should be the same also [07:25] veebers: xtian will need to look into it a bit [07:26] wallyworld: right, so cleanupDyingUnit causes the storage attachments to go to Dying (detach but don't remove) [07:26] veebers: it could be a test tweak as well, we just don't know yet [07:26] wallyworld: then the uniter will run detach-storage hooks [07:26] and will then remove them [07:26] axw: ah right, i need to run that bit manually as well [07:32] wallyworld: ack ok, keep us posted :-) [07:33] jam: just here to beg a review [07:33] thumper: of? [07:33] https://github.com/juju/juju/pull/7430 [07:34] thumper: I missed your ping before, available now if you like [07:34] veebers: see PR [07:35] veebers: can you just check the CI test aspects? [07:36] thumper: link to PR? [07:36] veebers: two lines above your mention [07:37] thumper: ah ha :-) looking now [07:39] it really is pretty simple [07:39] 7 files, +10 −6 [07:40] thumper: sweet, commented. LGTM [07:40] jam: ? [07:40] thumper: was otp, do you want it right away? [07:41] I'll poke axw [07:41] I was wanting to kick off the merge [07:41] it's very very simple [07:41] * thumper looks at axw [07:42] * thumper will pop back in 10min [07:42] * thumper needs to clean house a bit [07:42] veebers: I'm assuming for develop we use the in tree tests and charms [07:42] thumper: are we guaranteed never to see .log before it becomes .log.gz [07:44] thumper: not yet. That's something we're working toward (won't be far away) [07:45] veebers: oh... well the CI test will fail then [07:46] you do see a .log before it becomes .log.gz, but just very briefly [07:47] thumper: ack, once that branch lands we can do the separate "update all" which will propagate the changes (then re-run the test again if needed) [07:47] veebers: but... but.. then the tests will fail from older versions [07:48] or have you fixed that? [07:48] thumper: sorry I was afk, you wanted a review from me? [07:48] https://github.com/juju/juju/pull/7430 [07:48] axw: discussing ^^ [07:48] jam: I guess if we hit a weird timing issue, we can add a sleep 5 to the action [07:48] :) [07:49] but it passed here [07:49] on lxd with ssd [07:49] well... I guess we'll find out [07:49] * thumper needs to head off now... [07:56] veebers: I can merge tim's branch, but CI will start failing - how much longer are you around? isn't it already past your EOD? [07:59] axw: aye it is, the CI test will fail on the revision-build, we can make it pass though by updating the nodes once it lands [07:59] It's a bit messy as we're in the process of making it so testing is done from in tree [08:00] veebers: ok. looks like this is meant to be going into rc1, so I'll merge and hopefully balloons can sort it out when he wakes up [08:01] axw: ack, I'll email and let him kknow [08:02] veebers: thanks :) === salmankhan1 is now known as salmankhan [12:09] rogpeppe: hey, i'm told a recent change to add/use a dns cache may have added a flakey test, TestDNSCacheUsed... any chance you could look? we are trying to get an rc out this week. here's an example of a failure http://reports.vapour.ws/releases/5325/job/run-unit-tests-xenial-amd64/attempt/255 [12:12] wallyworld: was that before https://github.com/juju/juju/pull/7429 landed? [12:13] rogpeppe: it's off the latest CI run, let me check to see that rev it is [12:14] wallyworld: thanks [12:15] rogpeppe: yeah, the CI run is from testing PR 7430 which landed 5 hours after [12:16] wallyworld: OK, i'll look into it [12:16] rogpeppe: tyvm, i'm off to bed real soon [12:16] we are looking to get a good CI run for the morning in australia [12:18] around midnight UTC? [12:18] wallyworld: there's one problem that really should be fixed before release [12:20] wallyworld: https://bugs.launchpad.net/juju/+bug/1692905 [12:20] Bug #1692905: cert error on public controller: cannot validate certificate [12:20] wallyworld: i'm working on the fix [12:22] hey wallyworld ;) [12:24] Looks like if you land the unit test fix the only issue will be with the windows deploy test. We think the slave is sick [12:25] rogpeppe, will you have a fix for that today? [12:25] balloons: i am hoping to, yes [12:26] balloons: i've fixed the code - just writing tests for it [12:27] Awesome. So we can get a bless on that landing. Changing any dependencies? [12:27] balloons: here's a fix for another flaky test of mine... https://github.com/juju/juju/pull/7434 [12:29] rogpeppe, ack. Good stuff. === akhavr1 is now known as akhavr [13:28] Bug #1694988 opened: AWS instances created by juju don't have an associated IPv6, even if "auto-assign IPv6 addresses" is enabled for the subnet === pathcl is now known as path === path is now known as pathcl === akhavr1 is now known as akhavr [17:15] this PR fixes juju bug 1692905: https://github.com/juju/juju/pull/7438; reviews appreciated [17:15] Bug #1692905: cert error on public controller: cannot validate certificate [18:00] rogpeppe: unit tests are failing [18:17] wpk, could i please get a review of https://github.com/juju/juju/pull/7439 ? [19:04] cmars: done. [19:04] wpk, thanks [19:12] wpk: looking [19:14] cmars: you could look at the server version and warn if it's 2.2 or greater, I guess [19:15] wpk: ok, a bunch of fairly trivial things; i was too lazy to run the tests my own machine, can you tell? :) [19:26] wpk: i'd very much appreciate a review if you're up for it, BTW [19:29] wpk: tests should be fixed now [19:40] rogpeppe: full suite kills my laptop, so I always do the bare minimum and then test it in Jenkins :) [19:40] wpk: at least the tests are now run before you hit $$merge$$ [21:14] veebers: hey, sorry to miss the discussion last night - I think if you set up the model defaults with log forwarding settings before creating the model it should start forwarding logs straight away. [21:18] babbageclunk: that's contrary to what wallyworld said re: the settings being transparent isn't it? [21:19] veebers: you always did need to set up log forwarding in the initial config [21:19] veebers: It's transparent if you were already setting the model defaults. ;) [21:19] ah right [21:19] * veebers checks what the test is doing now [21:21] veebers: There is a change in behaviour - before if you set up forwarding for the controller it would automatically forward for all models. Now if you want that you need to put the settings in model defaults. [21:22] babbageclunk: ah right ok, I think that's the missing part in my thinking, cheers [21:23] veebers: Sorry, I probably should have mentioned that earlier! [21:24] babbageclunk: you have docs re: what the settings values are for that? [21:26] Does anyone happen to know if there exists CI jobs that depend on the fact that GO is being installed by Juju's top level Makefile? [21:28] balloons: is it only unit tests that we expect juju to install go as part of it's own setup? [21:28] externalreality, yes we do depend on it [21:29] veebers, we we run the merge jobs we use the makefile on the new instance to test [21:32] veebers: no, I don't think it's documented anywhere. The settings haven't changed - still logforward-enabled, syslog-host, syslog-ca-cert, syslog-client-cert, syslog-client key. [21:33] veebers: Just how they're used has changed. [21:37] babbageclunk: do you need to set the syslog-* stuff on the model config too? [21:39] veebers: yup - they could be independent, in theory. [21:40] veebers: Want to have a quick hangout about it? [21:44] babbageclunk: would love to, just in release call, will be a little bit before i'm free, can I ping you? [21:45] thumper, axw, wallyworld: you might wanna take a look at this PR - I'm hoping it can land before the release: https://github.com/juju/juju/pull/7438 [21:46] rogpeppe: might be able to :-) btw there's still a dns cache related race [21:46] veebers: yup yup [21:47] wallyworld: got a link to a failure? [21:47] give me a sec to find it [21:48] rogpeppe: TestDNSCacheUsed. I think you fixed a different one. http://reports.vapour.ws/releases/5331/job/run-unit-tests-race/attempt/2831 [21:48] wallyworld: ooh, an actual race! [21:48] wallyworld: thanks [21:49] ooh! [21:49] rogpeppe: we are delaying rc1 till next monday/tuesday, so that will get time for your pr to land [21:50] rogpeppe: yeah, we have fixed several actual races this week elsewhere as well. so close to getting a proper blessed CI run [21:52] wallyworld: ok, that's a trivial fix [21:52] yay [21:58] Is anyone else having trouble with pushes to github taking a long time? [21:58] babbageclunk: I did last night and thought it was just my internet (or that I did something wrong) [21:59] wallyworld: https://github.com/juju/juju/pull/7440 [22:00] rogpeppe: you rock, ty, will look real soon [22:00] i'll merge [22:00] rick_h: just finishing meeting, be there in a sec [22:00] wallyworld: all good [22:03] veebers: Mine's just sitting here with a git pack-objects process doing nothing. I guess the other end of the connection is loaded? [22:05] ugh, finally! [22:08] babbageclunk: can you join the release call plz? [22:08] thumper: sure [22:08] rick_h: still in release call, can we delay for 15 mins, or defer? [22:08] rick_h: I'm feeling left out, you haven't asked me for a call [22:08] wallyworld: rgr, just setup something that works for you next week if that's ok [22:08] thumper: well, I like wallyworld :P [22:09] rick_h: sure, and sorry, 2.2 release is so close [22:09] we need to get stuff sorted [22:09] wallyworld: <3 [22:09] yea [22:11] wallyworld: thumper the one thing for 2.2 I wanted to bring up is if this article effects instance type availability and needs to be mentioned. https://goo.gl/VNe9oC [22:11] wallyworld: other than that I'll catch you later [22:11] rick_h: we'll look at it [23:14] babbageclunk: could you also tweak the controller setting max-txn-log-size when you do the mustString() thing for the other ones? [23:15] wallyworld: ok [23:15] wallyworld: we don't ww [23:15] oops [23:15] wallyworld: covfefe [23:16] lol [23:16] wallyworld: We don't want to run the log pruner per-model do we? [23:16] it should be like the status history pruner [23:16] i think that's per model [23:17] wallyworld: oh, ok - I'll take a look at that