[00:15] anastasiamac: thanks [00:16] wpk: anytime \o/ === bdx_ is now known as bdx [00:45] babbageclunk: I guess the lazy lookup was in case something happened to import the package without actually running mongod using tests? [00:45] babbageclunk: for that reason I guess it makes sense to keep it lazy [00:49] wallyworld: I have things to do in town this afternoon, so let's cancel our call [00:49] wallyworld: we will be seeing each other in two days anyway [00:51] thumper: sgtm === mup_ is now known as mup [01:05] menn0: I've noticed that we're not passing "--storageEngine wiredTiger" to mongo 3.2 in the tests. have you tested with that added? [01:20] menn0: yeah, that'd be the reason not to do it at import time. [01:52] axw: isn't the default WT? [01:53] menn0: yeah, just the help text is a bit confusing "defaults to wiredTiger if no data files present" [01:53] menn0: anyway, I tested and it made no difference [01:53] axw: ok good to know [01:54] menn0: forcing it to mmapv1 makes it faster though, in the limited testing I did [01:54] menn0: I don't think we want to do that though, since we use WT in prod [01:54] though maybe just having WT in CI would be good enough [01:56] axw: we think the problem is that we're still deleting DBs instead of clearing the collections in a lot of places [01:56] WT is much slower at deletings DBs [01:59] menn0: could be, but I did notice that at least some of the test code (i.e. between SetUpTest completion and before TearDownTest start is slower with WT [02:01] ok, so there's issues there as well [02:32] axw, thumper, babbageclunk: there's quite a significant performance increase with wiredtiger if you disable transparent huge pages [02:32] echo never > /sys/kernel/mm/transparent_hugepage/enabled [02:33] without changing anything else this takes the agent/agentbootstrap tests from 2.8s to 1.8s on my machine [02:33] consistently [02:35] interestingly, Juju is supposed to be setting that on controller machines but I don't see that [02:36] menn0: right, but do you know what other impact that would have on our systems? [02:36] babbageclunk: not sure yet [02:36] menn0: hmm interesting. doesn't improve allWatcherStateSuite.TestChangeApplications in state for me [02:36] axw: what's the setting set to on your machine? [02:36] menn0: it was set to always [02:37] ok [02:37] menn0: It makes sense to do it on dedicated machines (although I guess it wouldn't help in lxd?) [02:37] duh [02:37] menn0: with mongo 2.4 (juju-mongodb), that test takes just under 0.4s. with mongo 3.2 (juju-mongodb3.2) it takes 1.3s [02:38] that sucks [02:38] maybe we shouldn't be using wiredtiger at all [02:38] there's a lot of horror stories online about WT [02:38] lots of people seem to have switched back [02:43] babbageclunk: https://github.com/juju/testing/pull/123/files [02:43] hmm... [02:43] wow [02:45] menn0: initial timing tests... 2.6 mongod 23m17s, api 32s, apiserver 138s, state 671s [02:45] babbageclunk, axw, thumper: seems like we get most of the performance back by passing --storageEngine mmapv1 [02:45] but then it's not like production [02:45] menn0: I think I floated that at the time. [02:46] menn0: initial timing tests... 3.2 mongod 55m36s, api 366s, apiserver 1515s, state 1455s [02:46] api and apiserver packages are 10x slower [02:46] thumper: can you try with huge transparent pages turned off? [02:46] menn0: I guess the main risk/annoyance would be if there's some behaviour difference and we only find out when the CI tests fail. [02:46] let me address the clear databases [02:47] then I'll try with the transparent pages off [02:47] thumper: yep ok. [02:47] menn0: yeah that's what I found too. I think I'm OK with that as long as our CI still uses 3.2 (which it would, since there's no way to override that?) [02:47] menn0: less of a problem now that we have check builds though [02:51] babbageclunk: do you have time for a small review? https://github.com/juju/testing/pull/124 [02:52] * thumper has kicked off the new test run [02:53] babbageclunk, thumper: I'm going to return to other work for now but will be happy to discuss ideas or test stuff [02:53] menn0: ack [02:57] babbageclunk: using clear databases rather than reset actually looks like it is taking longer... [03:00] hmm [03:00] api package was 378s [03:01] menn0, babbageclunk: I'll wait for the apiserver package timings, but the clear databases call makes it even slower [03:01] I'll try the transparent pages turned off [03:03] axw: sure, looking at that now (and menn0's too). [03:10] axw: LGTM'd [03:19] babbageclunk: thanks [03:26] babbageclunk: did you forget about this one? https://github.com/juju/description/pull/8 [03:26] there's no bot on that repo, I can merge if you like [03:27] axw: oh, yes please - I think I decided to merge it when thumper was away and then forgot to follow it up. [03:27] ta [03:27] babbageclunk: done [03:31] babbageclunk: thanks for the review [03:31] menn0: [03:32] menn0, babbageclunk: changing the huge transparent pages is making very little difference [03:32] I don't know what to do next [03:33] thumper: weird. it seemed to make a big difference for me. [03:33] thumper: :( you said clearDatabases is slower? [03:33] yeah [03:35] thumper: well bums [03:35] yeah [03:37] even with the huge pages set to never, the api package is 10x slower [03:37] so 300s instead of 30s [03:38] NFI what to do next [03:39] convienient timing too [03:41] thumper: --storageEngine mmapv1 :) [03:41] menn0: where? [03:41] in juju/testing/mgo.go, in the args we pass to mongod [03:42] have to be careful that mongod is 3.x though as --storageEngine doesn't exist in 2.x [03:43] thumper: also, what about dropping DBs in MgoSuite.TearDownTest instead of clearing? [03:54] menn0: did you do something about caching the mgo version? [03:54] thumper: yes [03:55] menn0: where? [03:55] https://github.com/juju/testing/commit/3ccb7d0a3f3412b41f8c1bf000ff7078d28c0af9 [03:59] k [04:00] thumper: that doesn't deal with the slowness. it just closes a potential problem. [04:02] hmm with mmapv1 storage engine it is only 30% slower [04:09] are we prepared to accept a 30% decrease in test speed? [04:09] and what other choice do we have? [04:10] and should this info drive production choices? [04:13] thumper: production usage patterns are very different from the tests though - we very rarely drop databases. :) [04:13] yeah... [04:13] * thumper needs to organise some trip stuff [05:19] jam: ping? [06:16] wallyworld: oracle bootstraps for me on my trial account FWIW [06:17] axw: it is failing because it is starting a yakkety instance (and looking for yakkety tools in /var/lib/juju) even though juju thinks it has asked for a xenial instance. so there's an issue with image selection :-( [06:18] i will try and remove the yakkety image from my account, that should fix it [06:18] okey dokey. I think I only have xenial in mine [06:19] yeah, i'll let gabriel know, just testing to make sure [07:22] axw: re: proxy - this can break things, as it really sets the proxy globally === frankban|afk is now known as frankban [07:45] axw: I forwarded you an e-mail from jam describing changes, I don't want to land it without a proper test run, preferably by someone who uses proxying [07:45] axw: there is a 'workaround' (remove systemd code), but that has yet to be decided [07:59] wpk: how about if we leave the ability to do systemd, but have it disabled, and land the rest ? [07:59] I think the rest is a clear improvement [08:00] and we can discuss whether systemd will actually break things [09:14] Bug #1686938 opened: During a destroy-model the units first update/upgrade which delays the destroy process [09:59] Is anyone willing to review #7204? I know it's big but it's been waiting for almost a month now.. === fnordahl_ is now known as fnordahl [11:12] wpk: did you go through it as though you were doing a review as well? [11:26] jam: yes, although it is ahuge one so I might have missed some things... [11:27] wpk: so it seems there is one piece that needs fixing wrt bridge_ports and inet and inet6 stanzas [11:28] jam: we should only put one? [11:28] bug #1650304 [11:28] Bug #1650304: Juju2: 'Creating container: failed to ensure LXD image: image not imported!' [11:29] from axw: https://bugs.launchpad.net/juju/+bug/1650304/comments/7 [11:37] jam: Reading through bridge-utils, IMHO it's OK to have it but I'll check [11:38] wpk: it wasn't ok from experience, given the original bug [11:41] jam: as I understand that wasn't the problem, and looking at the code it's simply doing for (bridge in bridge_ports) if (bridge not already in the bridge) brctl addif.... [11:42] hopefully thats if "port not already in the bridge" :) [11:42] but I get your point [11:43] wpk: are we sure that is true across trusty+xenial+yaketty, etc? [11:46] trusty, xenial, zesty [11:46] if [ "$MODE" = "start" ] && [ ! -d /sys/class/net/$IFACE/brif/$port ]; then [11:46] I don't have any Yakkety but I don't think they'd change that line just for this one :) === tinwood is now known as tinwood_lunch [12:35] wpk: if you're changing it back (I'm happy to be told I was wrong - again, I'm no expert), can you please run the QA steps from that PR that jam linked to? [12:35] I mean, the PR linked from the bug that jam linked to... [12:44] axw: willdo, I'm checking how the 'old version' worked and bridge_ports wasn't the only problem (repeated auto, etc.) [12:44] axw: was the problem only with rackspace or with other providers too? [12:45] wpk: yep, I may well have conflated the other issues. that's the only place I observed and reproduced the issue. I think it was also seen on MAAS by others, but could never confirm myself [12:48] here's a tiny little PR for review, controller-specific cookie jars: https://github.com/juju/juju/pull/7294 [12:49] axw, wpk, jam, wallyworld: if someone manages to review it, i'll owe them a beer or two :) [12:50] rogpeppe: heh. I've already got a beer and it's 9pm on a a friday night ;) I'll take a look on monday if nobody does it sooner [12:50] axw: thanks [12:51] axw: this work was the subject of this tweet: https://twitter.com/rogpeppe/status/849963422032723968 :) [12:52] rogpeppe: I figured :) === tinwood_lunch is now known as tinwood === freyes__ is now known as freyes === frankban is now known as frankban|afk