| wpk | anastasiamac: thanks | 00:15 |
|---|---|---|
| anastasiamac | wpk: anytime \o/ | 00:16 |
| === bdx_ is now known as bdx | ||
| menn0 | babbageclunk: I guess the lazy lookup was in case something happened to import the package without actually running mongod using tests? | 00:45 |
| menn0 | babbageclunk: for that reason I guess it makes sense to keep it lazy | 00:45 |
| thumper | wallyworld: I have things to do in town this afternoon, so let's cancel our call | 00:49 |
| thumper | wallyworld: we will be seeing each other in two days anyway | 00:49 |
| wallyworld | thumper: sgtm | 00:51 |
| === mup_ is now known as mup | ||
| axw | menn0: I've noticed that we're not passing "--storageEngine wiredTiger" to mongo 3.2 in the tests. have you tested with that added? | 01:05 |
| babbageclunk | menn0: yeah, that'd be the reason not to do it at import time. | 01:20 |
| menn0 | axw: isn't the default WT? | 01:52 |
| axw | menn0: yeah, just the help text is a bit confusing "defaults to wiredTiger if no data files present" | 01:53 |
| axw | menn0: anyway, I tested and it made no difference | 01:53 |
| menn0 | axw: ok good to know | 01:53 |
| axw | menn0: forcing it to mmapv1 makes it faster though, in the limited testing I did | 01:54 |
| axw | menn0: I don't think we want to do that though, since we use WT in prod | 01:54 |
| axw | though maybe just having WT in CI would be good enough | 01:54 |
| menn0 | axw: we think the problem is that we're still deleting DBs instead of clearing the collections in a lot of places | 01:56 |
| menn0 | WT is much slower at deletings DBs | 01:56 |
| axw | menn0: could be, but I did notice that at least some of the test code (i.e. between SetUpTest completion and before TearDownTest start is slower with WT | 01:59 |
| menn0 | ok, so there's issues there as well | 02:01 |
| menn0 | axw, thumper, babbageclunk: there's quite a significant performance increase with wiredtiger if you disable transparent huge pages | 02:32 |
| menn0 | echo never > /sys/kernel/mm/transparent_hugepage/enabled | 02:32 |
| menn0 | without changing anything else this takes the agent/agentbootstrap tests from 2.8s to 1.8s on my machine | 02:33 |
| menn0 | consistently | 02:33 |
| menn0 | interestingly, Juju is supposed to be setting that on controller machines but I don't see that | 02:35 |
| babbageclunk | menn0: right, but do you know what other impact that would have on our systems? | 02:36 |
| menn0 | babbageclunk: not sure yet | 02:36 |
| axw | menn0: hmm interesting. doesn't improve allWatcherStateSuite.TestChangeApplications in state for me | 02:36 |
| menn0 | axw: what's the setting set to on your machine? | 02:36 |
| axw | menn0: it was set to always | 02:36 |
| menn0 | ok | 02:37 |
| babbageclunk | menn0: It makes sense to do it on dedicated machines (although I guess it wouldn't help in lxd?) | 02:37 |
| menn0 | duh | 02:37 |
| axw | menn0: with mongo 2.4 (juju-mongodb), that test takes just under 0.4s. with mongo 3.2 (juju-mongodb3.2) it takes 1.3s | 02:37 |
| menn0 | that sucks | 02:38 |
| menn0 | maybe we shouldn't be using wiredtiger at all | 02:38 |
| menn0 | there's a lot of horror stories online about WT | 02:38 |
| menn0 | lots of people seem to have switched back | 02:38 |
| menn0 | babbageclunk: https://github.com/juju/testing/pull/123/files | 02:43 |
| thumper | hmm... | 02:43 |
| thumper | wow | 02:43 |
| thumper | menn0: initial timing tests... 2.6 mongod 23m17s, api 32s, apiserver 138s, state 671s | 02:45 |
| menn0 | babbageclunk, axw, thumper: seems like we get most of the performance back by passing --storageEngine mmapv1 | 02:45 |
| menn0 | but then it's not like production | 02:45 |
| babbageclunk | menn0: I think I floated that at the time. | 02:45 |
| thumper | menn0: initial timing tests... 3.2 mongod 55m36s, api 366s, apiserver 1515s, state 1455s | 02:46 |
| thumper | api and apiserver packages are 10x slower | 02:46 |
| menn0 | thumper: can you try with huge transparent pages turned off? | 02:46 |
| babbageclunk | menn0: I guess the main risk/annoyance would be if there's some behaviour difference and we only find out when the CI tests fail. | 02:46 |
| thumper | let me address the clear databases | 02:46 |
| thumper | then I'll try with the transparent pages off | 02:47 |
| menn0 | thumper: yep ok. | 02:47 |
| axw | menn0: yeah that's what I found too. I think I'm OK with that as long as our CI still uses 3.2 (which it would, since there's no way to override that?) | 02:47 |
| babbageclunk | menn0: less of a problem now that we have check builds though | 02:47 |
| axw | babbageclunk: do you have time for a small review? https://github.com/juju/testing/pull/124 | 02:51 |
| * thumper has kicked off the new test run | 02:52 | |
| menn0 | babbageclunk, thumper: I'm going to return to other work for now but will be happy to discuss ideas or test stuff | 02:53 |
| thumper | menn0: ack | 02:53 |
| thumper | babbageclunk: using clear databases rather than reset actually looks like it is taking longer... | 02:57 |
| thumper | hmm | 03:00 |
| thumper | api package was 378s | 03:00 |
| thumper | menn0, babbageclunk: I'll wait for the apiserver package timings, but the clear databases call makes it even slower | 03:01 |
| thumper | I'll try the transparent pages turned off | 03:01 |
| babbageclunk | axw: sure, looking at that now (and menn0's too). | 03:03 |
| babbageclunk | axw: LGTM'd | 03:10 |
| axw | babbageclunk: thanks | 03:19 |
| axw | babbageclunk: did you forget about this one? https://github.com/juju/description/pull/8 | 03:26 |
| axw | there's no bot on that repo, I can merge if you like | 03:26 |
| babbageclunk | axw: oh, yes please - I think I decided to merge it when thumper was away and then forgot to follow it up. | 03:27 |
| babbageclunk | ta | 03:27 |
| axw | babbageclunk: done | 03:27 |
| menn0 | babbageclunk: thanks for the review | 03:31 |
| babbageclunk | menn0: <thumbsup emoji> | 03:31 |
| thumper | menn0, babbageclunk: changing the huge transparent pages is making very little difference | 03:32 |
| thumper | I don't know what to do next | 03:32 |
| menn0 | thumper: weird. it seemed to make a big difference for me. | 03:33 |
| babbageclunk | thumper: :( you said clearDatabases is slower? | 03:33 |
| thumper | yeah | 03:33 |
| babbageclunk | thumper: well bums | 03:35 |
| thumper | yeah | 03:35 |
| thumper | even with the huge pages set to never, the api package is 10x slower | 03:37 |
| thumper | so 300s instead of 30s | 03:37 |
| thumper | NFI what to do next | 03:38 |
| babbageclunk | convienient timing too | 03:39 |
| menn0 | thumper: --storageEngine mmapv1 :) | 03:41 |
| thumper | menn0: where? | 03:41 |
| menn0 | in juju/testing/mgo.go, in the args we pass to mongod | 03:41 |
| menn0 | have to be careful that mongod is 3.x though as --storageEngine doesn't exist in 2.x | 03:42 |
| menn0 | thumper: also, what about dropping DBs in MgoSuite.TearDownTest instead of clearing? | 03:43 |
| thumper | menn0: did you do something about caching the mgo version? | 03:54 |
| menn0 | thumper: yes | 03:54 |
| thumper | menn0: where? | 03:55 |
| menn0 | https://github.com/juju/testing/commit/3ccb7d0a3f3412b41f8c1bf000ff7078d28c0af9 | 03:55 |
| thumper | k | 03:59 |
| menn0 | thumper: that doesn't deal with the slowness. it just closes a potential problem. | 04:00 |
| thumper | hmm with mmapv1 storage engine it is only 30% slower | 04:02 |
| thumper | are we prepared to accept a 30% decrease in test speed? | 04:09 |
| thumper | and what other choice do we have? | 04:09 |
| thumper | and should this info drive production choices? | 04:10 |
| babbageclunk | thumper: production usage patterns are very different from the tests though - we very rarely drop databases. :) | 04:13 |
| thumper | yeah... | 04:13 |
| * thumper needs to organise some trip stuff | 04:13 | |
| babbageclunk | jam: ping? | 05:19 |
| axw | wallyworld: oracle bootstraps for me on my trial account FWIW | 06:16 |
| wallyworld | axw: it is failing because it is starting a yakkety instance (and looking for yakkety tools in /var/lib/juju) even though juju thinks it has asked for a xenial instance. so there's an issue with image selection :-( | 06:17 |
| wallyworld | i will try and remove the yakkety image from my account, that should fix it | 06:18 |
| axw | okey dokey. I think I only have xenial in mine | 06:18 |
| wallyworld | yeah, i'll let gabriel know, just testing to make sure | 06:19 |
| wpk | axw: re: proxy - this can break things, as it really sets the proxy globally | 07:22 |
| === frankban|afk is now known as frankban | ||
| wpk | axw: I forwarded you an e-mail from jam describing changes, I don't want to land it without a proper test run, preferably by someone who uses proxying | 07:45 |
| wpk | axw: there is a 'workaround' (remove systemd code), but that has yet to be decided | 07:45 |
| jam | wpk: how about if we leave the ability to do systemd, but have it disabled, and land the rest ? | 07:59 |
| jam | I think the rest is a clear improvement | 07:59 |
| jam | and we can discuss whether systemd will actually break things | 08:00 |
| mup | Bug #1686938 opened: During a destroy-model the units first update/upgrade which delays the destroy process <apt> <delay> <destroy-model> <upgrade> <juju-core:New> <https://launchpad.net/bugs/1686938> | 09:14 |
| wpk | Is anyone willing to review #7204? I know it's big but it's been waiting for almost a month now.. | 09:59 |
| === fnordahl_ is now known as fnordahl | ||
| jam | wpk: did you go through it as though you were doing a review as well? | 11:12 |
| wpk | jam: yes, although it is ahuge one so I might have missed some things... | 11:26 |
| jam | wpk: so it seems there is one piece that needs fixing wrt bridge_ports and inet and inet6 stanzas | 11:27 |
| wpk | jam: we should only put one? | 11:28 |
| jam | bug #1650304 | 11:28 |
| mup | Bug #1650304: Juju2: 'Creating container: failed to ensure LXD image: image not imported!' <oil> <oil-2.0> <regression> <juju:Incomplete> <juju 2.1:Incomplete> <https://launchpad.net/bugs/1650304> | 11:28 |
| jam | from axw: https://bugs.launchpad.net/juju/+bug/1650304/comments/7 | 11:29 |
| wpk | jam: Reading through bridge-utils, IMHO it's OK to have it but I'll check | 11:37 |
| jam | wpk: it wasn't ok from experience, given the original bug | 11:38 |
| wpk | jam: as I understand that wasn't the problem, and looking at the code it's simply doing for (bridge in bridge_ports) if (bridge not already in the bridge) brctl addif.... | 11:41 |
| jam | hopefully thats if "port not already in the bridge" :) | 11:42 |
| jam | but I get your point | 11:42 |
| jam | wpk: are we sure that is true across trusty+xenial+yaketty, etc? | 11:43 |
| wpk | trusty, xenial, zesty | 11:46 |
| wpk | if [ "$MODE" = "start" ] && [ ! -d /sys/class/net/$IFACE/brif/$port ]; then | 11:46 |
| wpk | I don't have any Yakkety but I don't think they'd change that line just for this one :) | 11:46 |
| === tinwood is now known as tinwood_lunch | ||
| axw | wpk: if you're changing it back (I'm happy to be told I was wrong - again, I'm no expert), can you please run the QA steps from that PR that jam linked to? | 12:35 |
| axw | I mean, the PR linked from the bug that jam linked to... | 12:35 |
| wpk | axw: willdo, I'm checking how the 'old version' worked and bridge_ports wasn't the only problem (repeated auto, etc.) | 12:44 |
| wpk | axw: was the problem only with rackspace or with other providers too? | 12:44 |
| axw | wpk: yep, I may well have conflated the other issues. that's the only place I observed and reproduced the issue. I think it was also seen on MAAS by others, but could never confirm myself | 12:45 |
| rogpeppe | here's a tiny little PR for review, controller-specific cookie jars: https://github.com/juju/juju/pull/7294 | 12:48 |
| rogpeppe | axw, wpk, jam, wallyworld: if someone manages to review it, i'll owe them a beer or two :) | 12:49 |
| axw | rogpeppe: heh. I've already got a beer and it's 9pm on a a friday night ;) I'll take a look on monday if nobody does it sooner | 12:50 |
| rogpeppe | axw: thanks | 12:50 |
| rogpeppe | axw: this work was the subject of this tweet: https://twitter.com/rogpeppe/status/849963422032723968 :) | 12:51 |
| axw | rogpeppe: I figured :) | 12:52 |
| === tinwood_lunch is now known as tinwood | ||
| === freyes__ is now known as freyes | ||
| === frankban is now known as frankban|afk | ||
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!