[00:09] menn0: fair enough [00:09] there is an option for running tests within a package in parallel [00:09] but [00:09] axw: trivial review when you have a moment http://reviews.vapour.ws/r/825/ [00:09] a. it's not the default (nobody writes tests taht can be run concurrently) [00:09] b. you have to specifically enable it on a per tets by test basis [00:10] c. it wouldn't matter because gocheck makes everything look like one test case anyway [00:11] davecheney: thanks [00:16] thumper, wallyworld__ : possible fix to the CI blocker. http://reviews.vapour.ws/r/826/ [00:19] menn0: how confident are you that it fixes it? [00:19] it certainly looks like a candidate [00:20] * thumper is munching then free for the call [00:20] ah shoot [00:20] menn0: I forgot about my chat with wallyworld__ [00:20] wallyworld__: what is your schedule like? [00:20] i can wait [00:20] thumper: well I can't replicate the problem locally, it's timing dependent. it only seems to fail for i386 and ppc in CI. [00:20] cool cool [00:20] ping when ready [00:20] menn0: I can try it on my slow ec2 instance [00:21] thumper: but there was definitely a problem which this PR fixes and I can see how the bug could lead to the problems we're seeing. [00:21] menn0: although after reviewing the fix my error might be unrelated [00:21] jw4: that would be great [00:22] jw4: the error you added to the bug could definitely be related to this fix [00:22] menn0: cool - I'll spin it up and try [00:22] jw4: thank you [00:53] menn0: hmm - first test run without your change did not fail - I'll try again [01:03] wallyworld: chat? [01:03] jw4: I would expect the problem to be somewhat intermittent [01:03] yup [01:04] wallyworld: can you just change the timeout to be based on the bootstrap-timeout? [01:04] wallyworld: it's configurable [01:16] wallyworld: I'm in the hangout, but I'm going to make a coffee [01:29] menn0: still not repro'ing it - i'm gonna try with an i386 image [01:30] jw4: ok thanks [01:31] menn0: otherwise we'll just have to watch for it in the CI servers [01:31] jw4: the CI unit test runs with the fix in are about to finish so that might tell us more as well [01:31] menn0: +1 [01:32] jw4: this is the key one to watch: http://juju-ci.vapour.ws:8080/job/run-unit-tests-precise-i386/1327/console [01:32] jw4: the jujud tests have already passed which is a good sign [01:33] sweet === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 [02:06] wallyworld: missed you, I'm off to the vet, bbl [02:06] thumper: np, i'm off to buy a new farking network adaptor [02:06] can't stand these disconnects [02:39] wallyworld: how common is it for the jujud test runs to time out on the (apparently slower) i386 CI hosts? [02:39] wallyworld: because a lot of runs just did [02:40] not sure, but often i suspect [02:40] our tests timeout on my laptop [02:49] axw: pr updated [02:50] wallyworld: just noticed. LGTM [03:05] wallyworld: your default size change is ineffective, sorry I didn't pick up on it earlier. your change just alters a temporary [03:05] I'll fix it in my branch [03:08] ah ball ok, thanks [03:08] that's what i get for rushing [03:12] wallyworld: you back? [03:13] yep [03:13] haven't left yet [03:13] chat? [03:13] sure === kadams54 is now known as kadams54-away [03:39] wallyworld: https://github.com/wallyworld/juju/pull/21 [03:39] looking === kadams54-away is now known as kadams54 [03:46] axw: gh not letting me comment. but i think we need allcons[name] = cons after cons.Pool = defaultPool [03:49] wallyworld: ah yeah, I'll fix it [03:49] ta [03:51] wallyworld: updated [03:51] thanks, merging === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 [04:43] night all === kadams54 is now known as kadams54-away [04:57] menn0: in case i don't catch you later, see you in capetown [04:58] axw: small problem to add to todo list - storage constraints reject pool names with "_" eg fast_ebs [04:58] wallyworld_: yeah, we need to decide what's a valid name. I just copied what was valid for service names [04:59] np, i'll add to spreadsheet [05:25] axw: i can't get ebs volumes with provisioned iops to work - the ec2 instance doesn't attach any ebs volumes and doesn't finish coming up in AWS [05:26] wallyworld_: just about to propose UML changes, will take a look in a sec [05:26] that's using volume-type = provisioned-iops [05:26] and iops = 2000 or whatever [05:26] i blame aws [05:26] folks - is it possible to change the juju admin secret in a deployed environment? [05:26] also - morning [05:27] not sure [05:27] wallyworld_: there is a ratio between size and IOPS that you need to adhere to, could be that [05:27] ah ok [05:28] wallyworld_: http://reviews.vapour.ws/r/828/ [05:31] morning mattyw. I don't think so, but not 100% sure [05:36] wallyworld_: I'm going to $$__JFDI__$$ for hopefully obvious reasons :) [05:36] yeah, np [05:36] i ned to land the maas fix also [05:37] i'm off for a bit - going to computer shop to get network card [05:38] later [05:39] will let you know how I go with IOPS [05:49] axw, morning - I was in the process of bringing the env up - so I decided to just destroy and start again === Guest8948 is now known as jam [06:39] axw: anastasiamac: i think we should enhance storage list to display the provider type [06:39] axw: wallyworld_: sounds good but let me merge my stuff 1st [06:39] axw: wallyworld_: i can add it next :D [06:41] sure :-) [06:42] wallyworld_: provider type, or pool? [06:42] wallyworld_: I was thinking pool... [06:42] axw: both maybe. as a user, i'd like to see provider type as well [06:43] wallyworld_: I'd hope the pools are named to make it obvious which provider is involved [06:43] that's up the the user :-) [06:43] well if you know what pool to select when you deploy... you should also know what it is when you list them [06:44] supporting "_" would help [06:44] that only holds if the person listing and deploying are the same tho [06:44] wallyworld_: why? you can use "-" [06:44] i agree in principal, but yes, as you just said [06:44] true [06:44] i guess i'm an old python guy at heart [06:44] :) [07:22] wallyworld_: https://github.com/wallyworld/juju/pull/23 [07:23] wallyworld_: did you need any other docs or anything for next week, or is that UML update sufficient? [07:24] wallyworld_: btw I just confirmed that with a pool having iops=3000, I can deploy with a 100GiB volume [07:24] otherwise the block device mapping seems to get silently eaten up... [07:24] that branch just beefs up the validation [07:40] davecheney, ping? [08:15] mattyw: ack [11:16] voidspace, jamestunnicliffe, TheMue, fix for bug 1416134 - please take a look http://reviews.vapour.ws/r/830/ [11:16] Bug #1416134: Unable to override network-bridge if container type is kvm (local provider) for adam-stokes> [11:33] voidspace, http://reviews.vapour.ws/r/831/ [11:43] TheMue, http://reviews.vapour.ws/r/832/ [11:45] thanks for the reviews voidspace and TheMue! [11:45] dimitern: yw [12:11] morning === lazyPower is now known as lp|Metrics === liam_ is now known as Guest51703 [12:58] wallyworld_: still here [12:58] ? [13:00] hi, about to sleep, i catch a taxi soon [13:01] wallyworld_: the short answer is tounit <-- testing purposes [13:02] sure, but i think the function signature is wrong [13:02] you mean my findEntity? [13:02] yeah - why not have it return a StatusSetter [13:03] wallyworld_: because it needs to satisty an interface to be used by common statusetter? [13:03] right, so change that [13:03] the operation is to call SetStatus [13:03] the different is how the thing to set status on is found [13:04] for machines and units, shared logic can be used [13:04] for unit agents, the Agent() method is used [13:04] anyways, have a think about it, i gotta get a few hours sleep before i catch th plane [13:05] go, have a nice trip [13:05] i might be off base too, so see if it makes sense looking at the code [13:05] we can re-discuss this via mail whenever you are more awake, cheers [13:05] ty, see you later [14:04] anyone know anything about the proxyupdateworker? [14:11] wallyworld_: ping if you're still awake at this horrendous hour [14:13] voidspace, might do a little, I touched it semi-recently [14:13] voidspace, I am 80% convinced that it's fundamentally dangerous [14:14] voidspace, but it was a reduce-duplication driveby, not a make-tings-right [14:22] fwereade_: so this worker updates the .juju-proxy file with the http(s) (etc) settings from the config [14:22] voidspace, yeah [14:22] fwereade_: so these values are propagated from environments.yaml into .juju-proxy and /home/ubuntu/.profile is updated to source this file [14:22] voidspace, it's the env vars that that http module uses that worry me [14:22] fwereade_: however the environment variables are never set in the jujud process [14:23] fwereade_: so http.DefaultClient (which uses these environment variables) does not see the proxy settings [14:23] fwereade_: so if they are needed (which is usually why they are set...) everything fails [14:23] fwereade_: so my intention is that the proxyupdater should also set the environment variables for the process [14:24] fwereade_: https://bugs.launchpad.net/juju-core/+bug/1403225 [14:24] Bug #1403225: charm download behind the enterprise proxy fails [14:24] voidspace, proxyupdater.go:151? [14:24] lol, just found in the code [14:24] // TODO(fwereade) GAAAAAAAAAAAAAAAAAH this is LUDICROUS. [14:24] :) [14:24] fwereade_: ah right, but only the first time - right [14:25] fwereade_: interesting. I can attach to jujud and see that they are *not* set. [14:25] voidspace, I think that says "always on first call or if they've changed since last time"? [14:27] voidspace, best guess is that jujud changes somehow stopped it from being started properly? [14:27] fwereade_: hmmm... indeed. Well, it doesn't seem to be "working". I'll do some debugging. Thanks. [14:27] voidspace, I thought I had tests for that, but possibly I screwed them up [14:28] no problem, I know where I'm looking now - so it should be easy from here ;-) [14:28] jam, dimitern: We have a licensing bug to address. bug 1416425 which has an easy fix I think [14:28] Bug #1416425: src/bitbucket.org/kardianos/osext/LICENSE is wrong [14:28] jam dimitern: Can you ask someone to address the licensing bug [14:30] voidspace, (also: it's started in both the unit agent and the machine agent, please check both) [14:30] fwereade_: yep [14:31] also dimitern , or anyone, why did the license to juju/cmd change? [14:31] sinzui, i'll have a look [14:32] sinzui, so is this about using rev 44140c5 in dependencies.tsv for bitbucket.org/kardianos/osext - in 1.23, 1.22, and 1.21 ? [14:35] dimitern, yes, tip has the fix [14:36] fwereade_: ftr, the ensuing line is indeed ludicrous [14:36] cmd := exec.Command("go", "build", "github.com/juju/juju/cmd/jujud") [14:36] * fwereade_ knows :( [14:40] sinzui, ok, I'll prepare a fix for it [14:50] oh boy.. I found a bug with retry-provisioning [14:51] using it on a failed container in EC2 caused a new *instance* to be provisioned for "0/lxc/0" [14:52] how do you like this http://paste.ubuntu.com/9957564/ [15:12] dimitern, Sorry, several more licensing bugs were found. Can you find people to look into the issues. they all affect 1.23, 1.22, and 1.21 https://bugs.launchpad.net/juju-core/+milestone/1.21.2 [15:13] * sinzui return to the broken publish-revision job [15:14] sinzui, I can fix the osext one, but unfortunately we're close to finishing the sprint here and somebody else can take over [15:16] alexisb, hey, are you around? [15:27] does anyone have any tips on getting around a "virbr0": net: no such interface error on bootstrap of local provider on 1.21.1? [15:28] coreycb, do you have virbr0 on your machine? [15:28] dimitern, nope [15:28] dimitern, I know I could create it, but wondering more about if my juju config is wrong [15:28] coreycb, and you're using container: kvm ? [15:28] dimitern, yes [15:29] coreycb, looks like you're affected by bug 1416134 which I just fixed this morning :) [15:29] Bug #1416134: Unable to override network-bridge if container type is kvm (local provider) by dimitern> [15:29] dimitern, sweet, I think :) [15:30] dimitern, thanks [15:30] coreycb, in the mean time you can either create a virbr0 or use "network-bridge" setting but lxcbr0 there won't work due to that bug [15:30] dimitern, ok, thanks [15:31] np :) [15:33] voidspace, http://reviews.vapour.ws/r/833/, http://reviews.vapour.ws/r/834/, http://reviews.vapour.ws/r/835/ PTAL :) [15:46] natefinch around? [15:46] bodie_: no natefinch today === kadams54 is now known as kadams54-away === benji_ is now known as benji === kadams54-away is now known as kadams54 === lp|Metrics is now known as lazyPower [19:42] sinzui: ping [19:42] hi menn0 [19:43] sinzui: so it's hard to know if I've fixed the CI blocker because we seem to be hitting test timeouts for the i386 and ppc unit test runs [19:43] sinzui: I know my recent changes added a few seconds to the cmd/jujud/agent test run (on my machine at least) [19:44] sinzui: I wonder if we were close to the 20min threshold on the i386 and ppc CI machines and I've just tipped it over? [19:45] menn0, I am not user. but looking at the FAIL: in the tests, these are very different from what we usually see [19:46] menn0, and the ppc test are run very quickly === kadams54 is now known as kadams54-away [19:47] sinzui: run-unit-tests-trusty-ppc64el and run-unit-tests-precise-i386 seem to be hitting timeouts I don't see fails there [19:48] menn0: the fact that we are still seeing panic means something is wrong. We know that some tests will fail, but pass when retried, but we don't see panics [19:48] sinzui: since the last fix was committed the only panics I see are about tests taking too long [19:48] menn0, the 386 tests ridiculously slow. we cannot get a fast cloud images [19:49] sinzui: can you point me at another panic? [19:49] menn0, http://data.vapour.ws/juju-ci/products/version-2286/run-unit-tests-trusty-ppc64el/build-2281/consoleText [19:49] sinzui: that's because "panic: test timed out after 20m0s" [19:50] sinzui: go test triggers a panic because the timeout set for the test run was exceeded [19:50] menn0, but about that 386. We have a fast trusty 386 machine, but the tests don't pass because of two issues. I reported a bug about this. If the two tests are fixed or *skipped* you will see a passs http://data.vapour.ws/juju-ci/products/version-2286/run-unit-tests-trusty-i386/build-164/consoleText [19:51] bug 1408459 [19:51] Bug #1408459: pingerSuite tests consistenly fail on trusty 386 [19:53] sinzui: ok, I see that and we need to fix that. but that's an unrelated issue to the current CI blocker. [19:53] sinzui: what I'd like is if we could bump up the test timeout to 25 min or something to see if the panics related to the CI blocker have been resolved. [19:54] menn0,the machine is healthy [19:54] and I can see it doesn't need a updated [19:54] menn0, and we can see that 1.21 and 1.22 pass [19:55] menn0, so I think master + gccgo have a problem that 1.21 doesn't [19:55] sinzui: I'm not saying there's anything wrong with the machine. I think that the addition of extra tests in trunk recently has pushed us over the timeout 20min timeout for the machine agent tests. [19:57] menn0, the whole suite when it was passing take 37 to 44 minutes to run === kadams54-away is now known as kadams54 [19:58] menn0, this is the last health mast run http://juju-ci.vapour.ws:8080/view/Juju%20Revisions/job/run-unit-tests-trusty-ppc64el/2263/ [19:59] disregard the tar timestamp issue. I fixed that this morning [20:02] sinzui: ok, that's useful [20:02] * menn0 checks how the timeout works [20:02] oh good, because I am running out of thinks to look for [20:07] sinzui: are there any plans to get the tests working on windows and osx ? === kadams54 is now known as kadams54-away [20:08] sinzui: ok... so i'm pretty sure the timeout is per Go package and it's looking like on i386 and ppc64 the cmd/jujud/agent package is now taking over 20mins [20:09] jw4 yes We test the client ever revision and we run windows units every revision http://reports.vapour.ws/releases/2286 [20:09] sinzui: that's a massive jump from the last successful runs [20:09] sinzui: so it's not just tests taking a little bit longer... something is stuck [20:09] sinzui: but these tests pass fine on amd64 [20:09] menn0, yep [20:09] sinzui: am I ok to use those hosts later on today [20:09] ? [20:10] sinzui: I think I still have the details to access the ppc64 host that you gave me some time ago === kadams54-away is now known as kadams54 [20:10] jw4, when there is a way to for juju to provision a windows machine and deploy an agent we will add that test to ever revision [20:11] sinzui: cool. That link doesn't work with my ubuntu credentials, probably because I'm not a cougar (<-- is that even a thing ? ;) ) [20:11] menn0, oh, you my have visited a different machine even if you had. these tests are run on stilson-08. I can add your ssh keys if needed [20:11] sinzui: I'm just curious how they're being run because on the surface they're not even close to running on my windows and osx machines [20:12] * menn0 checks [20:12] jw4, only canonical staff can see the results, but the raw results are public [20:12] sinzui: kk - tx [20:12] sinzui: that's what I have access for and I can still get in [20:13] jw4, the last 2 tests are windows client and units http://juju-ci.vapour.ws:8080/view/Juju%20Revisions/ [20:13] fab [20:13] sinzui: access to the i386 host might be useful too if that's possible [20:13] menn0, we spin up an instance for that [20:13] sinzui: sweet - thanks again [20:14] sinzui: ok, I can spin up my own then. [20:14] menn0, but I can get you acesss to the machine we want to run tests on. again, about that 386 bug. when those tests pass, we will remove the old 386 job [20:14] sinzui: ok [20:15] sinzui: hmm the last 'successful' run of win-client-deploy doesn't seem to be actually successful : http://juju-ci.vapour.ws:8080/view/Juju%20Revisions/job/win-client-deploy/1381/console [20:16] jw4, looks like a successful bootstrap to me [20:16] sinzui: it probably is - I just see a strange error about 5 lines up from the bottom [20:17] jw4, The SUCCESS was on the win machine the other lines are from ssh and the test running. we don't what them cursing the win success [20:17] and ssh did bugger up again [20:18] sinzui: I see - also I noticed that the run-unit-tests-win2012-amd64 is getting the same '/usr/lib/juju/bin/mongod' file does not exist error that I'm getting on my windows tests [20:19] sinzui: do you have any details or a script for spinning up the precise and trusty i386 test hosts? [20:19] sinzui: I want to make sure I replicate as closely as possible [20:21] menn0, juju-ci-tool/run-unit-tests m1.medium ami-81dee0e8 [20:21] sinzui: perfect. thanks. i've got a few things to take care of right now but I'll jump back on this today. [20:22] menn0, aws is the only provider that permits 386 and they limit it to a slow machine, and aws is phasing out 386, which is why we need to switch to our special machine [20:22] sinzui: got it === kadams54 is now known as kadams54-away === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away === kadams54-away is now known as kadams54 === kadams54 is now known as kadams54-away