/srv/irclogs.ubuntu.com/2014/08/26/#juju-dev.txt

wallyworld_thumper: how long do the cmd/juju tests take for you? on my average laptop, they take about 160s. JujuConnSuite test set up is about 200ms for each00:08
* thumper runs00:08
thumperah...00:08
thumpermy state is somewhat broken right now00:08
wallyworld_the bot seems way slower00:08
wallyworld_i'm pretty sure we're using tmpfs00:09
* thumper scrolls way up00:09
thumper~170s00:09
wallyworld_so that's well within the timeout limit00:09
thumperwe are on fairly high speced machines though00:10
wallyworld_so yes, the tests are bad, but they should run00:10
davecheneywallyworld_: on my laptop it was 400s00:10
wallyworld_wow00:10
thumperI have ssd00:10
wallyworld_are yoyu using tmpfs?00:10
davecheneyyes00:10
davecheneycore i5 thinkpad x22000:10
davecheneythere is a 3-4 second delay between each test run00:10
davecheneyi'm assuming that is setup/teardown00:10
wallyworld_i don't see that00:10
wallyworld_i see 200ms00:10
davecheneydon't look at the times00:10
davecheneygo test -gocheck.v00:11
wallyworld_i did00:11
davecheneywell, your machine is the odd man out00:11
davecheneyit takes 400s on my machine00:11
davecheneyand 570+ in CI00:11
wallyworld_thumper gets the same times as me00:11
davecheneyi only have 4 cores00:11
davecheneyjust like CI00:11
wallyworld_i have an 8 core i700:12
wallyworld_shouldn't be that much faster00:13
davecheneyyes, that is what I said00:13
wallyworld_i'm not sure CI should be blocked on this - it's not a regression00:14
davecheneyi agree00:14
wallyworld_i'll update the bug00:14
davecheneythanks00:14
wallyworld_maybe in the interim we can run CI on a larger instance type00:16
davecheneymaybe00:32
davecheneythe largest that ec2 offer is an 8 core00:32
davecheneyand in my testing00:32
davecheneythat is still slower than my 2 year old core i500:32
davecheneybuilding gccgo as a test00:32
wallyworld_davecheney: thumper: changing the instance type to "c3.2xlarge" (8 core, 16GB) seems to have helped the cmd/juju tests to pass. More data points needed, but -p 2 was faster than full parallisation for this run. there was a spurious apiserver/client failure which caused -p 2 to run. interestingly, the -p 2 run was faster for many tests this time around01:17
wallyworld_maybe we try running with -p 2 to start with for a bit01:18
perrito666thumper: are you around?01:18
thumperyeah...01:18
perrito666thumper: don't sound so happy01:18
perrito666:p01:18
thumperI'm writing docs/bootstrapping.txt01:18
perrito666thumper: my condoleances01:19
thumperafter having to work it out for the umteenth time...01:19
thumperI thought I'd write it down01:19
perrito666thumper: how savvy are you on menn0's "upgrade mode" ?01:19
thumperish01:19
thumperwhazzup?01:19
perrito666thumper: I am trying to replicate the idea for restore and I would like some overview but apparently I missed menn0 this week01:20
perrito666I worked out a decent part of it01:21
perrito666ok guys braing shutting down, see you all or some of you tomorrow02:04
waigani_cya perrito66602:09
waigani_I'm getting a "state changing too quickly; try again soon" error02:25
davecheneyo_O02:26
waigani_how soon? should I have a cup of tea?02:26
waigani_I'm looping over a collection of state users and adding each one as an environ user via a new AddEnvironmentUser func - maybe I need to add them all as one transaction?02:27
waigani_thumper: ^?02:28
thumperwaigani_: it is lying to you02:42
thumperthat is another error that returns when the assertion fails02:42
thumperit tries a few times02:42
thumperand then assumes that the assertion is due to someone else02:42
thumperin dev, it is most likely you02:42
thumperwaigani_: did that help?02:53
* thumper goes to turn the coffee machine on02:53
waigani_thumper: sorry I was afk02:53
axwwallyworld__: I've got to fix up my bootstrap tools branch, it's incompatible with your changes to use /var/lib/juju in the containers02:57
axwgah, not sure how I'm going to fix this...02:57
wallyworld__axw: that's ok, i mounted that directory to avoid the need for the container to call out to http (if i am thinking of the right thing)02:58
wallyworld__calling out to http on the host02:58
axwwallyworld__: yes that is the one. in my branch, file:// is treated specially. it means read the file contents locally and then add to the cloud-init script02:58
wallyworld__i did that change because people were seeing errors02:59
wallyworld__this branch i think obsoletes the need for that02:59
wallyworld__since the tools are copied into the container via ssh init02:59
axwwallyworld__: only for bootstrap ,not for containers...02:59
wallyworld__the containers will get from state server03:00
wallyworld__we can handle retries at that point03:00
wallyworld__it was only a short term quick fix03:01
axwwallyworld__: different branch. I guess I can revert it temporarily... I'm checking if we can do better tho. we may be able to load the tools into the userdata for the local provider03:01
wallyworld__could do. i just wanted a quick way to avoid the observed source of the errors03:02
wallyworld__i knew it would be throwaway03:02
axwwallyworld__: can you PTAL: https://github.com/juju/juju/pull/600/commits03:44
wallyworld__sure03:44
axwfrom the 5th commit on03:44
axwwallyworld__: it looks like lxc doesn't have the same limit on userdata size, so if we need to we have the option of serialising the tools in there for all lxc containers03:45
wallyworld__good to know03:46
wallyworld__i think that would be useful03:46
wallyworld__avoid networking calls back to the state server to get the tools03:46
wallyworld__axw: supportedArchitecturesCount is just for testing?03:47
wallyworld__ah nevermind03:48
wallyworld__i thought it was in production code03:48
* thumper needs to work out how to poke inside of mongo03:53
thumperalso noticed some wonderful weirdness in our code03:54
thumperthe password hash of the admin-secret is used as the actual password for the admin user to mongo (which is "machince-0" btw)03:54
* thumper goes to make that coffee03:55
thumperbugger, timer on the machine would have turned it off again by now03:56
thumperugh...04:12
thumperspent the day teasing apart the layers of juju to work out where to put my change04:12
thumperstill not entirely sure... but closer04:13
stokachuwhere does juju log its debugging output when attempting to bootstrap into openstack?04:15
stokachui ran with --debug but its just sitting at apt-get update http://paste.ubuntu.com/8146668/04:15
thumpergrr...04:20
thumperstokachu: not sure sorry04:20
stokachuno worries04:21
* thumper can't push because master is dirty 04:21
thumperericsnow: you need to fix your pre-push hooks04:21
davecheneythumper: hang on04:36
davecheneymartin fixed that, twice04:36
davecheneyhow did a change get past the bot04:37
thumperwallyworld__: chat?04:37
thumperdavecheney: no idea04:37
wallyworld__thumper: hiya, ok04:37
wallyworld__onyx standup?04:37
thumperhttps://plus.google.com/hangouts/_/g6ga27vzkwgy3dz4s7xly5sivia?authuser=1&hl=en04:37
wallyworld__ok04:38
thumperwallyworld__: https://github.com/juju/juju/pull/60404:47
thumperdavecheney: https://github.com/juju/juju/pull/60505:02
* davecheney looks05:04
davecheneythumper: not logm05:04
davecheneydo not use NewXXTag in productoin code05:04
davecheneyunless you are 100% sure that the tag is valid05:04
thumperdavecheney: the line above it validates05:05
thumperdavecheney: how else should we do it?05:05
davecheneynames.ParseEnvironTag("environment-"+string)05:05
davecheneyif you're sure the tag is valid then LGTM05:05
davecheneybut this is a warning05:06
thumperhmm...05:06
davecheneyhonestly those NewXXTag functoins shouldn't be in the package05:06
davecheneythey are a footgun05:06
thumperdavecheney: but they have to be created somewhere, right?05:06
davecheneyin most cases they come as strings on the wire05:06
davecheneykind-id05:06
davecheneyso we use parse05:06
thumperbut something puts them on the wire05:06
davecheneyyup tag.String()05:07
thumperbut something creates the tag05:07
thumperyou have to have trust somwhere05:07
davecheneyno argument there05:07
davecheneybut you should look suspciously at every case05:07
thumperthe same method is used in state.Initialize05:08
thumperwe have a uuid05:08
thumperthen create an environment tag from it05:08
davecheneysure05:08
davecheneybut you are arguing that using a dangrous weapon is ok because others have used it heaps of times in the past with nothing going wrong05:09
davecheneypast performance is no guarentee of future profit05:09
thumperno, I am saying that this is one of the places where you use the dangerous weapon carefully05:09
davecheneysure05:09
thumperthere are always places where we need to create tags with known data05:09
davecheneyyes05:10
davecheneybut I don't see any validation there05:10
davecheneyyou just take what comes out of hte jenv fil05:10
thumperno, this isn't the jenv05:10
davecheney+ssInfo, err := st.StateServerInfo()05:10
thumperright, st here is *state.State05:10
davecheneyssIfnfo should have an envTag field or method05:10
davecheneyno, StateServerInfo is not a *state.State05:11
davecheneyit's some turd that got passed back from the api05:11
thumperno, st is05:11
davecheneyssInfo05:11
thumperStateServerInfo is a method on *State05:11
davecheney+st.environTag = names.NewEnvironTag(ssInfo.EnvUUID)05:11
davecheneyyou have a LGTM with resevatoins05:11
davecheneythere is no value in griding on this point05:12
thumperright, here is some ickyness...05:12
thumperwhich we can fix05:12
thumperStateServerInfo is a POD structure05:12
thumperPlain Old Data05:12
thumperall public05:12
davecheneyyeah, this is the same POS that infests the state and the mongo packages05:12
davecheneyand binds them tightly to the _client_ api05:12
* thumper nods05:13
thumperwe should separate the serialization structure from the info structure05:13
davecheneyit's fine for it to be public fields05:13
davecheneyit's returned by value05:13
davecheneywe can't change the copy that state has05:13
thumperbut then adding a method that creates an environ tag from the uuid in the struct is meaninless05:13
thumperas it gives you a false sense of security05:13
thumperwhen there is none05:14
davecheneyobiously we'd remove the envUUID field05:14
thumpernah... I just put it there05:14
davecheneywell shit05:14
thumperthis is all about our shitty data structures05:14
=== Ursinha is now known as Ursinha-afk
axwwallyworld__: https://github.com/axw/juju/commit/b56a48d3bd760f9ab58ccada562dd663b1786a0d#commitcomment-752668505:21
wallyworld__rightio05:21
thumperdavecheney: let me ponder this env tag for a bit05:21
axwcan you please take a look at that and see if I'm making sense05:21
thumperbefore I merge it05:21
thumperI'd like to ensure that we do it right05:21
axwwallyworld__: thanks. going to do some more testing before I land, and double check coverage05:26
wallyworld__ok, it's got potential to break things does this branch05:27
wallyworld__jam1: i got the bot "fixed" by throwing a larger instance at it - our tests are still horrible05:28
axwwallyworld__: well it's pretty invasive so yeah... I have done some targeted testing with non-amd64 arch, will test the whole lot before attempting merge tho05:28
wallyworld__ty05:29
dimiternmorning all05:42
jam1wallyworld__:  :(05:56
axwhazmat: in case this got lost in the noise of github activity: https://github.com/juju/juju/pull/59606:07
=== JoshStrobl[ZZZ] is now known as JoshStrobl
TheMuemorning08:10
jammorning TheMue08:31
TheMuejam: seen your mail, could you tell me a bit more about it?08:37
dimiternmorning jam, TheMue08:59
dimiternjam, the meeting will start any minute now :)08:59
TheMuedimitern: heya08:59
dimiternjam, ping09:02
jamdimitern: pong10:18
jamsorry about the delay10:18
jamdimitern: dang it10:18
jamsorrI  missed it10:18
jamI have to take the dog out now, will be back in 20 min or so10:18
dimiternjam, no worries, i'll bring you up-to-speed at the standup10:21
jamdimitern: speaking of which :)10:44
jamTheMue: ^^10:45
dimiternbrt10:46
mattywdimitern, ping - when you have a moment10:54
dimiternmattyw, a tentative pong (doing standup now)? :)10:57
=== Ursinha-afk is now known as Ursinha
hazmatdo actions use a hook context? (ie. long running)11:49
hazmatbodie_, fwereade ^11:53
gsamfirahazmat: hi. There does not seam to be any difference between Hooks and Actions aside from location (when it comes to running). https://github.com/juju/juju/blob/master/worker/uniter/context.go#L33011:55
hazmatgsamfira, thanks11:55
gsamfiraglad I could help :)11:58
=== Ursinha is now known as Ursinha-afk
=== Ursinha-afk is now known as Ursinha
TheMueeh, maybe I’m blind, but do we have a github.com/juju/juju/upstart?14:26
TheMuecmd/jujud/machine.go imports it, but I cannot find it (neither can my compiler)14:27
TheMuedimitern: you’re around for a little crosscheck?14:28
TheMuemattyw: ping14:31
mattywTheMue, hey there14:32
TheMuemattyw: could you take a look please? it seems the repo has a problem14:33
alexisbmgz, I am on the hangout whenever you are ready14:33
mattywTheMue, I certainly can but I might not be the best person for the job14:34
mgzalexisb: thanks for the poke14:34
TheMuemattyw: maybe I already found the checkin14:36
TheMueso currently our master won’t build, a package is missing14:37
dimiternTheMue, now I'm here14:37
mattywTheMue, this one yeah? https://github.com/juju/juju/commit/1f7148c5e2ae9f68eb9f8b0c94f6c00b82ee4a1814:38
TheMuedimitern: thx, but found it already.14:38
TheMuemattyw: yeah, exactly14:38
dimiternok14:38
TheMuedimitern: in jujud machine.go imports a non-existing package :(14:39
mattywTheMue, dimitern but it looks like the package isn't used either14:39
dimiternTheMue, what?14:39
TheMuemattyw: yep14:39
TheMuemattyw: I’m only wondering how it passed the bot14:39
mattywTheMue, me too - who's the best person to ask about the bot?14:40
mattywTheMue, the tests run http://juju-ci.vapour.ws:8080/job/github-merge-juju/431/console14:43
mattywTheMue, but that error is in main - do tests on main get run?14:43
dimiternmattyw, TheMue, is this about upstart?14:43
TheMuemattyw: this seems to be the problem14:43
TheMuedimitern: yep14:44
dimiternTheMue, mattyw, it seems juju/upstart moved into juju/service/upstart14:44
dimiternand perhaps wallyworld had goimports installed and juju/upstart code was in GOPATH before juju/service/upstart, and probably the same happened on the bot14:45
TheMuedimitern: hmm, could be the reason14:46
dimiternit happened in https://github.com/juju/juju/commit/190f98fcab118b5dce269e8c0021a563455fee39#diff-88ad1ca7d18fe89a76f6348caf6ddd4214:46
mattywdimitern, makes sense14:47
mattywdimitern, TheMue anyidea how we can stop this from happening next time?14:48
dimiternmattyw, TheMue, it just happened so that code importing the old path was merged last14:48
dimiternhttps://github.com/juju/juju/commit/880aaa83f1a474ef7856f1237c3781ab6a51dbfe14:48
TheMuealso upgrades isn’t used in machine.go14:48
dimiternmattyw, I'm not sure if that's the case, but if it is, then we should look into the bot and see how it does fetch dependencies, etc.14:49
mattywdimitern, where is the code for the bot?14:49
dimiternit might be that ian added the import line manually, rather than using goimports14:50
dimiternmattyw, mgz would know that I guess14:50
mgzwhich bot bit?14:51
mgzmattyw: you want to look at the make-release-tarball script in lp:juju-release-tools14:52
mattywmgz, ok thanks TheMue ^^14:52
TheMueyep14:53
mgzwhat was the symptom exactly? I'm a little confused from the log14:53
mgzwe had broken import that got past the landing?14:53
mgzor didn't get past the landing but did get past the build?14:53
TheMuemgz: cmd/jujud/machine.go imports packages it doesn’t use and that doesn’t exist14:54
mgzI see, on trunk currently.14:54
dimiternwe should file a critical ci blocker bug for that14:55
dimiternso nothing lands until it gets fixed14:55
mgzand blame is on the last rev of master? or an earlier one?14:55
dimiternmgz, last rev14:57
mattywtasdomas, dimitern in other news this is ready for more reviews when you have a moment https://github.com/juju/juju/pull/56214:57
dimiternmattyw, will have a look shortly14:57
mgzoh, I see14:57
mgzgo fmt passes...14:57
mgzand the go build line has gone14:58
mgzdimitern: my suggestion, I land a backout of the last rev14:59
mgzadd `go build ./...` back to the tarball script14:59
dimiternmgz, there seems to be another issue14:59
mgzreland the earlier rev and see that it borks?14:59
dimiternmgz, ../../state/backups/metadata/metadata.go:10:2: cannot find package "github.com/juju/utils/filestorage" in any of:14:59
dimitern/usr/lib/go/src/pkg/github.com/juju/utils/filestorage (from $GOROOT)14:59
dimiternand it's not in dependencies.tsv as well15:00
mgzalso the same rev?15:00
mgzif so, covered by the backout15:00
dimiternmgz, let me check15:00
mgzseems not..15:00
mgzprobably eric's change?15:01
dimiternmgz, yes, on trunk15:01
dimiternmgz, but it's not the same rev I think15:01
mgzyeah, looks like that's ericsnowcurrently backups-storage15:01
dimiternmgz, yep https://github.com/juju/juju/commit/f4da7f542947abb798da7da730a5482a029eee4415:02
mgzso, two borked landings from the build line going... now, why was that removed...15:02
dimiternmgz, so we're not even trying if it builds ? lol..15:02
mgzwell, not at the tarball stage15:04
dimiternmgz, iirc there was a unit test for deps.tsv..15:04
mgzwe do before running the tests, and that's working for some reason15:04
dimiternmgz, it takes like 10 secs - we should do it before running tests, not as late as tarball packaging time i think15:04
mgztar15:04
dimiternah, I see15:04
mgzsorry,15:04
mgztarball build comes before tests15:05
dimiternmgz, hmm.. I wonder why that is15:05
mgzwe get deps and make tarball on the main juju machine15:05
mgzthen send the tarball to a new instance to run the tests15:05
mgzso, the bot *should* still be failing before we run tests, but from the logs it's not for some reason15:05
mgzcan see the line `go build github.com/juju/juju/...` in the landing console log, and it's not got the error15:07
mgza little worried it's not actually testing the right juju at present15:08
mgzhm,15:11
mgzI'm tempted to blame a godeps change15:11
mgznothing on the ci side has changed15:11
mgzheh15:14
mgzokay, got it15:14
dimiternmgz, yep? what is it?15:15
TheMueah?15:16
mgzfor some reason, fixDetachedHead from cmd/go/vcs.go is now getting called, when it wasn't before15:16
mgzand that does checkout master... overwriting the merge15:17
dimiternyay! :D15:17
mgznot sure *what* has triggered this, but can fix at least15:18
dimiternlots of fun15:18
dimiterngodeps perhaps15:18
mgzlets hope it was recent15:18
TheMuestrange kind of error15:18
dimiternit depends on how it gets missing revisions from git - if it does not use fetch but pull it can happen15:18
mgzbecause we've only been testing the current head, rather than the pending merge, for the last few landings at least15:18
TheMueI’ve seen it first when testing the dummy provider. here I got it in github.com/juju/juju/environs/jujutest/livetests.go:124: build command „go“ failed …15:20
dimiternnope, scratch that - godeps uses git fetch, at list in lp:godeps trunk15:20
mgzfor now, I want to just back those two changes out15:20
mgzand eat lunch...15:20
=== kwmonroe_ is now known as kwmonroe
alexisbjcw4, bodie_ we are on the hangout when you guys are ready16:02
jcw4woo hoo16:02
jcw4TheMue: #jujuskunkworks16:08
perrito666hello everyone17:00
mgzhey!17:00
mgzanyone: pr #60917:07
mgzgd17:09
mgzurk17:09
mgzgsamfira, perrito666: ^17:09
perrito666mgz: on which repo?17:13
mgzjuju/juju17:13
* perrito666 receives confirmation from msdn of subscription... I wonder when did I subscribe17:16
perrito666it was at least one month ago17:16
bodie_so what's the deal with upstart and how far back do we have to roll back to get it to build?17:21
gsamfirawell, 2 options. there is one commit that was ported forward, and if we remove that one, it will build17:22
gsamfiraor, we can do a PR, that removes the extra imports and calls agentConfig.Tag().String() instead of agentConfig.Tag() in a couple of places17:23
gsamfiraand it will also build17:23
gsamfirabut I have not investigated the issues related to the second PR that is being reverted by https://github.com/juju/juju/pull/60917:23
gsamfiramgz might have more info on that17:23
perrito666mgz: btw, can https://bugs.launchpad.net/juju-core/+bug/1361721 be reproduced in something other than utopic?17:24
mupBug #1361721: MachineSuite.TestDyingMachine failing frequently <juju-core:Triaged> <https://launchpad.net/bugs/1361721>17:24
mgzperrito666: that's the only job it's on I think, but it's been a dodgy testfor a while17:24
mgzbodie_: I'm not sure, which upstart what?17:26
bodie_the failing build on the latest master17:26
gsamfirabodie_ : the upstart package was moved to the service package quite a while ago.17:26
jcw4bodie_: mgz has a revert in the pipeline now17:27
mgzbodie_: my pr should fix the failing build17:27
bodie_ah, great17:27
gsamfiraif there is no other issue with the commits that are being reverted, the change to get it to build without reverting is about 4 lines. I am running the tests now. Should I let them finish and see if that fixes it?17:28
mgzgsamfira: I want to just revert, because the tests were never run on those changes17:31
mgzthen fix the landing before putting in new code17:31
gsamfirafair enough. As you wish. I am running the tests on that code now, with the fix. If you prefer to revert, its fine with me :). I was just offering an alternative that would be shorter17:33
=== jheroux_away is now known as jheroux
=== urulama is now known as urulama-afk
ericsnowperrito666: how's your morning go?19:26
perrito666ericsnow: wonderful19:26
ericsnowperrito666: glad to hear it19:27
perrito666ericsnow: btw, one of your PRs has just been reverted, please contact mgz for more info19:27
ericsnowperrito666: yeah, I saw :(19:27
ericsnowmgz: how exactly was my patch failing?19:29
ericsnowmgz: I'm guessing it's related to updating dependencies.tsv19:30
mgz16:02 < dimitern> mgz, ../../state/backups/metadata/metadata.go:10:2: cannot find package "github.com/juju/utils/filestorage" in any of:19:31
mgz16:02 < dimitern> I/usr/lib/go/src/pkg/github.com/juju/utils/filestorage (from $GOROOT)19:31
mgz16:02 < dimitern> and it's not in dependencies.tsv as well19:31
ericsnowmgz: weird19:32
mgzworth trying the merge again and building locally, see if it's actually okay19:33
ericsnowmgz: github.com/juju/utils/filestorage has existed for some time and it's in the revision listed in depenedencies.tsv19:33
ericsnowmgz: at least as long as that revision didn't get reverted too19:34
perrito666all: I just sent an email to juju-dev in the thread "getting rid of all-machines.log" you opinion will be greatly appreciated19:36
=== arosales_ is now known as arosales
perrito666ok good news is: I don't need utopic to fix 1.20 tests20:18
mgzace20:19
perrito666sweet, 8G of ram really did the trick for test running20:21
perrito666why cant I see builds before #545 for http://juju-ci.vapour.ws:8080/job/run-unit-tests-utopic-amd64/ ?20:28
perrito666abentley: mgz jog anyone can tell since when has 1.20 -> http://juju-ci.vapour.ws:8080/job/run-unit-tests-utopic-amd64/ been broken? I know it failed for the last revision, but do we know if this is indeed something new20:41
perrito666?20:41
perrito666sorry, shift too close to enter20:41
abentleyperrito666: The last revision that passed was eba6e37f20:42
perrito666abentley: thanks a lot20:43
abentleyperrito666: r6dc9a588 was tested and failed, but I need to check to see whether it was the same failure mode.20:43
abentleyperrito666: Silly me, that's the candidate.20:44
perrito666weird, gitk says r6dc9a588 is not part of 1.2020:44
abentleyperrito666: I'm sorry, the last to pass was eba6e37f.  The way jenkins displays this is confusing.20:46
perrito666abentley: I know, dont worry, confuses me each time20:47
perrito666mm, changes from eba6e37f to head of 1.20 contain a patch mgz just reverted from master20:48
wwitzel3ping ericsnow, perrito66620:51
ericsnowwwitzel3: hey20:51
wwitzel3ericsnow, perrito666: got time for a quick meeting / standup?20:52
ericsnowwwitzel3: sure20:52
wwitzel3ok, going to moonstone20:53
ericsnowwwitzel3: sorry, thought I had joined!20:59
perrito666sorry was in another window, you guys still there?21:17
ericsnowperrito666: nope21:17
ericsnowperrito666: we didn't talk for long21:18
ericsnowperrito666: just a quick recap for Wayne21:18
perrito666well not much from me either I am fixing a bug in 1.2021:18
thumpermorning21:26
perrito666thumper: morning21:27
perrito666mgz: still here?21:27
=== jheroux is now known as jheroux_away
mgzperrito666: yup21:33
perrito666mgz: ah nevermind I was just curious if git revert worked for you21:37
mgzperrito666: it does, but is a little finickity21:38
perrito666mgz: I tried git revert -m 1 hash21:38
perrito666and got all kind of conflicts21:38
perrito666I really expected it to be slightly smarter21:38
mgzare you sure the -m was right?21:39
perrito666I ... I am not sure, I guess it was not given the result, I do not feel the explanation for what -m does was meant to be understood21:40
perrito666would anyone please https://github.com/juju/juju/pull/61021:48
perrito666it fixes https://bugs.launchpad.net/juju-core/+bug/136172121:48
mupBug #1361721: MachineSuite.TestDyingMachine failing frequently <juju-core:Triaged by hduran-8> <https://launchpad.net/bugs/1361721>21:48
mgzwell if it reverted the right stuff, even with conflict pain, presumably21:49
perrito666mgz: I end up doing it by hand way easier21:51
perrito666thumper: cmars you are the ocrs21:51
thumperperrito666: no, that was yesterday :)21:51
perrito666mgz: if you append .diff to the pr in ghub it will yield the patch in plain text21:52
perrito666thumper: ah true, it is still today for me lol21:52
thumperfunny, it is still today for me too21:52
wallyworld_perrito666: why does my branch break that test? the tests pass if run with reduced parallelisim so it's likely coincidental that that commit is blamed21:52
perrito666wallyworld_: well it is consistent when I run them and I believe mgz has the same results21:53
wallyworld_they pass for me locally21:54
wallyworld_and the bot or else it wouldn't have landed21:54
perrito666wallyworld_: mm, strange, they fail here and in CI21:54
perrito666wallyworld_: http://juju-ci.vapour.ws:8080/job/run-unit-tests-utopic-amd64/21:55
wallyworld_they will likely pass if the tests are run with reduced paralleism21:55
wallyworld_we have a number of tests that fail without -p 221:55
wallyworld_that test as also failed intermittently previously21:55
* perrito666 does21:56
wallyworld_my branch changes machine agent startup to write the tools version earlier in the startup, so it is hard to see how that could affect that test21:56
wallyworld_once machine agent is up and running, there's no difference21:57
wallyworld_since it only fails on utopic, it is very likely to be a timing issue, which is an issue many tests have sadly21:58
wallyworld_since we tend to use timeouts all over the place rather than channels and signals to coordinate21:59
perrito666wallyworld_: I can reproduce it with thusty21:59
wallyworld_with -p 2?21:59
perrito666wallyworld_: I am running with p221:59
perrito666lets get coffee while we wait :p21:59
wallyworld_i have had that test fail even before my branch landed21:59
mgzwallyworld_: did you see the revert on trunk?22:02
mgzI still havent fully resolved what changed to make the bot pass borked merges, but will have it fixed22:03
wallyworld_mgz: haven't seen that revert yet, let me look22:04
wallyworld_mgz: how did the backup pr break the tests? looks very srlf contained?22:08
mgzwallyworld_: it may have actually been okay, but dimitern flagged it as dodgy as well22:09
mgzthe dep borked for him22:09
wallyworld_the utils dep? how did it bork?22:10
mgzlack of filestorage package22:12
mgzmay have just been a mistake, I told eric to reland if it built for him locally22:12
mgzI just wanted to back out all suspect changes as the bot had not in fact been testing them22:13
perrito666wallyworld_: tests fail with go test -test.parallel=2 github.com/juju/juju/...22:14
wallyworld_perrito666: same test?22:14
arosaleswallyworld_, mgz abentley: added a comment to https://bugs.launchpad.net/juju-core/+bug/136172122:15
mupBug #1361721: MachineSuite.TestDyingMachine failing frequently <juju-core:Triaged by hduran-8> <https://launchpad.net/bugs/1361721>22:15
perrito666wallyworld_: exact same test22:15
perrito666golang-go                                             2:1.2.1-2ubuntu122:16
perrito666wallyworld_: I can run any other sort of test for you if you want22:16
wallyworld_arosales: i've only just SOD, but will look into the test and try and see that the issue is, and we'll get 1.20.6 out today22:17
wallyworld_perrito666: thanks, i need to look at the test to see where it's failing22:17
perrito666wallyworld_: exact same output that in jenkins22:17
wallyworld_perrito666: and yet the test passed the bot22:18
mgzwallyworld_: you mean, you don't love us all bugging you before breakfast? :)22:19
perrito666wallyworld_: I recall mgz saying earlier that the bot was letting things pass22:19
wallyworld_we have so many tests that fail due to subtle changes in timing due to different instance types etc22:19
wallyworld_mgz: before breakfast is ok, not before coffee :-(22:19
perrito666wallyworld_: let me run one more test and I might be able to give you more info22:20
wallyworld_ok, thanks22:21
=== mup_ is now known as mup
=== allenap_ is now known as allenap
wallyworldperrito666: the test passes for me - can you try running it in isolation?22:38
wallyworldi'm running on an SSD, msny of our tests pass more often with fast i/o22:39
perrito666wallyworld: I am running in an ssd too22:39
perrito666everything in this machine is ssdish22:39
perrito666wallyworld: what do you mean in isolation22:40
wallyworldgo test -gocheck.f TestDyingMachine22:40
wallyworldcd to the cmd/jujud package22:40
wallyworldand just run that one test22:40
perrito666wallyworld: running22:40
perrito666it passes22:40
wallyworldyup, so it's just another case of our tests being stupid :-(22:41
perrito666wallyworld: well the tests where written by us so...22:41
perrito666:(22:41
wallyworldperrito666: agreed. there's sooo much that needs fixing22:42
=== Ursinha_ is now known as Ursinha
=== mup_ is now known as mup
=== TheRealMue is now known as TheMue
perrito666wallyworld: well I made a few tries and definitely I am not able to figure out why your patch triggers this failure22:51
wallyworldperrito666: my patch doesn't - this test has failed several times in the past before my patch22:51
perrito666wallyworld: let me rephrase22:51
wallyworldi knew what you meant, sorry :-)22:52
perrito666wallyworld: I believe that your patch somehow triggers our underlying test error, yet I cannot figure out why on the universe22:53
* perrito666 reruns22:53
wallyworldand sadly, it seems to happy just on utopic on CI, on trusty elsewhere, and not for me at all22:53
wallyworlds/happy/happen22:54
perrito666that speaks so bad about the affected piece of code consistency22:55
wallyworldperrito666: if you can get it to fail, maybe try increasing the poll timeout to see if that makes a difference, just to see if the agent will eventually die or is hung22:55
perrito666wallyworld: I believe I tried22:56
perrito666wallyworld: I find somehow interesting the various log entries that state that Open has been called without addresses22:56
* perrito666 swims in seas full of red herrings22:57
wallyworlds/swims/drowns22:58
* perrito666 does as always in case of fish and starts the barbecue grill23:02
perrito666wallyworld: I found it23:09
perrito666MachineSuite is not properly isolated23:09
wallyworldthat and several other test suites :-(23:10
wallyworldwhat particular issue did you find?23:10
perrito666wallyworld: by removing the tests you added to machine_test the whole suite runs23:10
wallyworldit works for me even with those tests23:10
perrito666wallyworld: well I guess Ill have to be the one to find the bug then23:10
wallyworldi'm looking into why the agent is not stopping - well it is stopping according to the logs, but the test doesn't see it23:11
wallyworldtrouble is there's not enough logging23:12
wallyworldin the machine agent Run() method23:12
perrito666wallyworld: what is primeAgent?23:12
wallyworldthat creates a machine and tools and sets up a machine agent23:13
perrito666I have a hunch that at some point a machine is being shared23:14
wallyworldperrito666: maybe, but the logs show the agent dying in response to the machine being marked as dead. it's just that the test doesn't find that out. and for some reason, the agent tries to start again23:18
wallyworldperrito666: would be interesting, if you can get it to fail, to add logging around these lines at the end of machine agent's Run() method23:19
wallyworldif err == worker.ErrTerminateAgent {23:19
wallyworlderr = a.uninstallAgent(agentConfig)23:19
wallyworld}23:19
wallyworlderr = agentDone(err)23:19
wallyworlda.tomb.Kill(err)23:19
wallyworldreturn err23:19
perrito666wallyworld: ok, going23:20
wallyworldperrito666: so it looks like the logic in one of the runners is not detecting that the worker is dying, and is attempting to restart everything23:24
wallyworldthe agent itself correctly notices that the machine is dead, which is what the test is testing for, but the worker doesn't allow the agent to exit23:24
perrito666so this actually is a bug23:26
wallyworldwell, it's a bug somewhere because the test fails when it shouldn't. but not sure where exactly23:28
wallyworldi'm guessing it's in the worker/runner infrastructure23:28
thumperdavecheney: https://github.com/juju/juju/pull/605/files23:31
wallyworldfunc (runner *runner) run() error {   <-- this function in worker/runner.go is noticing that the runner has been stopped but then attempts to restart because it doesn't know that it's deliberate23:31
davecheneythumper: looking23:32
thumperdavecheney: ta23:32
thumperwaigani: https://github.com/juju/juju/pull/519 has a merge conflict with master23:33
perrito666wallyworld: whoa23:34
perrito666wallyworld: I am running it with some more logging23:34
waiganithumper: thanks, looking23:34
wallyworldperrito666: it appears the logs are missing the 'killing "api"' line which means that the api work is not being killed like it should23:34
wallyworldthat's the worker that is then being erroneously restarted23:35
perrito666wallyworld: good catch23:35
perrito666looking23:35
wallyworldperrito666: func killWorker(id string, info *workerInfo) {   <---- if this is not called, then info.start is not set to nil, so when the worker terminates, it will just be restarted again23:36
wallyworldwhich is not what we want23:36
perrito666wallyworld: I am quite close to call you23:39
wallyworldi introduced a deliberate error into the test and compared the logs - it seems when it fails, the "deployer" worker is not killed as it shold be'23:44
wallyworldhmmm, but that's because the deployer is not started23:47
thumperdavecheney: still happy with that? If so, I'll merge it (when landing unblocked)23:49
davecheneythumper: lgtm23:51
davecheneyminor gripes23:51
davecheneybut lgtm23:51
thumperwhat are the gripes?23:54
* thumper looks at that test23:55
davecheneythumper: that's my only comment23:55
davecheneyeverything else looks good23:55
perrito666thumper: well in spanish is the plural for the flu23:56
davecheneyperrito666: lol23:56
perrito666thumper: also your last name pronounced as read in spanish means penis :p </end of trivia>23:57
thumperperrito666: yea, back to high school23:58
perrito666thumper: actually It triggered a very weird look from my wife when I told her your name when I was chatting with you the other night23:58
perrito666and then I realized23:58

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!