/srv/irclogs.ubuntu.com/2015/07/01/#juju-dev.txt

davecheneymwhudson: nope00:07
davecheneywell i tried to run the tests a while back00:07
davecheneybut a cavelcade of blocking issues have meant I haven't looked at arm64 for going on 6 weeks00:08
mwhudsonfair enough00:08
davecheneythumper: sorry i missed the standup00:33
davecheneythere are two main races remaining00:33
davecheneyone the cert update worker, which I don't think I can fix00:33
davecheneyand a race counting outstanding connections in the api server00:33
davecheneywhich I think I can fix00:33
davecheneyas I'm going to be travelling for the next three weeks, i'd like to talk to you about how to fix the cert update worker00:34
davecheneyit's a big job00:34
davecheneywhen I say fix, it needs to be rewritten00:34
mupBug #1470297 opened: worker/uniter/storage: data race in test <juju-core:New> <https://launchpad.net/bugs/1470297>01:53
menn0thumper: ok I have managed to eventually reproduce 1469199 by doing some very nasty things to mongodb03:24
thumperoh?03:24
menn0hang on03:24
menn0thumper: http://paste.ubuntu.com/11802709/03:25
menn0that's the script that does it03:25
menn0the env that triggered the bug report only has one state server03:25
menn0but mongodb briefly went from PRIMARY to SECONDARY and back again in 1 sec03:26
menn0after a little playing around I found that any change to the replicaset config causes the primary to drop to secondary for a bit03:26
menn0that script keeps causing that to happen03:26
menn0most of the time Juju handles the mongodb disconnect as it should03:27
menn0until it doesn't03:27
menn0I now have an env which is in the same state as described in the bug03:27
menn0no API server running03:27
menn0the api worker is continually trying to get a connection but failing03:28
menn0I actually think it's the state worker which is stuck/brokeen03:28
menn0possibly related:03:28
menn02015-07-01 03:20:57 DEBUG juju.mongo open.go:122 TLS handshake failed: tls: first record does not look like a TLS handshake03:28
menn02015-07-01 03:20:57 DEBUG juju.mongo open.go:122 TLS handshake failed: local error: unexpected message03:28
menn0that happened shortly before things went bad03:29
menn0might be a red herring though03:29
* menn0 goes to instrument the state and apiserver workers03:29
thumpercould be...03:29
thumperbut then again...03:29
thumpermenn0: have you dropped the idea of internally catching and retrying the 'bad record MAC'?03:29
menn0thumper: I'm not sure if that's the root cause here so I'm not focussing on that at the moment03:30
thumperok03:31
davecheneythumper: http://paste.ubuntu.com/11802397/04:25
davecheneycurrent status04:25
davecheneysome failures when run under -race04:25
thumperdavecheney: did you want to have a chat?04:25
davecheneynah04:25
davecheneyi'm working on the apirserver race04:25
davecheneywe'll talk tomorrow about the cert updater issue04:26
thumperok04:27
thumperdavecheney: I noticed a few other failures in that listing that don't look racy04:28
thumperdavecheney: like the runner_test.go:134: runnerSuite.TestOneWorkerStartWhenStopping one04:28
thumperI might poke that with a stick04:28
davecheneyyes, that is what I mean by some failures04:29
thumperand this one: FAIL: ssh_gocrypto_test.go:137: SSHGoCryptoCommandSuite.TestCommand04:29
davecheneyi shold have said "-race triggers some other unrelated filaures"04:29
davecheneySHUT04:29
davecheneySHIT04:29
davecheneythat is supposed to be fixed04:29
thumperwhich?04:30
davecheneyoh, ignore04:30
thumperand is it just with -race?04:30
davecheneyi fixed a differernt issue with that package04:30
davecheneythat is a straight up failure04:30
thumperkk04:30
* thumper fetches his pointy stick04:30
thumper:-(04:32
thumperdavecheney: when I run utils/ssh with -race I get a race04:32
davecheneypasetbin ?04:35
davecheneythere was one race that needed the update to gocheck04:35
davecheneyis there another one ?04:35
thumperah... I may not have the latest04:35
thumperI'm in the 1.24 branch04:36
davecheneyshould I backport the gocheck dep fix ?04:36
davecheneycan't hurt04:36
davecheneywe'll be stick with 1.24 for a while04:36
thumperdavecheney: http://paste.ubuntu.com/11802877/04:39
thumperdavecheney: that is with the gocheck fix as far as I can tell04:39
davecheneyok04:39
davecheneyi've seen that one before04:39
davecheneyit doesn't happened every time04:39
thumperdavecheney: I went into the gopkg.in/check.v1 dir, and did git pull origin v104:39
davecheneyi'll put it on the list04:39
thumperdavecheney: I'm using go 1.4.204:39
menn0thumper: this is a proper heisenbug... I can't trigger it when I add extra logging04:48
thumperyay... NOT04:48
davecheneythumper: https://github.com/juju/juju/pull/269404:49
davecheneyi wont submit this one til roger signs off on it04:50
* thumper looks04:50
davecheneytbh, i'm not 100% on the description of the fix04:50
davecheneybut the race was 100% reproducible before04:50
davecheneyand now it is not04:50
thumperdavecheney: how does this change anything?04:53
thumperI don't get it04:53
davecheneyso, there is a difference between waiting on Dying04:53
davecheneyand waiting on Dead04:53
davecheneyDying happens when you use tomb.Kill04:53
davecheneyDead happens when someone calles tomb.Done04:53
davecheneyso there is a window where some waiting on the tomb, on tomb.Done are still running04:53
davecheneyalso04:54
davecheneytbh, I cannot explain it fully04:54
davecheneybut the logic now is straight forward04:54
davecheneywe only start the defer chain when the listener is shutdown04:54
thumperI would have thought that the wait group will be executed first (LIFO)04:54
davecheney???04:55
thumperdefer srv.wg.Wait() // wait for any outstanding requests to complete.04:55
davecheneyit will be, then tomb.done04:55
thumperright04:55
davecheneybut somehow the http server is accepting a final request04:55
davecheneyi cannot explain how04:55
davecheneybut in this new form, the listener is 100% closed before calling wg.Wait()04:55
thumperseems weird, but ok04:57
davecheneyanyway, i want roger to have a look at it04:57
thumperagreed04:57
davecheneythumper: https://canonical.leankit.com/Boards/View/115065967/11580814204:58
davecheneywhat do you want to do with this card ?04:58
* thumper looks04:58
thumperI think we can remove it04:58
thumperhaving found a different way to deal with it04:58
davecheneyhow do we record time for a card that was not landed04:59
davecheneyhow do we record time for a card that was not landed ?04:59
davecheneyalso, what's going on with the in review column04:59
davecheneythere is more work there than in progress04:59
thumperdavecheney: we don't care about recording time for a card not landed05:00
davecheneyfairy'nuff05:00
* thumper found a real race in the runner code05:08
thumperwhich only raised it's ugly head in the -race test runs05:08
* thumper is submitting05:09
thumperthis *may* be the cause of the timeout we were seeing on ppc05:10
thumperno... don't thinkso05:10
davecheneyrunner code ?05:11
davecheneya data race05:11
davecheneyor a change in timing ?05:11
thumperhttp://reviews.vapour.ws/r/2073/diff/#05:11
thumpertiming change05:11
thumperbut not when dying05:12
thumpercalling start worker, then stop worker real quick05:12
thumperwill not stop the worker05:12
thumperbecause the worker hasn't told the runner it has started05:12
davecheneyright05:13
davecheneythat is because there is a nil info.Worker field05:13
davecheneyship that shit05:13
davecheneythat code needs a shotgun rennovation05:14
mupBug #1470345 opened: provider/local: test failure <juju-core:New> <https://launchpad.net/bugs/1470345>05:33
mupBug #1470345 changed: provider/local: test failure <juju-core:New> <https://launchpad.net/bugs/1470345>05:54
mupBug #1470345 opened: provider/local: test failure <juju-core:New> <https://launchpad.net/bugs/1470345>05:57
=== ashipika1 is now known as ashipika
* dimitern TIL: given func f1(strings ...string); f1([]string(nil)...) works just as well as f1([]string{}...)07:18
=== mup_ is now known as mup
mattywgsamfira, ping?08:58
gsamfiramattyw: pong09:00
voidspacedimitern: ping09:01
fwereadejam, standup?09:02
mattywgsamfira, hey there, I'm just doing a code review (not yours) but there's a suggestion of something that might break windows, so I wanted your thoughts: http://reviews.vapour.ws/r/2030/diff/#09:02
gsamfiramattyw: looking09:02
gsamfiramattyw: Should be fine as far as I can tell.09:09
gsamfiramattyw: should be easy to test if there are any doubts. GOOS=windows go install github.com/juju/juju/...09:10
mattywgsamfira, does it look like a reasonable thing to do in windows?09:10
gsamfiramattyw: I can not promise that errors will be the same in this scenario on both windows and Linux. So while it will probably not error out, you might not catch the error you are expecting.09:12
gsamfiramattyw: also, debuglog is not yet supported on windows.09:13
mattywgsamfira, ah ok09:13
gsamfiramattyw: it relies on tmux and ssh, both of which are missing from windows :)09:14
mattywgsamfira, good point, tmux is the thing I miss most09:14
mattywgsamfira, thanks for your help09:24
gsamfiramattyw: my pleasure :)09:26
dimiternTheMue, hey10:41
TheMuedimitern: yup10:41
dimiternTheMue, here's a sketch of what I think is needed - http://paste.ubuntu.com/11803914/10:41
dimiternTheMue, feel free to modify/simplify it, but it should include all the relevant pieces10:42
dimiternfwereade, have a look if you can whether I missed something? ^^10:43
fwereadedimitern, sure10:43
TheMuedimitern: looks indeed similar to my IPAddressWatcher10:43
TheMuedimitern: only missing the generic part and having a concrete one instead10:43
TheMuedimitern: nice10:43
dimiternTheMue, cool10:44
fwereadedimitern, strong +1 to :19010:45
dimiternfwereade, cheers10:46
fwereadedimitern, TheMue: otherwise looks sane to me10:47
dimiternTheMue, fwereade, in that case the api client-side will treat both EntityWatcher and StringsWatcher apiserver facades the same10:47
dimitern(save for the used facade name)10:47
fwereadedimitern, I think they *are* pretty much the same though -- it's just that an EntityWatcher makes more generally helpful guarantees about the semantic content of its values10:48
fwereadedimitern, the one quibble is whether the client-side one should be returning actual parsed tags10:48
TheMueyep, more than just *any* string10:48
TheMuefwereade: would expect it (parsed tags)10:49
fwereadeTheMue, dimitern: yeah10:49
fwereadeTheMue, dimitern: it would be a shame to copy-paste the client-side strings watcher just for that though -- please avoid duplication where you can10:51
dimiternfwereade, agreed10:52
TheMuefwereade: hey, that's HA by redundancy10:52
fwereadeLOL10:52
dimiternfwereade, that's a fair point, but if we return []names.Tag we can't reuse the client-side strings watcher proxy10:53
fwereadedimitern, TheMue: yeah -- and it may be that golang effectively just forces us to copy-paste anyway10:54
dimiternTheMue, tags are serialized as []string in params, that's intentional as in other cases, but the client-side could very well convert them to []names.Tag10:54
fwereadedimitern, TheMue: but please see what you can do :)10:54
dimiternfwereade, sure10:54
TheMuedimitern: yes, paring on client side would have been my approach. I like to keep the wire protocol as simple as possible10:55
TheMues/paring/parsing/10:56
dimiternTheMue, +110:59
fwereadedoes anyone know if there's a time when we'd call state.Open *without* knowing the environment and state server uuids?10:59
fwereadebecause there's this really bloody inconvenient bit where we call .StateServingInfo somewhere inside Open, and use it to fill in missing fields11:01
* TheMue is afk, lunchtime11:02
fwereadeand I'd like to pass it in from outside, but this branch is a monster already, so if anyone knows why it's a bad idea before I start, please tell me :-/11:03
fwereadebut, for now, lunch sounds good :)11:03
dimiternfwereade, AFAIK for backwards compatibility with older versions11:04
dimiternfwereade, but that shouldn't make the interface awkward to use and test IMO11:05
mattywfwereade, 2 minutes?11:08
mattywbogdanteleaga, is this still needed? https://github.com/juju/charm/pull/4711:12
bogdanteleagamattyw: nope, they seem to be already updated11:13
bogdanteleagamattyw: somehow :)11:13
mattywbogdanteleaga, magic :)11:13
voidspacedimitern: generating devices from a template isn't too hard11:49
voidspacedimitern: https://code.launchpad.net/~mfoord/gomaasapi/devices/+merge/26337011:49
voidspacedimitern: see newDeviceHandler11:49
voidspacedimitern: little bit of work needed on that (generating a proper system id), but should be plain sailing from here11:50
dimiternvoidspace, awesome!12:00
mattywfwereade, sorry do you have 2 minutes?12:05
tasdomasfwereade, ping?12:35
dimiternvoidspace, ping13:11
voidspacedimitern: pong13:23
mupBug #1455623 changed: TestPingCalledOnceOnlyForSeveralWorkers fails <ci> <intermittent-failure> <test-failure> <juju-core:Fix Released by thumper> <juju-core 1.23:Won't Fix> <juju-core 1.24:Fix Released by thumper> <https://launchpad.net/bugs/1455623>13:25
dimiternvoidspace, did you manage to sort out your issue with maas 1.9 ?13:26
voidspacedimitern: no, still talking to Raphael about it13:27
voidspacedimitern: a migration got changed in the daily builds13:27
voidspacedimitern: which screwed things up13:27
voidspacedimitern: we're on step 5 of trying to sort it out13:27
dimiternvoidspace, I see13:27
dimiternvoidspace, I hope it works then :)13:27
voidspaceme too :-/13:28
mupBug #1463910 changed: Upgrade tests timeout on ppc64 <ci> <gccgo> <intermittent-failure> <regression> <juju-core:Fix Released by thumper> <juju-core 1.24:Fix Released by thumper> <https://launchpad.net/bugs/1463910>13:43
mupBug #1470526 opened: Juju bootstrap with local provider fails on precise <juju-core:New> <https://launchpad.net/bugs/1470526>14:25
=== Ursinha_ is now known as Ursinha
=== hazmat_ is now known as hazmat
=== cppforlife__ is now known as cppforlife_
=== kadams54_ is now known as kadams54-away
katcoericsnow: meeting16:02
ericsnowkatco: just when I thought it was safe to go back in the water :)16:03
katcoericsnow: haha sorry =/16:03
=== kadams54-away is now known as kadams54_
katcoericsnow: you have 2 ship its16:43
ericsnowkatco: thanks!16:43
katcoericsnow: i'm going to place a card on our kanban to ensure we revisit the state stuff16:44
katcoericsnow: we don't want that to fall off our radar16:44
ericsnowkatco: good idea16:44
katcoericsnow: also: http://reviews.vapour.ws/r/2075/16:46
ericsnowkatco: I'll take a look16:46
=== liam_ is now known as Guest12050
mupBug #1470601 opened:  UniterSuite.TestLeadership fails on windows <blocker> <ci> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1470601>18:05
=== kadams54_ is now known as kadams54-away
natefinchericsnow: you around?18:35
ericsnownatefinch: yep18:35
natefinchericsnow: I noticed that the ListProcesses state method (as I last saw it) is effectively a bulk call into state... but it doesn't seem to have a way to return multiple errors.  How will it handle getting called with a mix of valid and invalid IDs, for instance?18:35
ericsnownatefinch: it just returns the ones it found18:36
ericsnownatefinch: it's up to the caller to sort that out18:36
ericsnownatefinch: FYI, I've already fixed that up in my "adjustments" patch18:36
=== \b is now known as benonsoftware
wwitzel3ericsnow: so if i propose my branch again the feature branch, will the RB review be made for me?19:04
wwitzel3s/again/against19:05
ericsnowwwitzel3: yep19:05
wwitzel3ericsnow: I need to add a few more, but no reason it can't be up for review while I take care of that.19:09
ericsnowwwitzel3: k19:10
wwitzel3tests that is19:10
wwitzel3well maybe I do, I don't know19:10
wwitzel3I think I'm actually exercising all of the client paths .. oh error paths, I need to do those19:10
wwitzel3ok, yeah19:10
wwitzel3good talk19:10
ericsnowwwitzel3: :)19:10
mupBug #1470220 changed: Juju-deployer incorrectly reporting errors in juju 1.24.0 environment <juju-core:New> <juju-deployer:New> <juju-gui:New> <https://launchpad.net/bugs/1470220>19:20
wwitzel3natefinch: ping19:37
wwitzel3natefinch: the card I was doing, the API client abstractions, is the card you are doing :)19:38
wwitzel3natefinch: I think we just copied too many of the server cards19:38
natefinchwwitzel3: oops!19:38
wwitzel3natefinch: my branch is up, I just have to add one more method and a couple tests, but the PR is up19:39
natefinchwwitzel3: I'll take a look and see if I have anything to add19:39
natefinchwwitzel3: that looks great, and pretty much just what I was doing.19:41
wwitzel3natefinch: actually I can ask you, I didn't see a stand alone Status on the API server side?19:43
wwitzel3natefinch: I am assuming ListProcess is filling that role?19:43
natefinchwwitzel3: I was just exposing what ericsnow had in state... I think list process is the thing, though maybe we want a shortcut for a brief status message19:45
ericsnownatefinch, wwitzel3: yeah, we might want that19:45
ericsnownatefinch, wwitzel3: we can tackle that when we add support to juju status19:46
natefinchericsnow: is your state work merged into the feature branch?19:47
ericsnownatefinch: not yet19:47
ericsnownatefinch: multi-environment stuff is killing me19:47
natefinchericsnow: is there anything I can do to help?19:48
ericsnownatefinch: I don't think so19:49
ericsnownatefinch: I've almost got it19:49
natefinchericsnow: awesome19:51
thumperbogdanteleaga: ping20:20
bogdanteleagathumper: pong20:20
thumperbogdanteleaga: hey there20:20
thumper"2015-07-01 17:27:20 WARNING utils.featureflag flags_windows.go:34 Failed to open juju registry key HKLM:\\SOFTWARE\\juju-core; feature flags not enabled\n"20:20
thumperI don't think we should emit warnings if the registry key isn't there20:20
thumperis there a difference  between non existent and can't open?20:21
bogdanteleagayes20:21
thumperbogdanteleaga: this made master fail CI BTW20:21
bogdanteleagaI got a fix for it though20:21
thumperawesome20:21
bogdanteleagahere you go https://github.com/juju/juju/pull/269920:22
bogdanteleagathumper: since you're here a fast review wouldn't hurt :p20:23
thumperbogdanteleaga: looking now20:23
thumperbogdanteleaga: shipit20:27
bogdanteleagathumper: cool, thanks20:28
thumperbogdanteleaga: np20:28
thumpersinzui: I'm assuming the upgrade testing for 1.24.2 all went well?20:32
sinzuithumper: sorry? I don’t know which testing that is20:32
thumpersinzui: didn't you say that you were going to make sure that the proposed release goes through additional CI testing around upgrades?20:33
sinzuithumper: yes, I did the 1.22.6 -> 1.24.220:33
thumpercool20:33
thumperI was so happy to finally see that bless come through yesterday evening20:34
bogdanteleagasinzui: I tried doing an upgrade from 1.25 to 1.26 using zip for 1.26 and metadata taken from simplestreams on a webserver set up by me; can you think of anything else I should test?20:34
sinzuibogdanteleaga: 1.24.x -> .1.26.x. I expect a message that there are no candidates (no crash) also, we need to bootstrap juju 1.18.1 and 1.20.11 and confirm they do not choke on streams with zips20:37
natefinchwwitzel3: got you a review on the client stuff. There's a couple pretty important changes that need to be made.20:48
wwitzel3natefinch: thanks, the second comment, not sure what you mean? Isn't it up to the caller to inspect the results for errors?20:50
wwitzel3natefinch: oh, I see what i did, heh20:51
wwitzel3I stipped them out beofre the caller gets a chance to act on them20:51
natefinchyeah, I think we can just return the API objects that the API calls return.. maybe stripped of an extraneous containing struct (like if the struct is just holding a slice of something)20:52
natefinchahh hmm.... error in the code   ProcessResults  should not have an Error value itself20:53
natefinchor maybe ericsnow added that on purpose20:53
natefinchthough I kind of thought that was what the error return from the API call was for20:53
ericsnownatefinch: yeah, I added that on purpose20:54
wwitzel3natefinch: ProcessResult or ProcessResults?20:54
ericsnowmy understanding is that the error return is more for errors in the machinery, not in the handling logic of the request20:55
natefinchericsnow: other bulk calls don't seem to have an Error on the top level result20:59
ericsnownatefinch: k21:00
natefinchI have to go make dinner, but I'll be on later.21:00
=== natefinch is now known as natefinch-afk
=== kadams54 is now known as kadams54-away
fwereadethumper, I'm here if you're free21:32
thumperfwereade: otp now, release call21:32
fwereadethumper, np, ping me when you're free, I may or may not be around :)21:34
davechen1ythumper: there at lots of calls to os.Exit(2) inside cmd/go21:42
davechen1ythe error could be coming from there21:42
davechen1yhttps://bugs.launchpad.net/juju-core/+bug/147060121:52
mupBug #1470601:  UniterSuite.TestLeadership fails on windows <blocker> <ci> <regression> <unit-tests> <juju-core:Fix Committed by bteleaga> <https://launchpad.net/bugs/1470601>21:52
davechen1yhow long until the build is unblocked ?21:52
katcowwitzel3: meeting22:34
=== kadams54 is now known as kadams54-away
perrito666anastasiamac:  axw I am going to bail today the context switch would be really expensive today22:57
anastasiamacperrito666: nps :D could u send a briefing email? tyvm :D22:58
=== kadams54-away is now known as kadams54
perrito666anastasiamac: I will22:59
anastasiamacperrito666: \o/ have fun!22:59
perrito666anastasiamac: I actually am having fun :) txx23:00
* perrito666 is having the fun of finally cracking a problem23:00
perrito666completely unrelated question, is anyone using vim and a decent buffer switcher23:01
wwitzel3katco: shoot, sorry23:12
wwitzel3katco: still going?23:12
wwitzel3katco: sorry, I'm sure the plan was "Wayne will do everything" since I wasn't there .. I deserve that23:15
ericsnowwwitzel3: FYI, I'm merging the state patch right now23:16
wwitzel3ericsnow: good deal23:16
axwwwitzel3: you'll be manning the booth by yourself wed-fri, we'll do the rest23:16
axw;)23:16
wwitzel3axw: haha23:16
wwitzel3axw: yeah, I finished a meeting for the dockercon wrapup doc, and in my brain, that was my late meeting, so I ran to the store23:17
wwitzel3just completely spaced23:17
=== kadams54 is now known as kadams54-away
axwwwitzel3: nps, otp, will fill you in later23:18
wwitzel3axw: thanks23:18
thumpersinzui: we have a problem: http://data.vapour.ws/juju-ci/products/version-2844/kvm-deploy-trusty-amd64/build-1206/consoleText23:19
thumpersinzui: there are conflict markers in the main CI branch code23:20
sinzuithumper: we have just fixed it and and queued the retest23:20
thumpersinzui: cool23:20
sinzuithumper: and it is just on the kvm-slave23:20
thumpersinzui: I was trying to get a jump on looking at master CI :)23:20
mgzthumper: yeah, my bad, I forgot I needed to revert a hack before doing the real landing23:20
thumperha23:21
thumpernp23:21
cheryljwwitzel3: we are putting the hardware requirements for the demo on you, though.  So if we need any systems, monitors, etc, get that info to alexisb by tomorrow.23:24
alexisbheh thumper you really want to see that bless :)23:25
thumperoh ye23:27
thumpers23:27

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!