/srv/irclogs.ubuntu.com/2016/07/20/#juju-dev.txt

thumperdog walk, then addressing review comments...00:44
* thumper afk for a bit00:44
redirmanana juju-dev01:21
wallyworldmenn0: a small one, fixes 2 blockers, when you have a moment http://reviews.vapour.ws/r/5276/01:41
menn0wallyworld: looking01:43
menn0wallyworld: ship it01:45
wallyworldmenn0: ta01:45
anastasiamacmenn0: axw beat u to it too :D01:45
menn0wallyworld: I think I've figure out what's going on with https://bugs.launchpad.net/juju-core/+bug/160451402:02
mupBug #1604514: Race in github.com/joyent/gosdc/localservices/cloudapi <blocker> <ci> <joyent-provider> <race-condition> <regression> <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1604514>02:02
menn0it's certainly not a new issue02:02
menn0and I really don't think it should be a blocker02:02
wallyworldyeah, i'd be surprised if it were02:02
menn0I think the problem is that the joyent provider destroys machines in parallel02:03
wallyworldit's not a regression02:03
wallyworldi'm surprised it was marked as such02:03
menn0but the joyent API test double isn't safe to access concurrently02:03
wallyworldsounds plausible02:03
menn0the correct place to fix it is in the test double but that's not our code02:04
wallyworldyep, i think we can unmark as a blocker and figure out what to do from there02:04
wallyworldwe may need to pull in that external code, as a i doubt we will get it to be fixed02:05
menn0wallyworld: ok, i'll update the ticket so it's no longer blocking02:05
menn0wallyworld: and then I'll poke it some more to see if I can figure out a fix02:06
menn0wallyworld: I can /occasionally/ reproduce the race if I use dave's stress script02:06
wallyworldmaybe there's a work around in the non test code, but would be better to fix upstream i guess02:07
stokachumenn0: im still seeing https://bugs.launchpad.net/juju-core/+bug/160464402:13
mupBug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>02:13
stokachujust fyi02:13
menn0stokachu: that's the issue xtian was looking at02:13
stokachumenn0: this one was https://bugs.launchpad.net/bugs/159382802:14
mupBug #1593828: cannot assign unit E11000 duplicate key error collection: juju.txns.stash <ci> <conjure> <deploy> <intermittent-failure> <oil> <oil-2.0> <juju-core:Fix Released by 2-xtian> <https://launchpad.net/bugs/1593828>02:14
stokachuand it was marked fixed02:14
menn0stokachu: they're the same issue (dup)02:15
menn0stokachu: which version of Juju are you using? I think it was only fixed very recently (not sure exactly when though)02:16
stokachumenn0: correct, i opened a new issue as the previous version was marked fixed release02:16
stokachuBug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash02:17
mupBug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>02:17
stokachujuju beta 1202:17
* menn0 checks when the fix went in02:17
stokachubeta12 lol02:18
thumpermenn0: perhaps the patch approach didn't work?02:19
mupBug #1589471 changed: Mongo cannot resume transaction <canonical-bootstack> <juju-core:Invalid> <https://launchpad.net/bugs/1589471>02:20
menn0stokachu, thumper: nope the fix didn't make beta1202:20
mupBug #1604641 opened: restore-backup fails when attempting to 'replay oplog' again <backup-restore> <blocker> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1604641>02:20
mupBug #1604644 opened: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>02:20
stokachulmao02:20
stokachuit got mark fixed release02:20
menn0the fix is here: 99cb2d1c148f5ed1d246bf4fe44064363226e12e (Jul 15)02:20
menn0it's not in beta1202:20
stokachumenn0: can you update that bug with your findings02:21
stokachu160464402:21
menn0stokachu: will do. shall I also mark it as a dup of the other one?02:21
stokachumenn0: the other bug is already mark fixed released02:22
thumpermenn0: I thought the patch was applied to the top of our mgo branch02:22
stokachui think we should leave that one alone and work off this new one02:22
thumpermenn0: check with mgz and sinzui02:22
thumperand balloons I suppose02:22
stokachusinzui: ^ they are saying it didnt make it in02:22
stokachumake it in beta1202:22
menn0thumper: no, it looks like we copied in a fixed version of mgo's upsert code into juju02:22
menn0ah crap... chrome crash02:24
menn0thumper: oh never mind, you're right we patch over mgo in the build02:26
menn0thumper: at any rate, that change isn't in beta1202:26
thumperWhich was the release we just did? It should be in that02:26
thumperif stokachu is building from source, he won't have it02:27
stokachuthis is from the ppa02:27
thumperhmm...02:27
thumperthat should have the fix02:27
menn0thumper: the latest tag in git is "juju-2.0-beta12"02:27
menn0the fix is 99cb2d1c148f5ed1d246bf4fe44064363226e12e02:27
menn0when I check out the tag, the fix isn't there02:27
menn0when I check out master, it is02:28
thumperugh02:28
stokachuim guessing a one-off was done for this issue02:28
stokachu?02:28
menn0perhaps there was some miscommunication about when the release was ok to cut02:29
lazyPowerbooo, that was in the release notes too02:29
lazyPowermgo package update that retries upserts that fail with ‘duplicate key error’ lp159382802:29
lazyPowerspeaking of o/ hey core team :02:29
lazyPower:)02:29
stokachuso we're sure that fix isn't in beta 12 from the ppa?02:31
stokachubecause it's also uploaded to the archive :)02:32
menn0stokachu: pretty sure. the release tag is there in git, and the fix isn't part of that release.02:32
stokachumenn0: ok, if you don't mind updating that bug so i can follow up with balloons/mgz in the morning02:33
menn0awesome :(02:33
menn0stokachu: will do. i'll poke xtian too so he's in the loop02:33
stokachumenn0: ok cool thanks a bunch02:33
sinzuimenn0: thumper: The patch was added to the juju tree, and the scrpt that makes the tar file applies it. that is the hack that mgz put together02:33
thumpersinzui: looks like something didn't take though02:33
thumpero/ lazyPower02:34
axwwallyworld: I'm planning to add this to the cloud package: http://paste.ubuntu.com/20129296/. one of those will be present in a new environs.OpenParams struct. sound sane?02:34
menn0sinzui: it looks like the rev didn't make the cut of the release.02:34
axwsound/look02:34
sinzuiYeah, that is a bad way to deliver a fix02:34
wallyworldaxw: loking02:34
menn0sinzui: what to do now?02:34
axwwallyworld: open to suggestions for a better name also02:35
sinzuimenn0: I have no idea. I think godeps should define the repo and rev. Other wise we continue to maintain the patch in the tree and apply it each time the tar file is made02:35
menn0sinzui: the immediate problem is that beta12 didn't include the fix at all. the revision with the fix was committed *after* beta12 was cut.02:36
menn0sinzui: the mgo patch doesn't exist in beta1202:37
stokachuwe should amend the release notes and set the fix for beta1302:37
wallyworldaxw: i don't think that struct belongs in cloud - it's an analgamation of things used for an environ should really belongs in there02:37
wallyworldand then it could be called CloudSpec02:37
wallyworldor something02:37
sinzuimenn0: I cannot help at this point. The release was started we aboorted and tried again.02:37
stokachuso can the mgo fix be pulled into godeps now?02:38
stokachuwhat was the reason for applying the fix during the tarball build02:38
thumperanastasiamac: while you are doing virt-type fixes, core/description/constraints_test.go:25, the virt type needs to be added to the allArgs func02:39
axwwallyworld: yeah ok, that's what I had to start with. issue is how to then make State implement EnvironConfigGetter. I think I'll have to define a type outside of the state package that adapts it to that interface02:39
wallyworldstokachu: the reason was we don't control upstream and we could not get the fix landed for us to use02:40
wallyworldso we were forced to adopt a solution where the change was patched in a s part og the build02:40
stokachuwallyworld: so the fixed was pulled in before the PR was accepted?02:40
thumperstokachu: more complicated than that...02:40
stokachuah ok02:40
stokachujust trying to understand02:41
thumperrelated to golang, imports and the mgo release process02:41
wallyworldstokachu: no, the upstream PR was unaccepted but it was landed in an unstable v2 branch which we could not use directly02:41
wallyworldit's all a mess02:41
thumpers/unaccepted/accepted/02:41
stokachuok, but the status in master is it is now part of the tree?02:41
wallyworldno :-(02:41
thumperkinda02:41
wallyworldnot that i am aware of02:41
thumperbut poorly02:41
thumperwallyworld: it is in a patch...02:42
thumperin the tree02:42
thumperick02:42
stokachuhow do you guys do it, this makes my head hurt02:42
wallyworldsure, but unless you apply the patch manually....02:42
thumperyes02:42
wallyworldmine too02:42
thumperstokachu: many years of built up resistence02:42
lazyPowerstokachu - i'm going to say copious amounts of beer and callous to schenanigans02:42
* thumper goes to put the kettle on02:42
stokachuthumper: lol, you guys will lead the zombie resistance02:42
stokachulazyPower: :D02:43
thumperI for one await the zomie appocalypse02:43
menn0stokachu: this is partially due to the way Go handles imports02:43
lazyPowerI never trusted go imports02:43
stokachuok so not as simple as placing the git rev in the Godeps stuff02:43
menn0stokachu: b/c mgo is imported all over the place across multiple repos, if we want to fork it, we would have to change *everything*02:43
menn0stokachu: no, b/c the fix got accepted into mgo's unstable branch, but isn't yet in the stable branch02:44
stokachuah i see02:44
stokachugotcha, i didnt realize it was never in the stable branch02:44
menn0stokachu: we *could* use the unstable branch, but that pulls in a bunch of other stuff we don't really want02:44
stokachuunderstood02:44
lazyPowerdoesn't that mean its going to wind up landing in stable and pull in that bunch of other stuff eventually?02:46
* lazyPower is showing his ineptitude at golang02:46
natefinchthe whole "unstable" thing in the import path just seems like a bad idea.  Either make it a new version or don't.  If you want to mark it as unstable, do so in the readme.02:52
* thumper notes that we are still using charm.v6-unstable02:57
natefinchyep02:57
natefinchdumb idea02:57
natefinchinstead of having to go change all the imports once when we move to a new version, we have to do it twice.  Assuming we ever actually bother to rename it from unstable.02:58
menn0wallyworld: fix for the joyent race: http://reviews.vapour.ws/r/5277/04:08
wallyworldlooking04:08
wallyworldmenn0: lgtm04:09
menn0wallyworld: thansk04:10
menn0thanks even04:10
menn0wallyworld: backport to 1.25 as well/04:11
menn0?04:11
wallyworldmenn0: um, it's such a simple fix, why not04:11
menn0wallyworld: ok04:11
wallyworldmight get a bless more often than twice a year04:11
thumpermenn0: re dump-model review, and See Also, I copied it from elsewhere...04:22
thumperI did think it was strange04:23
* thumper looks for a good example04:25
thumpermenn0: updated http://reviews.vapour.ws/r/5265/04:38
thumperadded a few drive by fixes for "See also:" formatting, made consistent with juju switch04:38
thumpermenn0: made the apiserver side a bulk call, client api still single04:38
thumperadded client side formatting04:38
menn0thumper: looking. I wasn't really suggesting that you had to do the bulk API work given the rest of the facade but great that you did anyway :)04:44
menn0thumper: "See also" is already quite inconsistent between commands04:45
menn0sigh04:45
thumperI thought that switch was most likely to be right04:45
thumperI looked at quite a few04:45
menn0thumper: oh hang on... you fixed them all!04:45
thumperand picked the resulting style04:45
menn0thumper: nice04:45
thumperwell, in that package04:46
menn0thumper: ship it!04:49
thumpermenn0: ta04:50
babbageclunkmenn0: D'oh.05:53
=== frankban|afk is now known as frankban
frobwaredooferlad: ping08:00
dooferladfrobware: hi08:01
frobwaredooferlad: any change we can meet now?08:01
frobwarechance08:02
dooferladfrobware: need 5 mins08:02
frobwaredooferlad: I have a plumber arriving in ~30 mins which is likely to clash with our 1:108:02
frobwaredooferlad: ok08:02
babbageclunkmenn0: ping?08:03
menn0babbageclunk: howdy... i'm in the tech board call atm. talk after?08:04
babbageclunkmenn0: cool cool08:04
menn0babbageclunk: hey, done now09:13
babbageclunkmenn0: Sorry, in standup.09:14
wallyworldfwereade: in prep for some work, i have needed to move model config get/set/unset off client facade to their own new facade, so essentially a copy of stuff and a bit of boiler plate for backwards compat until gui is updated. would love a review at your leisure so i can land when CI is unclocked http://reviews.vapour.ws/r/5279/09:14
wallyworldi also removed jujuconnsuite tests \o/09:15
menn0babbageclunk: np, I'll hang around for a bit.09:17
fwereadewallyworld, ack, thanks09:23
fwereadewallyworld, I presume: s/have needed to/gladly took the opportunity to/ ;p09:24
wallyworldfwereade: that too, but also a need09:34
wallyworld:)09:34
babbageclunkmenn0: Sorry, rambling discussion about godeps and vendoring. Nearly done.09:44
menn0babbageclunk: sounds like a repeat of the tech board meeting :)09:45
babbageclunkmenn0: quite09:45
babbageclunkmenn0: ok, done09:45
babbageclunkmenn0: did you manage to reproduce stokachu's problem?09:46
babbageclunkmenn0: sorry, I mean, has anyone had a chance to reproduce it?09:46
menn0babbageclunk: nope. I gave stokachu a rebuild of 2.0-beta12 which definite had the patch applied.09:47
babbageclunkmenn0: And does he see it with that?09:47
menn0babbageclunk: he was going to try it out and see if the problem happened with that as he's able to make it happen fairly reliably.09:47
menn0babbageclunk: I don't know. He never got back to me. I think it was quite late for him at the time.09:48
menn0babbageclunk: he was going to report back on the ticket but hasn't yet.09:48
babbageclunkmenn0: Ok, cool - I had a go with a checkout of the right commit and the patch applied, but no luck yet - not sure which bundle to use.09:48
menn0babbageclunk:09:48
menn0babbageclunk: my goal was to establish whether or not the patch made it into the release or not09:48
menn0(and whether or not it worked)09:49
menn0babbageclunk: I imagine we'll hear back from stokachu when he starts work again09:49
babbageclunkmenn0: Also not sure whether my laptop has enough oomph to cause the contention needed.09:49
menn0babbageclunk: it seems like there was some process failure when the official beta12 was produced so I'm not ruling out that the patch didn't actually make it into the release09:50
babbageclunkmenn0: Yeah, it was a bit crazy.09:50
menn0babbageclunk: stokachu said he could make the problem happen quite often with just using add-model and destroy-model09:52
menn0I'm not sure how hard he was really pushing things09:52
babbageclunkmenn0: Ok, I'll try that a few more times. The hadoop-spark-zeppelin bundle really squishes my machine. It's pretty cool.09:53
menn0babbageclunk: I guess you could try making the problem happen with a juju that's built without the patch09:53
menn0and when you have a reliable way of triggering the problem09:54
menn0rebuild with the patch and see if it goes away09:54
babbageclunkmenn0: Well, I'm more concerned that the 5-retry thing just made it a bit less likely, but not really better.09:54
menn0or, you could hold off and do something else until we hear more from the QA peeps and stokachu09:54
babbageclunkmenn0: I'll give it a couple more kicks and then get in touch with the US peeps.09:55
menn0you would think 5 would be enough...09:55
babbageclunkI would and did!09:55
menn0maybe a random short sleep between each loop would help?09:55
menn0ethernet style09:55
babbageclunkYeah, could help - want to be sure it's happening first though.09:56
menn0for sure... need more info09:56
babbageclunkamusing - the test that was originally causing the problem in tests has been deleted.09:57
babbageclunkI mean, in our suite.09:57
menn0babbageclunk: for unrelated reasons?09:58
babbageclunkyeah, because address picking has been removed.09:58
menn0ha funny... still needs to be fixed of course10:01
menn0babbageclunk: I've got to go. I've got a literal mountain of washing to contend with.10:01
babbageclunkmenn0: ok, thanks. Happy climbing!10:03
mupBug #1604785 opened: repeatedly getting rsyslogd-2078 on node#0 /var/log/syslog <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1604785>11:40
mupBug #1604787 opened: juju agents trying to log to 192.168.122.1:6514 (virbr0 IP) <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1604787>11:40
frankbancherylj: hey morning, could you please merge trivial http://reviews.vapour.ws/r/5280/ ?11:51
cheryljfrankban: sure11:55
frankbancherylj: ty!11:55
mupBug #1598272 changed: LogStreamIntSuite.TestFullRequest sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Fix Released by fwereade> <https://launchpad.net/bugs/1598272>12:10
stokachubabbageclunk: retrying to reproduce this morning, was late last night for me12:20
perrito666morning all12:22
frankbancherylj: how do I check what failed at /var/lib/jenkins/workspace/github-merge-juju/artifacts/trusty-err.log ?12:44
frankbancherylj: sorry, at http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/console12:44
cheryljfrankban: I've pinged mgz to take a look.  I think it's a merge job failure12:45
frankbancherylj: ty12:45
mupBug # changed: 1603596, 1604176, 1604408, 1604561, 160464412:46
perrito666wallyworld: go to sleep?13:31
wallyworldok, about that time13:31
mupBug #1604817 opened: Race in github.com/juju/juju/featuretests <blocker> <ci> <intermittent-failure> <race-condition> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1604817>13:31
natefinchwallyworld: if you he 2 minutes, I'd love it if you could just read and maybe quickly respond to a couple review comments I have: http://reviews.vapour.ws/r/5238/13:33
natefinchs/he/have13:34
wallyworldok13:34
wallyworldnatefinch: done13:36
natefinchwallyworld: thanks13:36
natefinchhey, we're down to just two blocking tests in master, awesome13:37
natefinch(sorta)13:37
babbageclunkfwereade: ping?13:48
frankbancherylj: should I try merge again?13:48
fwereadebabbageclunk, pong13:49
fwereadebabbageclunk, what can I do for you?13:50
babbageclunkfwereade: I'm trying to understand the relationship between container and machine provisioners.13:50
babbageclunkfwereade: Sorry, environ provisioners13:50
babbageclunkfwereade: (looking at bug 1585878)13:51
mupBug #1585878: Removing a container does not remove the underlying MAAS device representing the container unless the host is also removed. <2.0> <hours> <maas-provider> <network> <reliability> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1585878>13:51
fwereadebabbageclunk, at the heart of a provisioner there is a simple idea: watch the machines and StartInstance/StopInstance in response13:52
fwereadebabbageclunk, I think that's called ProvisionerTask?13:52
babbageclunkfwereade: yup, and it's the same between the environ and container provisioners.13:53
babbageclunkfwereade: but with different brokers, I think.13:53
fwereadebabbageclunk, yeah, exactly13:53
babbageclunkfwereade: So it looks like the environ provisioner explicitly excludes containers from the things it watches13:54
fwereadebabbageclunk, ultimately we *should* be able to just start each of them with a broker, an api facade, and knowledge of what set of machines they should watch13:54
fwereadebabbageclunk, yeah, that should be encapsulated in what it watches13:54
fwereadebabbageclunk, I expect they actually make different watch calls or something, though? :(13:55
babbageclunkfwereade: Ok - in the maas case I need to tell maas the container's gone away after getting rid of it.13:55
cheryljfrankban: yes, looks like one PR went through, so something's working...13:55
cheryljfrankban: so I'd retry13:55
frankbancherylj: retrying13:56
fwereadebabbageclunk, ha, ok, let me think13:56
babbageclunkfwereade: Until I started saying this, I thought that the container broker didn't talk to the environ, but now I think that's wrong - it needs to tell it when it starts, right?13:56
fwereadebabbageclunk, I am confident that a container provisioner should *not* talk to the environ directly, because that would entail distributing environ creds to every machine13:57
babbageclunkfwereade: Ok, that makes sense. So in order to clean up the maas record of the container, the environ provisioner would also need to watch containers, right?13:58
babbageclunkI should trace the start path so I can see where maas gets told about the container.13:59
fwereadebabbageclunk, I would be most inclined to have a separate instance-cleanup worker on the controllers, fed by provisioners leaving messages (directly or indirectly) on instance destruction14:00
babbageclunkfwereade: leaving messages how? Files?14:00
fwereadebabbageclunk, db docs?14:00
babbageclunkfwereade: oh, duh14:00
fwereadebabbageclunk, ;p14:00
fwereadebabbageclunk, there is a general problem with having all-the-necessary-stuff set up before a provisioner sees a machine to try to deploy14:01
babbageclunkfwereade: ok, so the container provisioner creates a record indicating that it killed a container, and a controller-based worker watches those and does the environ-level cleanup.14:02
fwereadebabbageclunk, trying to set up networks etc in the provisioner is wilful SRP violation -- but I think we do have a PrepareContainerNetworking (or something) call that the provisioner task makes14:02
babbageclunkfwereade: ok14:03
fwereadebabbageclunk, yeah, I would be grateful if we would cast it in terms that applied to machines and containers both, and didn't distinguish between them except in the worker that actually handles them14:03
babbageclunkfwereade: so that's in the environ provisioner - it talks to the provider.14:03
fwereadebabbageclunk, I don't think any provisioner should be responsible for doing this work, I think it should be a separate instance-cleanup worker14:04
babbageclunkfwereade: (oops, that was in response the prev)14:04
fwereadebabbageclunk, yuck :)14:04
babbageclunkfwereade: Ok - so the provisioner task would just say "this instance needs cleaning"...14:05
babbageclunkfwereade: and then the new worker would see all of them and just do stuff for the containers for now.14:05
fwereadebabbageclunk, so, really, *that* should be happening in an instance-preparer worker, which creates tokens watched by the appropriate provisioner, which can then only try to start instances that have all their deps ready14:05
fwereadebabbageclunk, yeah, I think so14:06
fwereade(I refer, above, to the instance-prep work currently done by the provisioner, not to what you just said, which I agree with)14:06
babbageclunkfwereade: right, I was just going to check that.14:06
babbageclunkfwereade: sounds good, thanks!14:07
fwereadebabbageclunk, note that there's an environ-tracker manifold available on environ manager machines already, it gets you a shared environ that's updated in the background, you don't need to dirty your worker up with those concerns14:07
babbageclunkfwereade: ok, I'll make sure to base my worker on that.14:08
fwereadebabbageclunk, and it is called "environ-tracker", set up in agent/model.Manifolds14:10
fwereadebabbageclunk, just use it as a dependency and focus the worker around the watch/response loop14:10
fwereadebabbageclunk, you should then be able to just assume the environ's always up to date, and if you do race with a credential change or something it's nbd, just an error, fail out and let the mechanism bring you up to try again soon14:11
babbageclunkfwereade: ok14:12
fwereadebabbageclunk, ...or, hmm. be careful about those errors, actually14:12
fwereadebabbageclunk, we want those to be observable, I think14:12
babbageclunkfwereade: observable?14:13
fwereadebabbageclunk, and we probably shouldn't mark the machine that used them dead until they've succeeded14:13
fwereadebabbageclunk, report the error in status, I think, nothing should be competing for it by the time this is running14:13
babbageclunkfwereade: oh, gotch14:14
babbageclunka14:14
fwereadebabbageclunk, so, /sigh, this implies moving responsibility for set-machine-dead off the provisioner and onto the instance-cleaner14:14
fwereadebabbageclunk, which is clearly architecturally sane, but a bit of a hassle14:14
fwereadebabbageclunk, otherwise we'll be leaking resources and not having any entity against which to report the errors14:15
fwereadebabbageclunk, sorry again: not set-machine-dead, but remove-machine14:15
fwereadebabbageclunk, the machine agent sets itself dead to signal to the rest of the system that its resources should be cleaned up14:15
babbageclunkfwereade: ok14:16
frankbancherylj: looked at the tests and I've found that the failure is real for my branch. I have a fix already, but how do I check the tests that actually failed from the CI logs?14:16
fwereadebabbageclunk, but we shouldn't *remove* it until both the instance (by the provisioner) and other associated resources (by instance-cleaner, maybe more in future) have been cleaned up14:16
babbageclunkfwereade: Yeah, that makes sense.14:17
cheryljfrankban: go to your merge job:  http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/14:17
cheryljfrankban: and click trusty-err.log14:17
fwereadebabbageclunk, ...and ofc *that* now implies that we *will* potentially have workers competing for status writes14:17
=== natefinch is now known as natefinch-afk
cheryljfrankban: argh, looks like it failed to run again14:17
cheryljballoons, sinzui - can you take a look:  http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/artifact/artifacts/trusty-err.log14:17
fwereadebabbageclunk, so... it's not trivial, I'm afraid, but I can't think of any other things that'll interfere14:18
frankbancherylj: I am running 8477 now14:18
frankbancherylj: let's see if it will fail to run again, it should fail with 2 tests failures in theory14:18
fwereadebabbageclunk, do you know what dimitern has been doing lately? I think he had semi-detailed plans for addressing the corresponding setup concerns but I'm not sure he started implementing them14:19
cheryljfrankban: ah, well, when it completes you can view that trusty-err.log file for the test output14:19
babbageclunkfwereade: sorry, no - he's been away for the last week and a bit, not sure what he's working on at the moment.14:19
frankbancherylj: yes thank you, good to know14:19
fwereadebabbageclunk, no worries14:19
fwereadebabbageclunk, do sync up with him when he returns14:20
babbageclunkfwereade: hang on, why multiple workers competing to write status?14:20
fwereadebabbageclunk, if the provisioner StopInstance fails that should report; if the instance-cleaner Whatever fails, that should also report14:20
fwereadebabbageclunk, it might also be useful to look at what storageprovisioner has done14:21
fwereadebabbageclunk, with the internal queue for delaying operations if they can't be done yet14:21
babbageclunkfwereade: Oh I see, so if both of them fail then an error in the provisioner might be hidden by one in the cleanup worker.14:22
fwereadebabbageclunk, yeah, exactly14:22
babbageclunkfwereade: ok, that's heaps to go on with - I'll probably need more pointers once I'm a bit further along.14:23
babbageclunkfwereade: Thanks!14:23
fwereadebabbageclunk, (nothing would be *lost*, because status-history, but it would be good to do better)14:23
fwereadebabbageclunk, np14:23
fwereadebabbageclunk, always a pleasure14:23
mupBug #1604644 opened: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <conjure> <mongodb> <juju-core:New> <https://launchpad.net/bugs/1604644>14:25
sinzuisorry cherylj: got pulled inot a meeting. Go is writing errors to stdout You can see the failure in http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/artifact/artifacts/trusty-out.log14:43
sinzuicherylj: I think we can create unified log so that the order of events and where to look are in a single place14:43
rick_h_katco: ping for standup15:02
katcorick_h_: oops omw15:02
mupBug #1604883 opened: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>16:16
mupBug #1604883 changed: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>16:25
mupBug #1604883 opened: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>16:34
=== natefinch-afk is now known as natefinch
perrito666anyone has spare time to review this http://reviews.vapour.ws/r/5282/diff/# ? its not a very short one, its part of a set of changes to support ControllerUser permissions, I am happy to discuss what this particular patch does if anyone goes for the rev16:41
natefinchrick_h_: I have a ship it for the interactive bootstrap stuff... should I push it through or wait for master to be unblocked?16:45
rick_h_natefinch: wait for master please atm16:45
rick_h_natefinch: just mark it as blocked on the card on master16:46
=== frankban is now known as frankban|afk
natefinchrick_h_: will do16:51
rick_h_natefinch: got a sec?16:52
natefinchrick_h_: yep16:52
rick_h_natefinch: https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=116:52
* rick_h_ goes for lunchables then16:58
natefinchgod I love small unit tests17:37
natefinchI love that it tells me "you have an error in this 20 lines of code"17:38
rick_h_jcastro: marcoceppi arosales heads up docs switch is done and the jujucharms.com site is all 2.0 all the time https://jujucharms.com/docs18:49
marcoceppirick_h_: yesssss18:49
mupBug #1604915 opened: juju status message: "resolver loop error" <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604915>18:50
rick_h_marcoceppi: will send an email shortly, want to check on status of b12 in xenial update to go along with it18:50
mupBug #1604919 opened: juju-status stuck in pending on win2012hvr2 deployment <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604919>19:05
natefinchrick_h_: output for interactive commands on stdout or stderr?19:06
rick_h_natefinch: so jam had some thoughts and added notes to the interactive spec on that19:07
natefinchrick_h_: ok, I was wondering who added that.  it was incomplete so I was hoping for clarification19:07
* rick_h_ loads doc to double check19:07
rick_h_natefinch: ah, yea looks like he didn't finish typing19:08
natefinchrick_h_: I know the answer for non-interactive commands, but not sure if it should be different for interactive19:08
natefinchrick_h_: given that there's no real scriptable output19:08
natefinch(I mean, you can script anything, but it's not made with that in mind)19:09
rick_h_natefinch: can you ping him to clarify the rest, but the start is there as far as for interactive I think that's the idea that the questions/etc should go to stderr, but if we confirm things "successfully added X" it's stdout19:09
natefinchrick_h_: ok, yeah, I'll talk to him about it.19:09
rick_h_natefinch: ty19:09
natefinchoh man... writing this package to handle the formatting of user interactions was the best idea I ever had.19:15
natefinchok, maybe not the best idea ever. But... it's certainly saving my ass.19:16
rick_h_natefinch: <319:22
alexisbnatefinch, so that begs the questions, what was your best idea ever19:23
natefinchalexisb: that's like the best set up for a joke I've ever had....19:24
natefinchalexisb: marrying my wife, obviously.  Only slightly behind would be the idea to switch from Mechanical Engineering to Computer Science in school.  Dodged a bullet there.19:25
natefinchI have a couple mech-e friends... they basically design screws all day long19:26
alexisbnatefinch, yep19:26
alexisbI got to my first statics class, followed by drafting and went "o hell no!"19:27
alexisbI also had some time at Racor systems (Parker affiliate) and watched there engineers at a ProE screen all day19:28
alexisbno thank you19:28
natefinchyuuup19:28
natefinchI realized fairly early that I found physics fascinating in the abstract, but the reality of actually figuring shit out was mind-bogglingly boring.19:29
alexisbat racor, the acturally factory was AWESOME, which is where I started wtih control systems19:29
perrito666wanna do some boring mech things, try calculating elevators for a living19:30
perrito666most revealing class I ever had19:30
natefinchfriend of mine makes maglev elevators for things like aircraft carriers.... still pretty boring work in the small19:30
natefinchhe'd probably say the same for my job, though ;)19:31
natefinch"So... you twiddled with carriage returns all day?"19:31
perrito666lol "so, found that missing statement?"19:32
mupBug #1604931 opened: juju2beta12: unable to destroy controller properly on localhost <conjure> <juju-core:New> <https://launchpad.net/bugs/1604931>19:32
perrito666but I was talking about builting elevators, I actually had to spend a semester calculating those19:32
* rick_h_ runs to get the boy from school19:38
arosalesrick_h_: great to hear thanks for the fyi19:53
natefinchare we supposed to be able to add-cloud for providers like ec2?20:30
natefinchit doesn't look like we're stopping people from doing that20:31
mupBug #1604955 opened: TestUpdateStatusTicker can fail with timeout <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1604955>20:35
mupBug #1604959 opened: Failed restore juju.txns.stash 'collection already exists' <backup-restore> <ci> <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1604959>20:44
natefinchrick_h_: It's a little weird that clicking on "stable" in jujucharms brings you to the 2.0 docs, which say at the top, in red "Juju 2.0 is currently in beta which is updated frequently. We don’t recommend using it for production deployments."20:48
natefinchthe problem with MAAS API URL is that it looks like I'm shouting, but really it's just TLA proliferation20:58
=== natefinch is now known as natefinch-afk
mupBug #1604961 opened: TestWaitSSHRefreshAddresses can fail on windows <ci> <intermittent-failure> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1604961>21:11
mupBug #1604965 opened: machine stays in pending state even though node has been marked failed deployment in MAAS <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604965>21:11
rediryou sure are21:17
redirignore ^^21:19
menn0perrito666: ping21:25
perrito666menn0: ping21:26
perrito666sorry pong21:27
perrito666menn0: did I break something?21:27
menn0perrito666: no, I just wanted to apologise for not getting to your ACLs PR yesterday... the day got swallowed up by critical bugs21:28
menn0perrito666: I was about to review now and see that it's been discarded?21:28
perrito666menn0: oh, no worries, you would not have been able to review it yesterday anyway, it had a dependency on an unmerged branch and the diff was uncomprehensible, I droped it, merged the pending branch and re-proposed21:29
perrito666RB is really misleading, I thought that adding the dependency on the depends on field would fix the diff but did nothing at all and then it would not allow me to upload my own diff21:30
=== akhavr1 is now known as akhavr
perrito666I think we should change RB for something a bit more useful, like snapchat21:30
menn0perrito666: LOL :)21:30
menn0perrito666: I've been wondering about Gerrit or Phabricator, they seem like the best alternatives21:31
perrito666I checked one of those during the sprint, and I liked it, I cant rememmber which one though, Phabricator I think21:31
perrito666menn0: also, the only person that knew something about our RB is no longer in this team which makes an interesting SPOF21:32
menn0perrito666: I don't think the ops side of RB is particularly hard21:32
menn0perrito666: and I *think* the details are written down /somewhere/21:32
perrito666menn0: I fear that the certain somewhere is an email :s21:33
perrito666anyway, eric usually knew the dark secrets like how to actually make a branch depend on21:33
menn0perrito666: phab is nice. I've used it a bit at one job. it support enforcing a fairly strict (but customized) development process..21:33
menn0perrito666: you do this: rbt post --disable-ssl-verification -r <review_number> --parent <parent_branch>21:34
menn0perrito666: and then check how it looks on the RB website and hit Publish21:34
katcomenn0: perrito666: i've been interested in how this works out for teams: https://github.com/google/git-appraise21:35
perrito666menn0: ah, I need some non magic interaction :)21:35
perrito666menn0: if you ask me (and even if you dont) if it cant be done on the website, its broken21:35
menn0perrito666: I think you can upload arbitrary diffs to RB... but I've never done it21:36
menn0katco: looks interesting! I hadn't heard of git-appraise before21:36
* menn0 reads more21:36
katcomenn0: i enjoy the decentralized nature. no ops needed21:37
katcomenn0: or at least i think i *would*. i've never used this21:37
perrito666menn0: well I actually tried, It seems to assume rb has something it doesnt, we might have broken that particular workflow with our magic bot21:37
perrito666katco: that looks amazing but seems to not work very nicely with github workflow (which we sort of use)21:39
menn0katco: storing the reviews in git is a nice idea. the way you add comments is a little unfriendly though. I guess the expectation is that someone will create a UI/tool for that.21:39
katcoperrito666: just saw this: https://github.com/google/git-pull-request-mirror21:40
katcomenn0: and just found this: https://github.com/google/git-appraise-web21:40
redirwho's the resident data race expert?21:41
perrito666katco: mm, really interesting, do you know actual users of this, I am interested in seeing how it behaves in heavily conflictive envs21:41
perrito666redir: we all are good adding data races :p21:41
redirperrito666: OK who's the resident data race tortoise?21:42
perrito666redir: well, you are not in luck, its dave cheney :p21:42
menn0katco: that improves the situation somewhat! :)21:42
perrito666redir: just throw the problem to the field and well see how can we attack it21:42
katcoperrito666: i do not. this looks fairly active? https://git-appraise-web.appspot.com/static/reviews.html#?repo=23824c02939821:43
perrito666man, was that english broken or what? ;p I am loosing my linguistic skills21:43
redirI think it is pretty straightforward21:43
perrito666katco: very interesting, I really like the idea of storage of these things In the repo21:46
perrito666but ill say something very shallow21:46
redirhttps://github.com/go-mgo/mgo/blob/v2/socket.go#L329 needs to be locked so it doesn't race with https://github.com/go-mgo/mgo/blob/v2/stats.go#L5921:46
redir I think21:46
perrito666the UI is ugly as f***21:46
redirtrouble reproducing21:46
katcoperrito666: it is certainly spartan21:46
katcoperrito666: personally, i would be writing an emacs plugin for this if someone hasn't already21:47
katcoredir: why are stats being reset before kill has been returned? i think there's a logic bomb there21:48
perrito666I dont know what kind of spartans you know, the ones from the movie certainly look better than that UI :p21:49
katcoperrito666: sorry, i intended this usage: "adj.Simple, frugal, or austere: a Spartan diet; a spartan lifestyle."21:49
perrito666katco: I know, I intended to : " troll (/ˈtroʊl/, /ˈtrɒl/) is a person who sows discord on the Internet by starting arguments or upsetting people,"21:51
katcolol21:51
perrito666redir: while killing, imho you should be locking everything indeed, but I have not checked past these two links to know if I am speaking the thruth about this particular issue21:52
perrito666katco: I do dislike the ui though, I prefer something like github without the insane one mail per comment thing21:53
katcoredir: also i don't think that's the race. socketsAlive locks the mutex before doing anything: https://github.com/go-mgo/mgo/blob/v2/stats.go#L13521:55
redirmkay thanks21:56
redirperrito666: katco ^21:56
perrito666moving to a silent neigbourhood is glorious for work21:59
mupBug #1604988 opened: Inconsistent licence in github.com/juju/utils/series <jujuqa> <packaging> <juju-core:Triaged> <juju-core 1.25:Triaged> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1604988>22:05
menn0katco: you're convincing me that we should experiement with vendoring some more :)22:43
katcomenn0: eep...22:43
katcomenn0: as long as how go does vendoring is well understood, i'm happy. i am scared of diverging too much without forethought22:43
menn0katco: sure... it's not something we should do lightly. and if do it, it should use Go's standard mechanism.22:44
perrito666menn0: re our previous talk http://reviews.vapour.ws/r/5282/diff/#22:44
katcomenn0: yeah, agreed22:45
menn0perrito666: ok. I can take a look.22:48
menn0perrito666: my initial comment is that I wish this was 2 PRs: one for state and one for apiserver (but I will cope)22:49
perrito666menn0: I am sorry I promise I tried to make it smaller22:50
perrito666menn0: its smaller than it looks though, small changes in many files22:51
katcomenn0: i think i messed up the tech board permissions. i was trying to get a link and it looked publicly accessible, so i disabled that. now i can't view it22:52
menn0katco: I'll take a look22:53
katcomenn0: sorry about that22:53
menn0katco: you completely removed canonical access :) not sure how to put it back yet22:53
mupBug #1605008 opened: juju2beta12 and maas2rc2:  juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>22:54
katcomenn0: wait what! all i did was turn off link sharing :(22:54
menn0katco: figured it out. what was it before? anyone at canonical can edit or view?22:54
menn0or comment?22:55
katcomenn0: could comment i think, but it looked like external people with link could view as well22:55
menn0katco: ok, it's fixed. anyone from canonical can comment again.22:56
katcomenn0: ta, sorry22:56
axwwallyworld: did I miss anything on the call? slept through my alarm supposedly, pretty sure it didn't go off though22:57
axwneed to ditch this dodgy phone22:57
perrito666axw: or get a clock22:58
wallyworldaxw: not a great deal, just release recap, tech board summary22:58
axwperrito666: could do that too, I'd rather have it near my head so I don't wake up my wife22:58
axwsuppose I could move the clock...22:58
perrito666axw: get a deaf people clock22:58
axwwallyworld: ok, ta22:58
perrito666(not trolling, these are a thing)22:58
axwperrito666: ah, have not seen one22:59
perrito666they have a thing that you put in your pillow and it vibrates22:59
perrito666much like your phone, but less points of failure22:59
axwI guess I could just use my fitbit then. if I can find it, and my charger...22:59
mupBug #1605008 changed: juju2beta12 and maas2rc2:  juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>23:00
axwanyway23:00
* axw stops debugging alarm replacement issues23:00
mupBug #1605008 opened: juju2beta12 and maas2rc2:  juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>23:06
=== akhavr1 is now known as akhavr
alexisbaxw, thumper ping23:18
axwcoming, sorry23:18
thumpercoming23:18
=== akhavr1 is now known as akhavr
rediraxw: thanks for the protip in the review. helpful23:49
axwredir: np23:49
thumpermenn0: so you are working with redir on the race?23:57

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!