/srv/irclogs.ubuntu.com/2016/07/20/#juju-dev.txt

thumper	dog walk, then addressing review comments...	00:44
* thumper afk for a bit		00:44
redir	manana juju-dev	01:21
wallyworld	menn0: a small one, fixes 2 blockers, when you have a moment http://reviews.vapour.ws/r/5276/	01:41
menn0	wallyworld: looking	01:43
menn0	wallyworld: ship it	01:45
wallyworld	menn0: ta	01:45
anastasiamac	menn0: axw beat u to it too :D	01:45
menn0	wallyworld: I think I've figure out what's going on with https://bugs.launchpad.net/juju-core/+bug/1604514	02:02
mup	Bug #1604514: Race in github.com/joyent/gosdc/localservices/cloudapi <blocker> <ci> <joyent-provider> <race-condition> <regression> <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1604514>	02:02
menn0	it's certainly not a new issue	02:02
menn0	and I really don't think it should be a blocker	02:02
wallyworld	yeah, i'd be surprised if it were	02:02
menn0	I think the problem is that the joyent provider destroys machines in parallel	02:03
wallyworld	it's not a regression	02:03
wallyworld	i'm surprised it was marked as such	02:03
menn0	but the joyent API test double isn't safe to access concurrently	02:03
wallyworld	sounds plausible	02:03
menn0	the correct place to fix it is in the test double but that's not our code	02:04
wallyworld	yep, i think we can unmark as a blocker and figure out what to do from there	02:04
wallyworld	we may need to pull in that external code, as a i doubt we will get it to be fixed	02:05
menn0	wallyworld: ok, i'll update the ticket so it's no longer blocking	02:05
menn0	wallyworld: and then I'll poke it some more to see if I can figure out a fix	02:06
menn0	wallyworld: I can /occasionally/ reproduce the race if I use dave's stress script	02:06
wallyworld	maybe there's a work around in the non test code, but would be better to fix upstream i guess	02:07
stokachu	menn0: im still seeing https://bugs.launchpad.net/juju-core/+bug/1604644	02:13
mup	Bug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>	02:13
stokachu	just fyi	02:13
menn0	stokachu: that's the issue xtian was looking at	02:13
stokachu	menn0: this one was https://bugs.launchpad.net/bugs/1593828	02:14
mup	Bug #1593828: cannot assign unit E11000 duplicate key error collection: juju.txns.stash <ci> <conjure> <deploy> <intermittent-failure> <oil> <oil-2.0> <juju-core:Fix Released by 2-xtian> <https://launchpad.net/bugs/1593828>	02:14
stokachu	and it was marked fixed	02:14
menn0	stokachu: they're the same issue (dup)	02:15
menn0	stokachu: which version of Juju are you using? I think it was only fixed very recently (not sure exactly when though)	02:16
stokachu	menn0: correct, i opened a new issue as the previous version was marked fixed release	02:16
stokachu	Bug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash	02:17
mup	Bug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>	02:17
stokachu	juju beta 12	02:17
* menn0 checks when the fix went in		02:17
stokachu	beta12 lol	02:18
thumper	menn0: perhaps the patch approach didn't work?	02:19
mup	Bug #1589471 changed: Mongo cannot resume transaction <canonical-bootstack> <juju-core:Invalid> <https://launchpad.net/bugs/1589471>	02:20
menn0	stokachu, thumper: nope the fix didn't make beta12	02:20
mup	Bug #1604641 opened: restore-backup fails when attempting to 'replay oplog' again <backup-restore> <blocker> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1604641>	02:20
mup	Bug #1604644 opened: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>	02:20
stokachu	lmao	02:20
stokachu	it got mark fixed release	02:20
menn0	the fix is here: 99cb2d1c148f5ed1d246bf4fe44064363226e12e (Jul 15)	02:20
menn0	it's not in beta12	02:20
stokachu	menn0: can you update that bug with your findings	02:21
stokachu	1604644	02:21
menn0	stokachu: will do. shall I also mark it as a dup of the other one?	02:21
stokachu	menn0: the other bug is already mark fixed released	02:22
thumper	menn0: I thought the patch was applied to the top of our mgo branch	02:22
stokachu	i think we should leave that one alone and work off this new one	02:22
thumper	menn0: check with mgz and sinzui	02:22
thumper	and balloons I suppose	02:22
stokachu	sinzui: ^ they are saying it didnt make it in	02:22
stokachu	make it in beta12	02:22
menn0	thumper: no, it looks like we copied in a fixed version of mgo's upsert code into juju	02:22
menn0	ah crap... chrome crash	02:24
menn0	thumper: oh never mind, you're right we patch over mgo in the build	02:26
menn0	thumper: at any rate, that change isn't in beta12	02:26
thumper	Which was the release we just did? It should be in that	02:26
thumper	if stokachu is building from source, he won't have it	02:27
stokachu	this is from the ppa	02:27
thumper	hmm...	02:27
thumper	that should have the fix	02:27
menn0	thumper: the latest tag in git is "juju-2.0-beta12"	02:27
menn0	the fix is 99cb2d1c148f5ed1d246bf4fe44064363226e12e	02:27
menn0	when I check out the tag, the fix isn't there	02:27
menn0	when I check out master, it is	02:28
thumper	ugh	02:28
stokachu	im guessing a one-off was done for this issue	02:28
stokachu	?	02:28
menn0	perhaps there was some miscommunication about when the release was ok to cut	02:29
lazyPower	booo, that was in the release notes too	02:29
lazyPower	mgo package update that retries upserts that fail with ‘duplicate key error’ lp1593828	02:29
lazyPower	speaking of o/ hey core team :	02:29
lazyPower	:)	02:29
stokachu	so we're sure that fix isn't in beta 12 from the ppa?	02:31
stokachu	because it's also uploaded to the archive :)	02:32
menn0	stokachu: pretty sure. the release tag is there in git, and the fix isn't part of that release.	02:32
stokachu	menn0: ok, if you don't mind updating that bug so i can follow up with balloons/mgz in the morning	02:33
menn0	awesome :(	02:33
menn0	stokachu: will do. i'll poke xtian too so he's in the loop	02:33
stokachu	menn0: ok cool thanks a bunch	02:33
sinzui	menn0: thumper: The patch was added to the juju tree, and the scrpt that makes the tar file applies it. that is the hack that mgz put together	02:33
thumper	sinzui: looks like something didn't take though	02:33
thumper	o/ lazyPower	02:34
axw	wallyworld: I'm planning to add this to the cloud package: http://paste.ubuntu.com/20129296/. one of those will be present in a new environs.OpenParams struct. sound sane?	02:34
menn0	sinzui: it looks like the rev didn't make the cut of the release.	02:34
axw	sound/look	02:34
sinzui	Yeah, that is a bad way to deliver a fix	02:34
wallyworld	axw: loking	02:34
menn0	sinzui: what to do now?	02:34
axw	wallyworld: open to suggestions for a better name also	02:35
sinzui	menn0: I have no idea. I think godeps should define the repo and rev. Other wise we continue to maintain the patch in the tree and apply it each time the tar file is made	02:35
menn0	sinzui: the immediate problem is that beta12 didn't include the fix at all. the revision with the fix was committed after beta12 was cut.	02:36
menn0	sinzui: the mgo patch doesn't exist in beta12	02:37
stokachu	we should amend the release notes and set the fix for beta13	02:37
wallyworld	axw: i don't think that struct belongs in cloud - it's an analgamation of things used for an environ should really belongs in there	02:37
wallyworld	and then it could be called CloudSpec	02:37
wallyworld	or something	02:37
sinzui	menn0: I cannot help at this point. The release was started we aboorted and tried again.	02:37
stokachu	so can the mgo fix be pulled into godeps now?	02:38
stokachu	what was the reason for applying the fix during the tarball build	02:38
thumper	anastasiamac: while you are doing virt-type fixes, core/description/constraints_test.go:25, the virt type needs to be added to the allArgs func	02:39
axw	wallyworld: yeah ok, that's what I had to start with. issue is how to then make State implement EnvironConfigGetter. I think I'll have to define a type outside of the state package that adapts it to that interface	02:39
wallyworld	stokachu: the reason was we don't control upstream and we could not get the fix landed for us to use	02:40
wallyworld	so we were forced to adopt a solution where the change was patched in a s part og the build	02:40
stokachu	wallyworld: so the fixed was pulled in before the PR was accepted?	02:40
thumper	stokachu: more complicated than that...	02:40
stokachu	ah ok	02:40
stokachu	just trying to understand	02:41
thumper	related to golang, imports and the mgo release process	02:41
wallyworld	stokachu: no, the upstream PR was unaccepted but it was landed in an unstable v2 branch which we could not use directly	02:41
wallyworld	it's all a mess	02:41
thumper	s/unaccepted/accepted/	02:41
stokachu	ok, but the status in master is it is now part of the tree?	02:41
wallyworld	no :-(	02:41
thumper	kinda	02:41
wallyworld	not that i am aware of	02:41
thumper	but poorly	02:41
thumper	wallyworld: it is in a patch...	02:42
thumper	in the tree	02:42
thumper	ick	02:42
stokachu	how do you guys do it, this makes my head hurt	02:42
wallyworld	sure, but unless you apply the patch manually....	02:42
thumper	yes	02:42
wallyworld	mine too	02:42
thumper	stokachu: many years of built up resistence	02:42
lazyPower	stokachu - i'm going to say copious amounts of beer and callous to schenanigans	02:42
* thumper goes to put the kettle on		02:42
stokachu	thumper: lol, you guys will lead the zombie resistance	02:42
stokachu	lazyPower: :D	02:43
thumper	I for one await the zomie appocalypse	02:43
menn0	stokachu: this is partially due to the way Go handles imports	02:43
lazyPower	I never trusted go imports	02:43
stokachu	ok so not as simple as placing the git rev in the Godeps stuff	02:43
menn0	stokachu: b/c mgo is imported all over the place across multiple repos, if we want to fork it, we would have to change everything	02:43
menn0	stokachu: no, b/c the fix got accepted into mgo's unstable branch, but isn't yet in the stable branch	02:44
stokachu	ah i see	02:44
stokachu	gotcha, i didnt realize it was never in the stable branch	02:44
menn0	stokachu: we could use the unstable branch, but that pulls in a bunch of other stuff we don't really want	02:44
stokachu	understood	02:44
lazyPower	doesn't that mean its going to wind up landing in stable and pull in that bunch of other stuff eventually?	02:46
* lazyPower is showing his ineptitude at golang		02:46
natefinch	the whole "unstable" thing in the import path just seems like a bad idea. Either make it a new version or don't. If you want to mark it as unstable, do so in the readme.	02:52
* thumper notes that we are still using charm.v6-unstable		02:57
natefinch	yep	02:57
natefinch	dumb idea	02:57
natefinch	instead of having to go change all the imports once when we move to a new version, we have to do it twice. Assuming we ever actually bother to rename it from unstable.	02:58
menn0	wallyworld: fix for the joyent race: http://reviews.vapour.ws/r/5277/	04:08
wallyworld	looking	04:08
wallyworld	menn0: lgtm	04:09
menn0	wallyworld: thansk	04:10
menn0	thanks even	04:10
menn0	wallyworld: backport to 1.25 as well/	04:11
menn0	?	04:11
wallyworld	menn0: um, it's such a simple fix, why not	04:11
menn0	wallyworld: ok	04:11
wallyworld	might get a bless more often than twice a year	04:11
thumper	menn0: re dump-model review, and See Also, I copied it from elsewhere...	04:22
thumper	I did think it was strange	04:23
* thumper looks for a good example		04:25
thumper	menn0: updated http://reviews.vapour.ws/r/5265/	04:38
thumper	added a few drive by fixes for "See also:" formatting, made consistent with juju switch	04:38
thumper	menn0: made the apiserver side a bulk call, client api still single	04:38
thumper	added client side formatting	04:38
menn0	thumper: looking. I wasn't really suggesting that you had to do the bulk API work given the rest of the facade but great that you did anyway :)	04:44
menn0	thumper: "See also" is already quite inconsistent between commands	04:45
menn0	sigh	04:45
thumper	I thought that switch was most likely to be right	04:45
thumper	I looked at quite a few	04:45
menn0	thumper: oh hang on... you fixed them all!	04:45
thumper	and picked the resulting style	04:45
menn0	thumper: nice	04:45
thumper	well, in that package	04:46
menn0	thumper: ship it!	04:49
thumper	menn0: ta	04:50
babbageclunk	menn0: D'oh.	05:53
=== frankban\|afk is now known as frankban
frobware	dooferlad: ping	08:00
dooferlad	frobware: hi	08:01
frobware	dooferlad: any change we can meet now?	08:01
frobware	chance	08:02
dooferlad	frobware: need 5 mins	08:02
frobware	dooferlad: I have a plumber arriving in ~30 mins which is likely to clash with our 1:1	08:02
frobware	dooferlad: ok	08:02
babbageclunk	menn0: ping?	08:03
menn0	babbageclunk: howdy... i'm in the tech board call atm. talk after?	08:04
babbageclunk	menn0: cool cool	08:04
menn0	babbageclunk: hey, done now	09:13
babbageclunk	menn0: Sorry, in standup.	09:14
wallyworld	fwereade: in prep for some work, i have needed to move model config get/set/unset off client facade to their own new facade, so essentially a copy of stuff and a bit of boiler plate for backwards compat until gui is updated. would love a review at your leisure so i can land when CI is unclocked http://reviews.vapour.ws/r/5279/	09:14
wallyworld	i also removed jujuconnsuite tests \o/	09:15
menn0	babbageclunk: np, I'll hang around for a bit.	09:17
fwereade	wallyworld, ack, thanks	09:23
fwereade	wallyworld, I presume: s/have needed to/gladly took the opportunity to/ ;p	09:24
wallyworld	fwereade: that too, but also a need	09:34
wallyworld	:)	09:34
babbageclunk	menn0: Sorry, rambling discussion about godeps and vendoring. Nearly done.	09:44
menn0	babbageclunk: sounds like a repeat of the tech board meeting :)	09:45
babbageclunk	menn0: quite	09:45
babbageclunk	menn0: ok, done	09:45
babbageclunk	menn0: did you manage to reproduce stokachu's problem?	09:46
babbageclunk	menn0: sorry, I mean, has anyone had a chance to reproduce it?	09:46
menn0	babbageclunk: nope. I gave stokachu a rebuild of 2.0-beta12 which definite had the patch applied.	09:47
babbageclunk	menn0: And does he see it with that?	09:47
menn0	babbageclunk: he was going to try it out and see if the problem happened with that as he's able to make it happen fairly reliably.	09:47
menn0	babbageclunk: I don't know. He never got back to me. I think it was quite late for him at the time.	09:48
menn0	babbageclunk: he was going to report back on the ticket but hasn't yet.	09:48
babbageclunk	menn0: Ok, cool - I had a go with a checkout of the right commit and the patch applied, but no luck yet - not sure which bundle to use.	09:48
menn0	babbageclunk:	09:48
menn0	babbageclunk: my goal was to establish whether or not the patch made it into the release or not	09:48
menn0	(and whether or not it worked)	09:49
menn0	babbageclunk: I imagine we'll hear back from stokachu when he starts work again	09:49
babbageclunk	menn0: Also not sure whether my laptop has enough oomph to cause the contention needed.	09:49
menn0	babbageclunk: it seems like there was some process failure when the official beta12 was produced so I'm not ruling out that the patch didn't actually make it into the release	09:50
babbageclunk	menn0: Yeah, it was a bit crazy.	09:50
menn0	babbageclunk: stokachu said he could make the problem happen quite often with just using add-model and destroy-model	09:52
menn0	I'm not sure how hard he was really pushing things	09:52
babbageclunk	menn0: Ok, I'll try that a few more times. The hadoop-spark-zeppelin bundle really squishes my machine. It's pretty cool.	09:53
menn0	babbageclunk: I guess you could try making the problem happen with a juju that's built without the patch	09:53
menn0	and when you have a reliable way of triggering the problem	09:54
menn0	rebuild with the patch and see if it goes away	09:54
babbageclunk	menn0: Well, I'm more concerned that the 5-retry thing just made it a bit less likely, but not really better.	09:54
menn0	or, you could hold off and do something else until we hear more from the QA peeps and stokachu	09:54
babbageclunk	menn0: I'll give it a couple more kicks and then get in touch with the US peeps.	09:55
menn0	you would think 5 would be enough...	09:55
babbageclunk	I would and did!	09:55
menn0	maybe a random short sleep between each loop would help?	09:55
menn0	ethernet style	09:55
babbageclunk	Yeah, could help - want to be sure it's happening first though.	09:56
menn0	for sure... need more info	09:56
babbageclunk	amusing - the test that was originally causing the problem in tests has been deleted.	09:57
babbageclunk	I mean, in our suite.	09:57
menn0	babbageclunk: for unrelated reasons?	09:58
babbageclunk	yeah, because address picking has been removed.	09:58
menn0	ha funny... still needs to be fixed of course	10:01
menn0	babbageclunk: I've got to go. I've got a literal mountain of washing to contend with.	10:01
babbageclunk	menn0: ok, thanks. Happy climbing!	10:03
mup	Bug #1604785 opened: repeatedly getting rsyslogd-2078 on node#0 /var/log/syslog <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1604785>	11:40
mup	Bug #1604787 opened: juju agents trying to log to 192.168.122.1:6514 (virbr0 IP) <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1604787>	11:40
frankban	cherylj: hey morning, could you please merge trivial http://reviews.vapour.ws/r/5280/ ?	11:51
cherylj	frankban: sure	11:55
frankban	cherylj: ty!	11:55
mup	Bug #1598272 changed: LogStreamIntSuite.TestFullRequest sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Fix Released by fwereade> <https://launchpad.net/bugs/1598272>	12:10
stokachu	babbageclunk: retrying to reproduce this morning, was late last night for me	12:20
perrito666	morning all	12:22
frankban	cherylj: how do I check what failed at /var/lib/jenkins/workspace/github-merge-juju/artifacts/trusty-err.log ?	12:44
frankban	cherylj: sorry, at http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/console	12:44
cherylj	frankban: I've pinged mgz to take a look. I think it's a merge job failure	12:45
frankban	cherylj: ty	12:45
mup	Bug # changed: 1603596, 1604176, 1604408, 1604561, 1604644	12:46
perrito666	wallyworld: go to sleep?	13:31
wallyworld	ok, about that time	13:31
mup	Bug #1604817 opened: Race in github.com/juju/juju/featuretests <blocker> <ci> <intermittent-failure> <race-condition> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1604817>	13:31
natefinch	wallyworld: if you he 2 minutes, I'd love it if you could just read and maybe quickly respond to a couple review comments I have: http://reviews.vapour.ws/r/5238/	13:33
natefinch	s/he/have	13:34
wallyworld	ok	13:34
wallyworld	natefinch: done	13:36
natefinch	wallyworld: thanks	13:36
natefinch	hey, we're down to just two blocking tests in master, awesome	13:37
natefinch	(sorta)	13:37
babbageclunk	fwereade: ping?	13:48
frankban	cherylj: should I try merge again?	13:48
fwereade	babbageclunk, pong	13:49
fwereade	babbageclunk, what can I do for you?	13:50
babbageclunk	fwereade: I'm trying to understand the relationship between container and machine provisioners.	13:50
babbageclunk	fwereade: Sorry, environ provisioners	13:50
babbageclunk	fwereade: (looking at bug 1585878)	13:51
mup	Bug #1585878: Removing a container does not remove the underlying MAAS device representing the container unless the host is also removed. <2.0> <hours> <maas-provider> <network> <reliability> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1585878>	13:51
fwereade	babbageclunk, at the heart of a provisioner there is a simple idea: watch the machines and StartInstance/StopInstance in response	13:52
fwereade	babbageclunk, I think that's called ProvisionerTask?	13:52
babbageclunk	fwereade: yup, and it's the same between the environ and container provisioners.	13:53
babbageclunk	fwereade: but with different brokers, I think.	13:53
fwereade	babbageclunk, yeah, exactly	13:53
babbageclunk	fwereade: So it looks like the environ provisioner explicitly excludes containers from the things it watches	13:54
fwereade	babbageclunk, ultimately we should be able to just start each of them with a broker, an api facade, and knowledge of what set of machines they should watch	13:54
fwereade	babbageclunk, yeah, that should be encapsulated in what it watches	13:54
fwereade	babbageclunk, I expect they actually make different watch calls or something, though? :(	13:55
babbageclunk	fwereade: Ok - in the maas case I need to tell maas the container's gone away after getting rid of it.	13:55
cherylj	frankban: yes, looks like one PR went through, so something's working...	13:55
cherylj	frankban: so I'd retry	13:55
frankban	cherylj: retrying	13:56
fwereade	babbageclunk, ha, ok, let me think	13:56
babbageclunk	fwereade: Until I started saying this, I thought that the container broker didn't talk to the environ, but now I think that's wrong - it needs to tell it when it starts, right?	13:56
fwereade	babbageclunk, I am confident that a container provisioner should not talk to the environ directly, because that would entail distributing environ creds to every machine	13:57
babbageclunk	fwereade: Ok, that makes sense. So in order to clean up the maas record of the container, the environ provisioner would also need to watch containers, right?	13:58
babbageclunk	I should trace the start path so I can see where maas gets told about the container.	13:59
fwereade	babbageclunk, I would be most inclined to have a separate instance-cleanup worker on the controllers, fed by provisioners leaving messages (directly or indirectly) on instance destruction	14:00
babbageclunk	fwereade: leaving messages how? Files?	14:00
fwereade	babbageclunk, db docs?	14:00
babbageclunk	fwereade: oh, duh	14:00
fwereade	babbageclunk, ;p	14:00
fwereade	babbageclunk, there is a general problem with having all-the-necessary-stuff set up before a provisioner sees a machine to try to deploy	14:01
babbageclunk	fwereade: ok, so the container provisioner creates a record indicating that it killed a container, and a controller-based worker watches those and does the environ-level cleanup.	14:02
fwereade	babbageclunk, trying to set up networks etc in the provisioner is wilful SRP violation -- but I think we do have a PrepareContainerNetworking (or something) call that the provisioner task makes	14:02
babbageclunk	fwereade: ok	14:03
fwereade	babbageclunk, yeah, I would be grateful if we would cast it in terms that applied to machines and containers both, and didn't distinguish between them except in the worker that actually handles them	14:03
babbageclunk	fwereade: so that's in the environ provisioner - it talks to the provider.	14:03
fwereade	babbageclunk, I don't think any provisioner should be responsible for doing this work, I think it should be a separate instance-cleanup worker	14:04
babbageclunk	fwereade: (oops, that was in response the prev)	14:04
fwereade	babbageclunk, yuck :)	14:04
babbageclunk	fwereade: Ok - so the provisioner task would just say "this instance needs cleaning"...	14:05
babbageclunk	fwereade: and then the new worker would see all of them and just do stuff for the containers for now.	14:05
fwereade	babbageclunk, so, really, that should be happening in an instance-preparer worker, which creates tokens watched by the appropriate provisioner, which can then only try to start instances that have all their deps ready	14:05
fwereade	babbageclunk, yeah, I think so	14:06
fwereade	(I refer, above, to the instance-prep work currently done by the provisioner, not to what you just said, which I agree with)	14:06
babbageclunk	fwereade: right, I was just going to check that.	14:06
babbageclunk	fwereade: sounds good, thanks!	14:07
fwereade	babbageclunk, note that there's an environ-tracker manifold available on environ manager machines already, it gets you a shared environ that's updated in the background, you don't need to dirty your worker up with those concerns	14:07
babbageclunk	fwereade: ok, I'll make sure to base my worker on that.	14:08
fwereade	babbageclunk, and it is called "environ-tracker", set up in agent/model.Manifolds	14:10
fwereade	babbageclunk, just use it as a dependency and focus the worker around the watch/response loop	14:10
fwereade	babbageclunk, you should then be able to just assume the environ's always up to date, and if you do race with a credential change or something it's nbd, just an error, fail out and let the mechanism bring you up to try again soon	14:11
babbageclunk	fwereade: ok	14:12
fwereade	babbageclunk, ...or, hmm. be careful about those errors, actually	14:12
fwereade	babbageclunk, we want those to be observable, I think	14:12
babbageclunk	fwereade: observable?	14:13
fwereade	babbageclunk, and we probably shouldn't mark the machine that used them dead until they've succeeded	14:13
fwereade	babbageclunk, report the error in status, I think, nothing should be competing for it by the time this is running	14:13
babbageclunk	fwereade: oh, gotch	14:14
babbageclunk	a	14:14
fwereade	babbageclunk, so, /sigh, this implies moving responsibility for set-machine-dead off the provisioner and onto the instance-cleaner	14:14
fwereade	babbageclunk, which is clearly architecturally sane, but a bit of a hassle	14:14
fwereade	babbageclunk, otherwise we'll be leaking resources and not having any entity against which to report the errors	14:15
fwereade	babbageclunk, sorry again: not set-machine-dead, but remove-machine	14:15
fwereade	babbageclunk, the machine agent sets itself dead to signal to the rest of the system that its resources should be cleaned up	14:15
babbageclunk	fwereade: ok	14:16
frankban	cherylj: looked at the tests and I've found that the failure is real for my branch. I have a fix already, but how do I check the tests that actually failed from the CI logs?	14:16
fwereade	babbageclunk, but we shouldn't remove it until both the instance (by the provisioner) and other associated resources (by instance-cleaner, maybe more in future) have been cleaned up	14:16
babbageclunk	fwereade: Yeah, that makes sense.	14:17
cherylj	frankban: go to your merge job: http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/	14:17
cherylj	frankban: and click trusty-err.log	14:17
fwereade	babbageclunk, ...and ofc that now implies that we will potentially have workers competing for status writes	14:17
=== natefinch is now known as natefinch-afk
cherylj	frankban: argh, looks like it failed to run again	14:17
cherylj	balloons, sinzui - can you take a look: http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/artifact/artifacts/trusty-err.log	14:17
fwereade	babbageclunk, so... it's not trivial, I'm afraid, but I can't think of any other things that'll interfere	14:18
frankban	cherylj: I am running 8477 now	14:18
frankban	cherylj: let's see if it will fail to run again, it should fail with 2 tests failures in theory	14:18
fwereade	babbageclunk, do you know what dimitern has been doing lately? I think he had semi-detailed plans for addressing the corresponding setup concerns but I'm not sure he started implementing them	14:19
cherylj	frankban: ah, well, when it completes you can view that trusty-err.log file for the test output	14:19
babbageclunk	fwereade: sorry, no - he's been away for the last week and a bit, not sure what he's working on at the moment.	14:19
frankban	cherylj: yes thank you, good to know	14:19
fwereade	babbageclunk, no worries	14:19
fwereade	babbageclunk, do sync up with him when he returns	14:20
babbageclunk	fwereade: hang on, why multiple workers competing to write status?	14:20
fwereade	babbageclunk, if the provisioner StopInstance fails that should report; if the instance-cleaner Whatever fails, that should also report	14:20
fwereade	babbageclunk, it might also be useful to look at what storageprovisioner has done	14:21
fwereade	babbageclunk, with the internal queue for delaying operations if they can't be done yet	14:21
babbageclunk	fwereade: Oh I see, so if both of them fail then an error in the provisioner might be hidden by one in the cleanup worker.	14:22
fwereade	babbageclunk, yeah, exactly	14:22
babbageclunk	fwereade: ok, that's heaps to go on with - I'll probably need more pointers once I'm a bit further along.	14:23
babbageclunk	fwereade: Thanks!	14:23
fwereade	babbageclunk, (nothing would be lost, because status-history, but it would be good to do better)	14:23
fwereade	babbageclunk, np	14:23
fwereade	babbageclunk, always a pleasure	14:23
mup	Bug #1604644 opened: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <conjure> <mongodb> <juju-core:New> <https://launchpad.net/bugs/1604644>	14:25
sinzui	sorry cherylj: got pulled inot a meeting. Go is writing errors to stdout You can see the failure in http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/artifact/artifacts/trusty-out.log	14:43
sinzui	cherylj: I think we can create unified log so that the order of events and where to look are in a single place	14:43
rick_h_	katco: ping for standup	15:02
katco	rick_h_: oops omw	15:02
mup	Bug #1604883 opened: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>	16:16
mup	Bug #1604883 changed: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>	16:25
mup	Bug #1604883 opened: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>	16:34
=== natefinch-afk is now known as natefinch
perrito666	anyone has spare time to review this http://reviews.vapour.ws/r/5282/diff/# ? its not a very short one, its part of a set of changes to support ControllerUser permissions, I am happy to discuss what this particular patch does if anyone goes for the rev	16:41
natefinch	rick_h_: I have a ship it for the interactive bootstrap stuff... should I push it through or wait for master to be unblocked?	16:45
rick_h_	natefinch: wait for master please atm	16:45
rick_h_	natefinch: just mark it as blocked on the card on master	16:46
=== frankban is now known as frankban\|afk
natefinch	rick_h_: will do	16:51
rick_h_	natefinch: got a sec?	16:52
natefinch	rick_h_: yep	16:52
rick_h_	natefinch: https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=1	16:52
* rick_h_ goes for lunchables then		16:58
natefinch	god I love small unit tests	17:37
natefinch	I love that it tells me "you have an error in this 20 lines of code"	17:38
rick_h_	jcastro: marcoceppi arosales heads up docs switch is done and the jujucharms.com site is all 2.0 all the time https://jujucharms.com/docs	18:49
marcoceppi	rick_h_: yesssss	18:49
mup	Bug #1604915 opened: juju status message: "resolver loop error" <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604915>	18:50
rick_h_	marcoceppi: will send an email shortly, want to check on status of b12 in xenial update to go along with it	18:50
mup	Bug #1604919 opened: juju-status stuck in pending on win2012hvr2 deployment <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604919>	19:05
natefinch	rick_h_: output for interactive commands on stdout or stderr?	19:06
rick_h_	natefinch: so jam had some thoughts and added notes to the interactive spec on that	19:07
natefinch	rick_h_: ok, I was wondering who added that. it was incomplete so I was hoping for clarification	19:07
* rick_h_ loads doc to double check		19:07
rick_h_	natefinch: ah, yea looks like he didn't finish typing	19:08
natefinch	rick_h_: I know the answer for non-interactive commands, but not sure if it should be different for interactive	19:08
natefinch	rick_h_: given that there's no real scriptable output	19:08
natefinch	(I mean, you can script anything, but it's not made with that in mind)	19:09
rick_h_	natefinch: can you ping him to clarify the rest, but the start is there as far as for interactive I think that's the idea that the questions/etc should go to stderr, but if we confirm things "successfully added X" it's stdout	19:09
natefinch	rick_h_: ok, yeah, I'll talk to him about it.	19:09
rick_h_	natefinch: ty	19:09
natefinch	oh man... writing this package to handle the formatting of user interactions was the best idea I ever had.	19:15
natefinch	ok, maybe not the best idea ever. But... it's certainly saving my ass.	19:16
rick_h_	natefinch: <3	19:22
alexisb	natefinch, so that begs the questions, what was your best idea ever	19:23
natefinch	alexisb: that's like the best set up for a joke I've ever had....	19:24
natefinch	alexisb: marrying my wife, obviously. Only slightly behind would be the idea to switch from Mechanical Engineering to Computer Science in school. Dodged a bullet there.	19:25
natefinch	I have a couple mech-e friends... they basically design screws all day long	19:26
alexisb	natefinch, yep	19:26
alexisb	I got to my first statics class, followed by drafting and went "o hell no!"	19:27
alexisb	I also had some time at Racor systems (Parker affiliate) and watched there engineers at a ProE screen all day	19:28
alexisb	no thank you	19:28
natefinch	yuuup	19:28
natefinch	I realized fairly early that I found physics fascinating in the abstract, but the reality of actually figuring shit out was mind-bogglingly boring.	19:29
alexisb	at racor, the acturally factory was AWESOME, which is where I started wtih control systems	19:29
perrito666	wanna do some boring mech things, try calculating elevators for a living	19:30
perrito666	most revealing class I ever had	19:30
natefinch	friend of mine makes maglev elevators for things like aircraft carriers.... still pretty boring work in the small	19:30
natefinch	he'd probably say the same for my job, though ;)	19:31
natefinch	"So... you twiddled with carriage returns all day?"	19:31
perrito666	lol "so, found that missing statement?"	19:32
mup	Bug #1604931 opened: juju2beta12: unable to destroy controller properly on localhost <conjure> <juju-core:New> <https://launchpad.net/bugs/1604931>	19:32
perrito666	but I was talking about builting elevators, I actually had to spend a semester calculating those	19:32
* rick_h_ runs to get the boy from school		19:38
arosales	rick_h_: great to hear thanks for the fyi	19:53
natefinch	are we supposed to be able to add-cloud for providers like ec2?	20:30
natefinch	it doesn't look like we're stopping people from doing that	20:31
mup	Bug #1604955 opened: TestUpdateStatusTicker can fail with timeout <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1604955>	20:35
mup	Bug #1604959 opened: Failed restore juju.txns.stash 'collection already exists' <backup-restore> <ci> <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1604959>	20:44
natefinch	rick_h_: It's a little weird that clicking on "stable" in jujucharms brings you to the 2.0 docs, which say at the top, in red "Juju 2.0 is currently in beta which is updated frequently. We don’t recommend using it for production deployments."	20:48
natefinch	the problem with MAAS API URL is that it looks like I'm shouting, but really it's just TLA proliferation	20:58
=== natefinch is now known as natefinch-afk
mup	Bug #1604961 opened: TestWaitSSHRefreshAddresses can fail on windows <ci> <intermittent-failure> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1604961>	21:11
mup	Bug #1604965 opened: machine stays in pending state even though node has been marked failed deployment in MAAS <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604965>	21:11
redir	you sure are	21:17
redir	ignore ^^	21:19
menn0	perrito666: ping	21:25
perrito666	menn0: ping	21:26
perrito666	sorry pong	21:27
perrito666	menn0: did I break something?	21:27
menn0	perrito666: no, I just wanted to apologise for not getting to your ACLs PR yesterday... the day got swallowed up by critical bugs	21:28
menn0	perrito666: I was about to review now and see that it's been discarded?	21:28
perrito666	menn0: oh, no worries, you would not have been able to review it yesterday anyway, it had a dependency on an unmerged branch and the diff was uncomprehensible, I droped it, merged the pending branch and re-proposed	21:29
perrito666	RB is really misleading, I thought that adding the dependency on the depends on field would fix the diff but did nothing at all and then it would not allow me to upload my own diff	21:30
=== akhavr1 is now known as akhavr
perrito666	I think we should change RB for something a bit more useful, like snapchat	21:30
menn0	perrito666: LOL :)	21:30
menn0	perrito666: I've been wondering about Gerrit or Phabricator, they seem like the best alternatives	21:31
perrito666	I checked one of those during the sprint, and I liked it, I cant rememmber which one though, Phabricator I think	21:31
perrito666	menn0: also, the only person that knew something about our RB is no longer in this team which makes an interesting SPOF	21:32
menn0	perrito666: I don't think the ops side of RB is particularly hard	21:32
menn0	perrito666: and I think the details are written down /somewhere/	21:32
perrito666	menn0: I fear that the certain somewhere is an email :s	21:33
perrito666	anyway, eric usually knew the dark secrets like how to actually make a branch depend on	21:33
menn0	perrito666: phab is nice. I've used it a bit at one job. it support enforcing a fairly strict (but customized) development process..	21:33
menn0	perrito666: you do this: rbt post --disable-ssl-verification -r <review_number> --parent <parent_branch>	21:34
menn0	perrito666: and then check how it looks on the RB website and hit Publish	21:34
katco	menn0: perrito666: i've been interested in how this works out for teams: https://github.com/google/git-appraise	21:35
perrito666	menn0: ah, I need some non magic interaction :)	21:35
perrito666	menn0: if you ask me (and even if you dont) if it cant be done on the website, its broken	21:35
menn0	perrito666: I think you can upload arbitrary diffs to RB... but I've never done it	21:36
menn0	katco: looks interesting! I hadn't heard of git-appraise before	21:36
* menn0 reads more		21:36
katco	menn0: i enjoy the decentralized nature. no ops needed	21:37
katco	menn0: or at least i think i would. i've never used this	21:37
perrito666	menn0: well I actually tried, It seems to assume rb has something it doesnt, we might have broken that particular workflow with our magic bot	21:37
perrito666	katco: that looks amazing but seems to not work very nicely with github workflow (which we sort of use)	21:39
menn0	katco: storing the reviews in git is a nice idea. the way you add comments is a little unfriendly though. I guess the expectation is that someone will create a UI/tool for that.	21:39
katco	perrito666: just saw this: https://github.com/google/git-pull-request-mirror	21:40
katco	menn0: and just found this: https://github.com/google/git-appraise-web	21:40
redir	who's the resident data race expert?	21:41
perrito666	katco: mm, really interesting, do you know actual users of this, I am interested in seeing how it behaves in heavily conflictive envs	21:41
perrito666	redir: we all are good adding data races :p	21:41
redir	perrito666: OK who's the resident data race tortoise?	21:42
perrito666	redir: well, you are not in luck, its dave cheney :p	21:42
menn0	katco: that improves the situation somewhat! :)	21:42
perrito666	redir: just throw the problem to the field and well see how can we attack it	21:42
katco	perrito666: i do not. this looks fairly active? https://git-appraise-web.appspot.com/static/reviews.html#?repo=23824c029398	21:43
perrito666	man, was that english broken or what? ;p I am loosing my linguistic skills	21:43
redir	I think it is pretty straightforward	21:43
perrito666	katco: very interesting, I really like the idea of storage of these things In the repo	21:46
perrito666	but ill say something very shallow	21:46
redir	https://github.com/go-mgo/mgo/blob/v2/socket.go#L329 needs to be locked so it doesn't race with https://github.com/go-mgo/mgo/blob/v2/stats.go#L59	21:46
redir	I think	21:46
perrito666	the UI is ugly as f***	21:46
redir	trouble reproducing	21:46
katco	perrito666: it is certainly spartan	21:46
katco	perrito666: personally, i would be writing an emacs plugin for this if someone hasn't already	21:47
katco	redir: why are stats being reset before kill has been returned? i think there's a logic bomb there	21:48
perrito666	I dont know what kind of spartans you know, the ones from the movie certainly look better than that UI :p	21:49
katco	perrito666: sorry, i intended this usage: "adj.Simple, frugal, or austere: a Spartan diet; a spartan lifestyle."	21:49
perrito666	katco: I know, I intended to : " troll (/ˈtroʊl/, /ˈtrɒl/) is a person who sows discord on the Internet by starting arguments or upsetting people,"	21:51
katco	lol	21:51
perrito666	redir: while killing, imho you should be locking everything indeed, but I have not checked past these two links to know if I am speaking the thruth about this particular issue	21:52
perrito666	katco: I do dislike the ui though, I prefer something like github without the insane one mail per comment thing	21:53
katco	redir: also i don't think that's the race. socketsAlive locks the mutex before doing anything: https://github.com/go-mgo/mgo/blob/v2/stats.go#L135	21:55
redir	mkay thanks	21:56
redir	perrito666: katco ^	21:56
perrito666	moving to a silent neigbourhood is glorious for work	21:59
mup	Bug #1604988 opened: Inconsistent licence in github.com/juju/utils/series <jujuqa> <packaging> <juju-core:Triaged> <juju-core 1.25:Triaged> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1604988>	22:05
menn0	katco: you're convincing me that we should experiement with vendoring some more :)	22:43
katco	menn0: eep...	22:43
katco	menn0: as long as how go does vendoring is well understood, i'm happy. i am scared of diverging too much without forethought	22:43
menn0	katco: sure... it's not something we should do lightly. and if do it, it should use Go's standard mechanism.	22:44
perrito666	menn0: re our previous talk http://reviews.vapour.ws/r/5282/diff/#	22:44
katco	menn0: yeah, agreed	22:45
menn0	perrito666: ok. I can take a look.	22:48
menn0	perrito666: my initial comment is that I wish this was 2 PRs: one for state and one for apiserver (but I will cope)	22:49
perrito666	menn0: I am sorry I promise I tried to make it smaller	22:50
perrito666	menn0: its smaller than it looks though, small changes in many files	22:51
katco	menn0: i think i messed up the tech board permissions. i was trying to get a link and it looked publicly accessible, so i disabled that. now i can't view it	22:52
menn0	katco: I'll take a look	22:53
katco	menn0: sorry about that	22:53
menn0	katco: you completely removed canonical access :) not sure how to put it back yet	22:53
mup	Bug #1605008 opened: juju2beta12 and maas2rc2: juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>	22:54
katco	menn0: wait what! all i did was turn off link sharing :(	22:54
menn0	katco: figured it out. what was it before? anyone at canonical can edit or view?	22:54
menn0	or comment?	22:55
katco	menn0: could comment i think, but it looked like external people with link could view as well	22:55
menn0	katco: ok, it's fixed. anyone from canonical can comment again.	22:56
katco	menn0: ta, sorry	22:56
axw	wallyworld: did I miss anything on the call? slept through my alarm supposedly, pretty sure it didn't go off though	22:57
axw	need to ditch this dodgy phone	22:57
perrito666	axw: or get a clock	22:58
wallyworld	axw: not a great deal, just release recap, tech board summary	22:58
axw	perrito666: could do that too, I'd rather have it near my head so I don't wake up my wife	22:58
axw	suppose I could move the clock...	22:58
perrito666	axw: get a deaf people clock	22:58
axw	wallyworld: ok, ta	22:58
perrito666	(not trolling, these are a thing)	22:58
axw	perrito666: ah, have not seen one	22:59
perrito666	they have a thing that you put in your pillow and it vibrates	22:59
perrito666	much like your phone, but less points of failure	22:59
axw	I guess I could just use my fitbit then. if I can find it, and my charger...	22:59
mup	Bug #1605008 changed: juju2beta12 and maas2rc2: juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>	23:00
axw	anyway	23:00
* axw stops debugging alarm replacement issues		23:00
mup	Bug #1605008 opened: juju2beta12 and maas2rc2: juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>	23:06
=== akhavr1 is now known as akhavr
alexisb	axw, thumper ping	23:18
axw	coming, sorry	23:18
thumper	coming	23:18
=== akhavr1 is now known as akhavr
redir	axw: thanks for the protip in the review. helpful	23:49
axw	redir: np	23:49
thumper	menn0: so you are working with redir on the race?	23:57

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!