/srv/irclogs.ubuntu.com/2016/09/12/#juju-dev.txt

wallyworld	veebers: do you know if bug 1621576 is in progress?	00:03
mup	Bug #1621576: get-set-unset config will be renamed <juju-ci-tools:Triaged> <https://launchpad.net/bugs/1621576>	00:03
veebers	wallyworld: it looks like curtis landed the fix in the tests for that. Seems he didn't update the bug. I'll confirm with him that it's finished (but it looks like it is)	00:09
wallyworld	veebers: awesome. the reason for asking is that we will be looking to land the code changes to juju to use the new command syntax	00:09
wallyworld	and without the ci script changes, ci will break	00:09
veebers	wallyworld: ack, I'll have confirmation for you tomorrow :-) But I'm pretty sure that fix is complete	00:15
wallyworld	ty	00:15
wallyworld	the juju pr needs a little fixing, so that won't be ready till when the US comes back online anyway	00:16
mup	Bug #1560487 changed: local provider fails to create lxc container from template <canonical-is> <local-provider> <juju-core:Won't Fix> <juju-core 1.25:Triaged by alexis-bruemmer> <OPNFV:New> <https://launchpad.net/bugs/1560487>	00:45
menn0	axw: easy one: http://reviews.vapour.ws/r/5646/	01:56
axw	menn0: LGTM	01:57
menn0	axw: thanks	01:59
perrito666	Nites, anyone happens to know dimitern's mobile?	02:50
menn0	wallyworld: another migration cleanup: http://reviews.vapour.ws/r/5648/	03:06
wallyworld	ok	03:07
anastasiamac	wallyworld: re-instatement (?) of vsphere supported architecture - http://reviews.vapour.ws/r/5649/	03:11
wallyworld	ok, on my list	03:12
anastasiamac	:)	03:12
rick_h_	perrito666: not here, you make it in?	03:15
rick_h_	wallyworld: do you have the config changes spec handy? I thought application config was just config now?	04:02
wallyworld	rick_h_: it is	04:02
wallyworld	but not in beta18	04:02
rick_h_	wallyworld: did it not make b18? oh crap	04:02
wallyworld	PR up today, will land tonght	04:02
rick_h_	ah, thought b18 got all but one commit	04:02
rick_h_	ah, gotcha	04:02
wallyworld	rick_h_: sorry :-(	04:02
wallyworld	we just ran out of time	04:03
rick_h_	wallyworld: all good, just working on my slides for tomorrow and checking my thoughts vs reality in the beta	04:03
wallyworld	needed to coordinaye withCI etc	04:03
rick_h_	wallyworld: understand	04:03
wallyworld	put an asterisk :=)	04:03
rick_h_	yep, will work it out	04:03
wallyworld	rick_h_: also, i will have a fix today for machines not reporting as Down when they get killed	04:04
wallyworld	just a cosmetic thing, but very annoying	04:04
rick_h_	wallyworld: <3	04:04
wallyworld	especially if you are an admin trying to script whether to enable ha or not	04:05
anastasiamac	... if CI gets a run without a failure... all landings I've seen today report similar failures :)	04:13
perrito666	Rick_h_ just getting out of the airport after 1h or more in migrations queue I wanted to message him to get dinner but I guess I'll be arriving too late	04:17
rick_h_	perrito666: gotcha, sucky on the queue fun	04:19
perrito666	Happens :) seems I picked a specially busy day	04:21
perrito666	Juju is in town so all these people are coming for the charmer summit, evidently :p	04:23
menn0	wallyworld, axw: do you know if anyone is looking into all the test timeouts in apiserver/applications	04:27
menn0	it's happened to me and lots of other merge attempts it seems	04:27
menn0	apiserver/application	04:27
axw	don't know	04:27
wallyworld	menn0: i'm not, i haven't been monitoring landing bot today	04:27
veebers	menn0: you're seeing this in the merge job? (anastasiamac ^^)	04:27
menn0	wallyworld: ok... i'll start looking	04:27
wallyworld	damn, something broke	04:27
wallyworld	menn0: i am fixing the annoying go cookies issue	04:28
menn0	veebers: yep, most merge attempts today have failed because of this	04:28
menn0	so someone managed to land something which is failing most of the time	04:28
* menn0 hopes it wasn't him :)		04:28
veebers	menn0: right, I was checking to see if it was CI/infra related. I've changed which machine the next run will happen on in hopes it might help.	04:28
menn0	veebers: ok thanks.	04:29
menn0	veebers: I can't repro the problem locally of course	04:29
veebers	menn0: heh :-\ always the way. FYI the last mereg that passed on that job was: "fwereade charm-life" (http://juju-ci.vapour.ws:8080/job/github-merge-juju/9167/)	04:30
veebers	menn0: I'll track the next queued up job that will run on the older machine and let you know how it gets on	04:30
menn0	wallyworld, axw, anastasiamac: the stuck test appears to be TestAddCharmConcurrently if that rings any bells?	04:30
anastasiamac	menn0: no bells but veebers pointed out the commit ^^ that seems to b the culprit :D	04:31
* anastasiamac have to get a kid from school, b back l8r		04:32
menn0	veebers: cool, I'll start looking at that merge	04:33
anastasiamac	wallyworld: m considering to remove arch caching from vsphere on current pr as well.. any idea how heavily supported architectures retrieval is used?	04:36
anastasiamac	wallyworld: it'll b calling simplestream image retrival evry time constraints validator is constructed...	04:37
wallyworld	in a couple of places	04:37
wallyworld	twice in one api call	04:37
wallyworld	when adding a machine i tihnk	04:37
anastasiamac	wallyworld: k.. i'll leave it out cached for now.. let's tackle it later for 2.1 maybe...	04:38
wallyworld	tha's from memory thugh	04:38
wallyworld	would need to check code again	04:38
anastasiamac	wallyworld: k.. i've created a separate bug for it and we'll address separately then	04:39
anastasiamac	mayb we'll even have some help with perfomance benchmarking (veebers :D) to determine how much better/worse we'd do without caching supported architectures :)	04:39
veebers	heh :-)	04:40
wallyworld	menn0: you're busy, if you get a chance later, here's that status fix http://reviews.vapour.ws/r/5651/. If no time, I can ask elsewhere	04:45
menn0	wallyworld: looking now	04:47
wallyworld	ta	04:48
menn0	wallyworld: good stuff.	04:55
menn0	wallyworld: i'm creating a card now as the migrations prechecks will need to use this too	04:55
wallyworld	menn0: thanks menno. btw did you book your flights yet? when you arriving/leaving?	04:55
menn0	wallyworld: I've sent the email to the agent but haven't heard back yet (unsurprisingly since they're not at work yet)	04:57
wallyworld	you looking to arrive the first sat and leave the following sat?	04:57
menn0	wallyworld: I'm likely to be leaving on Saturday night, which gets me in on Sunday evening	04:58
menn0	wallyworld: leaving the sprint on Saturday morning	04:58
wallyworld	you'll miss drinks :-)	04:58
menn0	wallyworld: possibly	04:58
menn0	wallyworld: sometimes they're a bit later	04:58
wallyworld	depending on flights, i'm going to try and arrive sat evening	04:59
menn0	wallyworld: so it looks like will hit the timeout in apiserver/application twice while trying to merge. he assumed it was bug 1596960	05:15
mup	Bug #1596960: Intermittent test timeout in application tests <tech-debt> <unit-tests> <juju:Triaged> <https://launchpad.net/bugs/1596960>	05:15
menn0	but that one says it's only windows	05:15
menn0	I'm guessing his changes have made it more likely to happen	05:15
wallyworld	damn, sounds pluasible	05:15
wallyworld	looks messy as well	05:17
axw	wallyworld: will you have a chance to look at that ca-cert issue? I'm trying to stay focused on azure	05:54
wallyworld	axw: yeah, i can look	05:55
wallyworld	axw: just read emails, so the cert issue is just disabling the validation check IIANM	05:56
axw	wallyworld: see uros's latest email, there's also an issue with credentials looking up provider based on controller cloud	05:57
axw	which seems wrong...	05:57
wallyworld	yeah	05:59
blahdeblah	Quick Q: in order for a unit to log to rsyslog on node 0, should there be a rule in the secgroup that allows access to tcp port 6514? And should juju add this automatically?	07:26
=== frankban\|afk is now known as frankban
wallyworld	urulama: http://reviews.vapour.ws/r/5652/ FYI	07:52
urulama	thanks	07:54
wallyworld	blahdeblah: units can ask for ports to be opened on a bespoke basis	07:54
wallyworld	it's not something we'd do unilaterally	07:55
blahdeblah	wallyworld: so it wouldn't be done as part of add-unit when a machine is added via the manual provider?	07:55
urulama	wallyworld: been running it with that fix since axw pointed it out :)	07:55
wallyworld	blahdeblah: not that i am aware of. manual provider assumes pretty much that everything is in place. juju tends to try not to mess with manual machines	07:57
blahdeblah	wallyworld: OK - thanks	07:57
wallyworld	urulama: i was hinting for a review from your folks :-)	07:58
wallyworld	axw: fyi urulama thinks that add-model issue may be with the controller proxy, so we're off the hook for now	07:59
axw	wallyworld urulama: yeah, I think it's most likely due to something around Cloud.DefaultCloud and/or Cloud.Cloud	08:00
wallyworld	axw: yep, i traced to to the cli making an api call to client.Cloud() and it's all goo in core	08:00
wallyworld	good	08:00
wallyworld	but something missing in proxy most likely	08:00
voidspace	babbageclunk: https://github.com/juju/juju/compare/master...voidspace:1534103-run-action	09:55
* frobware needs to run an errand; back in an hour.		09:57
fwereade	voidspace, may I have a 5-second review of http://reviews.vapour.ws/r/5653/ please?	10:04
fwereade	voidspace, apparently it has been failing a bunch	10:04
voidspace	fwereade: ok	10:05
voidspace	fwereade: LGTM	10:12
fwereade	voidspace, ta	10:13
mup	Bug #1594977 changed: Better generate-image help <helpdocs> <oil-2.0> <v-pil> <juju:Triaged> <https://launchpad.net/bugs/1594977>	11:55
mup	Bug #1622581 opened: Cryptic error message when using bad GCE credentials <juju-core:New> <https://launchpad.net/bugs/1622581>	11:55
mup	Bug #1622581 changed: Cryptic error message when using bad GCE credentials <juju-core:New> <https://launchpad.net/bugs/1622581>	12:19
fwereade	is anyone free for a ramble about cleanups with a detour into refcounting? axw, babbageclunk?	13:05
babbageclunk	yup yup	13:12
fwereade	babbageclunk, so, the refcount stuff I extracted	13:14
fwereade	babbageclunk, short version: it's safe in parallel but not in serial	13:14
babbageclunk	babbageclunk: ?	13:14
natefinch	fwereade: that is impressive	13:15
voidspace	that's impressive	13:15
babbageclunk	I didn't think that was a thing we needed to worry about.	13:15
voidspace	hard to do	13:15
natefinch	voidspace: hi5	13:15
voidspace	o/	13:15
fwereade	babbageclunk, i.e. refcount is 2; 2 separate transactions decref; one will fail, reread with refcount 1, successfuly hit 0 and detect	13:15
voidspace	natefinch: :-)	13:15
fwereade	voidspace, natefinch: I'm rather proud of it, indeed	13:15
natefinch	lol	13:15
babbageclunk	but isn't serial just slow parallel?	13:16
fwereade	babbageclunk, refcount is 2, one transaction gets composed of separate ops that hit same refcount: will decref to 2, but won't ever "realise" it did so, so there's no guaranteed this-will-hit-0 detection	13:16
babbageclunk	ugh	13:17
fwereade	babbageclunk, we're always composing transactions from ops based on a read state from before the txn started	13:17
babbageclunk	All the asserts happen before all of the ops?	13:17
fwereade	babbageclunk, yeah	13:17
fwereade	babbageclunk, that's how it works	13:17
babbageclunk	of course. ouch. so each assert passes, but they leave it at 0 with no cleanup	13:18
fwereade	babbageclunk, yeah, exactly	13:18
voidspace	fwereade: you have two days to fix this, right	13:19
fwereade	voidspace, perhaps :)	13:19
voidspace	:-)	13:19
fwereade	voidspace, stateful refcount thingy is one, wanton spamming of possibly-needed cleanups is another	13:20
fwereade	voidspace, I'm slightly hopeful that you have a third?	13:20
voidspace	I hope so too	13:20
fwereade	voidspace, oh	13:21
voidspace	oh, no	13:21
voidspace	sorry	13:21
fwereade	voidspace, days, I thought you said ways	13:21
voidspace	fwereade: I hope I have at least a third day	13:21
fwereade	voidspace, I would imagine so ;)	13:21
* babbageclunk lols sadly.		13:21
voidspace	fwereade: unless they purge juju of everyone you know...	13:21
voidspace	fresh start and all that	13:21
fwereade	voidspace, we have always been at war with...	13:22
voidspace	:-)	13:22
fwereade	voidspace, babbageclunk: so on the one hand there is this problem with txns	13:23
fwereade	voidspace, babbageclunk: and it's one that bites us increasingly hard as we try to isolate and decouple individual changes to the db	13:24
fwereade	voidspace, babbageclunk: and I don't really have an answer to either the problem or the increased risk we take on as we further isolate ops-generation	13:25
babbageclunk	Why do we compose these operations into one transaction? Shouldn't they be multiple transactions?	13:28
mup	Bug #1622136 changed: Interfaces file source an outside file for IP assignment to management interface <juju:Triaged by rharding> <https://launchpad.net/bugs/1622136>	13:28
fwereade	babbageclunk, that is basically where I'm going	13:28
babbageclunk	Not sure how we could prevent it though	13:28
fwereade	babbageclunk, I cannot, indeed, think of a reason that the app remove ops have to be bundled into the final-unit-remove ops	13:29
fwereade	babbageclunk, and in fact, that approach is itself vulnerable to that style of bug -- if we wrap up the final 2 unit-removes, we'd miss the app-remove code	13:30
babbageclunk	fwereade: But it would be nice if the transaction system could prevent you from combining these transactions together somehow since they're not valid.	13:30
fwereade	babbageclunk, that would probably be sensible, but I can't see any non-blunt ways of doing it -- only one op per doc, I guess? but that works hard against any prospect of decomposition	13:32
fwereade	babbageclunk, the usual escape valve is cleanup ops, ofc -- you can apply a partial change and leave a note to pick it up later, and that's great	13:33
babbageclunk	fwereade: can it be more fine-grained than that - one op touching any attribute of a doc in one transaction?	13:33
fwereade	babbageclunk, perhaps so, but it sorta sucks not to be able to incref unitcount by 5, for example	13:34
babbageclunk	(Not sure how easy that would be to do in the mongo expression lang)	13:34
babbageclunk	true	13:34
fwereade	babbageclunk, and anything at the txn layer has sort of lost the real context of the ops, so it's likely hard/impossible to DTRT re compressing ops into one	13:35
fwereade	babbageclunk, (I heartily support this style of thinking, I just don't think I can do much about it in 2 days, hence cleanups)	13:36
babbageclunk	fwereade: yeah, it seems like it would be hard to do that in a generic way - I can see it working for refcounts, but I'm sure the same problem can come from other things harder to reason about.	13:37
babbageclunk	so, cleanups!	13:37
fwereade	babbageclunk, so, if we simplify unit removal (and relation removal, same latent bug) such that it doesn't even care about app refcounts, and just read life and drop in a maybe-clean-the-app-up op	13:38
fwereade	babbageclunk, the cleanups will run and everyone is happy	13:38
fwereade	babbageclunk, except that the time taken to remove a service once its last unit goes away has gone from ~0s to 5-10s	13:39
fwereade	babbageclunk, because the cleanups run on the watcher schedule	13:39
babbageclunk	fwereade: Oh, 'cause that's when a cleanup will run.	13:39
voidspace	so that's the "spam extra cleanup checks" approach	13:39
voidspace	but removing the service once the units have gone is mostly admin right	13:40
fwereade	babbageclunk, yeah -- and the more we do this, the better our decoupling but the more we'll see cleanups spawning cleanups and require ever more generations to actually get where we're going	13:40
voidspace	or is there resource removal that only happens at cleanup time too?	13:40
fwereade	voidspace, yeah, but you can't deploy another app with the same name, for example	13:40
voidspace	right	13:40
voidspace	is that a common need?	13:41
voidspace	maybe I guess	13:41
babbageclunk	do watcher polling more frequently!	13:41
babbageclunk	;)	13:41
fwereade	babbageclunk, that is certainly an option, and it does speed things up, but it's also the sort of tuning parameter that I am loath to fiddle with without paying close attention to the Nth-order effects at various scales and so on	13:42
babbageclunk	What about rather than dropping a cleanup you drop another txn that does the removal, with an assert that the refcount's 0?	13:43
fwereade	babbageclunk, can't guarantee they both apply -- that is the purpose of a cleanup, to queue another txn, really	13:43
babbageclunk	Ah, no - the cleanup gets created in the txn, right?	13:43
fwereade	babbageclunk, and you can't really write ops for future execution in the general case -- if they fail, there's no attached logic to recreate or forget about them, we can only forget	13:44
fwereade	babbageclunk, voidspace: anyway, one watcher-tick delay is not so terrible	13:45
babbageclunk	no	13:46
fwereade	babbageclunk, voidspace: so I was thinking I could just tweak the cleanup worker: expose NeedsCleanup, and check it in a loop that cleans up until nothing's left	13:47
fwereade	babbageclunk, voidspace: which at least gives us freedom to explore more-staggered cleanup ops without macro-visible impact	13:48
fwereade	babbageclunk, voidspace: and which I can probably get done fairly quickly	13:48
voidspace	sounds reasonable	13:48
babbageclunk	+1	13:49
fwereade	babbageclunk, voidspace: barring unexpected surprises in trying to separate service-remove from unit-remove	13:49
fwereade	babbageclunk, voidspace: excellent, thank you	13:49
voidspace	# github.com/juju/juju/cmd/jujud	14:11
voidspace	/usr/lib/go-1.6/pkg/tool/linux_amd64/link: running gcc failed: fork/exec /usr/bin/gcc: cannot allocate memory	14:11
mgz	yeah, I suffer a fair bit from that	14:12
mgz	linking with 1.6 takes a lot of memory	14:12
voidspace	time to switch to 1.7 then I guess	14:12
voidspace	I haven't seen it before and now I'm seeing it consistently with master	14:16
voidspace_	ok, so a reboot fixed the memory issues	14:55
dimitern	frobware: hey, not sure if you've seen my PM	15:34
dimitern	frobware: here's the PR I'm talking about: https://github.com/juju/juju/pull/6219	15:34
=== hml_ is now known as hml
fwereade	babbageclunk, voidspace_: I think I have a happier medium, in case I don't land anything else: http://reviews.vapour.ws/r/5644/	15:51
fwereade	babbageclunk, voidspace_: would either of you be free to take a look before EOD?	15:51
babbageclunk	fwereade: Sure, looking now	15:52
fwereade	babbageclunk, tyvm	15:52
redir	morning juju-dev	16:01
babbageclunk	fwereade: Sorry, I got distracted - still looking!	16:48
voidspace	fwereade: you still here?	16:51
fwereade	babbageclunk, voidspace: heyhey	17:00
voidspace	fwereade: so this implementation of a failaction operation seems to work and "do the right thing" https://github.com/juju/juju/compare/master...voidspace:1534103-run-action#diff-ae955475ac58e0d2683d2cfd6101b3f7R1	17:01
=== frankban is now known as frankban\|afk
voidspace	fwereade: which is mostly copied from runaction.go	17:03
fwereade	voidspace, that certainly looks sane to me	17:07
voidspace	fwereade: cool, it seems to fix the bug and behave sanely - so I'll add tests and propose	17:07
fwereade	voidspace, cool, tyvm	17:16
perrito666	hey, juju restore survives suspending the machine for 10 mins, sweet	17:31
perrito666	does annyone know if there is a way to list all models?	17:55
perrito666	fwereade: ?	17:55
fwereade	perrito666, I thought there was literally a list-models?	17:56
perrito666	fwereade: sorry I meant in state :p	17:56
fwereade	perrito666, not sure offhand, how does the list-models apiserver do it?	17:56
* perrito666 accidectally mixed chia and earl grey and is not happy about the result		17:56
perrito666	fwereade: an ungly thing that gets models for a user	17:57
perrito666	I was trying to avoid constructing another one of those	17:57
perrito666	:p	17:57
perrito666	hey, there is an AllModels here	17:59
perrito666	nice	17:59
fwereade	perrito666, well, the raw collection is pretty terrible	18:01
fwereade	perrito666, but, resolved anyway ;p	18:02
mbruzek	hmo: http://ppa.launchpad.net/juju/devel/ubuntu/pool/main/j/juju-core/juju-core_2.0-beta15-0ubuntu1~16.04.1~juju1.debian.tar.xz	18:52
perrito666	is the message "Contacting juju controller <private ip>" correct here? http://pastebin.ubuntu.com/23170667/	20:01
natefinch	perrito666: buh... that can't be right, unless somehow you can connect to the private address of AWS from where you're running the client	20:08
perrito666	I cant	20:09
natefinch	perrito666: weird then. probably just posting the first address in whatever list	20:12
perrito666	yep, after a restore, juju status will also show that address	20:12
=== rmcall_ is now known as rmcall
mup	Bug #1622738 opened: Multi-series charms failing in 1.25.6 <juju-core:New> <https://launchpad.net/bugs/1622738>	20:39
wallyworld	redir i am free for a bit but need coffee so give me 5 if you still had a question	22:33
redir	cool	22:33
redir	I'm here	22:33
redir	but going to make tea while you make coffee	22:36
perrito666	wallyworld: hey, this https://bugs.launchpad.net/juju/+bug/1595720 is still happening but now its a big issue since admin users are hitting this :(	22:55
mup	Bug #1595720: Problems using `juju ssh` with shared models <ssh> <usability> <juju:Triaged> <https://launchpad.net/bugs/1595720>	22:55
wallyworld	perrito666: damn, i'll add to the list of todo items for rc1, yay	23:12
wallyworld	thumper: standup?	23:16
thumper	review up: http://reviews.vapour.ws/r/5657/	23:36
marcoceppi	rick_h_ https://bugs.launchpad.net/juju/+bug/1622787	23:40
mup	Bug #1622787: If you name a credential with an @ Juju barfs <juju:New> <https://launchpad.net/bugs/1622787>	23:40
rick_h_	marcoceppi: lol ty for keeping the barf part	23:41
thumper	o/ marcoceppi	23:43

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!