/srv/irclogs.ubuntu.com/2013/04/17/#juju-dev.txt

gary_poster	if there's a review of https://codereview.appspot.com/8811043/ we could get it in under the wire	00:14
gary_poster	which would be nice :-)	00:14
gary_poster	and it is very small	00:15
* davecheney looks		00:16
davecheney	gary_poster: this diff is dirty	00:22
davecheney	there is some bzr .THIS crap in thre	00:22
gary_poster	davecheney, I was assuming that it was a clean up	00:22
gary_poster	davecheney, might a repush clean it up?	00:23
gary_poster	Makyo, ^^^ could you try a repush?	00:23
gary_poster	davecheney, than kyou for looking	00:23
davecheney	actually, this removes a mistaken checkin	00:23
davecheney	let me publish my review, there is some more dirtyness	00:23
gary_poster	ok thanks	00:23
davecheney	hazmat: this was your fault!	00:24
davecheney	done	00:25
davecheney	thumper: https://codereview.appspot.com/8811043/	00:25
davecheney	can you give the gui guys a second LGTM	00:25
davecheney	gary_poster: or you could just commit it	00:25
davecheney	if time is a factor	00:25
gary_poster	davecheney, thanks much. I'll check what is possible,	00:25
* thumper looks		00:25
thumper	done	00:27
* thumper off to the gym shorly for some sparring		00:28
davecheney	gary_poster: Makyo you've got two, make those changes and fire at will	00:28
gary_poster	awesome thanks very much davecheney and thumper	00:28
=== thumper is now known as thumper-afk
davecheney	does anyone have a link to last week's hangout agenda ?	00:38
davecheney	m_3: ping	00:38
mgz	okay, have the other bug down as well	00:46
m_3	davecheney: hey	00:49
m_3	mgz: we're stuc	00:49
m_3	k	00:49
davecheney	m_3: how's it going ?	00:49
m_3	mgz: I'm leaving it be atm	00:49
m_3	davecheney: seems hung up on something	00:50
davecheney	hang on, changing hosts	00:50
davecheney	m_3: tell me where it hurts	00:50
m_3	davecheney: we brought up 200 earlier to checck on adding incremental nodes via nova cli	00:51
mgz	m_3: I saw it's not increasing, but didn't see anything in the log	00:51
m_3	davecheney: mgz played a bit	00:51
m_3	davecheney: then added another 200 (juju add-unit hadoop-slave -n200)	00:51
mgz	I have two of the three issues I've seen tracked down	00:51
m_3	davecheney: it added 15 more and hung	00:51
* m_3 gotta relocate... getting kicked out of the expo area		00:51
davecheney	mgz: i've seen an issue where the PA will get into a loop if there are problems creating security groups	00:52
m_3	back on after while... I'll leave it up for y'all	00:52
davecheney	m_3: mgz: i suggest destroying this instance	00:52
davecheney	bzr pull to rev 1164	00:52
davecheney	then trying again	00:52
davecheney	2013/04/16 23:41:06 ERROR provisioning worker: failed to GET object provider-state from container goscale2-1	00:53
davecheney	caused by: https://region-a.geo-1.objects.hpcloudsvc.com/v1/17031369947864/goscale2-1/provider-state%!(EXTRA string=failed executing the request)	00:53
davecheney	i think i have a fix for these formatting errors	00:53
davecheney	mgz: i'd like to destroy this envbironment and start again with a fresh build	00:55
mgz	okay, we could probably do with fixing a couple of the dumb issues too	00:56
davecheney	rev 1164 cuts down on 66% of the log span	00:56
davecheney	spam	00:56
mgz	that particular one is probably api flood issue again	00:56
davecheney	i'll grap the all machines log	00:56
davecheney	the blow away the environment	00:57
davecheney	ubuntu@juju-hpgoctrl2-machine-0:~$ scp 15.185.160.145:/var/log/juju/all-machines.log .	00:59
davecheney	all-machines.log 20% 141MB 5.6MB/s 01:38 ETA	00:59
davecheney	that is a great network hp	00:59
mgz	filed bug 1169778	01:03
mgz	davecheney: compress first :)	01:03
davecheney	mgz: sure	01:04
davecheney	but 5.4Mb/s is shit	01:04
davecheney	my pandaboard can do better than that	01:04
davecheney	mgz: it's nice to know you can resume juju destroy-environment	01:04
davecheney	mgz: m_3 we have 7 stray nova instances after destroy-environemnt	01:08
davecheney	hmm	01:08
davecheney	actually only 2	01:08
davecheney	never mind	01:08
mgz	one was mine, that I killed, and one is the manage one?	01:12
davecheney	fixed it	01:12
davecheney	wc -l was the wrong thing to tuse	01:12
davecheney	mgz: m_3 deploying 200 hadoop-slave units	01:15
davecheney	100% of the time is being consumed waiting on hp to respond to new instance requests	01:17
m_3	davecheney: hey	01:57
m_3	davecheney: how's it going?	01:57
m_3	davecheney: btw, you never want to use 'hadoop-slave' the charm... (long story)	01:58
davecheney	m_3: 2013/04/17 01:52:25 NOTICE worker/provisioner: started machine 200 as instance 1509985	01:58
m_3	davecheney: make sure you're doing `juju deploy hadoop hadoop-slave -n200`	01:58
davecheney	m_3: sorry, i was looking in the history, that is all I could find	01:58
m_3	ack	01:58
davecheney	nm, we can destroy the instance	01:58
m_3	always ran from bin/hadoop-stack	01:58
davecheney	ok, let me nuke this environment	01:59
davecheney	(it worked fine btw	01:59
davecheney	rev 1164 is much less chatty	01:59
m_3	davecheney: did it get above 200?	01:59
m_3	davecheney: we have perms to go to 2000	01:59
davecheney	just did -n200	01:59
m_3	were just having probs right near the 200 mark	01:59
davecheney	i'll bootstrap a fresh env	01:59
davecheney	do bin/hadoop-stack	01:59
m_3	I did a `-n1997` before	01:59
davecheney	yeep!	02:00
davecheney	thati'll take hours	02:00
m_3	did you get mgz's changes in?	02:00
davecheney	no, but the logging change went in	02:00
m_3	yup... about 8hrs or so	02:00
davecheney	so we can actually see what is going on	02:00
m_3	ah, cool	02:00
m_3	wanna do 300?	02:00
davecheney	ok	02:00
m_3	or shall we go big	02:00
davecheney	500 should see us throught to drinks	02:00
davecheney	how is ODS ?	02:00
davecheney	is the marketing completely off tap	02:00
davecheney	?	02:01
m_3	dude	02:01
davecheney	hookers and blow ?	02:01
m_3	it's pretty over the top	02:01
m_3	haha	02:01
m_3	yeah, all hookers and blow	02:01
m_3	supposed to meet kapil in the lobby for dinner in a bit	02:01
davecheney	actaully, card tables, laptops and tshirts	02:01
m_3	yeah :(	02:02
m_3	better than couch, popcorn, and underwear at least	02:02
davecheney	m_3: i don't need to go to a conference for that	02:02
davecheney	m_3: we're good to go	02:04
davecheney	hit it with the old -n300	02:04
davecheney	hmm, that isn't good	02:05
davecheney	let me drive for a second	02:05
davecheney	ok, still piss farting around	02:05
davecheney	the last bootstrap was much faster	02:05
m_3	ack	02:06
davecheney	still scroblling in apt	02:06
davecheney	├─cloud-init-cfg─┬─add-apt-reposit	02:06
davecheney	│ └─sh───tee	02:06
davecheney	wow	02:06
davecheney	that is slow	02:06
davecheney	this instance may fail to bootstrap	02:06
m_3	might just killl it	02:07
davecheney	do it	02:07
davecheney	oh wait	02:07
davecheney	yeah, that is really slow	02:07
davecheney	nuke it	02:07
m_3	one that one's up just run bin/hadoop-stack	02:08
m_3	I'll chcek in after dinner if that's cool with you?	02:08
davecheney	m_3: ok, is that configured for -n300 ?	02:08
m_3	maybe you can catch what's going wrong near the 200 mark	02:08
davecheney	totolly cool	02:08
m_3	yup	02:08
davecheney	you go have dinner with K	02:08
m_3	awesome thanks man... ttyl	02:08
m_3	oh, and let's archive the logs in ~/arch/<date>	02:09
davecheney	sure, i've been scp'ing them onto my machine	02:09
davecheney	will fix	02:09
m_3	at least to keep some sort of record of what we're doing	02:09
m_3	thanks	02:09
davecheney	actually, i'll put it in google drive	02:09
m_3	I'm still rsyncing the whole control node's /home/ubuntu offsite daily	02:09
m_3	perfect	02:09
davecheney	probably isn't a bad idea	02:10
davecheney	make sure you use rsync -z	02:10
m_3	(-azvP)	02:10
m_3	cool... /me food	02:10
m_3	thanks man	02:10
davecheney	m_3: also, status now shows relations	02:12
davecheney	2013/04/17 02:13:05 DEBUG environs/openstack: openstack user data; 2708 bytes	02:13
davecheney	2013/04/17 02:13:05 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/6"	02:13
thumper	\o/	02:13
davecheney	2013/04/17 02:13:05 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/6"	02:13
davecheney	2013/04/17 02:13:05 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/5"	02:13
davecheney	2013/04/17 02:13:05 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/5"	02:13
davecheney	2013/04/17 02:13:05 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/6"	02:13
davecheney	2013/04/17 02:13:05 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/6"	02:13
davecheney	2013/04/17 02:13:05 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/5"	02:14
davecheney	2013/04/17 02:13:05 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/5"	02:14
davecheney	if this is an error, why does the provisioner not stop	02:14
thumper	davecheney: that doesn't look good	02:14
davecheney	?	02:14
davecheney	2013/04/17 02:12:29 INFO environs/openstack: starting machine 14 in "goscale2" running tools version "1.9.14-precise-amd64" from "https://region-a.geo-1.objects.hpcloudsvc.com/v1/17031369947864/goscale2-1/tools/juju-1.9.14-precise-amd64.tgz"	02:15
davecheney	2013/04/17 02:12:29 DEBUG environs/openstack: openstack user data; 2710 bytes	02:15
davecheney	2013/04/17 02:12:40 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/1"	02:15
davecheney	2013/04/17 02:12:40 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/1"	02:15
davecheney	2013/04/17 02:12:40 ERROR settings for entity with unrecognized key "r#0#provider#hadoop-master/0"	02:15
davecheney	2013/04/17 02:12:40 ERROR settings for entity with unrecognized key "r#1#provider#hadoop-master/0"	02:15
davecheney	2013/04/17 02:12:40 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/1"	02:15
davecheney	2013/04/17 02:12:40 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/1"	02:15
davecheney	2013/04/17 02:12:40 ERROR settings for entity with unrecognized key "r#0#provider#hadoop-master/0"	02:15
davecheney	2013/04/17 02	02:15
davecheney	starts to screw up here	02:15
davecheney	caused by: https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/servers%!(EXTRA string=failed executing the request)	02:17
davecheney	i really want to fix this stupid formatting errror	02:17
davecheney	shit shit shit	02:47
davecheney	the 1.9.14 release cannot seem to download any tools	02:47
davecheney	however when I build the release from source	02:47
davecheney	it can !	02:47
davecheney	2013/04/17 02:50:21 DEBUG checking tools 1.9.14-precise-amd64	02:51
davecheney	2013/04/17 02:50:21 INFO environs/openstack: starting machine 298 in "goscale2" running tools version "1.9.14-precise-amd64" from "https://region-a.geo-1.objects.hpcloudsvc.com/v1/17031369947864/goscale2-1/tools/juju-1.9.14-precise-amd64.tgz"	02:51
davecheney	2013/04/17 02:50:21 ERROR worker/provisioner: cannot start instance for machine "298": cannot find image satisfying constraints: failed to get list of flavours	02:51
davecheney	caused by: https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/flavors%!(EXTRA string=failed executing the request)	02:51
davecheney	caused by: Get https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/flavors: lookup az-2.region-a.geo-1.compute.hpcloudsvc.com: Temporary failure in name resolution	02:51
davecheney	i'm having a lot of trouble validating the 1.9.14 release	03:12
davecheney	is anyone in a position to install 1.9.14 from ppa:juju/devel	03:13
davecheney	and try	03:13
davecheney	(don't forget to make sure you're calling /usr/bin/juju, not ~/bin/juju)	03:13
fwereade	hazmat, if you're there, I don't think we necessarily want to destroy machines	03:33
fwereade	hazmat, I suspect the deploy-ubuntu-to-prime, force-machine-to-deploy (anti-)pattern will be popularly used	03:34
fwereade	davecheney, thumper: if you're there, can I get a trivial on https://codereview.appspot.com/8815043 please?	03:54
davecheney	fwereade: i'll take a look	03:57
davecheney	fwereade: i'm very concerned that 1.9.14 is stillborn	03:57
davecheney	i'm thinking of proposing we move to daily point releases	03:58
davecheney	so m_3 and I stop using the development chain	03:58
fwereade	davecheney, I don't think it'll build, given this stuff	03:58
fwereade	davecheney, it's an old test file that somehow crept in	03:58
davecheney	wut ?	03:59
fwereade	davecheney, at a casual glance it looks like it slipped in from the force-machine branch	04:00
davecheney	this is the second turd to fall out of that branch	04:00
fwereade	davecheney, although it's not quite clear how	04:00
davecheney	without being rude, 4 people looked at that branch	04:00
fwereade	davecheney, ouch :(	04:00
davecheney	and nobody spotted the .THIS file there	04:00
davecheney	fwereade: LGTM, please submit	04:02
fwereade	davecheney, happening right now	04:03
fwereade	davecheney, looks like that file slipped in somewhere between 1 and 5 days ago after the main round of reviewing	04:03
davecheney	so that means the submitter did not run the tests before submitting	04:04
davecheney	do we need to make another adding to .lbox.check ?	04:04
davecheney	/s/adding/addition	04:04
fwereade	I never found out what happened to the bot proposal but that also seems to be stillborn	04:05
fwereade	in its absence, maybe we do	04:05
thumper	hi fwereade	04:05
thumper	what are you doing up?	04:06
fwereade	thumper, fell asleep early last night, back up now to merge a whole pile of things and do some reviews	04:06
davecheney	fwereade: we can't have a bot til the tests pass reliably	04:06
davecheney	that was my understanding	04:06
thumper	fwereade: oh, I have four reviews up?	04:06
* thumper goes to drink the coffee		04:06
fwereade	davecheney, if we can't have a bot without reliable tests but we can't commit reliably without a bot something needs to change	04:08
davecheney	fwereade: i think we can at least do a test compile of the tests	04:09
davecheney	that would ahve caught this error	04:09
davecheney	but it does seem a bit futile	04:09
davecheney	we're like the TSA	04:09
davecheney	running after the horse	04:09
fwereade	davecheney, oh! how do we compile tests without running them?	04:09
fwereade	davecheney, actually, don't tell me, I might start doing it ;p	04:10
davecheney	fwereade: i'm sure there is a way	04:11
davecheney	(cd /tmp ; go test -c $PKG) will do it	04:11
fwereade	davecheney, emphatic +1 then	04:12
davecheney	fwereade: http://paste.ubuntu.com/5714987/	04:17
davecheney	can anyone else test the 1.9.14 release ?	04:17
davecheney	the top invocation is from the pacakge	04:18
davecheney	the bottom is the same source	04:18
davecheney	2013/04/17 02:50:17 INFO environs/openstack: starting machine 207 in "goscale2" running tools version "1.9.14-precise-amd64" from "https://region-a.geo-1.objects.hpcloudsvc.com/v1/17031369947864/goscale2-1/tools/juju-1.9.14-precise-amd64.tgz"	04:23
davecheney	2013/04/17 02:50:17 ERROR worker/provisioner: cannot start instance for machine "207": cannot find image satisfying constraints: failed to get list of flavours	04:23
davecheney	caused by: https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/flavors%!(EXTRA string=failed executing the request)	04:23
davecheney	caused by: Get https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/flavors: lookup az-2.region-a.geo-1.compute.hpcloudsvc.com: Temporary failure in name resolutio	04:23
davecheney	after 200 machines, we loose the ability to resolve the name of the hp endpoint	04:23
fwereade	davecheney, hmm, I seem to be bootstrapping ok in us0east-1 with 1.9.14 from the package	04:24
thumper	fwereade: so is it like 6:30am for you now?	04:24
fwereade	thumper, yeah	04:25
thumper	fwereade: cool, not even mind-numbingly early	04:25
thumper	fwereade: don't suppose you have time for a quick hangout?	04:25
fwereade	as predicted I fell asleep early and woke up early	04:25
fwereade	thumper, sure	04:25
davecheney	fwereade: i cannot bootstrap into ap-southeast-1 or 2 with that release	04:25
thumper	fwereade: I'll start	04:25
davecheney	but it works perfectly using the source	04:25
* davecheney has had a gutfull of unreliable cloud providers		04:31
davecheney	ubuntu@juju-hpgoctrl2-machine-0:~$ juju destroy-environment	04:31
davecheney	error: failed to delete server with serverId: 1510323	04:31
davecheney	caused by: https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/servers/1510323%!(EXTRA string=failed executing the request)	04:31
davecheney	caused by: remote error: handshake failure	04:31
fwereade	davecheney, ap-southeast-2 is... well, appearing to bootstrap anyway	04:50
fwereade	davecheney, but acting a bit weird, I will let you know how I get on	04:50
fwereade	davecheney, ok, the good news is that the bootstrap node comes up and works fine, the bad news is that the CLI will not acknowledge its existence	04:56
davecheney	fwereade: -v ?	04:57
davecheney	did it bootstrap the wrong tools	04:57
davecheney	?	04:57
fwereade	2013/04/17 06:55:37 INFO environs/ec2: waiting for DNS name(s) of state server instances [i-79c80744]	04:57
fwereade	2013/04/17 06:55:42 ERROR command failed: no instances found	04:57
fwereade	error: no instances found	04:57
davecheney	eep	04:58
fwereade	davecheney, the instance definitely exists and it's definitely got the right tools	04:58
davecheney	fwereade: i;m getting these errors constantly during the load tests	04:58
davecheney	2013/04/17 04:57:50 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/2"	04:58
davecheney	2013/04/17 04:57:50 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/2"	04:58
davecheney	2013/04/17 04:57:50 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/0"	04:58
davecheney	2013/04/17 04:57:50 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/0"	04:58
davecheney	2013/04/17 04:57:50 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/2"	04:58
fwereade	davecheney, so that is good anyway	04:58
davecheney	2013/04/17 04:57:50 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/2"	04:58
davecheney	2013/04/17 04:57:50 ERROR settings for entity with unrecognized key "r#0#requirer#hadoop-slave/0"	04:58
davecheney	2013/04/17 04:57:50 ERROR settings for entity with unrecognized key "r#1#requirer#hadoop-slave/0"	04:58
fwereade	davecheney, ah ffs there is no way those should be logged	04:59
fwereade	davecheney, they're not errors, it's just the allwatcher getting its knickers in a twist because it's watching things it doesn't know about	05:00
fwereade	davecheney, btw, we seem to be waiting only a very few seconds for the DNS name (apse2 issues), which is surprising to me	05:01
davecheney	fwereade: oh, so that is the api server	05:10
fwereade	davecheney, yeah	05:10
davecheney	i was wondering why provisioning didn't land in a screaming heap	05:10
davecheney	fwereade: i'll log a bug	05:10
fwereade	davecheney, they're not even really errors -- it's just logging and stopping processing for those docs	05:11
fwereade	davecheney, cheers	05:11
davecheney	i'll log two bugs	05:11
davecheney	can I get a review on https://codereview.appspot.com/8818043/	05:11
davecheney	https://bugs.launchpad.net/juju-core/+bug/1169825	05:14
fwereade	davecheney, what happened to errors.Newf (if anything)?	05:21
davecheney	nothign, just people were not calling it correctly	05:22
davecheney	URL was not a format specifier	05:22
davecheney	it was arg 1	05:22
* fwereade sees now		05:23
fwereade	davecheney, LGTM	05:23
davecheney	http://paste.ubuntu.com/5715085/	05:24
davecheney	https://bugs.launchpad.net/juju-core/+bug/1169826	05:32
TheMue	morning	06:05
TheMue	fwereade: seen my part 1? during sleep I found a by far simpler approach. will continue after breakfast.	06:07
jam1	afternoon wallyworld, how's it going today?	06:21
fwereade	TheMue, great news, I'm afraid I fell asleep last night ;p	06:21
wallyworld	jam1: going well thanks, just sorting out a test isolation issue and i will send a mp for my work	06:21
=== jam1 is now known as jam
TheMue	fwereade: no wonder after your night before	06:22
jam	fwereade: second LGTM on https://codereview.appspot.com/8818043/	06:28
jam	sorry, I meant davecheney ^^	06:29
jam	bigjools, jtv: Is the maas provider ready to be exposed in jujud/juju client?	06:31
* jtv is not here		06:32
bigjools	jam: what is the effect of that? Sorry, I have no context on it.	06:33
jam	bigjools: if you have it enabled, people can configure it in ~/.juju/environments.yaml and try to bootstrap against a maas service.	06:34
bigjools	as far as I know it all works, so if there's something missing I don't know why it's needed	06:34
jam	If it is ready for people to do so, then we should expose it.	06:34
jam	(which it has been)	06:34
jam	But we also want to configure jujud to require it	06:34
bigjools	jam: we can already bootstrap with it, I don;t understand	06:34
jam	so we don't accidentally lose it later.	06:34
jam	bigjools: I was just confirming that the code has gotten to the "working" state. I hadn't heard a specific update to where you guys were.	06:35
jam	great.	06:35
davecheney	jam: bzr: ERROR: Cannot lock LockDir(chroot-72851216:///%2Bbranch/goose/.bzr/branch/lock): Transport operation not possible: readonly transport	06:35
davecheney	I cannot submit that mp	06:35
jam	davecheney: so I would say LGTM trivial on exposing it.	06:35
bigjools	jam: ok :) yes it should all work	06:35
davecheney	any ideas ?	06:35
jam	davecheney: you can't use 'lbox submit' for goose. You mark the MP as approved (make sure it has a commit message), and then goose bot will run the tests and land it for you.	06:35
jam	I can do it if you need help.	06:35
davecheney	ta	06:36
m_3	davecheney: how's it going?	06:36
jam	davecheney: so on here: https://code.launchpad.net/~dave-cheney/goose/001-fmt-error/+merge/159287	06:36
davecheney	m_3: hey	06:36
jam	there is a "Set commit message" which you can just copy the description if you are happy with it.	06:36
jam	and then mark it Approved.	06:36
davecheney	sent you an email with the sitrep	06:37
davecheney	basicallyu can't get above 228 machines	06:37
davecheney	i susecpt we're running out of ipv4 allocations	06:37
m_3	ouch	06:37
davecheney	and it is being reported in a strange way	06:37
m_3	actually, that's awesome... this is the shit we're doing this to flush out	06:37
davecheney	the way the dns address of the keystone endpoint stops resolving is both	06:38
davecheney	a. repeatable	06:38
davecheney	b. fucked up	06:38
* m_3 reading mail		06:38
m_3	davecheney: we can see our ip quotas... one sec	06:39
davecheney	228 is close enough to 254 that it makes me go hmmm	06:40
davecheney	m_3: ahh	06:43
davecheney	interesting	06:43
davecheney	228 isn't near any of those numbers	06:43
davecheney	jam: now what happens ?	06:44
m_3	davecheney: hmmmm.... thought that showed ip allocations too	06:45
* m_3 rtfm-ing		06:45
davecheney	jam: is the bot broken ?	06:49
m_3	davecheney: I'll dig through the acct setup with antonio tomorrow	06:50
m_3	davecheney: jorge and I are talking tomorrow, so won't get to this til later in my daytime	06:51
jam	davecheney: usually it takes about 5 min for it to wake up and run the test suite	06:51
jam	but if it fails it should report a failure to Launchpad and unmark the proposal as pneding	06:51
* davecheney thinks its dead		06:52
m_3	davecheney: thanks for the help... /me beddy-bye	06:52
davecheney	m_3: no worries mate	06:52
jam	davecheney: from what I can tell 'canonistack' thinks the machine is running, but I'm getting 0 response from port 22	06:53
davecheney	jam: i remember a discussion a while ago where you metnoined that you weren't able to contact th emachine	06:56
davecheney	but it appeared to be doing stuff, so whateva	06:56
wallyworld	jam: i'm off to soccer, but here's a wip mp. dimiter was interested to see it. i can't land till his work goes in since i need to account for whatever changes he has made. i also need to land a small goose change	07:14
wallyworld	https://code.launchpad.net/~wallyworld/juju-core/openstack-image-lookup/+merge/159301	07:14
wallyworld	the covering letter also need to be expanded, but i am late	07:15
* TheMue is happy, the simpler approach works fine. now part two adding the subordinate unite information.		07:30
rogpeppe2	mornin' all	07:54
TheMue	rogpeppe2: morning	07:56
=== TheRealMue is now known as TheMue
rogpeppe2	in case my earlier email didn't get through, my phone line is down currently	09:25
rogpeppe2	i'm connected through my mobile currently, but i don't know how reliable it will be	09:25
rogpeppe2	trivial branch that fixes #1169825 ; review appreciated please: https://codereview.appspot.com/8821043	09:33
TheMue	YEEEEEAH!	09:58
* TheMue does the subordinate dance ...		09:59
rogpeppe2	fwereade: i'm after a review of this fairly trivial branch, please. i think it's important to get in before the release. https://codereview.appspot.com/8823043	10:09
fwereade	rogpeppe2, ok, I'll chat about s3 in a mo	10:09
rogpeppe2	fwereade: ok	10:10
fwereade	rogpeppe2, nice, LGTM	10:11
rogpeppe2	fwereade: trivial?	10:12
TheMue	fwereade: you've got a proposal :)	10:12
fwereade	rogpeppe2, yeah, I think so	10:12
rogpeppe2	fwereade: cool, thanks	10:12
fwereade	TheMue, cheers	10:12
* rogpeppe2 wonders how long it'll take to run lbox submit through the mobile data connection		10:12
TheMue	fwereade: it's https://codereview.appspot.com/8824043/	10:13
=== rogpeppe2 is now known as rogpeppe
fwereade	TheMue, cheers	10:13
fwereade	rogpeppe, I am somewhat ambivalent about changing Storage.Put	10:13
rogpeppe	fwereade: i think that's going to be a necessary thing if we ever want retries to work.	10:14
fwereade	rogpeppe, I would be much happier if it took a Reader and tried to ReadSeeker it	10:15
rogpeppe	fwereade: i wondered about that, but i was very glad i didn't	10:15
fwereade	rogpeppe, oh yes?	10:15
rogpeppe	fwereade: because the static type caught lots of places that i would not otherwise have found	10:16
rogpeppe	fwereade: it's not that we can stream data without having it first either	10:17
rogpeppe	fwereade: because we need to know the length	10:17
fwereade	rogpeppe, it is not unheard of for content lengths to be declared	10:17
rogpeppe	fwereade: that is true, i suppose	10:17
fwereade	rogpeppe, if they are, then great,but even if not I don't really think we should force all clients to buffer everything	10:18
rogpeppe	fwereade: but in all the cases we care about, we do want retries to work	10:18
rogpeppe	fwereade: and they won't work on a straight io.Reader	10:18
fwereade	rogpeppe, maybe it's our own responsibility to buffer though	10:19
rogpeppe	fwereade: i think so	10:19
fwereade	rogpeppe, ha, different "our" I suspect	10:19
rogpeppe	fwereade: ah, maybe so :-)	10:19
rogpeppe	fwereade: i really don't think that Put should do the buffering itself	10:19
rogpeppe	fwereade: if that's what you were thinking of	10:20
fwereade	rogpeppe, yeah, I definitely see that side of it too	10:20
fwereade	rogpeppe, but I'm not sure it's a very strong objection -- convince me?	10:20
rogpeppe	fwereade: in many of the places, we already have the data in a file	10:21
fwereade	rogpeppe, in which case, yay, easy ReadSeeker	10:21
rogpeppe	fwereade: buffering in Put would force us to make another copy of that before doing the Put	10:21
fwereade	rogpeppe, I'm only suggesting we should buffer ourselves if we can't ReadSeeker it	10:21
rogpeppe	fwereade: oh i see, you mean "if it doesn't implement ReadSeeker, then pull everything into a buffer" ?	10:21
fwereade	rogpeppe, more ideally a temp file	10:22
rogpeppe	fwereade: i don't really see the point.	10:22
rogpeppe	fwereade: in all the places we have, it's trivial to make a ReadSeeker	10:22
rogpeppe	fwereade: and i much prefer to avoid dynamic type checking magic where reasonable	10:22
fwereade	rogpeppe, so that we can inject the reliability as close to the problematic bit as possible, and so we don't push a new restriction into the API that I'm not sure is justified	10:23
rogpeppe	fwereade: you're saying we shouldn't force all clients to buffer everything, but we're going to buffer everything anyway, right?	10:24
rogpeppe	fwereade: for the rare (currently non-existent) case where the client doesn't have a seeker, it's easy for them to make one.	10:24
rogpeppe	fwereade: and know about the trade-offs involved	10:25
rogpeppe	fwereade: as i said, having the static type caught lots of cases where i could trivially use a readseeker but didn't. if i hadn't made that change, we'd be creating a temp file for each of those.	10:26
fwereade	rogpeppe, or maybe we'd be making use of the length param to decide whether or not to create one ;)	10:26
rogpeppe	fwereade: that seems to me like lots of heuristics and extra code where currently we don't need any	10:27
rogpeppe	fwereade: if we find lots of places that want to use an io.Reader, we can easily make a func PutBuffered(s Storage, name string, r io.Reader, length int64) error	10:28
fwereade	rogpeppe, isn't that because you're just pushing the responsibility onto the client?	10:28
rogpeppe	fwereade: so it's obvious to the caller that we're going to be creating a temporary file	10:28
rogpeppe	fwereade: sure. but creating a temporary file is something i think it's worth the client be aware of	10:28
rogpeppe	fwereade: and it means we don't have to write more code now	10:29
fwereade	rogpeppe, possible compromise: at the point where you'd seek, just error out if it's not a ReadSeeker?	10:30
rogpeppe	fwereade: that's worse, i think	10:31
fwereade	rogpeppe, automatic reliability in the cases we already have, no external code changes, no magic buffering	10:31
fwereade	rogpeppe, I guess there's something I'm missing	10:31
fwereade	rogpeppe, (ok, maybe we should check first to know where to seek back to)	10:31
rogpeppe	fwereade: except that i want automatic reliability in all cases, and it's not hard	10:32
rogpeppe	fwereade: i don't really see why there's a problem with requiring ReadSeeker. it loses no generality	10:32
rogpeppe	fwereade: yes, that is also an issue with just dynamically type converting to ReadSeeker	10:33
rogpeppe	fwereade: we'd have to say "if this is a ReadSeeker, it must be positioned at the start"	10:33
rogpeppe	fwereade: that's ok i guess, but i'd much prefer to just require a ReadSeeker	10:34
fwereade	rogpeppe, the issue is that it's a restrictive interface change to all the providers to implement a change that's ec2-specific	10:35
fwereade	rogpeppe, I feel like it's ec2 driving the interface, not the other way round	10:36
rogpeppe	fwereade: it's not necessarily ec2-specific. it applies to any provider where the request might encounter a transient failure	10:36
fwereade	rogpeppe, it is in practice ec2-specific... and any other provider that implements Put-reliability will need to make similar internal changes to those in ec2 regardless	10:38
rogpeppe	fwereade: i agree. but if you've just got an io.Reader, you cannot retry a request - so regardless of the provider implementation, they'll want some way of rewinding the stream.	10:40
rogpeppe	fwereade: so we're going to force all providers that want to retry to implement their own temp-file buffering stuff	10:40
rogpeppe	fwereade: i'd prefer to keep that cleverness outside of the individual providers	10:40
fwereade	rogpeppe, the tradeoff is between asking a limited number of providers to do so, and asking every client to do so	10:41
fwereade	rogpeppe, our perspective on this may skewed because we are focused on writing one client, and we have 3+ providers to think about	10:42
rogpeppe	fwereade: if it's just a difference between s.Put(r, name, len) and environs.PutBuffered(r, name, len) then i don't see the issue	10:42
rogpeppe	PutBuffered(s, name, r, len) of course	10:42
rogpeppe	fwereade: then the temp-file buffering is implemented in a single place, once and for all	10:43
rogpeppe	fwereade: (and currently we don't even need it, because there's not a single place that's really streaming)	10:43
fwereade	rogpeppe, the question I am trying to get clarity on is: purely at the level of the interface, why is it better to have a ReadSeeker than a Reader?	10:43
rogpeppe	fwereade: because a Reader allows streaming only, and here we can't stream	10:44
fwereade	rogpeppe, we can stream, it's just not necessarily so reliable	10:44
rogpeppe	fwereade: and that's a good thing?	10:45
fwereade	rogpeppe, considering the capabilities of the various providers, I think it is all we can reasonably guarantee	10:46
rogpeppe	fwereade: we can guarantee to make available to a provider some data that they can retry on. i think that's a nice thing to do. it makes the provider's job easier.	10:47
rogpeppe	fwereade: your solution is asking me to put more intelligence and code in the provider, where actually we want to move as much as possible outside the provider	10:48
fwereade	rogpeppe, we can, yes -- but I thought we weren't writing the interfaces just for our own convenience?	10:48
fwereade	rogpeppe, I don't know how you are going to put S3-specific buffering outside the provider	10:48
rogpeppe	fwereade: as far as i'm concerned, it's the job of that interface to make it as easy as possible to write a provider	10:49
fwereade	rogpeppe, the provider should absolutely contain provider-specific intelligence and code	10:49
rogpeppe	fwereade: for clients, we can easily write convenience functions that wrap it	10:49
rogpeppe	fwereade: agreed. but buffering the reader is not provider-specific. i'm not sure what you mean by "S3-specific buffering".	10:49
rogpeppe	fwereade: another possibility is to pass a function (func() io.Reader) that can be called multiple times to get the source.	10:53
rogpeppe	fwereade: that would make it easier for a client to do streaming.	10:54
fwereade	rogpeppe, does every provider have to implement this retry stuff?	10:54
fwereade	rogpeppe, it seems it is a provider-specific enhancement	10:54
rogpeppe	fwereade: it's a good question. i suspect it's not unlikely.	10:54
rogpeppe	fwereade: the point is that any provider that does want to implement retry must implement the same buffering code.	10:55
rogpeppe	fwereade: and that client code must work against any provider	10:55
rogpeppe	fwereade: i suppose your argument could be: for providers that don't implement retry, we'll be doing more work than necessary.	10:56
rogpeppe	fwereade: assuming we actually have some client code that wants to stream.	10:57
fwereade	rogpeppe, well, we already did, even if it was just a smattering of test code	10:57
rogpeppe	fwereade: really?	10:58
rogpeppe	fwereade: i thought the changes were just changing bytes.Buffer to bytes.Reader in the main	10:58
rogpeppe	fwereade: no extra buffering required	10:58
fwereade	rogpeppe, my argument is really that most providers don't retry, and that even if we get to a point where most do I would rather have a new provider able to implement the interface to the full more easily	10:58
rogpeppe	fwereade: huh? this doesn't make it harder in the slightest for a new provider	10:59
fwereade	rogpeppe, rather than make Put-retries a fundamental requirement for a provider, I would like them to be an optional enhancement	10:59
rogpeppe	fwereade: they are	10:59
rogpeppe	fwereade: a provider can totally ignore the fact that it's a ReadSeeker	10:59
fwereade	rogpeppe, if an Environ is always required to retry, ReadSeeker makes sense	10:59
fwereade	rogpeppe, Storage sorry	10:59
rogpeppe	fwereade: i don't see any requirement - it makes it possible for a provider to retry, but by no means mandatory	11:00
fwereade	rogpeppe, a Reader in the interface implies potentially-unreliable streaming, a ReadSeeker tenacious persistence	11:01
fwereade	rogpeppe, if the lowest common denominator is unreliable streaming, that is what we should advertise	11:01
rogpeppe	fwereade: i don't see any difference in reliability between a Reader and a ReadSeeker	11:01
rogpeppe	fwereade: they both implement io.Reader	11:01
rogpeppe	fwereade: a seeker has no more reliability guarantees	11:02
fwereade	rogpeppe, if any, or all, providers want to get smarter and buffer, then they can do so and everyone who uses them gets to see the benefits transparently	11:02
fwereade	rogpeppe, I'm talking about the reliability of the operation, not its data source	11:02
rogpeppe	fwereade: but we can implement this behaviour once and for all providers	11:02
rogpeppe	fwereade: another possibility would be to hoist the retry logic outside the providers, but i think it's too inherently provider-specific	11:03
rogpeppe	fwereade: and even then, we'd need a stream we could rewind	11:04
fwereade	rogpeppe, how about EnsureSeeker(Reader) (ReadSeeker, error), and fall back to only trying once?	11:05
rogpeppe	fwereade: who calls that?	11:05
fwereade	rogpeppe, any provider that wants to add retry logic	11:06
fwereade	rogpeppe, they just have to be prepared for it to fail and fall back to non-retrying	11:06
rogpeppe	fwereade: what would EnsureSeeker do other than .(io.ReadSeeker) ?	11:06
fwereade	rogpeppe, one day, maybe, buffer to a file if that fails	11:07
fwereade	rogpeppe, I think that's a cleaner separation of concerns	11:08
rogpeppe	fwereade: i'd find it preferable for a client to explicitly disable retrying (for instance by passing in a ReadSeeker where the Seek always fails)	11:09
rogpeppe	fwereade: then a client has to make a positive action to disable reliability, rather than the other way around	11:09
fwereade	rogpeppe, except in the cases where it isn't reliable	11:09
rogpeppe	fwereade: func ErrorReadSeeker(r io.Reader) io.ReadSeeker	11:09
rogpeppe	fwereade: sure, some providers just aren't reliable. but when we use a reliable provider, we are assured of reliability	11:10
rogpeppe	fwereade: rather than, some time in the future "oh, that's failing because i forgot to use a bytes.Reader rather than a bytes.Buffer"	11:11
rogpeppe	fwereade: and this question is currently academic - we have a seekable reader in every single case, trivially.	11:12
fwereade	TheMue, https://codereview.appspot.com/8824043/ reviewed	11:12
TheMue	fwereade: cheers	11:12
fwereade	rogpeppe, if not all providers are reliable, why would we imply they are?	11:12
rogpeppe	fwereade: we are not. we are implying that a provider may be reliable.	11:13
fwereade	rogpeppe, and placing the burden for supporting reliability on the client	11:13
rogpeppe	fwereade: ... which is trivial!	11:14
fwereade	rogpeppe, despite the fact that it may not actually be present	11:14
rogpeppe	fwereade: there is no burden	11:14
rogpeppe	fwereade: honestly - how does it actually burden the client?	11:14
fwereade	rogpeppe, you just suggested ErrorReadSeeker	11:15
fwereade	rogpeppe, which STM to be boilerplate you expect every client to use if they don't know the provenance of their Reader	11:15
fwereade	rogpeppe, which, honestly, I do not think they should have to care about	11:16
rogpeppe	fwereade: yes - i'd like each client to be fully aware when they might not get reliability against an otherwise bulletproof provider	11:16
rogpeppe	fwereade: but currently there are no clients that would use it	11:17
fwereade	rogpeppe, IIRC Put returns an error	11:17
fwereade	rogpeppe, under-promise, over-deliver ;)	11:18
rogpeppe	fwereade: this isn't promising anything, really.	11:19
rogpeppe	fwereade:	11:19
rogpeppe	1) this does not require any providers to do anything different	11:19
rogpeppe	2) this enables a provider to implement retry easily	11:19
rogpeppe	3) there is no significant impact on how easy it would be to write a client that uses the Storage.	11:19
rogpeppe	i really can't see a down side	11:19
fwereade	rogpeppe, every provider needs to Read; some providers may be able to be smarter if they can also Seek	11:21
rogpeppe	fwereade: sure.	11:21
fwereade	rogpeppe, this seems to be exactly why we can do `rs, ok := r.(io.ReadSeeker)`	11:22
rogpeppe	fwereade: and every client needs to work with every provider	11:22
rogpeppe	fwereade: so what the client provides will enable those some providers to be smarter.	11:22
rogpeppe	fwereade: if there was a significant difficulty for clients to provide a Seeker, i'd agree with you.	11:23
rogpeppe	fwereade: but really, there's not AFAICS.	11:23
fwereade	rogpeppe, whether or not a given Reader is also a Seeker is not something clients should have to think about	11:23
rogpeppe	fwereade: it really is - a client should be aware that by using a reader that's not a seeker it is making the operations more unreliable on some providers.	11:24
rogpeppe	fwereade: and we can let the compiler tell us that	11:24
rogpeppe	fwereade: i think that's awesome	11:24
* rogpeppe really doesn't like dynamic type checks where they're easily avoided.		11:25
fwereade	rogpeppe, you're proposing that introducing a magic implicit reliability switch will make people's lives easier?	11:25
fwereade	rogpeppe, the client should not get to affect that choice	11:26
rogpeppe	fwereade: it's not magic. doing a dynamic type check under the hood to see if the client might have turned on that switch is magic.	11:26
fwereade	rogpeppe, how does it affect the client, except to cause fewer errors?	11:27
rogpeppe	fwereade: fewer errors is a huge thing	11:27
fwereade	rogpeppe, yes	11:27
fwereade	rogpeppe, and a good thing	11:28
fwereade	rogpeppe, that the client doesn't have to think about	11:28
rogpeppe	fwereade: i'm glad we agree on something :-\|	11:28
rogpeppe	fwereade: indeed - the io.ReadSeeker contract implies that they get reliability for free	11:28
rogpeppe	fwereade: and is trivial for all current (and probably most future) clients to provide	11:29
rogpeppe	fwereade: that's really the crux - if that's not true, then i think you're right.	11:29
fwereade	rogpeppe, saying "you have to provide a readseeker, and hence reliability, but it might not be used" is not better than saying "just give us a Reader -- if it's also a ReadSeeker, you might see some benefits on some providers"	11:30
rogpeppe	fwereade: i think it is	11:30
rogpeppe	fwereade: because it means that the compiler tells us where things might be unreliable.	11:31
rogpeppe	fwereade: so we know that in all current cases, you will see benefits on the providers that can use it	11:31
rogpeppe	fwereade: and when making this change, i was surprised by the places that it told me	11:32
rogpeppe	fwereade: and they were all trivially changeable to make work	11:32
dimitern	jam: you around?	11:32
rogpeppe	fwereade: i think that's a concrete benefit	11:32
fwereade	rogpeppe, you'll have to explain further	11:33
rogpeppe	fwereade: so, when i changed the signature, i ran the build	11:33
rogpeppe	fwereade: and it said, in various places "that argument does not implement io.ReadSeeker"	11:33
rogpeppe	fwereade: and i went there	11:33
fwereade	rogpeppe, so, it doesn't get to be reliable on ec2, just like it doesn't everywhere	11:34
rogpeppe	fwereade: and, lo! it was using bytes.Buffer not bytes.Reader	11:34
wallyworld	dimitern: https://code.launchpad.net/~wallyworld/juju-core/openstack-image-lookup/+merge/159301	11:34
jam	dimitern: I'm back	11:34
rogpeppe	fwereade: i don't understand that	11:34
fwereade	rogpeppe, the interface should reflect the minimum set of required capabilities	11:35
fwereade	rogpeppe, if I've just got a Reader I know its potentially unrepeatable anyway	11:35
fwereade	rogpeppe, you're asking me to turn it into a ReadSeeker that you might not bother to use	11:36
fwereade	rogpeppe, how does it help?	11:36
fwereade	rogpeppe, OTOH transparently making use of reliability, or injecting it ourselves if it's really critical, frees the client to solve their own problem	11:37
rogpeppe	fwereade: we require that code to run against every provider. we know that when we run with the ec2 provider that the ReadSeeker will be used (assuming an operation fails in a transient way)	11:37
fwereade	rogpeppe, why do all the clients for all the other providers have to care?	11:37
rogpeppe	fwereade: there is no client "for another provider". a client is for all providers	11:38
fwereade	rogpeppe, you have been talking like the client knows its using ec2	11:38
rogpeppe	fwereade: we are indeed writing code that we know will be used against ec2 (and all the other providers too)	11:39
fwereade	rogpeppe, the all providers context should only make it more starkly clear that we are messing with a perfectly good interface so we can show off a special feature of a single provider	11:39
rogpeppe	fwereade: on the contrary, i think that the ec2 requirement shows the inadequacy of the interface	11:40
rogpeppe	fwereade: and i'm sure there will be other providers that will want to retry a Put too.	11:42
rogpeppe	fwereade: by making the change now, we make it easy for them to do so	11:42
fwereade	rogpeppe, AFAICT the only "inadequate" thing about the interface is that it requires that you do about the most innocuous possible type check check	11:45
fwereade	rogpeppe, in order to add provider-specific functionality	11:45
fwereade	rogpeppe, ReadSeeker makes no sense unless reliability-vs-transient-errors is in the contract of the Storage type	11:46
rogpeppe	fwereade: the inadequate thing is that it's not obvious to people using that interface that if the happen to pass a ReadSeeker in, that somehow things will magically get more reliable on some providers.	11:46
fwereade	rogpeppe, but you seemed to be saying that people would usually be using ReadSeekers anyway	11:47
rogpeppe	fwereade: as far as i'm concerned, the contract of the Storage type is "do whatever you can to try and fulfil these requests"	11:47
rogpeppe	fwereade: not really - i'm saying that people would usually be using some that could trivially be a ReadSeeker	11:48
fwereade	rogpeppe, so you're breaking the contracts of maas and openstack,because they don;t do that	11:48
rogpeppe	fwereade: not at all	11:48
rogpeppe	fwereade: perhaps they don't get transient errors	11:48
fwereade	rogpeppe, then they definitely don;t need ReadSeekers!	11:48
rogpeppe	fwereade: that's fine. all they need is the reader bit then.	11:49
rogpeppe	fwereade: there's no requirement to use the seeker part	11:49
fwereade	rogpeppe, then why would we require that it be supplied?	11:49
rogpeppe	fwereade: ec2 won't use the seeker part unless something fails with a certain kind of error	11:49
rogpeppe	fwereade: because thing client needs to supply a reader for all providers!	11:49
rogpeppe	s/thing/the/	11:50
fwereade	rogpeppe, and it does	11:50
fwereade	rogpeppe, and ec2 would make use of it if you would just make use of the language feature designed explicitly for that purpose	11:50
fwereade	rogpeppe, type casts and types switches were not just put in for fun	11:50
rogpeppe	fwereade: and if clients actually passed in a readseeker	11:51
fwereade	rogpeppe, we don;t get to control our inputs and probably nor do they	11:51
fwereade	rogpeppe, they then have to advertise ReadSeeker	11:51
rogpeppe	fwereade: when would it ever not be appropriate for a client to pass in a ReadSeeker?	11:51
fwereade	rogpeppe, all the places you changed, for a start	11:53
rogpeppe	fwereade: in that branch, there are various places where the changed code does "bytes.NewReader(buf.Bytes())". that code is totally non-obvious if the argument to Put is just a Reader. but it has significant runtime implications.	11:53
fwereade	rogpeppe, being?	11:55
rogpeppe	fwereade: that when running against ec2 the operation will be less reliable if you don't	11:56
rogpeppe	fwereade: essentially we want all our client code to work as well as possible against all providers. that means that the client code should work as well as possible against at least the ec2 provider. so all clients should provide the thing that makes ec2 work well - i.e. a ReadSeeker. given that, why not make the interface appropriate to that?	11:57
* rogpeppe thinks it might be possible to phrase most of that in mathematical notation		11:58
wallyworld	fwereade: are you free sometime soon for a meeting with me and dimitern ?	11:59
wallyworld	soon = next 30 minutes ?	11:59
dimitern	fwereade: about openstack constraints/images/flavors selection	11:59
fwereade	wallyworld, dimitern: sure, I can do now	12:02
dimitern	fwereade: ok, I'll start a g+	12:02
wallyworld	ok	12:02
dimitern	https://plus.google.com/hangouts/_/edc5333a9131548ea93258b2c5c90f6a9ef4af15?authuser=0&hl=en << fwereade, wallyworld	12:03
wallyworld	trying, not connecting	12:04
dimitern	wallyworld: william was having problems with sound	12:06
jam	wallyworld: as a side thing, when you know your holidays, try to put them on the juju team calendar. I think I did it correctly for you this time.	12:13
wallyworld	jam: i did add them	12:13
wallyworld	the week before i left	12:13
jam	wallyworld: I mean the ones you just applied for national holidays in April and June	12:14
wallyworld	np sorry	12:14
jam	april 25th june 10th I think	12:14
rogpeppe	fwereade: "Why aren't we doing this at the end of cmd.Main? We already get a log message on error."	12:48
rogpeppe	fwereade: that's the way i tried to do it first	12:48
rogpeppe	fwereade: but unforunately it breaks the jujuc log command	12:48
rogpeppe	fwereade: we really don't want jujuc log to say "command finished" every time it runs.	12:49
rogpeppe	jam: "	12:52
rogpeppe	Some	12:52
rogpeppe	platforms allow renaming an empty dir over another empty directory.	12:52
rogpeppe	"	12:52
rogpeppe	really?	12:52
rogpeppe	i thought the whole point of using directories in fslock was that that wasn't the case	12:52
rogpeppe	fwereade: if that is the case, then fslock is broken AFAICS	12:53
* dimitern bbiab		13:01
TheMue	fwereade: the propose is in again	13:06
* TheMue is at lunch		13:06
fwereade	rogpeppe, I'm not at fslock properly yet	13:07
rogpeppe	fwereade: ok	13:07
rogpeppe	fwereade: a review from you would be good before it goes in	13:08
fwereade	rogpeppe, re jujuc: hmm, I'm not so sure jujuc logging is entirely a misfeature	13:08
fwereade	rogpeppe, if anything I'd say "good point, add start logging to jujuc commands"	13:09
fwereade	rogpeppe, I would expect that in the eventual case that would be logged at a lower level than that specified for juju-log	13:09
rogpeppe	fwereade: the problem is the jujuc log command. do we really want that to produce two lines of log for every line the user intended to log?	13:10
rogpeppe	fwereade: i thought that the level was one of the arguments to juju-log	13:10
rogpeppe	fwereade: well, one that we currently discard, i'm aware	13:10
rogpeppe	fwereade: but given that we have those levels now, i think it probably should support that	13:11
fwereade	rogpeppe, it is one of the arguments to juju-log, and ofc it should support that	13:11
fwereade	rogpeppe, but that's the level at which the supplied message should be logged, and it's independent of the level at which we log command completion	13:12
rogpeppe	fwereade: so if i do juju-log --level=debug foo, do i really want to see two lines at that level in the log?	13:12
rogpeppe	fwereade: if you think that's ok, then i'll go with logging in cmd.Main. i didn't think it was, which was why i changed it.	13:13
fwereade	rogpeppe, yes, I think so, because if we unfuck juju-log we will also, I imagine, badge its output clearly such that we can filter that stuff easily	13:13
fwereade	rogpeppe, might be wrong	13:13
rogpeppe	fwereade: i'm just imagining a scenario where the user is producing lots of output using juju-log. we just doubled the number of lines produced	13:14
rogpeppe	fwereade: and log file size is actually a significant issue for us.	13:14
fwereade	rogpeppe, how much would we save if we dropped the logging in state.Open except (1) before the dial func and (2) on errors inside the dial func?	13:16
fwereade	rogpeppe, saving 240 lines/agent/hour will probably leave us some wiggle room:)	13:17
rogpeppe	fwereade: i don't know. i have thrown away the last huge log file i acquired	13:17
fwereade	rogpeppe, not as much help as silencing the txn spam, I agree	13:18
rogpeppe	fwereade: i just think it's a bit weird that a command whose sole purpose is to produce a line of log output actually produces two.	13:18
rogpeppe	fwereade: well, probably more still actually, as the uniter probably logs when it gets a jujuc command excecution request	13:19
rogpeppe	fwereade: i think the changes to logging in state.Open sound reasonable.	13:20
mramm	morning all (well early my morning)	13:20
rogpeppe	mramm: hiya	13:20
fwereade	rogpeppe, I think that the general "it's good to know what commands we run" principle probably beats the (otherwise clearly sane) juju-log consideration	13:20
rogpeppe	mramm: i won't be able to join the kanban meeting today as my broadband connection is out	13:21
rogpeppe	fwereade: i think i agree. but the "finished" message perhaps doesn't	13:22
fwereade	rogpeppe, sorry, the agent finished messages?	13:22
mramm	bummer that	13:22
rogpeppe	fwereade: the "finished" message in cmd.Main	13:22
rogpeppe	mramm: you never know, the engineers might turn up (chortle, chortle)	13:23
fwereade	rogpeppe, I don't see one	13:24
fwereade	rogpeppe, I see"command failed"	13:24
rogpeppe	fwereade: that's the one you're suggesting	13:24
rogpeppe	fwereade: "Why aren't we doing this at the end of cmd.Main?"	13:24
fwereade	rogpeppe, and I just suggested an alternative: in SuperCommand	13:24
rogpeppe	fwereade: orly? i don't think i've seen that	13:25
fwereade	rogpeppe, or... apparently I didn't	13:25
fwereade	rogpeppe, the insanity peppers must be kicking in	13:25
fwereade	rogpeppe, or maybe I'll discover it in the wrong buffer somewhere in an hour	13:25
rogpeppe	:-)	13:25
* fwereade pokes himself with something sharp		13:25
rogpeppe	fwereade: that happens to me too often	13:26
rogpeppe	fwereade: why would doing it in SuperCommand help? doesn't jujuc use SuperCommand?	13:26
fwereade	rogpeppe, I think niemeyer talked me out of it, let me check	13:26
rogpeppe	fwereade: you're right, i don't think it does	13:27
rogpeppe	pwd	13:27
rogpeppe	fwereade: still seems a bit iffy to me. we're logging in SuperCommand because use of SuperCommand happens to correspond exactly with the commands we want to log finished messages for.	13:28
fwereade	rogpeppe, feels a bit icky but maybe not too much so	13:28
fwereade	rogpeppe, because SuperCommand at least does know about logs	13:28
fwereade	rogpeppe, and in general I don't think we should expect either main.Main or cmd.Main to have logging configured	13:28
rogpeppe	fwereade: and juju doesn't?	13:28
fwereade	rogpeppe, not much point logging if you don't have a target IMO	13:29
rogpeppe	fwereade: true, but no harm either, no?	13:29
fwereade	rogpeppe, message that looks like it should be printed but actually isn't	13:29
rogpeppe	fwereade: true of all log messages...	13:29
fwereade	rogpeppe, I think that , all other things being equal, it is better to write a given message when you know that logging setup has occurred, and to write important messages elsewhere if you can't be sure	13:31
fwereade	rogpeppe, but that is kinda a derail	13:31
fwereade	rogpeppe, how do you feel about dropping it from cmd.Main and logging success/failure in SuperCommand?	13:31
fwereade	rogpeppe, independent of reasons I might like it ;)	13:31
rogpeppe	fwereade: that seems reasonable to me. i'd forgotten that logging was so tightly bound up with SuperCommand.	13:32
fwereade	TheMue, https://codereview.appspot.com/8824043/ LGTM	13:38
TheMue	fwereade: yeah, just read it, thank.	13:39
fwereade	dimitern, rogpeppe, jam, anyone: a second look at that would be handy; I am not entirely happy about some of the code but it produces the correct output and that's the critical thing at this point	13:39
TheMue	fwereade: the deleting of the units of subordinate service is done this way in py too. the data is used before to add the unit info to the principal units.	13:41
rogpeppe	fwereade, TheMue: looking	13:42
fwereade	rogpeppe, https://codereview.appspot.com/8821043/ LGTM trivial	13:42
rogpeppe	fwereade: ta!	13:43
* rogpeppe is trying hard to grok the code in processServices		13:48
TheMue	rogpeppe: maybe i should move those loops into two proprocessing funcs with a speaking name	13:49
rogpeppe	istm that it should be possible to get the type system working for us there, rather than fighting it in every line	13:49
TheMue	rogpeppe: it's only a mix of the collected data into the output	13:49
TheMue	rogpeppe: especially the lower loop is exactly how it is done today in py	13:50
rogpeppe	TheMue: the difficulty i have is that we're spending a lot of code dynamically type casting stuff that perhaps we can already know the types of.	13:50
TheMue	rogpeppe: but this map[string]interface{} indeed complicates it sigh	13:50
TheMue	rogpeppe: yes, i'm not happy about it too	13:50
TheMue	rogpeppe: but regarding the feature freeze i think we should change this afterwards	13:51
rogpeppe	TheMue: at the least it could do with some comments so we know what "post-processing the subordinates" is actually doing.	13:52
TheMue	rogpeppe: will add them	13:52
=== wedgwood_away is now known as wedgwood
=== wedgwood is now known as Guest86505
rogpeppe	TheMue: what do subFromMap and subToMap do?	13:59
TheMue	rogpeppe: will find better names. the subToMap had its name from collecting the service names for the output "subordinate-to". ;)	14:00
rogpeppe	TheMue: what's are the keys and what to the keys map to?	14:01
rogpeppe	s/to the/do the/	14:01
TheMue	rogpeppe: they do hangout ;)	14:01
rogpeppe	TheMue: given that the data types are so dynamic, we really need more comments	14:01
TheMue	rogpeppe: i'll add comments and change the names	14:01
rogpeppe	hmm, the link box says it's got a link, though the phone's still dead. will try broadband again.	14:02
rogpeppe1	woo!	14:03
mramm	https://plus.google.com/hangouts/_/539f4239bf2fd8f454b789d64cd7307166bc9083	14:04
rogpeppe1	fwereade: hmm, i'm not sure if it's working	14:05
rogpeppe1	very limited bandwidth!	14:06
rogpeppe1	dimitern: i saw you for a moment...	14:06
=== wedgwood_ is now known as wedgwood
TheMue	fwereade: is it ok to move the status in with your comments handled and improvements in readability and comments?	14:34
fwereade	TheMue, has anyone else at least glanced at it?	14:35
fwereade	TheMue, I care a lot more about output than about readability though :)	14:35
TheMue	fwereade: roger is looking, and i could as dimitern	14:36
fwereade	TheMue, but yes, you have my LGTM, just get someone else's too :)	14:37
TheMue	fwereade: ok, will do	14:37
rogpeppe1	interesting, bandwidth tester indicates 1Mb download (not too bad) but 52Kb upload. not surprising that the hangout worked ok when i turned off my vid and microphone	14:38
dimitern	TheMue: you really hate merging conflicts it seems ;)	14:39
TheMue	dimitern: yeah, really, i do	14:39
TheMue	dimitern: and also i got my first lgtm ;)	14:39
dimitern	TheMue: no worries, it'll be easier for me as well once you land your stuff	14:40
TheMue	dimitern: great. had a chance to take a look at it?	14:41
dimitern	TheMue: just a brief glance	14:42
TheMue	dimitern: can i charm you a lgtm? ;)	14:43
dimitern	TheMue: in a bit, just looking through cards and bugs to make sure we're in sync	14:44
TheMue	dimitern: just wait, a new propose will fly in in a few moments	14:52
dimitern	TheMue: sure, np	14:52
TheMue	fwereade: ping	14:54
fwereade	TheMue, pong	14:55
TheMue	fwereade: your comment in the status_test, line 842, how do you think it shall work?	14:56
fwereade	dimitern, rogpeppe1, TheMue: trivial if anyone wants it: https://codereview.appspot.com/8768045	14:56
TheMue	fwereade: i'm lost with this comment.	14:56
dimitern	fwereade: LGTM, trivial	14:56
TheMue	fwereade: LGTM	14:57
fwereade	thanks guys	14:57
TheMue	fwereade: i thought i followed your first idea, but it seems that this has been wrong	14:57
fwereade	TheMue, I mean that adding a relation is a prereq of adding a subordinate, and that won't (shouldn't ;)) work if you add a subordinate of service S to more than one unit of service T	14:58
fwereade	TheMue, it's a good start	14:58
fwereade	TheMue, but relateServices already exists and shouldn't be duplicated	14:58
fwereade	TheMue, while adding the subordinate is an operation on an existing unit and an existing service	14:59
TheMue	fwereade: ah, ic	14:59
fwereade	popping out for a mo, bbiab	15:01
benji	gary_poster/rick_h_: yeah, I'm looking at it	15:13
gary_poster	benji, hi channel jumper	15:13
benji	pff	15:13
TheMue	dimitern: https://codereview.appspot.com/8824043/ is back in again	15:26
dimitern	TheMue: I'm on it	15:26
dimitern	TheMue: you got LGTM from me	15:33
dimitern	TheMue: please go ahead an land it, so I can continue on mine	15:34
TheMue	dimitern: thx, will do	15:34
TheMue	anyone experiencing the same bootstrap_test problems like me? i just merged the trunk and since then the test fails	15:48
dimitern	TheMue: what's you series?	15:52
TheMue	precise	15:53
dimitern	TheMue: i'm on quantal - jut got latest trunk and running tests now; haven't seen this issue before though	15:53
fwereade	TheMue, if you're seeing https://bugs.launchpad.net/juju-core/+bug/1169826 then I need help tracking it down please :)	15:56
TheMue	fwereade: hmm, no, i don't get a panic	15:57
dimitern	I also didn't get a panic, but got this error in environs/openstack: http://paste.ubuntu.com/5716314/	15:58
TheMue	fwereade: it's in bootstrap_test line 105	15:58
TheMue	fwereade: c.Check(urls, HasLen, len(test.uploads))	15:58
TheMue	fwereade: if i get it right it's one url instead of the expected two	15:59
fwereade	TheMue, expected/actual? which test?	16:00
dimitern	and i can see the cmd/juju tests almost doubled in running time the past 2 weeks	16:00
TheMue	fwereade: cmd/juju/bootstrap_test.go line 105	16:01
fwereade	dimitern, that's because we're actually testing stuff	16:01
fwereade	TheMue, that's a table test, the line number tells me very little	16:01
dimitern	fwereade: that's good then :)	16:01
fwereade	dimitern, ha, I can repro yours (one of them anyway) if I edit /etc/lsb-release	16:02
TheMue	fwereade: does "test 10: --upload-tools always bumps build number" help you more?	16:02
fwereade	TheMue, helpful, tyvm	16:02
fwereade	TheMue, dimitern, sorry I have to take a break :/ back soon	16:02
TheMue	fwereade: as my CL doesn't cover this is it ok to still submit it?	16:03
dimitern	fwereade: that's awesome, I'll remember this trick to change current series	16:03
dimitern	TheMue: submitting onto a broken trunk is a no no	16:03
hazmat	ods keynote livestream fwiw http://openstackportland2013livestream.eventbrite.com/	16:04
TheMue	dimitern: ok, so status has to wait	16:05
dimitern	hazmat: it says it's ended, but I can't see a link to the video..	16:05
hazmat	dimitern, by bad.. better link.. http://www.openstack.org/	16:06
dimitern	hazmat: cool, 10x!	16:07
hazmat	dimitern, mark's keynote is in +1hr 15m	16:08
dimitern	TheMue: just dump the whole error output + logs and paste it please	16:08
=== BradCrittenden is now known as bac
dimitern	hazmat: isn't it streaming live now? I can see it has started..	16:11
hazmat	dimitern, the keynotes have started, but there are some other ones scheduled first	16:11
dimitern	hazmat: ah, ok, cheers	16:12
dimitern	fwereade: I can reproduce the error I posed above consistently	16:14
TheMue	dimitern, fwereade: http://paste.ubuntu.com/5716425/	16:42
dimitern	TheMue: cheers	16:46
* TheMue has to take a break		16:59
=== TheMue_ is now known as TheMobileMue
=== deryck is now known as deryck[lunch]
dimitern	mark's keynote is staring now	17:23
rogpeppe1	dimitern: link?	17:50
dimitern	rogpeppe1: http://www.openstack.org/home/Video/	17:50
rogpeppe1	dimitern: is it really not possible to find out a service's subordinate or principal without looking at its units?	17:53
dimitern	rogpeppe1: I don't think you can	17:54
rogpeppe1	dimitern: seems a bit weird. so if there are no units, there's no subordinate-principal relationship between two units?	17:54
rogpeppe1	s/two units?/two services?/	17:54
dimitern	rogpeppe1: you can take a look at the service's charm as well	17:55
rogpeppe1	dimitern: how does that help?	17:55
rogpeppe1	dimitern: presumably you actually need to look at the relations	17:55
dimitern	rogpeppe1: if the service is running a subordinate charm, then all it's units are/will be subordinates	17:55
rogpeppe1	dimitern: yes, but for status we need to find out a service's principal service, irrespective of units, no?	17:56
dimitern	rogpeppe1: well, from the list of relation a subordinate service is participating in then	17:57
dimitern	rogpeppe1: relations*	17:57
rogpeppe1	dimitern: yup. i think that if we see that there's a relation between two services and scope is container, then we know	17:58
dimitern	rogpeppe1: makes sense, yeah	17:59
rogpeppe1	niemeyer: is there any way of telling goyaml to ignore a field when marshalling? (unexported fields seem to cause an error)	18:00
niemeyer	rogpeppe1: unexported fields shouldn't cause errors	18:01
niemeyer	rogpeppe1: Example?	18:01
rogpeppe1	niemeyer: one mo	18:01
rogpeppe1	niemeyer: http://paste.ubuntu.com/5716622/	18:03
rogpeppe1	niemeyer: one mo, perhaps i'm running an old version of goyaml	18:03
rogpeppe1	niemeyer: when i run that code, i see "error: YAML error: reflect.Value.Interface: cannot return value obtained from unexported field or method"	18:03
niemeyer	rogpeppe1: Haven't made any changes about that	18:03
niemeyer	rogpeppe1: Will test	18:03
niemeyer	rogpeppe1: That's a bug	18:03
niemeyer	rogpeppe1: Will fix	18:03
rogpeppe1	niemeyer: ah, ok	18:04
rogpeppe1	niemeyer: there's another bug i just found too	18:04
niemeyer	Surprising to see that broken	18:04
rogpeppe1	niemeyer: i haven't looked into the source, but it seems like it might be hashing types by name somewhere	18:05
niemeyer	I'd have guessed someone would have reported by now	18:05
niemeyer	rogpeppe1: Can't tell what that means	18:05
rogpeppe1	niemeyer: because if you use two function-scoped types of the same name, goyaml gets confused	18:05
niemeyer	rogpeppe1: Sorry, I still have no idea about what you mean.. an example helps	18:05
rogpeppe1	niemeyer: ok, here's an example (slightly bigger than it could be, but it's the stage i got to when i realised i knew what the problem was:	18:07
rogpeppe1	niemeyer: http://paste.ubuntu.com/5716642/	18:07
* rogpeppe1 has a very dodgy network connection currently		18:08
rogpeppe1	niemeyer: search for "noMethods" in that code	18:08
niemeyer	rogpeppe1: Ok?	18:09
niemeyer	rogpeppe1?	18:11
rogpeppe1	niemeyer: sorry, i'm just finding the output	18:12
rogpeppe1	niemeyer: ok, this is a slightly smaller example: http://paste.ubuntu.com/5716665/ and this is its output: http://paste.ubuntu.com/5716667/	18:16
rogpeppe1	niemeyer: the fields showing under exposed-service/0 are from the serviceStatus type, not from the unitStatus type as they should be	18:17
rogpeppe1	niemeyer: for example "charm" is showing the field from slot 1 of the unitStatus struct	18:18
niemeyer	rogpeppe1: It looks correct given the code	18:19
niemeyer	func (s serviceStatus) GetYAML() (tag string, value interface{}) {	18:20
niemeyer	type noMethods serviceStatus	18:20
niemeyer	return "", noMethods(s)	18:20
niemeyer	}	18:20
niemeyer	rogpeppe1: serviceStatus is returning a serviceStatus	18:20
rogpeppe1	niemeyer:	18:20
rogpeppe1	yuip	18:20
rogpeppe1	niemeyer: and unitStatus is returning a unitStatus	18:20
niemeyer	rogpeppe1: Yeah, and how will it ever get into a unitStatus?	18:20
rogpeppe1	niemeyer: but goyaml thinks that the unitStatus returned (well actually a noMethods value) is a serviceStatus	18:21
niemeyer	rogpeppe1: let me run the code here rather than guessing.. hold on	18:22
niemeyer	rogpeppe1: Yeah, there's a bg	18:23
niemeyer	bug	18:23
rogpeppe1	niemeyer: indeed :-)	18:24
niemeyer	rogpeppe1: I actually improved that logic a long time ago in bson, and forgot to implement it in goyaml	18:24
niemeyer	rogpeppe1: Easy to fix, though	18:24
rogpeppe1	niemeyer: cool	18:24
=== deryck[lunch] is now known as deryck
niemeyer	rogpeppe1: Both fixed and pushed.	18:49
rogpeppe1	niemeyer: woah!	18:50
rogpeppe1	niemeyer: nice one!	18:50
niemeyer	rogpeppe1: I already had the context for the latter issue, and the former was a silly mistake	18:51
rogpeppe1	niemeyer: at some point, it would be nice for goyaml to support `goyaml:"-"` like json does	18:52
=== TheMobileMue_ is now known as TheMobileMue
niemeyer	rogpeppe1: Done and pushed. Next? :)	18:55
* rogpeppe1 can't think of any more at the moment :-)		18:56
niemeyer	rogpeppe1: Cool, thanks for the reports	18:57
rogpeppe1	niemeyer: np. i was also surprised by the unexported field error, BTW.	18:58
niemeyer	rogpeppe1: I found a better naming for the flusher stuff, btw, which made me happier as well.	18:58
niemeyer	rogpeppe1: v2 is coming	18:58
rogpeppe1	niemeyer: great!	18:58
rogpeppe1	niemeyer: new name is?	18:58
niemeyer	rogpeppe1: type Task interface { Run; Kill }	18:58
rogpeppe1	niemeyer: perfect	18:59
rogpeppe1	niemeyer: that maps much better to how i came to understand it	18:59
niemeyer	rogpeppe1: Yeah, it maps better to the overall problem indeed.. and, funny enough, it was the previous name	19:00
niemeyer	rogpeppe1: But there were several refactorings after that which made things flow back and forth	19:00
rogpeppe1	niemeyer: i know how that goes...	19:01
niemeyer	rogpeppe1: The current design was finally sound, but the naming wasn't perfect anymore	19:01
rogpeppe1	niemeyer: overall it's a nice small package. i really like it.	19:01
niemeyer	rogpeppe1: Nice, I'm glad to hear it	19:01
* TheRealMue goes to bed now, good night		20:18
thumper	morning	20:42
fwereade	thumper, heyhey	20:44
thumper	hi fwereade	20:44
fwereade	thumper, I WIPped most of your fslock branches but there's only one serious question at the root of it, and then a feature we'd all like that will hopefully not hurt too much ("message" or similar on acquire)	20:45
thumper	fwereade: ok, what is the one serious question?	20:45
fwereade	thumper, the latter will enable units to reliably break their own locks, which I think is a Good Thing	20:45
fwereade	thumper, if some FSs can rename over empty dirs, shouldn't be we breating the held file before moving the dir into place?	20:46
fwereade	thumper, or does that break something I'm not seeing?	20:46
fwereade	s/breating/creating/	20:46
thumper	actually that makes a lot of sense, and yes, I'm +1 on that	20:46
fwereade	thumper, sweet	20:47
mgz	dumb go question, I have n int, how do I do (n != nil ? n : 0) idiomatically?	20:49
mgz	this is to fill in a bunch of struct fields, so I really don't want a three line if block per one...	20:49
fwereade	mgz, func valueOrZero(*int) int, I suspect	20:52
fwereade	mgz, nothing neater springs to mind	20:52
thumper	fwereade: reading your review comment, how would you confirm that you didn't take a lock while someone else had it? when running in a go routine, repeated?	21:07
fwereade	thumper, fist thought, may not be very smart, send a bunch of "acquired" and "released" thingies down a channel from n goroutines and check that you always get true/false/true/false/true/false	21:09
fwereade	thumper, can't think of anything obviously wrong with it though	21:10
fwereade	thumper, acq1/rel1/acq3/rel3/acq3/rel3/acq1/rel1 etc	21:11
* thumper nods		21:11
thumper	I get the feeling that I should smash the pipes together, and have a single proposal...	21:12
fwereade	thumper, I'd be fine with that I think	21:12
fwereade	thumper, (oh, I would also really like tests for the uniter)	21:12
thumper	fwereade: yeah, read that, will look into it	21:12
thumper	fwereade: is there any way to nicely simulate a crash during hook execution?	21:13
thumper	fwereade: actually, we can trivially have a test that has a lock for that uniter	21:13
fwereade	thumper, not nicely -- I think best to create the lock dir really	21:13
thumper	fwereade: and we can check when we create one, it unlocks it	21:13
fwereade	thumper, exactly	21:13
* thumper nods		21:13
thumper	fwereade: also, I don't like the message being the held file...	21:14
thumper	fwereade: feels wrong some how	21:14
thumper	fwereade: how do I check I own it? before unlocking	21:15
thumper	if we use message?	21:15
thumper	I'll update the code to use atomic writes for message	21:15
fwereade	thumper, message doesn't have to be the held file, you can create the two indpendently before the copy	21:15
thumper	and have an expected message on Lock	21:15
thumper	fwereade: jam was suggesting to use the message for held	21:15
thumper	just saying I'm -1 on that	21:16
fwereade	thumper, yeah, I rather like the nonce in held	21:16
fwereade	thumper, there is I think a meaningful distinction between "did this identity create the lock" and "did this specific chunk of memory create the lock"	21:17
fwereade	thumper, so, yeah, I don't think it's necessary to conflate the two	21:17
* thumper nods		21:17
thumper	fwereade: one problem with logging every time you can't get a lock	21:18
thumper	fwereade: when trying, is that retries are every second (by default)	21:18
thumper	fwereade: and hook execution could be a while	21:18
fwereade	thumper, ok, that's too many	21:18
thumper	in the order of minutes	21:18
thumper	fwereade: how about writing it the first time	21:19
fwereade	thumper, (not too many checks, just too many messages)	21:19
fwereade	thumper, first time is fine by me	21:19
thumper	fwereade: and each subsequent time the message changes?	21:19
fwereade	thumper, ah, yeah, nice	21:19
thumper	should give an obvious, not too invasive trail	21:19
fwereade	thumper, and maybe one every 120 times though the loop or something?	21:20
thumper	fwereade: I'd rather use time than a loop counter	21:20
thumper	although	21:20
thumper	hmm...	21:21
thumper	synchronization is hard	21:21
* thumper recalls his talk on "multithreading 101"		21:21
fwereade	thumper, follow your heart :)	21:21
thumper	gave that talk at a conference 7 years ago	21:21
thumper	I thought I knew heaps about it, in the end, I learned so much researching my talk it scared me	21:22
thumper	:)	21:22
fwereade	haha	21:22
fwereade	I know enough to be scared in general	21:22
fwereade	I have learned a lot over the years but somehow the amount I know remains characterizable by "enough to be scared"	21:22
thumper	I'm busy downloading sabdfl's openstack keynote in the background	21:23
thumper	youtube keeps killing it watching live	21:23
fwereade	it's on youtube, I'm listening to it	21:23
fwereade	ah it's only fallen over once for me	21:23
thumper	I had a huge amount of fun writing some lock free algorithms	21:23
rogpeppe1	thumper: i might be tempted to use atomic.AddInt32 to check that two things aren't holding a lock at the same time	21:23
thumper	after the third time it fell over, I decided o download it	21:23
thumper	rogpeppe1: can you explain more?	21:24
thumper	I don't quite get it	21:24
thumper	oh, so increment on lock	21:24
thumper	and decrement on unlock?	21:24
rogpeppe1	thumper: yup	21:24
thumper	and assert values	21:24
rogpeppe1	thumper: and after the increment, you look at the value returned	21:24
rogpeppe1	thumper: if >1 you've fucked up	21:24
* thumper nods		21:24
* fwereade expected rogpeppe1 to show up and explain how to do it better , cheers :)		21:24
thumper	sounds like a plan	21:24
rogpeppe1	thumper: also, i'd change the retry interval to zero when doing that test	21:25
rogpeppe1	thumper: all the better to stress with	21:25
thumper	rogpeppe1: ok, sounds good	21:26
rogpeppe1	thumper: what platform are we worried about not having O_EXCL, BTW?	21:30
thumper	rogpeppe1: I don't have an explicit reason, just went with what bzrlib had because I had talked to lifeless a lot about it, and it is convenient to add informational messages to	21:30
rogpeppe1	thumper: ok. messages would be just as convenient with a file though, i think, no?	21:32
thumper	I'm not familiar with O_EXCL	21:32
thumper	but I trusted the giant who came before me :)	21:32
rogpeppe1	thumper: :-)	21:32
rogpeppe1	thumper: bzr probably needs to run on way more platforms than us, but i suppose it can't harm	21:33
* thumper nods		21:33
rogpeppe1	thumper: i'm just slightly concerned about EBS network traffic though, when we could really do it all locally.	21:34
thumper	what EBS network traffc?	21:34
rogpeppe1	thumper: doesn't disk storage on amazon instances with EBS go across the network?	21:34
thumper	well, we are storing in the agent datadir, where it puts tools	21:35
thumper	we could do it anywhere	21:35
thumper	/var/run/juju would make sense to me	21:35
rogpeppe1	thumper: /tmp ?	21:35
thumper	I would assume some of these would be local	21:35
thumper	no, /tmp is fugly	21:35
thumper	/var/run was made for this reason	21:36
* rogpeppe1 has never heard of /var/run		21:36
thumper	AFAIK	21:36
fwereade	thumper, btw, I forget, do you definitely create the temp dirs on the same filesystem for actual atomic renames?	21:36
thumper	fwereade: I was assuming that /tmp and the juju data dir were on the same filesystem...	21:37
thumper	could create a temp filename in the lock dir to be sure	21:38
fwereade	thumper, +1	21:38
thumper	but I thought that was a bit messy	21:38
fwereade	thumper, subdir with invalid lock name?	21:38
thumper	.nonce or something	21:38
thumper	where nonce is the hex nonce	21:38
* thumper will work something out		21:39
* thumper goes to make a coffee and toast		21:39
fwereade	rogpeppe1, thumper: can I get a quick look at https://codereview.appspot.com/8804044/ please?	22:27
* thumper looks		22:27
fwereade	rogpeppe1, thumper: I promise I was using quantal for some of my testig across the pipeline, but clearly not quite as much as I thought I had	22:27
thumper	fwereade: given, with trivial	22:29
fwereade	rogpeppe1, thumper: the only other quantal complaints I am aware of are (1) verified transient, fixed by completing pipeline and (2) cannot repro at all -- that's a weird error in UpgradeJujuSuite that I propse to ignore until I can squeeze more details out of themue	22:29
fwereade	thumper, tyvm	22:30
rogpeppe1	fwereade: looking	22:30
thumper	np	22:30
rogpeppe1	fwereade: PTAL https://codereview.appspot.com/8842043	22:32
rogpeppe1	thumper: would appreciate it if you could take a look too	22:33
* thumper looks		22:33
rogpeppe1	fwereade: reviewed	22:33
* rogpeppe1 has to go to bed very soon		22:39
thumper	rogpeppe1: +1	22:43
rogpeppe1	thumper: thanks	22:44
thumper	rogpeppe1: personally I wouldn't have had map[string]map[string]interface{}, but instead had a few other structs instead of just reusing *state bits, but that's just me	22:44
thumper	and your approach is equally valid	22:44
thumper	so no review comment on that :)	22:44
rogpeppe1	thumper: that's about my maximum tolerance	22:45
rogpeppe1	thumper: thanks for forbearing :-)	22:45
thumper	np	22:45
rogpeppe1	thumper: actually, i don't have map[string]interface{} - i don't use interface{} at all and i'm really happy about that	22:48
thumper	actually you are right...	22:48
thumper	I typed from memory miss remembering	22:48
rogpeppe1	thumper: that was what really prompted this branch was the million occurrences of .(sometype) and thinking "i have no idea where this might panic or if it's valid"	22:49
thumper	:)	22:49
fwereade	rogpeppe1, LGTM just trivials	22:49
thumper	for me, I would have had the units map inside a struct with the service state	22:49
thumper	that's all	22:49
rogpeppe1	fwereade: oh sorry, i've just submitted	22:49
thumper	the rest looks really good	22:49
fwereade	rogpeppe1, meh, fix them tomorrow	22:50
rogpeppe1	fwereade: i saw your earlier LGTM	22:50
fwereade	rogpeppe1, huge win regardless	22:50
fwereade	rogpeppe1, yep, np at all	22:50
rogpeppe1	fwereade: i generally agree with your remarks. putting StatusError alongside other fields would be wonderful and mean that the json/goyaml custom marshallers could go	22:53
fwereade	rogpeppe1, no worries tonight though	22:53
rogpeppe1	fwereade: i just tried to be ultra-compatible because i didn't know what the constraints were	22:53
fwereade	rogpeppe1, +1	22:53
fwereade	rogpeppe1, in general adding fields is fine	22:53
fwereade	rogpeppe1, existing stuff we're trying to be a bit more careful about	22:54
rogpeppe1	fwereade: right, it's that time of day. actually way past. and if it's that time for me, then i suspect it might be for you too :-)	22:54
rogpeppe1	thumper, fwereade: g'night	22:54
fwereade	SLEEP IS FOR THE WEAK	22:54
thumper	rogpeppe1: night	22:54
fwereade	rogpeppe1, gn :)	22:54
* thumper is writing the lock stress test		22:54
davecheney	m_3: ping	23:04
mgz	davecheney: I can go find him if it's anything urgent	23:07
thumper	ah poos	23:08
davecheney	mgz: nah, just wanted to talk about load testing stuffs	23:11
* thumper facepalms		23:15
wallyworld_	thumper: how's the face?	23:15
thumper	covered	23:15
wallyworld_	what exasperated you?	23:16
thumper	$ go test	23:17
thumper	throw: all goroutines are asleep - deadlock!	23:17
wallyworld_	\o/	23:19
* thumper sighs c.Check(state, Equals, 1)		23:24
thumper	... obtained int32 = 1	23:24
thumper	... expected int = 1	23:24
=== wedgwood is now known as wedgwood_away
wallyworld_	lol	23:24
thumper	davecheney: is there a nice way to get a go routine to release it's time segment?	23:25
thumper	davecheney: I want to say, go run something else for a bit	23:25
thumper	davecheney: will time.sleep(0) do that?	23:25
* thumper recalls it does in other languages		23:25
davecheney	that will do it	23:26
davecheney	runtime.Gosched()	23:26
* thumper goes to the manual		23:26
davecheney	godoc runtime Goschel	23:27
thumper	that's what I want, awesome	23:27
davecheney	godoc runtime Gosched	23:27
davecheney	sleep will do the same thing	23:27
thumper	hmm...	23:28
davecheney	thumper: sorry, maasive lag in australia	23:28
thumper	stress test now fails	23:28
thumper	:(	23:28
thumper	it shouldn't	23:28
* thumper goes to read some more		23:28
davecheney	also try with GOMAXPROCS=8 (your n cores) go test	23:29
davecheney	will probably have a similar effect	23:29
thumper	heh, passes with max procs	23:31
thumper	fails without	23:31
thumper	failed that time	23:32
thumper	bugger	23:34
thumper	debugging this is going to be a PITA	23:34
davecheney	thumper: tried the race detector	23:38
davecheney	?	23:38
davecheney	if you have a branch, I can test it for you	23:38
thumper	what is the race detector?	23:39
davecheney	it's a feature on go 1.1	23:39
davecheney	it is hte same thread santiser that is available in clag 3.2 / gcc 4.8	23:39
davecheney	thumper: http://tip.golang.org/doc/articles/race_detector.html	23:39
thumper	davecheney: lp:~thumper/juju-core/fslock-mashup has a failing stress test in utils/fslock	23:40
thumper	passed with 1 and 2 concurrent locks, failed with 3	23:40
thumper	that is why it currently says 3	23:41
thumper	I want 10 in the end...	23:41
davecheney	thumper: two secs	23:44
davecheney	mgz: is mramm around ?	23:48
mgz	not near me currently, could either be at the booth or in a meeting	23:49
davecheney	mgz: nm	23:50
davecheney	thumper: sorry, got distracted by submitting my own branch	23:57

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!