/srv/irclogs.ubuntu.com/2013/09/26/#juju-dev.txt

=== gary_poster is now known as gary_poster\|away
* thumper misses list comprehension in go		02:23
wallyworld_	thumper: mr ocr, i have a branch which hooks up simplestreams mirrors support for tools https://codereview.appspot.com/13952043	03:02
wallyworld_	for fuck sake, out landing bot has been shot down	03:07
wallyworld_	shut	03:07
wallyworld_	ah maintenance i think	03:07
thumper	hi wallyworld_	03:13
wallyworld_	hi	03:13
thumper	I'll look shortly	03:13
wallyworld_	i hope canonistack is back soon	03:13
wallyworld_	np	03:13
bradm	wallyworld_: I thought it was back already?	03:20
wallyworld_	bradm: when i nova list it says the instances are shutdown	03:20
wallyworld_	+--------------------------------------+----------------------+---------+-------------------------+	03:20
wallyworld_	\| ID \| Name \| Status \| Networks \|	03:20
wallyworld_	+--------------------------------------+----------------------+---------+-------------------------+	03:20
wallyworld_	\| 4829b364-72ad-4ee7-a21c-3ba640f28854 \| juju-gobot-machine-0 \| SHUTOFF \| canonistack=10.55.32.55 \|	03:20
wallyworld_	\| 97a7c226-a195-4014-9df5-c998bba3a491 \| juju-gobot-machine-3 \| SHUTOFF \| canonistack=10.55.32.52 \|	03:20
wallyworld_	+--------------------------------------+----------------------+---------+-------------------------+	03:20
bradm	wallyworld_: yeah, the compute node being rebooted will do that	03:21
wallyworld_	bradm: it would not have been in the procedures to restart stuff that was running?	03:21
bradm	wallyworld_: I wasn't directly involved, but it would seem not to be the case	03:22
wallyworld_	:-(	03:22
wallyworld_	this is the second time our instances have been broken :-(	03:22
bradm	you can't just power it on?	03:22
wallyworld_	i'm not sure how	03:22
wallyworld_	i assume thee's a nova command	03:23
wallyworld_	i'll take a look	03:23
bradm	nova start <id>	03:23
wallyworld_	yeah, trying that now	03:23
wallyworld_	bradm: back running, seems quicker perhaps	03:24
bradm	wallyworld_: probably, there's likely hardly anyone elses instances going :)	03:25
wallyworld_	\o/	03:25
bradm	the compute nodes are pretty beefy machines	03:26
bradm	they're just being overcommitted by a lot	03:26
bradm	I'll chase up what happened internally, the announcements did say things would be restarted	03:29
bradm	but that definately appears not to be the case, or at least not consistantly	03:29
wallyworld_	thanks :-)	03:31
thumper	wallyworld_: something is wrong with the gobot	03:33
thumper	no mongod	03:33
wallyworld_	:-(	03:33
wallyworld_	i'm not familiar with how it is set up sadly	03:34
thumper	we need more monday gods	03:34
thumper	mon-god	03:34
wallyworld_	yeah	03:34
wallyworld_	although stopping and starting should not have affected it you'd think	03:34
bradm	fwiw with my dinky little juju test env on lcy02 the reboot didn't break it, its back up and going	03:36
thumper	oh good	03:37
wallyworld_	thumper: i had a quick look - mongod is in /usr/local/bin and /usr/local/bin is in the path so i'm not sure	03:38
* thumper -> haircut		03:41
hazmat	thumper, testing saucy local fwiw	03:50
hazmat	thumper, is there a particular version of interest? trunk i assume?	03:52
hazmat	thumper, fails for me.. although looks like diff issue, namely the upstart job needs a wait between dropping an upstart template to disk and starting till inotify triggers and register with upstart.	04:16
bradm	this is very interesting, a default juju bootstrap on lcy02 fails, since the instance type isn't big enough	04:32
bradm	mongodb shuts itself down saying there's not enough space	04:34
wallyworld_	bradm: that started happening about a week ago for some reason, i think folks are looking into it	04:53
bradm	wallyworld_: I can tell you why	04:53
bradm	wallyworld_: the default instance is a m1.tiny, which has a 2G /	04:54
wallyworld_	ok :-)	04:54
wallyworld_	juju used to be ok in 1G	04:54
wallyworld_	or even 512	04:54
bradm	wallyworld_: I just bootstrapped with more, mongodb alone uses 3G	04:54
bradm	wallyworld_: its disk thats the issue, not memory	04:54
wallyworld_	serious? the landing bot bootstrap machine used to be a 512M instance	04:54
wallyworld_	ah disk	04:54
wallyworld_	i thought you were talking about ram	04:55
wallyworld_	still, juju should not pick tiny on canonistack	04:55
bradm	I can't say why mongodb suddenly wants all your disk, but that seems to be the c ase	04:55
bradm	bootstrap it with a m1.tiny and you'll see, check the logs in /var/log/mongodb	04:56
bradm	it pretty clearly says it needs mor disk	04:56
wallyworld_	there's some issue in how juju is choosing the instance	04:56
wallyworld_	it used to work. it should be picking small	04:56
wallyworld_	i'm not sure of the current status though but it is being looked at	04:56
bradm	yeah, not sure where it changed, but thats the fix, to bootstrap with contraints that give you a bigger disk	04:56
bradm	is that whats happening with your gobot? needs more disk for mongodb?	04:59
bradm	I wonder if mongodb should be using the smallfiles option	05:01
wallyworld_	bradm: could be, but it was running fine before the shutdown	05:02
bradm	wallyworld_: /var/log/mongodb/mongodb.log should make it pretty clear	05:04
wallyworld_	bradm: yeah, true. i'm tied up trying to get some coding finished, but i'll look soon	05:05
bradm	wallyworld_: cool, I can do some more testing myself once I've gotten this charm done	05:15
wallyworld_	bradm: ok. i'm flat out right now as i'm off from tomorrow for a week and am trying to get everything done before i go. i'll hopefully be ale to look a bit later	05:16
bradm	wallyworld_: actually, I'm off next week too :)	05:17
wallyworld_	\o/	05:17
wallyworld_	going anywhere?	05:17
bradm	yeah, my parents have taken our son for a holiday, we're driving up there to pick him up and spend some time with them	05:18
bradm	its one of our first times (outside of hospital) that we've been away from him, its interesting	05:18
wallyworld_	how old is he?	05:19
bradm	6	05:19
wallyworld_	yeah, we didn't spent time away from out son for a few years also	05:19
bradm	there are medical issues with him too, so we're probably a bit more protective than normal	05:20
wallyworld_	yeah, i can understand that	05:20
bradm	he had 2 open heart surgeries before he was 5	05:20
wallyworld_	wow	05:25
wallyworld_	glad he's ok	05:25
bradm	yeah, he's pretty good given what he's had to go thru	05:25
bradm	how about you? going anywhere interesting?	05:25
wallyworld_	hervey bay to watch whales, then to frazer island for a few days	05:26
wallyworld_	looking forward to it	05:26
bradm	ahh, nice - I've been to hervey bay whale watching before, lots of fun	05:27
wallyworld_	yeah, me too about 10 years ago	05:28
wallyworld_	with kid #1. now with kid #2	05:28
bradm	we're starting to think along those lines for holidays as the boy gets older, he might actually get a bit more out of it	05:29
wallyworld_	yep. we took kid #1 to nz when he was 4 and he remembers nothing. what a waste	05:29
bradm	it'd be pointless for us before now, we always seemed to spend a good portion of the year with him in and out of hospitals	05:30
wallyworld_	that's a shame, i hope he gets well asap	05:32
bradm	he's been really good this year	05:32
bradm	usually a flu would mean a trip to hospital, this year so far things have been good	05:32
wallyworld_	\o/	05:34
bradm	ohh, there's two mongodb running in my juju env	05:35
bradm	and its the non juju one taking up all the space	05:35
bradm	the juju started one has --smallfiles, the other one doesn't	05:37
wallyworld_	ah	05:38
=== thumper is now known as thumper-afk
jam	wallyworld_: as I'm working through some other things, I came across this question. How does "juju bootstrap --upload-tools" work today w/ openstack. Doesn't it put the tools in your private bucket, which should not be world readable?	06:13
wallyworld_	jam: yes	06:13
wallyworld_	it puts them in private bucket	06:13
jam	wallyworld_: right, and both cloud-init and Upgrader just use a "wget" to get the tools	06:14
jam	no Auth	06:14
wallyworld_	jam: bot is down since the canonistack maintenance. i haven't had a chance to look deeply, but running tests says it can't find mongod in path, but mongod is in /usr/local/bin and that dir is in the path from what i can see	06:14
wallyworld_	jam: it uses a temp url	06:15
wallyworld_	which is publically readable	06:15
wallyworld_	jam: i've tested bootstrapping with upload tools and simplestreams and it works fine	06:15
wallyworld_	unless i've missed smething	06:16
wallyworld_	jam: when getting to tools URL, it does a storage.URL() which for environs storage returns a url from which anyone can read	06:17
jam	wallyworld_: we don't have temp urls on canonistakc, IIRC. I'm worried we're actually making our private containers world readable	06:17
davecheney	wallyworld_: sounds like the bot is using the old tarball	06:17
davecheney	it should use mongodb from ppa:juju/stable	06:17
jam	wallyworld_: when working it out originally, we decided it was ok that the "public-bucket-url" had to be world readable	06:17
jam	I don't expect it to be any different with tools-url	06:18
jam	but I'm seriously suspecting that we should be able to "juju bootstrap --upload-tools" on Canonistack	06:18
wallyworld_	jam: i'll have to check but it all seemed to work ok	06:18
wallyworld_	upload tools is now automatic	06:18
jam	wallyworld_: you mean sync-tools ?	06:18
wallyworld_	no, upload	06:18
jam	again, I think if it is working, we have a security hole	06:18
wallyworld_	i don't recall explicitly setting permissions on the control buket	06:19
jam	wallyworld_: "swift stat $PRIVATE-BUCKET" has ".r:*,.rlistings"	06:21
jam	wallyworld_: :(	06:21
wallyworld_	hmmm. the tool stuff doesn't set that i'mpretty sure	06:21
jam	wallyworld_: I don't know who is setting it, but it is wrong, and it means private tools won't work when we "fix" it.	06:21
jam	wallyworld_: I think the "auto-upload-tools" stuff creates a bucket and sets it world readable	06:22
wallyworld_	i'll have to check	06:22
wallyworld_	jam:	06:23
wallyworld_	containerName: ecfg.controlBucket(),	06:23
wallyworld_	// this is possibly just a hack - if the ACL is swift.Private,	06:23
wallyworld_	// the machine won't be able to get the tools (401 error)	06:23
wallyworld_	containerACL: swift.PublicRead,	06:23
wallyworld_	this was put in in january	06:23
wallyworld_	by dimiter i think	06:23
jam	wallyworld_: with nobody realizing "you can't get the tools, but you're exposing all of your secrets to the world" ?	06:23
jam	I'm not 100% sure what goes in the private bucket	06:24
jam	as I don't think we put creds there.	06:24
jam	So it might be ok	06:24
wallyworld_	we put the state file there	06:24
jam	wallyworld_: which is just the IP address, right?	06:24
wallyworld_	i'd have to check but i don't think creds go there	06:24
wallyworld_	i think so	06:24
jam	wallyworld_: I think the only actually private thing is potentially private charms	06:29
jam	As I'm pretty sure we put the charm data in there	06:30
jam	however	06:30
jam	that also needs to be accessible via "wget" because of how we removed Environ creds from the Uniter agents.	06:30
wallyworld_	so it could be worse i guess	06:30
jam	fwereade: I need to chat with you about this.	06:31
jam	wallyworld_, fwereade: security bug #1231278	06:31
wallyworld_	ok	06:31
jam	(mup won't find it because it is private)	06:31
fwereade	wallyworld_, jam, reading back	06:32
jam	We just need a discussion, because there is certainly a "vulnerability vs not working at all" that we have to sort through.	06:32
jam	fwereade: G+ might be appropriate	06:32
jam	wallyworld_: bigjools seems to be enjoying himself without you so far :)	06:33
fwereade	jam, well, I'm here, if the sight of a dressing gown will not be damaging to your sensibilities	06:33
wallyworld_	jam: how do you know?	06:33
jam	wallyworld_: he's posting pics of the great barrier reef on G+	06:33
wallyworld_	fwereade: my eyes, my poor, poor eyes	06:33
jam	fwereade: started: https://plus.google.com/hangouts/_/26fdcf993421ca83a1cf0b1a3ddd35772695e493	06:33
wallyworld_	jam: ah ok. that social networking thing i ignore	06:33
jam	fwereade: you could just turn the camera off :)	06:34
davecheney	https://code.launchpad.net/~dave-cheney/juju-core/158-lp-1210407/+merge/187675	06:44
davecheney	axw: thanks for your review, see my comments	06:56
axw	looking	06:56
axw	davecheney: will lgtm, just curious about this: "we don't reboot machines" -- it doesn't work?	07:04
axw	I get your point though - it doesn't really matter	07:05
davecheney	axw: if you reboot a machine it gets a new ephemeral ip	07:05
davecheney	and at that point, nothing works	07:05
davecheney	axw: why do you say twice ?	07:09
davecheney	I get your point. It just feels wrong to do it twice when it only ought	07:09
davecheney	to be done once. But, given that's not really possible... LGTM.	07:09
davecheney	"	07:09
axw	davecheney: if it were able to reboot	07:09
axw	it's idempotent though, so doesn't matter.	07:10
davecheney	axw: fair point	07:10
davecheney	also, bootcmd http://cloudinit.readthedocs.org/en/latest/topics/examples.html	07:11
davecheney	does what runcmd does	07:11
davecheney	it has the same firstboot properties	07:11
axw	davecheney: ok, then the doc comment on juju-core/cloudinit/Config.AddBootCmd is wrong :)	07:12
davecheney	axw: ok, i'll fix that in a followup	07:12
axw	davecheney: in that page, the comment for bootcmd has a hidden gem: " * bootcmd will run on every boot"	07:13
davecheney	urgh	07:14
davecheney	oh well	07:14
davecheney	care factor, quite small	07:14
davecheney	this may fix the azure disk suckage	07:14
davecheney	but while reading that page	07:15
davecheney	where does it say runcmd is only rnu once ?	07:15
* axw shrugs		07:15
davecheney	i'm glad we've arrived at this place	07:16
axw	it says it in juju-core, but that's maybe not authoritative	07:16
davecheney	axw, best I can tell, we've never tried	07:17
davecheney	everyone knows rebooting an ec2 intsance will screw it	07:17
axw	no worries, it's not a big deal	07:18
axw	fwereade: when you have a moment, would you mind expanding on your comments here? https://codereview.appspot.com/13832045/	07:18
fwereade	axw, sure	07:19
axw	I've changed the authentication stuff around a bit to allow HTTP GETs & HTTPS PUTs; wanted to know what you meant first, though, in case I was expending too much effort on this...	07:19
fwereade	axw, I was just ruminating that if cert distribution prove to be some sort of hassle (mainly because the CLI still needs direct storage access for deploy/upgrade-charm) we could use ssh storage for the manual provider and filesystem storage for the local one, because the clients that need write access should already have the information needed to set up the appropriate Storage types	07:21
axw	ah, right	07:22
axw	fwereade: does anything other than CLI need write access?	07:22
fwereade	axw, the API server itself may do	07:22
fwereade	axw, but (assuming non-HA, anyway) that's doable via the filesystem	07:23
axw	fwereade: ok. it can write directly, given it's local, so that's fine	07:23
axw	yep	07:23
axw	ok, so yes I did expend too much effort	07:23
axw	oh well	07:23
fwereade	axw, well, you expended it too early, at least	07:23
axw	:)	07:23
fwereade	axw, but no real harm done, I think	07:23
fwereade	axw, we would ideally like to not depend on provider storage at all but that's not an immediate plan	07:25
axw	ok	07:26
fwereade	axw, what we will need to do soon, though, is start exposing storage access via the API, so that an API-only CLI can still upload charms from local repos	07:27
axw	fwereade: also, when you're not busy, would you please look at my latest replies on these two: https://code.launchpad.net/~axwalk/+activereviews	07:27
axw	fwereade: I was wondering if/when storage would be API based	07:27
fwereade	axw, will do	07:27
axw	thanks	07:28
axw	fwereade: actually, my changes to httpstorage aren't for naught	07:34
axw	they'll allow GETs to not require a self-signed cert	07:34
axw	forgot taht important bit :)	07:34
fwereade	axw, I don't think they are, indeed	07:34
axw	fwereade: I mean, changes I haven't pushed yet	07:34
fwereade	axw, ah right -- cool then :)	07:34
axw	I've been changing things today	07:34
fwereade	axw, https://codereview.appspot.com/13632046/ LGTM	07:38
axw	fwereade: thanks. the error is tested in jujutest/livetests	07:39
fwereade	axw, I was just thinking of direct tests for New/Is	07:39
axw	fwereade: ok, then no. I'll add some before landing	07:40
fwereade	axw, cheers	07:40
axw	eh that package has no tests... time to add some	07:42
rogpeppe1	mornin' all	07:48
axw	morning rogpeppe1	07:59
rogpeppe1	axw: hiy	07:59
rogpeppe1	a	08:00
rogpeppe1	:-)	08:00
fwereade	axw, https://codereview.appspot.com/13255051/ nearly LGTM, take a look and let me know your thoughts	08:04
fwereade	rogpeppe1, morning	08:04
rogpeppe1	fwereade: yo!	08:04
axw	fwereade: thanks, reading	08:04
axw	fwereade: replying now, but yeah, there is currently no handling of destruction for bootstrap nodes	08:12
axw	the others can be destroyed as usual	08:12
axw	I wasn't really sure where to draw the line with "null" :)	08:13
rogpeppe1	fwereade: i'd don't quite understand this comment: https://codereview.appspot.com/13912043/diff/1/environs/configstore/disk.go#newcode134	08:24
rvba	Hi jam, hi mgz… would any of you have time to talk about what seems to be a serious bug in the MAAS provider (bug 1229275).	08:25
rogpeppe1	fwereade: i think the only time we add attributes is when we call Prepare	08:25
_mup_	Bug #1229275: juju destroy-environment also destroys nodes that are not controlled by juju <maas (Ubuntu):New> <https://launchpad.net/bugs/1229275>	08:25
rogpeppe1	rvba: oops!	08:25
mgz	rvba: yup, but I'll need to get on a bus in a sec	08:25
fwereade	rogpeppe1, it's not really actionable, especially in the light of our later discussions	08:27
rogpeppe1	fwereade: ok, cool	08:27
fwereade	rogpeppe1, prepare chooses bootstrap-state and writes it; bootstrap uses exactly that	08:27
rogpeppe1	fwereade: yup	08:27
fwereade	rogpeppe1, it may involve some light massage of bootstrap responsibilities vs prepare responsibilitie	08:27
fwereade	rogpeppe1, but nbd	08:27
rvba	So basically, juju destroys all the instances he gets back from the provider's instances() method, and that is basically all the instances.	08:27
fwereade	rvba, that looks like a critical to me	08:28
rvba	Critical indeed.	08:28
fwereade	rvba, how does the maas providers markinstances as controlled by itself?	08:28
rogpeppe1	rvba: the provider's Instances method should not be returning instances it didn't itself create	08:28
rvba	fwereade: it doesn't	08:28
rvba	rogpeppe1: that's the problem indeed.	08:28
rogpeppe1	rvba: the other providers take care to avoid that	08:28
fwereade	rvba, well, crap -- as someone maasy, how would you recommend we do so?	08:29
rvba	fwereade: if this needs to be addressed on the MAAS side, then the easiest way is probably to set a tag on the nodes. A tag identifying the juju environment.	08:30
rvba	Out of curiosity, how do the other providers do it?	08:31
mgz	rvba: either by looking at the security groups or the name attached to the instances, I believe	08:32
mgz	instances that env controls are given names juju-ENVNAME-*	08:33
rvba	Right… that's how the Azure provider works too now that I think of it.	08:36
fwereade	mgz, rvba: fwiw envname is bad	08:38
fwereade	mgz, rbva: long-term, envname can only ever be a local alias for the actual environment uuid	08:39
mgz	it's all a little dodgy, but I don't like a the alternatives much	08:39
fwereade	mgz, rvba: and we've already had problems with two people using the same env name and same provider credentials	08:39
fwereade	it's easy to say "don't do that then"	08:40
mgz	well, we should check for that on bootstrap and blow up	08:40
mgz	right, I need to get on bus	08:40
fwereade	but that's not as helpful as designing things such that we don't have to do so in the first place	08:40
fwereade	rvba, I need to take a break for a bit, but... actually just a mo	08:41
fwereade	rvba, how does juju not destroy those other instances first?	08:41
fwereade	rvba, the provisioner will be asking for all instances and culling those it doesn't recognise	08:42
fwereade	rvba, so starting a juju environment should also kill everything else	08:42
fwereade	rvba, as should upgrading it	08:42
fwereade	rvba, do you know if that's the case?	08:42
rvba	fwereade: I just tested it, that's not what happens.	08:43
rvba	(I'm testing with the latest trunk)	08:43
fwereade	rvba, ok, that's weird	08:44
fwereade	rvba, if it's not culling unknown instances it implies that actually AllInstances is reporting the right ones	08:45
rvba	fwereade: that should happen during bootstrap right?	08:46
jam	fwereade: well, you also have to run "juju status" first or it can poll the Provider at all yet	08:48
jam	we've had many "run status and everything dies" bugs :)	08:48
rvba	fwereade: I simply tested running "juju bootstrap", is the culling supposed to happen there or later, for instance when the bootstrap node comes up?	08:50
fwereade	rvba, jam speaks truth, you need to connect once before the bad things will happeb	08:50
fwereade	rvba, it'll happen when the proviioner starts running	08:50
rvba	Okay, testing that now.	08:50
rvba	(node is installing)	08:50
fwereade	rvba, which will happen just after the first command that connects	08:51
fwereade	rvba, cheers	08:51
fwereade	axw, responded, let me know what you think of the Destroy error question	08:51
fwereade	bbiab	08:51
axw	fwereade: yes sorry, I agree Destroy should return an error for now	08:53
rvba	fwereade: you were right, culling did happen.	08:56
fwereade	rvba, well, ok, the good thing here is we don't have to worry about backward compatibility then, because nothing (sensible) we do can make the situation any worse	09:02
fwereade	axw, this right here ^ is a reason for an EnvironUpgrader that acts directly on the environment (independent of the should-it-hit-state discussion)	09:05
* axw reads back		09:06
fwereade	axw, short version: maas instances are not tied back to their environment, and getting instances from maas gets all instances, not all instances in the environment	09:07
* axw nods		09:07
axw	and destroys them all	09:07
axw	fwereade: where's the link to EnvironUpgrader?	09:07
axw	I didn't get much sleep last night, so a little slower than usual today	09:08
fwereade	axw, sorry, we were chatting about it in the state-upgrades thread	09:08
fwereade	axw, your contention was that it should connect to state	09:09
fwereade	axw, I think that's the wrong way round	09:09
fwereade	axw, but that adding an optional upgrade method to environ might be a good idea for other reasons	09:10
axw	fwereade: for example, so you could add a tag to the maas nodes that you control?	09:10
fwereade	axw, exactly so :)	09:10
fwereade	axw, or indeed so we could correct the envname problem (above) for the other providers	09:11
axw	fwereade: your latest reply clarifies things for me, and yes, much nicer to not manipulate state from environ	09:11
fwereade	axw, great	09:11
axw	fwereade: I've updated https://codereview.appspot.com/13255051/	09:15
axw	okay if I handle Destroy properly in a followup?	09:16
fwereade	axw, absolutely	09:19
fwereade	axw, LGTM	09:20
axw	thanks	09:20
axw	fwereade: I'll get the last of the httpstorage stuff in next, then get onto Destroy	09:20
fwereade	axw, perfect, tyvm	09:21
axw	fwereade: and then Prechecker wireup	09:21
fwereade	great -- that one's going to be a bit interesting, I think, we should plan how we get it in there ahead of time	09:22
* fwereade bbiab again, see you all atthe meeting		09:22
axw	me too, I need a break. bbl	09:23
=== thumper-afk is now known as thumper
thumper	rogpeppe1: I've realized that I really don't like mornings	09:48
rogpeppe1	thumper: that's taken you a while :-)	09:49
rogpeppe1	thumper: i've realised that i forgot (again!) about our chat last night	09:49
thumper	My head just isn't in it that early	09:49
thumper	I should go check the agenda	09:50
jam	https://bugs.launchpad.net/bugs/1229275 is that actually Critical ?	10:36
_mup_	Bug #1229275: juju destroy-environment also destroys nodes that are not controlled by juju <juju-core:Triaged> <maas (Ubuntu):Triaged> <https://launchpad.net/bugs/1229275>	10:36
jam	seems High at best	10:36
jam	especially given "nobody is assigned to it"	10:36
dimitern	fwereade, there it is https://codereview.appspot.com/13963043 - first part, the secrets blanking will follow	10:58
jam	dimitern: the other way around	11:02
=== gary_poster\|away is now known as gary_poster
fwereade	dimitern, would you take a really quick look at https://bugs.launchpad.net/juju-core/+bug/1229286 ? it feels somewhat likely to be unitery	11:18
_mup_	Bug #1229286: debug-log and boolean options are broken in trunk <juju-core:New> <https://launchpad.net/bugs/1229286>	11:18
dimitern	fwereade, looking	11:19
fwereade	dimitern, the config bits specifically	11:19
fwereade	dimitern, may be helpful to confer with TheMue, he was touching config recently	11:20
dimitern	fwereade, I haven't tried juju set when live testing the api uniter	11:20
dimitern	fwereade, just debug-hooks and relation-set/get	11:21
fwereade	dimitern, yeah, I should have thought of that	11:21
fwereade	dimitern, in fact the stuff you're doing is as critical as this regardless	11:21
fwereade	TheMue, is there any likelihood you'll be able to look into it this pm?	11:21
TheMue	fwereade: yep, will do	11:22
TheMue	fwereade: lunch in a few moments, but then	11:23
dimitern	fwereade, did debug-log show the hooks output before?	11:23
dimitern	frankban, hey	11:23
fwereade	TheMue, cool, thanks, please just verify what's happening with set vs config-changed	11:23
dimitern	frankban, about that bug ^^	11:24
dimitern	frankban, have you tried using debug-hooks instead?	11:24
frankban	dimitern: no	11:24
fwereade	dimitern, frankban: re logging you need to enable that logging in env config now	11:24
fwereade	dimitern, frankban: thumper knows exactly	11:24
dimitern	frankban, debug-hooks will show you if config-changed got fired	11:25
frankban	dimitern: as I mentioned in the bug description, I am pretty sure that config-changed is called	11:25
thumper	dimitern, frankban: it is due to logging changes that were made recently to make things more "productiony"	11:25
thumper	bootstrap with --debug	11:25
thumper	or --log-config=<root>=DEBUG	11:26
frankban	thumper: cool, good to know	11:26
thumper	or whatever you want	11:26
thumper	this log config then propagates to all the agents	11:26
frankban	thumper: so, by default, hooks output is not displayed in the debug log, correct?	11:26
dimitern	ah, good to know	11:26
thumper	can be updated using "juju set-env log-config=blah"	11:26
thumper	frankban: correct	11:26
thumper	only warning and errors	11:26
thumper	used to be debug for everything	11:27
thumper	I'll write an email for juju-dev tomorrow to explain the changes	11:27
thumper	and hooks	11:27
thumper	not juju hooks	11:27
thumper	but how to do other logging stuff	11:27
frankban	dimitern: so, the real bug is about boolean options: it seems they are always set to false	11:28
frankban	thumper: thanks for the clarification	11:28
thumper	np	11:29
dimitern	frankban, hmm.. TheMue, can this be relevant to your recent config changes?	11:29
TheMue	dimitern: should not, only empty settings have been touched	11:30
TheMue	dimitern: will will take a look after lunch	11:30
* TheMue => lunch		11:30
dimitern	fwereade, did you have a chance to look at https://codereview.appspot.com/13963043 ?	11:35
fwereade	dimitern, been in meetings I'm afraid, i'll try to fit in in before Igo forlunch	11:35
dimitern	fwereade, ok	11:36
fwereade	dimitern, did we not have an implementation for Upgrader that swapped out 127.0.0.1?	11:41
fwereade	dimitern, erDeployer	11:41
dimitern	fwereade, that's from there	11:41
dimitern	fwereade, it's not swapping anything	11:42
dimitern	fwereade, and it actually works like proposed - live tested on ec2	11:42
fwereade	dimitern, I see, ok, no quibbles with what we're doing	11:42
fwereade	dimitern, but would you please pull the common implementation of those methods out into a common type we can embed, like the other shared functionality?	11:43
fwereade	dimitern, Ican live with that as an immediate followup	11:43
dimitern	fwereade, even though it's going away as soon as we have machine addresses?	11:44
fwereade	dimitern, we're still going to need to do the same thing in the same two places, aren't we?	11:44
dimitern	fwereade, I'll do it in this CL, not to much to do I think	11:45
fwereade	dimitern, we'd just stop using an environ to do so, surely	11:45
fwereade	dimitern, that's even better :)	11:45
fwereade	thanks	11:47
* fwereade quick lunch		11:48
jam	dimitern: https://codereview.appspot.com/13964043/ looks pretty much the same as the one you set back to WIP and were going to resubmit. Did you mark the wrong one?	11:52
jam	https://code.launchpad.net/~dimitern/juju-core/145-apiserver-provisioner-blank-secrets/+merge/187577 looks just like https://code.launchpad.net/~dimitern/juju-core/147-apiprovisioner-blank-env-secrets/+merge/187738	11:52
jam	dimitern: maybe you meant to reject https://code.launchpad.net/~dimitern/juju-core/146-apiprovisioner-addresses/+merge/187719 ?	11:53
dimitern	jam, no, it has almost the same description and diff, but different prereq	11:55
gary_poster	TheMue, when you get back would like to know how https://bugs.launchpad.net/juju-core/+bug/1224568 is doing	12:38
_mup_	Bug #1224568: Improve hook error reporting <juju-core:In Progress by themue> <https://launchpad.net/bugs/1224568>	12:38
TheMue	gary_poster: it's almost done, one smaller CL is missing. after investigating the problem of frankban i'll continue (tests are missing)	12:39
gary_poster	awesome thanks TheMue @	12:40
gary_poster	!	12:40
TheMue	frankban: ping	12:52
frankban	TheMue: pong	12:55
TheMue	frankban: the boolean value, how is it configured?	12:55
frankban	TheMue: I saw every boolean values set to False, both if they are true by default (in config.yaml) and when they are set to True using "juju set". Hope that answers your question	12:57
TheMue	frankban: the setting makes me wonder, there has been a change in getting handling nil values when default is set	13:00
TheMue	frankban: the change happened with rev 1800	13:00
frankban	TheMue: it is possible, I saw this problem in trunk, but it works as usual reverting to 1750	13:01
frankban	TheMue: the bug includes instructions to dupe, I'd ensure this is not soemthing wrong in my local configuration before investigating	13:02
TheMue	frankban: so if 1799 would be ok and 1800 not we've got it ;)	13:02
TheMue	frankban: the change has been to omit nil values if default is set. and this may be interpreted as false	13:04
frankban	TheMue: the weird think is that it seems the value is False in the hooks execution even when you explicitly set an option to true (and the default is false)	13:05
TheMue	frankban: are you still on 1750 or back on trunk	13:05
frankban	TheMue: 1750	13:05
TheMue	frankban: the hook execution part is strange	13:06
TheMue	frankban: take a look at http://bazaar.launchpad.net/~go-bot/juju-core/trunk/revision/1800, get.go line 52 (the rest are tests)	13:06
jam	TheMue: so rev 1800 has "if option.Default != nil { info["value"] = option.Default" which seems to be the only change. Otherwise we leave value untouched.	13:07
TheMue	frankban: yes, exactly	13:07
TheMue	frankban: before that change the map contains the key "value", only with a value nil	13:08
TheMue	frankban: so with a quick hack on your 1750 to behave here like the 1800 and showing the same errors shows that it's a shitty CL :/	13:10
frankban	TheMue: so you duped?	13:11
TheMue	frankban: yes, i would revert it then	13:11
TheMue	frankban: but it would help me if you make that quick hack test to be sure that this is the correct concluion	13:12
TheMue	conclusion	13:12
frankban	TheMue: are you sure the problem is there? AFAICT ServiceGet works correctly (the correct values are showed, i.e. in the GUI (and the GUI takes that information using the API)	13:16
TheMue	frankban: no, i'm not sure, that's so far the only change i've found regarding config later than 1750	13:17
fwereade	TheMue, there's another biiig one	13:17
fwereade	TheMue, uniter working via API	13:17
TheMue	frankban: so you see the correct values in GUI? fine	13:18
frankban	TheMue: yes	13:18
TheMue	frankban: ok, will investigate there (uniter)	13:18
dimitern	fwereade, updated https://codereview.appspot.com/13963043	13:24
fwereade	dimitern, cheers	13:25
fwereade	dimitern, nice and clean, LGTM	13:28
dimitern	fwereade, thanks	13:29
fwereade	dimitern, remind me what else is on your plate after that one? the blanking?	13:29
mgz	got a lead on our memory/tiny booting issues, bug 1227425 may be related	13:30
_mup_	Bug #1227425: Cloud images do not need apt-xapian-index <bot-comment> <cloud-images-build> <ubuntu-cloud-images> <Ubuntu:New> <https://launchpad.net/bugs/1227425>	13:30
fwereade	TheMue, ah-ha	13:30
fwereade	TheMue, a true boolean is being reported to the uniter as ""	13:31
dimitern	fwereade, I realized we no longer need StateAddresses() and APIAddresses() on agent.Config, so I'll remove these as well	13:31
fwereade	dimitern, nice	13:31
fwereade	dimitern, thanks	13:31
TheMue	fwereade: i'm currently digging in the uniter	13:31
TheMue	fwereade: where are you	13:31
fwereade	TheMue, add a boolean to testing/repo/series/wordpress/config.yaml	13:32
fwereade	TheMue, find the uses of assertYaml in uniter_test.go	13:33
fwereade	shit	13:38
fwereade	config data is getting squeezed through map[string]string and we didn't spot because we didn't have tests involving non-string config settings at the sharp end	13:38
rogpeppe1	a small MP that might speed up tests slightly: https://codereview.appspot.com/13968043/	13:39
frankban	TheMue: revno 1800 works well fwiw. trying 1845 now	13:39
TheMue	fwereade: testing it, just had to change something in my test code ;)	13:40
TheMue	frankban: aha	13:40
fwereade	TheMue, frankban, dimitern: state/apiserver/uniter/uniter.go:509	13:40
fwereade	TheMue, frankban, dimitern: those are not relation settings are are most definitely not a map[string]string	13:40
fwereade	TheMue, frankban, dimitern: this is critical	13:41
frankban	so sval, _ := v.(string) is killing booleans?	13:41
dimitern	fwereade, hmm	13:42
dimitern	fwereade, ok, so we need map[string]interface{} there?	13:42
fwereade	dimitern, yeah	13:42
TheMue	wow	13:43
fwereade	dimitern, the confusing range of configgy/settingsy types with their selection of arbitrarily different rules is deeply depressing to me	13:43
dimitern	fwereade, if it's only that, it's easy enough to fix the API	13:43
fwereade	dimitern, bad luck for getting caught up in it (and Iprobably reviewed it too :/)	13:44
fwereade	dimitern, I believe so	13:44
fwereade	dimitern, we did release with the uniter api active, didn't we?	13:44
dimitern	fwereade, we did	13:45
fwereade	dimitern, still, upgrading the return type won't actually hurt	13:45
fwereade	dimitern, or will it	13:45
fwereade	dimitern, what happens if we try to deserialize a map[string]interface{} with mixed values into a map[string]string?	13:45
dimitern	fwereade, it ignores non-strings?	13:46
fwereade	dimitern, that'd be nice, and I think it might, but we should check	13:46
dimitern	fwereade, I mean - non-strings get empty string values	13:47
fwereade	dimitern, that would mean behaviour wouldn't change	13:47
dimitern	fwereade, I can do a CL that changes the result of ConfigSettings() to params.ConfigResults (new type - like SettingsResults, but with params.Config instead)	13:48
fwereade	dimitern, can we give them explicit ConfigSettingsResults and RelationSettingsResults names please?	13:49
fwereade	dimitern, and name the types they use ConfigSettings and RelationSettings	13:49
dimitern	fwereade, well, ConfigResult is used by the provisioner actually, for environ config result	13:49
rogpeppe1	dimitern, fwereade, TheMue, natefinch, mgz, jam: environment file extension: anyone want to weigh in? https://codereview.appspot.com/13969043	13:50
dimitern	fwereade, we can change these, but that means even more api incompatibility	13:50
fwereade	dimitern, type names are arbitrary, aren't they? where's the incompatibility?	13:50
fwereade	dimitern, field names are a problem	13:50
dimitern	fwereade, protocol on-the-wire might change?	13:50
dimitern	fwereade, or not, ok	13:51
fwereade	dimitern, if they suck we just have to eat it up and hope we learn from our mistakes :)	13:51
dimitern	fwereade, next CL will be about that then	13:51
fwereade	dimitern, I think it's even more important than the secret-masking tbh	13:51
fwereade	dimitern, this is a pretty devastating regression	13:52
TheMue	rogpeppe1: reviewed	13:53
dimitern	fwereade, I'm done with the provisioner for now - submitted the first for landing, the second one is next, and while waiting I'll tend to the uniter	13:53
* fwereade throws flowers before dimitern's path		13:54
natefinch	I like jenv because if we decide we don't like yaml anymore, we can put something else in there. I do sorta have a hatred for prefixing things with j, just due to an inordinate amount of time exposed to java crap	13:54
* natefinch isn't bitter though...		13:55
dimitern	fwereade, (if we ask Captain Hindsight for advice it'll be:) we would've caught this if we had tests for non-string settings	13:55
fwereade	thank you, Captain Hindsight!	13:55
fwereade	dimitern, perfectly correct	13:56
dimitern	fwereade, so I'll look about adding some	13:56
fwereade	dimitern, stick to local unit tests for the bit you change, for now, please -- I consider this critical and don't want to release with it again ;p	13:58
fwereade	dimitern, changing the uniter tests to exercise it may be noisy	13:58
fwereade	dimitern, they must ofc be done but they'll delay landing the fix	13:59
dimitern	fwereade, ok	13:59
fwereade	dimitern, that said, hmm, how do we test in the api?	14:00
smoser	hey	14:01
smoser	looking at https://codereview.appspot.com/13962043/	14:01
fwereade	dimitern, if we use wordpress' config settings	14:01
smoser	rather than disabling certificate checking ...	14:01
smoser	wouldn't it be better to add the certificates ?	14:01
smoser	it seems juju would know them.	14:01
smoser	cloud-init has config that explicitly allows adding certificates that should then be accepted.	14:02
smoser	hazmat, ^ ?	14:02
fwereade	jam, smoser makes an interesting point ^	14:02
fwereade	dimitern, anyway: if we are using wordpress as the "standard" testing charm	14:03
smoser	http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-ca-certs.txt	14:03
dimitern	fwereade, I have some simple charms I can use	14:04
fwereade	dimitern, we should probably just add all config types to it and so gently encourage people testing to actually check them all	14:04
fwereade	dimitern, you may find that the uniter is tightly coiled around the fake wordpress charm	14:04
fwereade	dimitern, but, eh, that's the next branch anyway, I'll stop distracting you	14:05
fwereade	smoser, I think the encompassing issue may be that some clouds don't even have certs configured	14:06
smoser	is that possible ?	14:06
smoser	ignorance being exposed....	14:06
smoser	but when i go to some https sight with firefox	14:07
smoser	it says "Hey, this doens't look right". You want to get the certificate and trust it ?	14:07
fwereade	smoser, I have only second-hand "knowledge", inferred from the conversations of those who know more than me	14:07
smoser	can't juju client just do the "get the certificate" bit. and then launch instances with that.	14:07
fwereade	mgz, IIRC you were doing ugly things to induce certificate errors recently I recall -- did I misunderstand your saying you'd been removing the certs temporarily and things had still worked?	14:08
dimitern	fwereade, the fix is done, testing now	14:34
mgz	fwereade: jam had done testing along those lines, but only for the client side so far I think (as it's harder to screw up the certs on a booted node and check that works)	14:38
jam	mgz, fwereade: I ssh'd into the node and messed up the certs for testing the patch I proposed.	14:43
jam	fwereade, smoser: While I like adding the functionality to allow a new known cert, I don't think it has the same user impact	14:43
jam	because digging up the cert and adding it to the config is far more complex than just shoving a "false" in there when you are testing.	14:43
jam	so I'd be happy to add support for custom certs	14:43
jam	but I think we still need the "disable" ability	14:44
smoser	jam, not necessarily	14:45
smoser	see my comment about firefox above	14:45
smoser	firefox bsaically allows me to say 'false' for checking of that server. and it does the rest.	14:45
smoser	i've actually done this once before on a project for exlicitly this reason. i figured out how firefox did what it does... how it gets the certificate and did that. and inserted that certificate.	14:46
smoser	i do see the point about this being "testing" and that https is likely only used without certificates on "test" scenarios.	14:47
rvba	mgz: just one question about the tag solution: if you upgrade a juju deployment that was created before we used the tags and then use a version of juju which uses the tags to filter out machines, your deployment will be broken. What's the policy to solve that kind of upgrade problems in juju?	14:48
mgz	hm, good question	14:50
mgz	that would be the case with either solution	14:50
rvba	True.	14:51
mgz	we could use compat code that detected the hey, no tag named after our environment, assume old behaviour of all machines are ours	14:51
mgz	but that may not be the best way	14:52
fwereade	rbva, mgz: we are getting closer to sanity for upgrades, but there's little so far	14:52
rvba	mgz: that seems like the only solution	14:52
fwereade	rvba, I was tending towards mgz's suggestion myself... it's bad but I don't see alternatives	14:52
rvba	Well, another solution is to have juju detect that there is no tag, and then create it and attach all the nodes it knows about to it.	14:53
mgz	we'd need to be doubly sure that destroy-enviornment twice wouldn't then go and delete all maas nodes anyway	14:54
fwereade	rvba, mgz, tag only the machines that have instance ids assigned in state?	14:54
mgz	because hey, the second time there's no tag named after our env, so everything must be ours, so wipe it...	14:54
rvba	fwereade: yes	14:54
rvba	mgz: the second time, no machine id will be stored, so no machine removed.	14:55
fwereade	rvba, mgz: that can't happen automaticaly within the environment though	14:55
mgz	it seems an easy enough disaster to avoid	14:56
fwereade	rvba, mgz: yeah -- axw has a lot on his plate right now but he seems enthusiastic about doing the long-overdue upgrade stuff in the near future	14:57
rvba	mgz: maybe the first solution (explicitly supporting the old behavior) is simpler after all.	15:00
rvba	fwereade: out of curiosity, why doesn't juju itself keeps track of the machines it owns?	15:02
rvba	keep*	15:02
fwereade	rvba, it does -- but Destroy is entirely internal to the environment, which is itself expected to keep track of its own machines and differentiate between those in and out of the environment	15:03
fwereade	rvba, it would indeed be possible to have written it such that juju had to specify all the instances it knew about	15:04
dimitern	anyone seen this local provider error: http://paste.ubuntu.com/6159055/	15:04
fwereade	rvba, but I think that would make it very hard for juju to effectively reap instances that it needed to itself	15:04
dimitern	it used to work fine a week ago	15:04
dimitern	loaded invalid environment configuration: storage-port: expected int, got 8040	15:05
fwereade	dimitern, that looks kinda like an int has been inappropriately coerced to a string somewhere, doesn't it	15:06
rvba	fwereade: I don't want to bother you with that, but I don't really understand. If juju has the list of all the machine it owns, it can pass it to the environment when destroying it.	15:06
rvba	machines*	15:06
dimitern	fwereade, it does	15:06
rvba	But that's not the way it works now so we have to fix the MAAS provider anyway :).	15:07
fwereade	rvba, if we start an instance but fail to record it against a machine, we want to automatically trash that instance	15:07
rvba	fwereade: hum, I see.	15:08
fwereade	rvba, I will try to make the situation clearer than it currently is in the writing-a-provider doc I'm working on	15:08
rvba	Cool	15:08
rogpeppe1	mgz: what's the status of the VPC-only bug?	15:17
hazmat	mgz, if you read the bug report, it states in the description how to get enabled with that on an existing account	15:22
hazmat	https://bugs.launchpad.net/juju-core/+bug/1221868	15:22
_mup_	Bug #1221868: juju broken with ec2 and default vpc <juju-core:Confirmed for gz> <https://launchpad.net/bugs/1221868>	15:22
hazmat	its took about 2biz days	15:22
dimitern	fwereade, ping	15:26
fwereade	dimitern, pong	15:26
dimitern	fwereade, how do you suggest to live test that thing? so far I tried ec2 live testing and calling juju set svc flag=True, calls config-changed in a debug hooks session and config-get shows it as expected	15:27
fwereade	dimitern, that sounds solid	15:27
fwereade	dimitern, but that local provider thing is really alarming	15:27
dimitern	fwereade, I'll check on trunk to see if it's my branch or it's broken	15:28
fwereade	dimitern, thanks	15:28
TheMue	ah, tests pass	15:31
dimitern	fwereade, same effect in trunk	15:37
natefinch	args... couple annoying bugs in goyaml..... unmarshaling "" into a *string makes the string nil (not an empty string), and unmarshalling [] into a slice gives you a nil slice (not an empty slice). PITA	15:37
dimitern	fwereade, so the local provider was broken earlier	15:37
* fwereade freaks out at dimitern but wants to chat to nate for a moment		15:37
fwereade	natefinch, that's annoying	15:38
natefinch	fwereade: yeah, we already had one workaround in constraints	15:38
fwereade	natefinch, I'm sure there was a similar bug with goyaml in the past	15:38
natefinch	fwereade: yeah, we had to set up a whole SetYAML method because the containertype was getting unmarshaled as nil instead of empty.	15:40
fwereade	natefinch, ouch -- do you know if there's a goyaml bug for that?	15:40
dimitern	can someone else try bootstrapping a local environment from trunk and deploying anything, to see if all-machines.log shows this error http://paste.ubuntu.com/6159055/	15:40
natefinch	fwereade: didn't look like it when I perused the bug list (only 13 bugs listed)	15:41
dimitern	TheMue, rogpeppe1, jam, mgz ^^ ?	15:44
dimitern	and please make sure you did go install . in cmd/juju and jujud/, and use --upload-tools on bootstrap	15:45
mgz	hazmat: thanks, I'm just not certain I want to do that on the shared bzr account, how disruptive was it for you?	15:49
dimitern	fwereade, there's the fix https://codereview.appspot.com/13908044	15:53
hazmat	mgz, seamless, just pick a region your not using	15:54
hazmat	mgz, you have to clear out ec2 resources in that region (ie no running instances, also good to clear out groups)	15:54
mgz	ah, that does seem good	15:54
hazmat	mgz, so i take it then there hasn't been any progress on this? we really need it for 1.16..	15:55
hazmat	i ran into two users last week, who couldn't use juju on ec2..	15:55
natefinch	fwereade: now there are bugs	15:56
fwereade	natefinch, thanks	15:56
dimitern	ok, so no one wants to try to reproduce the local provider issue, i'm filling a bug	15:58
sinzui	fwereade, do you have a revision that you want to release as 1.15.0?	16:01
fwereade	sinzui, I am very worried that I do not, because dimitern's problem seems pretty critical to me	16:02
sinzui	fwereade, okay. That's fine. Is there a bug I can track	16:03
fwereade	sinzui, dimitern is filing it as we speak	16:03
sinzui	fab. Thank you.	16:04
dimitern	fwereade, sinzui: there it is bug 1231543	16:07
_mup_	Bug #1231543: upgrader startup failure with local provider <juju-core:New> <https://launchpad.net/bugs/1231543>	16:07
sinzui	Thank you dimitern	16:09
fwereade	dimitern, would you please mark that critical and start investigating? TheMue, are you on something else or can you assist reproing?	16:10
dimitern	fwereade, it's filed as critical	16:10
dimitern	fwereade, and I'm looking at it	16:10
dimitern	fwereade, the uniter fix is proposed already	16:10
fwereade	dimitern, you anticipate my micromanagement with aplomb and panache	16:10
fwereade	dimitern, I'm about to LGTM it I think	16:11
fwereade	dimitern, yep, LGTM, just one tweak needed	16:12
dimitern	fwereade, ok, will tend to it afterwards	16:12
TheMue	fwereade: can do tomorrow morning, have to reactivate the matching VM (not enough space anymore on disk)	16:13
TheMue	fwereade: currently I'm fighting with a called but non-existing constructor sigh	16:15
* TheMue still will propose now, so the changes can be reviewed		16:16
* fwereade is taking a short family break but will return anon		16:18
TheMue	shit, propose will not work with the missing function :(	16:19
TheMue	dimitern: i'll start to setup my testing vm now	16:21
TheMue	dimitern: will you not any findings in the issue to that i can support you after setup later	16:22
TheMue	cu later	16:24
dimitern	TheMue, so far I tested it happens in trunk and r1885, will go further	16:24
rogpeppe1	dimitern, mgz, jam, natefinch: next stage in environment info storage, reviews appreciated please: https://codereview.appspot.com/13970043	16:26
rogpeppe1	fwereade: ^	16:26
rogpeppe1	dimitern: ping	16:30
dimitern	ok, so it doesn't happen as far as r1844, going back up	16:30
dimitern	rogpeppe1, pong	16:30
dimitern	rogpeppe1, I'm up to my elbows into the local provider atm	16:30
rogpeppe1	dimitern: i'm just wondering about API connections and how they can find the API addresses to store locally	16:30
dimitern	rogpeppe1, expand a bit please	16:31
rogpeppe1	dimitern: so, the plan is that when we make an API connection, we find out the current set of API addresses and store that locally in a .jenv file	16:31
dimitern	rogpeppe1, how about if they change after that?	16:31
rogpeppe1	dimitern: we refresh the cache each time we connect	16:32
rogpeppe1	dimitern: and fall back to environ config info if the connection fails	16:32
dimitern	rogpeppe1, sgtm	16:32
rogpeppe1	dimitern: but we need to find out the current set of API addresses so we can store them	16:32
rogpeppe1	dimitern: and i'm thinking of an API call that's available to anyone that can access the API that returns them	16:33
jam	rogpeppe1: it could be returned from Login	16:33
dimitern	rogpeppe1, so like a Login call	16:33
rogpeppe1	jam: that's an interesting idea	16:34
rogpeppe1	jam: i quite like that actually.	16:34
rogpeppe1	jam: then api.Open can cache it, so it can be retrieved by a later call	16:34
rogpeppe1	jam: so we don't have to change the type sig	16:34
jam	something like that, yeah	16:34
rogpeppe1	ah, there's a problem, i think	16:35
rogpeppe1	jam: i think that State.APIAddresses just returns the same IP addresses that mongo peers use to talk to each other	16:36
rogpeppe1	jam: which probably won't be public IP addresses	16:36
dimitern	rogpeppe1, they aren't	16:36
rogpeppe1	damn. i guess i'll need to fix that first	16:36
dimitern	rogpeppe1, but with the addresser stuff coming up it might not be needed	16:37
dimitern	machine addressability	16:37
rogpeppe1	dimitern: go on... how does that help?	16:37
dimitern	rogpeppe1, machines will know their own addresses (public, private, all)	16:37
rogpeppe1	dimitern: go on	16:38
dimitern	rogpeppe1, and you can query state for them, and there will be a worker to update them as needed	16:38
dimitern	rogpeppe1, mgz is working on that I think for some time	16:38
rogpeppe1	dimitern: so to find the API addresses, you do a search for all machines with ManageState, then query their addresses?	16:38
rogpeppe1	s/ManageState/JobManageState/	16:39
dimitern	rogpeppe1, yes	16:39
dimitern	rogpeppe1, and for other potential new jobs we have	16:39
rogpeppe1	dimitern: that seems somewhat inefficient. wouldn't it be a linear scan?	16:39
dimitern	rogpeppe1, who needs to know?	16:40
rogpeppe1	dimitern: it'll happen every time someone connects to the API	16:40
dimitern	rogpeppe1, and currently it happens thorough the StateInfo	16:40
rogpeppe1	dimitern: i was thinking that we'd have a doc in mongo which held the API addresses, then some agent would maintain that	16:40
dimitern	through	16:40
dimitern	rogpeppe1, that might be an addition to the addressability stuff, or even orthogonal to it	16:41
rogpeppe1	dimitern: i think it's orthogonal, yes	16:41
rogpeppe1	hmm, how does a machine's public address get filled in now? by the provisioner, i guess	16:42
mgz	rogpeppe1: that's the idea	16:42
mgz	not sure what you mean by "linear scan" though	16:43
rogpeppe1	mgz: well, if i want to find out the addresses of all machines that are state servers, how should i do it?	16:43
dimitern	rogpeppe1, not really	16:43
dimitern	rogpeppe1, the unit's addresses are set by the uniter, but the machine addresses are taken from the environment	16:44
mgz	query out machines that have the stateserver bit set in mongo, and pull the address?	16:44
dimitern	rogpeppe1, by the provisoner, but it doesn't set them anywhere yet	16:44
rogpeppe1	mgz: won't that be a linear scan through all machines?	16:44
mgz	having a seperate table with addresses of state servers doesn't sound faster to me	16:44
mgz	but is also perfectly possible, it's just a denormalisation	16:45
dimitern	fwereade, I found the culprit - the issue in bug 1231543 starts to happen in r1877	16:46
_mup_	Bug #1231543: upgrader startup failure with local provider <juju-core:New> <https://launchpad.net/bugs/1231543>	16:46
rogpeppe1	mgz: to me it sounds like one fetch of a document in a single document collection, versus a scan through potentially many hundreds	16:46
rogpeppe1	mgz: but... i think that for the time being it's probably fine	16:47
rogpeppe1	mgz: storing the addresses separately is an optimisation really.	16:47
rogpeppe1	dimitern: hmm, so the uniter API has PublicAddress and SetPublicAddress. is there any particular reason for that?	16:48
dimitern	rogpeppe1, the uniter sets these on startup	16:49
rogpeppe1	dimitern: what i mean is: why have the PublicAddress method if it's only there to pass its result to SetPublicAddress?	16:50
rogpeppe1	dimitern: (which also gives a compromised uniter the potential freedom to muck with its reported public address, something you probably don't want)	16:51
dimitern	rogpeppe1, the uniter needs both to set public/private addresses of a unit, and to read them	16:51
rogpeppe1	dimitern: why's that?	16:52
dimitern	rogpeppe1, the addresses shouldn't be on a unit at all - they should be on a machine, but that's that	16:52
rogpeppe1	dimitern: i'm wondering about an API call, say Start, which informs the API that the uniter has started	16:52
dimitern	rogpeppe1, because public-address is one of the relation settings set automatically when entering scope for example	16:52
rogpeppe1	dimitern: ah, good point, so we need PublicAddress	16:53
dimitern	rogpeppe1, the API very well knows when the unit agent connects, and starts a pinger now	16:53
rogpeppe1	dimitern: in that case, that's probably the moment that the public and private addresses should be set	16:53
dimitern	rogpeppe1, perhaps, if we're not using a separate worker for that	16:54
dimitern	rogpeppe1, and setting them on the machine, not on the unit	16:54
rogpeppe1	dimitern: yeah	16:54
rogpeppe1	dimitern: but the point is that we could remove that stuff from ModeInit, i think	16:55
rogpeppe1	dimitern: hmm, except not right now of course	16:55
rogpeppe1	dimitern: because it really does get the public address from the provider	16:56
rogpeppe1	dimitern: ok, ignore my stupidity	16:56
mgz	I've added an explaination to bug 1227533 about our memory woes the last week	16:56
_mup_	Bug #1227533: Juju fails to bootstrap if memory is lower than 1GB <juju-core:Triaged> <https://launchpad.net/bugs/1227533>	16:56
mgz	now I must depart, farewell!	16:56
rogpeppe1	mgz: one mo, please?	16:56
mgz	one mo while I close things :)	16:57
dimitern	rogpeppe1, there's a todo about it in mode init	16:57
rogpeppe1	mgz: kapil was asking about the status of the VPC-only bug...	16:57
rogpeppe1	dimitern: yeah, i understand that now :-)	16:57
dimitern	rogpeppe1, ...and a few other places, and there's the tech-dept bug 1205371	16:58
_mup_	Bug #1205371: state.Addresses and APIAddresses need better implementation <juju-core:In Progress by gz> <https://launchpad.net/bugs/1205371>	16:58
rogpeppe1	dimitern: hmm, so there's no way of finding out a machine's public address currently unless it has a unit on it?	16:58
mgz	rogpeppe1: it's the next on my list, but haven't started yet, saw his comments earlier	16:58
rogpeppe1	mgz: ok, cool	16:58
mgz	will tackle the registration stuff at least tomorrow	16:58
mgz	okay, now must fly	16:58
* dimitern is totally puzzled how r1877 could lead to that local provider issue		17:00
rogpeppe1	i'd love a review of https://codereview.appspot.com/13970043/ if anyone has a little time	17:22
natefinch	rogpeppe1: I can take that	17:22
rogpeppe1	natefinch: ta muchly	17:23
rogpeppe1	natefinch:	17:23
fwereade	dimitern, thanks, Iwill meditate upon 1877	17:33
fwereade	dimitern, "The simplestreams tools metadata includes a sha256..."?	17:34
natefinch	rogpeppe1: what's the difference between done := make(chan struct{})	17:35
natefinch	go func() { info.BootstrapConfig(); done <- struct{}{} }()	17:35
natefinch	<-done	17:35
natefinch	and just calling info.BootstrapConfig() in the current goroutine? They both just block waiting for bootstrapconfig to finish, right?	17:35
rogpeppe1	natefinch: ha, there is a subtle difference, but it's just a debugging remnant	17:35
rogpeppe1	natefinch: i'll revert it	17:35
rogpeppe1	natefinch: 2 points if you can tell me why i did it :-)	17:35
natefinch	rogpeppe1: if you had a panic in bootstrap config it would make the call stack a lot shorter	17:36
rogpeppe1	natefinch: close	17:36
natefinch	rogpeppe1: could be something to do with the scheduler, but that seems too subtle to matter	17:38
rogpeppe1	natefinch: nah	17:38
rogpeppe1	natefinch: it's to do with gocheck	17:38
rogpeppe1	natefinch: if you panic, then gocheck catches it and distorts things	17:38
rogpeppe1	natefinch: so by panicing in a goroutine you get a much cleaner idea of what's going on at that momen	17:38
rogpeppe1	t	17:38
natefinch	ahh ok	17:39
natefinch	rogpeppe1: I presume you'll take out the log messages in there as well	17:40
rogpeppe1	natefinch: yes	17:40
natefinch	k\	17:40
natefinch	rogpeppe1: btw, is "erewhemos" someone misspelling "somewhere" backwards, or something that actually makes more sense?	17:42
rogpeppe1	natefinch: the former :-)	17:42
dimitern	rogpeppe1, sweet! i'll remember that trick next time i'm fighting tests panic	17:42
natefinch	rogpeppe1: ha, ok. I thought so, but you never know	17:43
rogpeppe1	natefinch: just a nonsense name that's unlikely to be confused with anything in the production code	17:43
fwereade	natefinch, I'm sorry about that, there was a satirical work by samuel butler called "erewhon" which is not quite "nowhere" backwards	17:43
fwereade	natefinch, it seemed like a good idea at the time	17:43
dimitern	fwereade, yes that's whati found so fr	17:43
fwereade	dimitern, just to be crystal clear: 1876 works, 1877 does not?	17:44
rogpeppe1	fwereade: we're in the distopia right?	17:44
rogpeppe1	dystopia, sorry	17:44
natefinch	fwereade: haha, ok. not up on my Victorian authors	17:44
fwereade	rogpeppe1, heh	17:44
dimitern	fwereade, that's what I see, but I'll double check, just a minute	17:44
dimitern	fwereade, indeed	17:49
dimitern	fwereade, and the error now makes sense 2013-09-26 17:48:00 ERROR juju runner.go:211 worker: exited "upgrader": cannot set agent tools for machine 0: empty size or checksum	17:49
dimitern	fwereade, but, interestingly the coercing error is not there in 1877	17:51
rogpeppe1	natefinch: still waiting for that review, BTW :-)	18:02
natefinch	rogpeppe1: still doing it. Had to stop in the middle for a little bit. Almost done :)	18:03
rogpeppe1	natefinch: np	18:03
dimitern	fwereade, so the other error starts to show in my r1884 that switches to api provisioner	18:08
rogpeppe1	fwereade: do you know what stage mgz is at with the addressing stuff?	18:13
natefinch	rogpeppe1: done	18:13
rogpeppe1	fwereade: i just started hacking up the publisher/addresser worker, then realised that he might already have done/nearly done it	18:13
rogpeppe1	natefinch: thanks	18:13
fwereade	rogpeppe1, I'm afraid I do not actually know, i was kinda expecting a CLfrom him today	18:18
rogpeppe1	fwereade: i need that, or something like it, to cache the API addresses	18:19
fwereade	dimitern, ah ok	18:19
fwereade	dimitern, so the upgrader thing appears to be a problem	18:19
rogpeppe1	fwereade: this is the sketch of the code i just wrote: http://paste.ubuntu.com/6159815/	18:19
dimitern	fwereade, yeah	18:19
rogpeppe1	fwereade: oops, this is better: http://paste.ubuntu.com/6159817/	18:20
fwereade	dimitern, I thought all we were meant to be setting was a version, not a whole tools	18:20
dimitern	fwereade, and the other thing - it doesn't seem to be an int coerced to string, it's an int - I debugged so far as to say the provisionerAPI returns the correct map[string]interface{} in worker/WaitForEnviron	18:21
fwereade	dimitern, oh, ffs, is it possibly a json problem? definitely an int and not a float?	18:21
fwereade	rogpeppe1, sorry, I have only skim-read it, but I think it may well have overlap	18:23
dimitern	fwereade, trying to see exactly what now	18:23
rogpeppe1	fwereade: yeah, if he's doing an addresser worker, it almost certainly will	18:24
rogpeppe1	fwereade: well, i'll keep it around in case	18:24
dimitern	any idea why this error? ERROR juju.provider.local environ.go:482 could not install machine agent service: exec ["start" "juju-agent-dimitern-local"]: exit status 1 (start: Job is already running: juju-agent-dimitern-local)	18:24
rogpeppe1	time to stop for the day	18:25
fwereade	dimitern, aw hell, that really should be fixed for 1.16 too, we don't seem to shut down local envs cleanly	18:26
dimitern	fwereade, hmm - we are stopping them, but the upstart job remained and it though "because it's there, it must be running"	18:27
fwereade	dimitern, looks like we're calling StopAndRemove though	18:28
dimitern	fwereade, hmm.. it get's deeper	18:30
dimitern	fwereade, so now the upstart job hangs	18:30
dimitern	fwereade, that's why the bootstrap doesn't complete and I terminated it	18:30
rogpeppe1	g'night all	18:31
rogpeppe1	might be back later, actually	18:31
fwereade	rogpeppe1, see you soon	18:31
fwereade	dimitern, "cannot install, already running" seems to imply that it really was running	18:32
fwereade	dimitern, and was thus not properly cleaned up	18:32
dimitern	fwereade, believe me, ps xa \| grep juju was the first thing I did - no results, even as root	18:33
dimitern	fwereade, just the upstart job was there	18:33
fwereade	dimitern, very strange	18:33
dimitern	fwereade, so the mongo hangs at bootstrap	18:38
dimitern	fwereade, and that fails the whole thing	18:38
dimitern	fwereade, it's indeed running now, and the error is correct	18:38
fwereade	dimitern, ok, so we have some sort of poorly characterized local provider cleanup problem	18:39
dimitern	fwereade, and even upstart believes jujud job is running	18:39
dimitern	fwereade, and I can't see it	18:39
fwereade	dimitern, and a clear current issue: that we're recording full agent tools including hashes for no clear reason, when all we really care about it the binary version they're running	18:40
fwereade	dimitern, concur>	18:40
dimitern	fwereade, not sure I get you there	18:42
fwereade	dimitern, so the problem seems to be that we're setting tools on the agent, rather than just setting the binary version which is all anyone cares about AFAIK	18:43
fwereade	dimitern, and we can't set tools because we didn't record the hash we downloaded and verified	18:44
fwereade	dimitern, and it seems a bit pointless to report it back to juju when juju told it to us in the first place	18:44
dimitern	fwereade, yes, that seems likely	18:45
dimitern	fwereade, I have to stop though.. lest my head explodes :/	18:48
fwereade	dimitern, no worries at all, you are already above and beyond	18:48
fwereade	dimitern, is there a specific bug for the tools issue?	18:49
dimitern	fwereade, don't know	18:49
dimitern	fwereade, I added the one for the upgrader, but this seems unrelated	18:49
fwereade	dimitern, the upgrader was what I meant by the tools issue	18:50
dimitern	fwereade, bs, actually the upgrader error is about tools, the other errors were different	18:50
dimitern	fwereade, :)	18:50
fwereade	dimitern, I think there is one for screwy local-env destruction	18:50
dimitern	fwereade, maybe	18:51
rogpeppe1	fwereade: the point of setting tools on the agent was so that it was possible to make available that information in the status, so you could know exactly what s/w was running on each machine	19:25
fwereade	rogpeppe1, ok, so we should have to record and write into the tools dirs the hashes of the original tarballs?	19:27
rogpeppe1	fwereade: yes	19:27
fwereade	rogpeppe1, I dob't really see how that helps anyone	19:29
rogpeppe1	fwereade: when debugging stuff it means you have an unambiguous record of what is being run where, which i think could be very useful at times	19:30
rogpeppe1	fwereade: for reproducibility and diagnosis of difficult issues in a highly distributed environment	19:30
rogpeppe1	fwereade: and i don't really see why it should be a hard thing to do, though i haven't read through the discussion above, so i don't know what the current issue is	19:31
fwereade	rogpeppe1, it looks like we're barfing when calling SetAgentTools because the tools in state now demand a hash	19:32
rogpeppe1	fwereade: and you can't have a Tools with an empty hash?	19:32
fwereade	rogpeppe1, apparently not	19:32
fwereade	rogpeppe1, it seems to be demanding that if there's a URL, there must be a size and checksum	19:33
rogpeppe1	fwereade: oh yes, checkToolsValidity	19:34
fwereade	rogpeppe1, but not barfing if there's no URL	19:34
fwereade	rogpeppe1, when I thought we always wrote a URL	19:34
fwereade	rogpeppe1, but ofc do not necessarily have the original tgz available and so can't always manage size/hash	19:35
fwereade	rogpeppe1, (not that we do, even when we do, AFAIK -- maybe that changed somewhere?)	19:35
rogpeppe	fwereade: sorry, computer just crashed	19:42
rogpeppe	fwereade: last thing i was was "it seems to be demanding that if there's a URL, there must be a size and checksum"	19:43
rogpeppe	s/was/saw was/	19:43
natefinch	sigh.... goyaml doesn't differentiate between nil slices and empy slices :/	20:26
=== sidnei` is now known as sidnei
wallyworld	fwereade: hiya, saw the email about the error, i can take a look	22:22
fwereade	wallyworld, tyvm	22:22
wallyworld	any clues to get me started? i see a few comments in the bug	22:22
wallyworld	could it be related to the env split up?	22:22
thumper	grr	22:25
thumper	I have the upgrader constantly bouncing	22:25
thumper	any one else noticed?	22:25
thumper	wallyworld: fwereade: ??? http://paste.ubuntu.com/6160651/	22:25
fwereade	thumper, https://bugs.launchpad.net/juju-core/+bug/1231543	22:26
_mup_	Bug #1231543: upgrader startup failure with local provider <regression> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1231543>	22:26
fwereade	thumper, wallyworld is looking at it now dimitern has I think stopped	22:26
wallyworld	thumper: that error looks like tools checksum is failing to be calculated	22:26
thumper	kk	22:27
thumper	I'm trying to chase the lxc issues	22:27
wallyworld	fwereade: thumper's error message mentions checksums, whereas bug says something about ports	22:27
fwereade	wallyworld, that is also a problem	22:27
wallyworld	yeah, so issues \o/	22:28
wallyworld	2	22:28
fwereade	wallyworld, but the tools checksum is easier to get a handle on and isolate	22:28
wallyworld	the tools one is my fault	22:28
wallyworld	if i can't easy find it i can just disable the checksum check for now	22:28
fwereade	wallyworld, so do we now write out size/sha256 into the tools dir when we unbundle?	22:28
wallyworld	we do	22:28
wallyworld	but for some reason the checksum is not getting passed down the api	22:29
fwereade	wallyworld, I bet we just miss it in the local provider then	22:29
fwereade	wallyworld, or is it happening everywhere?	22:29
wallyworld	it could be that the tools are being read from the old place which means no checksum	22:29
fwereade	wallyworld, although, hmm, yeah exactly	22:29
wallyworld	fwereade: i tested bootstrapping on ec2, hp etc with the new stuff and it works	22:29
fwereade	wallyworld, I'm a little scepticalabout the value of recording all that in state anyway	22:29
fwereade	wallyworld, cool	22:30
wallyworld	fwereade: we recorded the url in state, from which a tools stuct is made. and that tools struct is used to find a tools tarball. so it needs the checksum	22:31
fwereade	wallyworld, we only ever call SetAgentTools in code that has already been extracted from the tarball in question	22:32
wallyworld	fwereade: i'll have to re-read the code - what do we use the agent tools stored in state for? the tools info from SetAgentTools?	22:33
fwereade	wallyworld, not much	22:33
thumper	fwereade: we should get around to fixing the tools for the local provider	22:34
wallyworld	so i could drop the checksum requirement. i thought it was needed somewhere, can't recall though	22:34
fwereade	wallyworld, that said, minimal changes good, I am not encouraging you to rewrite and would most favour a simple tweak to the local providr that made sure it wrote its tools dir properly	22:34
thumper	rather than the upload-tools malarky we do now	22:34
fwereade	thumper, oh, god, yes we should	22:34
thumper	fwereade: however I'm not sure what the best way is	22:34
fwereade	thumper, I'm quite sure we can harmonise it with all the simplestreams stuff	22:35
thumper	I hope so	22:35
wallyworld	fwereade: when you say "not much" - is there a simple explanation of why we store the tools url and version in state?	22:35
fwereade	wallyworld, the version we need for status	22:36
fwereade	wallyworld, series is duplicated, a machine should already know its own series	22:36
wallyworld	why the url?	22:36
fwereade	wallyworld, and for that matter arch should always be in hardware characteristics too	22:36
wallyworld	do we ever use the url to fetch tools?	22:37
wallyworld	if not, i can drop the need for imsisting on checksum	22:37
fwereade	wallyworld, I was asking rogpeppe -- I hope I am not mischaracterising him to say that it's there just in case it turns out to be useful one day	22:37
fwereade	wallyworld, SetAgentTools is, as far as I'm aware, purely a record of what the agent reports itself to be running	22:38
wallyworld	well	22:38
fwereade	wallyworld, url and checksum and size are not, I think, exposed anywhere	22:38
wallyworld	not sure i agree with recording all that extra info just to report a version	22:38
fwereade	wallyworld, all that detail in (once) state.Tools would have been great if we'd ever stored an environment's available tools in state	22:39
wallyworld	fwereade: would you object if i zero out url and checksum in set agent tools	22:39
wallyworld	if we have a url and not the checksum, that is not something we should encourage	22:40
fwereade	wallyworld, because then we could just grab the tools for a particular machine with a trvial query, get the url and size and checksum, and hand them straight over	22:40
fwereade	wallyworld, well	22:40
wallyworld	or i could find out why checksum is missing	22:40
fwereade	wallyworld, the url really just indicates "this is where we got them from"	22:40
wallyworld	ok, i'll see how it pans out. for the release, where we need something done, it may just be easier to drop the mandatory checksum requirement	22:41
wallyworld	and fix next week	22:41
fwereade	wallyworld, indeed, if that's what it comes to then so be it	22:41
wallyworld	cause the other issue sounds more tricky	22:42
thumper	wallyworld, fwereade: I'll look at the port int issue	22:42
fwereade	thumper, <3	22:42
thumper	wallyworld: if you want to tackle the checksum thing	22:42
wallyworld	yes indeed	22:42
thumper	heh, interesting,	22:43
wallyworld	fwereade: i'm also part way through ripping out all legacy tools support - that will need to be landed after 1.15 when all clouds have had simplestreams tools uploaded by the release team	22:43
thumper	I can see from the rpc logging that the value is being sent through as an int	22:43
* thumper digs		22:43
thumper	what the actual fuck...	22:44
fwereade	wallyworld, awesome news	22:45
fwereade	thumper, that sounds less awesome	22:45
* thumper just digging		22:45
* wallyworld needs a coffee		22:45
wallyworld	thumper: how do i reproduce your issue?	22:56
thumper	wallyworld: all I did was bootstrap the local provider	22:56
wallyworld	ok	22:56
thumper	I did try to deploy some things	22:56
thumper	before I checked the logs	22:56
thumper	so not entirely sure	22:56
wallyworld	np thanks	22:56
thumper	but I feel just bootstrap is enough	22:56
thumper	I also feel that my problem may be shadowing yours	22:57
wallyworld	should be easy to find then hopefully	22:57
thumper	so you might not get yours fixed	22:57
thumper	until mine is	22:57
wallyworld	let's find out	22:57
thumper	hmm...	23:07
thumper	I think I know what it is, but it is weird	23:07
thumper	and not sure why it hasn't broken before this	23:07
thumper	if it is what I think it is	23:07
* fwereade wants to watch, but is going to bedinstead		23:08
fwereade	gn all	23:08
wallyworld	fwereade: night	23:10
wallyworld	thumper: i found the spot where SetAgentTools was passing in incomplete tools	23:10
thumper	wallyworld: cool, I've found out where the validate is failing, but unsure as to why	23:11
wallyworld	but i'm not sure i habe the size and checksum info at that point to pass in also	23:11
wallyworld	it really is just passing in a version wrapped in a tools struct which seem silly	23:11
wallyworld	thumper: ah, actually i think when local provider starts up, the tools hack it uses might not be recording the checksum etc, so when that info is read back later, it is missing	23:14
* wallyworld is guessing		23:14
thumper	how to I get the type of something printed out?	23:14
wallyworld	%T	23:14
wallyworld	fmt.Println("%T", thing)	23:14
wallyworld	Printf	23:14
thumper	stabby!!!!!!!!!!!!!!!!!!1	23:16
thumper	error used to be : storage-port: expected int, got 8040	23:16
thumper	added type info	23:16
thumper	guess what?	23:16
thumper	storage-port: expected int, got float64(8040)	23:16
thumper	this is why it is failing	23:16
thumper	FFS	23:16
thumper	is it because json serialization only has float64?	23:17
thumper	how to we fix this in a non sucky way?	23:17
* thumper wonders how the api port is handled		23:18
* thumper digs		23:18
thumper	stabby stabby	23:18
thumper	the difference is:	23:18
thumper	schema.Int	23:18
thumper	vs	23:18
thumper	schema.ForceInt	23:18
thumper	guess which is which?	23:18
thumper	huh?	23:20
thumper	I change it now I get a panic	23:20
wallyworld	thumper: you need a custom json demarshaller i think	23:22
wallyworld	for the struct	23:22
thumper	no, found it	23:22
thumper	you wouldn't believe it if I told you	23:22
thumper	well, you might	23:22
thumper	schema.Int -> int64	23:22
thumper	schema.ForceInt -> int	23:23
wallyworld	wtf	23:23
thumper	ok, that fixes it	23:23
wallyworld	\o/	23:23
wallyworld	thumper: save me some time - can you point me to where the local provider does its tools hacky thing to find the tools to bundle	23:24
thumper	it does the default --upload-tools bit	23:24
thumper	what do you mean exactly?	23:24
wallyworld	for some reason, the tools struct passed to bootstrap is (i think) missing the checksum info	23:25
wallyworld	i need to find out how that is happening	23:25
wallyworld	just working backwards to find it	23:27
thumper	probably the possible tools created by the upload-tools stuff	23:28
thumper	at a guess	23:28
* thumper proposes a copule of branches		23:35
wallyworld	thumper: found it, fixed, testing	23:36
thumper	https://codereview.appspot.com/14005043/ is just logging tweaks	23:37
wallyworld	the local environ did not implement CustomToolsSource interface	23:37
wallyworld	so it did not find tools using simplestreams, and defaulted to legacy	23:37
wallyworld	which means no checksums	23:37
thumper	https://codereview.appspot.com/14006043 is the fix for the config	23:38
thumper	ah	23:38
* thumper goes to set commit messages in prep		23:38
* thumper waits for review		23:39
thumper	almost tiem for lunch	23:39
wallyworld	thumper: done with one comment	23:40
thumper	added a little context	23:42
thumper	wallyworld: the new test failed with the expected same error output to the log file	23:43
thumper	changed the schema, and all good \o/	23:43
wallyworld	yay	23:43
wallyworld	thumper: i'll be proposing a fix soon, may you can look after lunch	23:43
thumper	ok	23:44
* thumper is heading into town to lunch with veebers		23:44
thumper	wallyworld: once you review the actual fix, you can approve it	23:44
thumper	I'm hoping you won't find any issue	23:44
* thumper -> lunch		23:45
wallyworld	ok	23:45

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!