/srv/irclogs.ubuntu.com/2015/05/26/#juju-dev.txt

=== negronjl_ is now known as negronjl_away
=== negronjl_away is now known as negronjl_
=== kadams54 is now known as kadams54-away
davecheney	thumper: http://paste.ubuntu.com/11360922/	01:47
davecheney	current state of play	01:47
* thumper looks		01:47
thumper	davecheney: seems only 10 packages have races	01:48
davecheney	this was run without -p 1	01:48
davecheney	so some testss timed out	01:49
davecheney	becuase of contention on the cpu	01:49
davecheney	yeah 10 looks about right	01:49
* davecheney makes cards		01:49
mwhudson	davecheney: so, the "don't strip go binaries" thing	02:14
mwhudson	davecheney: do you know what the actual problems are, or is it more "it's not tested and sometimes breaks things so don't do it"?	02:15
axw_	thumper: when you have a moment, can you glance over https://github.com/juju/utils/pull/134 and tell me if there's any reason why this should break "go run"?	02:50
axw_	please	02:50
thumper	ok	02:50
=== axw_ is now known as axw
thumper	axw: do I take it from this question that it is breaking juju run?	02:51
axw	thumper: err yeah, juju run not go run :)	02:51
davecheney	mwhudson: it's sort of self referential	02:51
davecheney	strip(1) doesn't really follow elf	02:52
axw	thumper: context: fixing https://bugs.launchpad.net/juju-core/+bug/1454678	02:52
mup	Bug #1454678: "relation-set --file -" doesn't seem to work <landscape> <relation-set> <juju-core:Triaged> <juju-core 1.24:In Progress by axwalk> <https://launchpad.net/bugs/1454678>	02:52
davecheney	it just doesn't mangle gcc produced things	02:52
davecheney	so that broke go binaries	02:52
davecheney	mainly anthing that wasnt amd64	02:52
axw	thumper: with my pending fix, jujud would consume stdin and pass it to the backend	02:52
davecheney	now, we don't test stripped binaries	02:52
davecheney	so if they got better or worse over time, we don't know	02:52
axw	thumper: that breaks juju run, because it reads the subsequent commands piped to bash	02:52
thumper	hmm...	02:52
axw	thumper: e.g. if you did "juju run 'cat; echo 123'", you'd get output of "echo 123" rather than "123"	02:53
davecheney	so it's sort of a circular problem, we tell people not to strip, they file bugs, we close them, we don't test that strip works, we tell people not to strip binaries, etc	02:53
thumper	axw: well, juju run just calls 'juju-run' on the server, which enters a hook context to execute the commands...	02:55
thumper	couldn't we just change how the juju-run server side command sends the actual script?	02:55
thumper	axw: cmd/jujud/run.go	02:56
thumper	axw: couldn'd we just hook up the stdin around line 111	02:56
thumper	?	02:56
axw	thumper: that doesn't solve this particular issue, though we might want to do that too. the problem is that at the moment, hook tools don't accept stdin at all	02:57
axw	hang on, I'll link my branch	02:57
thumper	I have the pull from above	02:58
axw	wallyworld: http://reviews.vapour.ws/r/1776/	02:58
wallyworld	ok	02:58
axw	err sorry, thumper^^	02:58
axw	wallyworld: ignore sorry	02:58
axw	thumper: so, atm you cannot do "echo yaml \| relation-set ... --file=-"	02:59
axw	thumper: my branch changes it so you can. but that showed up a problem in a test where a hook tool was running underneath "juju run"	02:59
axw	thumper: if there are multiple hook tool commands in the same juju-run, then the first one would consume the stdin which happened to be the rest of the juju-run commands	03:00
thumper	ah...	03:01
thumper	that's kinda weird	03:01
thumper	and a bit strange...	03:01
thumper	not quite sure how to fix that	03:02
thumper	sorry	03:02
axw	thumper: my change to utils/exec fixes it :) I'm just wondering if there's any reason why we shouldn't do it.. I don't think so	03:02
thumper	axw: I can't see a reason not to	03:03
axw	thanks	03:03
=== kadams54 is now known as kadams54-away
davecheney	thumper: there are a SHITLOAD of changes on juju/utils	03:06
davecheney	which aren't deployed because godeps has pinned the version way back in the past	03:06
thumper	success!	03:06
davecheney	no	03:07
davecheney	hodl on that	03:07
davecheney	for some reason godeps didn't update my working copy	03:07
davecheney	anyone http://reviews.vapour.ws/r/1782/	03:13
axw	thumper: how do I turn up logging in tests? is there a doc on this somewhere?	03:17
thumper	axw: in the setup, do something like this:	03:17
axw	thumper: no env var? :\	03:18
thumper	loggo.GetLogger('juju.whatever').SetLogLevel(loggo.TRACE)	03:18
axw	ok, thanks	03:18
thumper	axw: no we protect all the tests from the environment	03:18
axw	sure, we could set up logging and then remove the env var tho	03:18
axw	doesn't matter, that'll do for now	03:19
davecheney	is anyone looking at the bug in reviewboard that causes it to shit on markdown liks ?	03:21
davecheney	links	03:21
davecheney	axw: thanks for the review, here is another https://github.com/juju/juju/pull/2420/files	03:27
axw	LGTM	03:28
mup	Bug #1458717 was opened: utils/featureflag: data race on feature flags <juju-core:New> <https://launchpad.net/bugs/1458717>	03:28
mup	Bug #1458721 was opened: lease: data races in tests <juju-core:New> <https://launchpad.net/bugs/1458721>	03:28
axw	davecheney: dunno about the markdown links. I pinged ericsnow, but didn't hear back	03:28
mwhudson	davecheney: right, i get the self-referential bit	03:31
mwhudson	maybe i'll try to bang on the details for 1.6 or something	03:31
davecheney	mwhudson: external linking passes everyting to /bin/ld ?	03:35
davecheney	that may work	03:35
mwhudson	davecheney: yes	03:35
davecheney	but using the itnernal linker will probably cause sadness	03:35
mwhudson	ah yeah	03:35
mwhudson	makes sense	03:35
menn0	thumper: here's the PR to move the unit agent: http://reviews.vapour.ws/r/1784/	03:57
* thumper looks		03:57
thumper	shipit	04:00
menn0	thumper: sweet	04:01
davecheney	thumper: on kanban, the link to LP bug link just sends be back to the board, not to lp	04:12
thumper	davecheney: I'll fix it	04:13
thumper	it is board specific	04:13
thumper	and I didn't set it assuming the board I copied did	04:13
davecheney	ta	04:14
thumper	davecheney: done	04:22
thumper	menn0: I'm thinking I should have perhaps, maybe, not tried to do all this at once	04:23
* thumper takes another bite of the elephant in the package		04:23
* thumper makes it compile first		04:23
menn0	thumper: I know that feeling well	04:24
davecheney	menn0: nice change on moving code out of the cmd	04:24
thumper	order of operation:	04:24
davecheney	testing commands is a pain	04:24
thumper	tests compile first	04:24
davecheney	move the code elsewhere	04:24
thumper	tests pass second	04:24
thumper	tests right and correct third	04:24
thumper	although perhaps 2 and 3 will be reversed	04:25
=== urulama__ is now known as urulama
davecheney	1, 2, you know what to review, http://reviews.vapour.ws/r/1785/	04:31
menn0	davecheney: thanks... the change was essential in order to properly test what i'm working on	04:32
axw	davecheney: RB is screwed, I can't reply to your comment. I don't think it makes sense to change to io.Writer, since we want to buffer the output and return it as []byte	04:36
davecheney	fair enough	04:37
davecheney	i couldn't see from the diff	04:37
davecheney	so it was easlier to throw a comment over the wall	04:37
davecheney	anyone want to retunr the favor	04:39
davecheney	http://reviews.vapour.ws/r/1785/	04:39
davecheney	its a 2 line change	04:39
mup	Bug #1458693 was opened: juju-deployer fills up ~/.ssh/known_hosts <juju-core:New> <https://launchpad.net/bugs/1458693>	04:43
davecheney	axw: why do you think moving the line above the go statement changes the semantics of the test ?	04:45
axw	davecheney: because the time is going to be different	04:45
axw	davecheney: seems the time is meant to be after the lease was claimed	04:45
davecheney	sure, but that go routine may not be scheduled til some point in the future	04:46
davecheney	how about I move more code up ?	04:46
axw	davecheney: that's what I'm suggesting: move the ClaimLease call above "leaseClaimedTime := time.Now()"	04:48
davecheney	axw: done	04:48
davecheney	ptal	04:48
davecheney	fwiw both versions passed my stress test	04:48
davecheney	but yours is more correct	04:48
axw	davecheney: LGTM	04:49
axw	thanks	04:49
thumper	ok... I gotta go cook dinner before picking rachel up from the airport	04:51
thumper	see you folks tomorrow	04:51
davecheney	oh the irony	05:10
davecheney	http://paste.ubuntu.com/11364012/	05:10
mup	Bug #1458741 was opened: cmd/jujud/agent: TestJobManageEnvironRunsMinUnitsWorker fails <juju-core:New> <https://launchpad.net/bugs/1458741>	05:25
anastasiamac	axw_: tyvm :)	06:05
anastasiamac	axw_: I'll look tonite :D	06:05
axw_	anastasiamac: nps	06:11
anastasiamac	axw_: this store that I am adding ("allecto") exist or the charm that I am using.	06:14
anastasiamac	axw_: the whole idea was to use charm with storage	06:14
anastasiamac	axw_: and this one has 2 charm stores :D	06:14
anastasiamac	i'll update tthe code later on but i think u r spot on the money with writechanges!	06:14
anastasiamac	axw_: brilliant! tyvm :)))	06:15
axw_	anastasiamac: sorry, didn't realise storage-block had been updated	06:15
anastasiamac	axw_: guilty as charged :))	06:15
axw_	anastasiamac: writeChanges shouldn't cause your test to pass though, that would only make a difference if you passed an error into FlushContext	06:16
axw_	anastasiamac: ah, I know what hte issue is then	06:16
axw_	anastasiamac: you didn't specify a Count, so it was set to the MinCount of that store which is 0	06:16
axw_	anastasiamac: it should default to 1	06:17
axw_	(in the case of this method only)	06:17
anastasiamac	axw_: axw_oomg! u r 100% right!!! thnx!!!	06:17
anastasiamac	axw_: :D	06:17
anastasiamac	axw_: i need this store to have 0, so I'll pass Count as 1 in the test :)	06:18
anastasiamac	axw_: the whole idea of adding this store to test charm was to have a 0 ifor count range :)	06:18
axw_	anastasiamac: I think state.AddStorageToUnit should set Count to 1 if it's 0	06:18
anastasiamac	axw_: sure?	06:18
anastasiamac	axw_: u don't want it to send an error back? saying env default is 0 so storage wasn't aadded?	06:19
axw_	anastasiamac: doesn't make sense to add storage with 0 count	06:19
axw_	anastasiamac: IMO, storage-add should add a single instance unless otherwise specified	06:19
axw_	anastasiamac: so maybe the state method should just error if Count is 0/unspecified	06:20
axw_	and require the client to specify it	06:20
anastasiamac	axw_: k, i'll ad it to PR too! thanks for the thoughts :D	06:20
anastasiamac	add*	06:20
anastasiamac	axw_: at state - err if count is 0; in storage-add - set count to 1 if none specified	06:22
axw_	anastasiamac: yep. storage.ParseConstraints already does that (you're using that right?)	06:22
axw_	yes you are	06:23
axw_	anastasiamac: so, just error if Count is 0 and fix the tests to specify non-zero count	06:24
anastasiamac	axw_: will do! tyvm :)))))))))	06:30
mup	Bug #1458754 was opened: $REMOTE_UNIT not found in relation-list during -joined hook <juju-core:New> <https://launchpad.net/bugs/1458754>	06:32
mup	Bug #1458758 was opened: enable to execute a command/script on lxc/kvm hypervisors before containers are created <feature-request> <juju-core:New> <https://launchpad.net/bugs/1458758>	06:56
dimitern	reviewers ? PTAL http://reviews.vapour.ws/r/1777/	07:17
wallyworld	dimitern: what are the plans for bug 1348663 ? given 1.24 is delayed till next week, are there plans to fix?	07:25
mup	Bug #1348663: DHCP addresses for containers should be released on teardown <maas-provider> <network> <oil> <juju-core:Triaged by mfoord> <juju-core 1.24:Triaged by mfoord> <MAAS:Invalid> <https://launchpad.net/bugs/1348663>	07:25
dimitern	wallyworld, yes, the plan is to work around this by using the new devices api from maas - michael is working on implementing it this week	07:26
wallyworld	dimitern: awesome ty. for 1.24 then i asume?	07:26
dimitern	wallyworld, at the very least juju lets maas (1.8+) know when in spins up a container and which node is its parent	07:27
wallyworld	great	07:27
dimitern	wallyworld, yes, I hope we'll make it for 1.24.0, if not - for .1	07:27
wallyworld	dimitern: ok, maybe then we move that bug off beta5 milestone and onto 1.24.0	07:28
dimitern	wallyworld, sounds good to me	07:28
wallyworld	done	07:28
dimitern	cheers!	07:29
dimitern	wallyworld, if you can, can you review http://reviews.vapour.ws/r/1777/ please?	07:32
wallyworld	ok	07:32
axw_	fwereade: any thoughts on how to fix this? https://bugs.launchpad.net/juju-core/+bug/1457728/comments/6	07:34
mup	Bug #1457728: `juju upgrade-juju --upload-tools` leaves local environment unusable <local-provider> <upgrade-juju> <vagrant> <juju-core:Triaged> <juju-core 1.24:In Progress by axwalk> <https://launchpad.net/bugs/1457728>	07:34
axw_	fwereade: my initial thought is to make it more like the watcher API, which can be canceled when the worker is killed	07:35
wallyworld	dimitern: done, but a few comment sorry. i have to run away to soccer for a bit but will be back later	07:41
dimitern	wallyworld, ta!	07:41
dimitern	wallyworld, I was trying to find a way not to use JujuConnSuite, but couldn't find how - ideas welcome	07:42
dimitern	axw_, ^^	07:42
axw_	dimitern: see {api,apiserver}/diskmanager for example	07:45
dimitern	axw_, ah, ok - thanks!	07:45
axw_	dimitern: convert the state.State to an interface {ResumeTransactions()}	07:45
axw_	then in the tests you replace the state.State with a mock version	07:46
wallyworld	dimitern: i referenced diskmanger in the comments :-)	07:46
dimitern	axw_, the problem is RegisterStandardFacade needs a factory method taking *state.State	07:46
* wallyworld runs away to soccer		07:46
axw_	dimitern: yeah that's a bit of a pain. couple of options: limited use of PatchValue as in apiserver/diskmanager, or have the factory defer to some other code that takes an interface	07:47
dimitern	axw_, right, that's an option, but we really should change facade factory methods across the board to avoid the need to pass state	07:48
axw_	dimitern: I agree	07:48
axw_	just haven't gotten around to it :)	07:48
fwereade	axw_, oops, sorry, looking	08:35
fwereade	axw_, I'm not sure the Block is intrinsically the problem; but, yes, a watcher-style approach would be much more in keeping with everything else in juju	08:37
fwereade	axw_, the core problem I think is that the block can outlive the manager responsible for notifying of the change	08:38
axw_	fwereade: yeah, the lease manager on the apiserver just exits without notifying the subscribers	08:39
axw_	fwereade: so they just sit there waiting, forever	08:39
fwereade	axw_, grrrmbl	08:39
fwereade	axw_, it has a few other hang bugs too	08:39
axw_	fwereade: so we can close those channels, but I'm not too sure how to prevent new ones from coming in yet. the whole thing's a singleton, which makes it slightly difficult	08:40
fwereade	axw_, the singleton is a goddamn nightmare	08:40
fwereade	axw_, let me forward you a couple of mails	08:40
axw_	okey dokey	08:40
fwereade	axw_, if you have input re replacing it cleanly I would be most grateful	08:42
* axw_ lights the pipe and puts on his reading glasses		08:42
axw_	sure thing	08:42
fwereade	axw_, but every approach I can see has tentacles :(	08:42
fwereade	axw_, I'm going out for a short run soon but ping me and I'll respond when I can	08:43
axw_	fwereade: will do, I'll have to digest all of this first	08:43
fwereade	axw_, yeah, I'm not expecting immediate responses at all :)	08:44
axw_	:)	08:44
axw_	fwereade: I'll investigate making lease a non-singleton. will let you know if I get anywhere	09:12
fwereade	axw_, awesome, tyvm, http://reviews.vapour.ws/r/1787/ and my responses may be relevant background also	09:13
axw_	ok	09:13
axw_	fwereade: re worker dependencies, I think I'd avoid that initially and return an error if the apiserver facade attempts to use the lease manager if the worker is stopped. is that reasonable?	09:15
fwereade	axw_, yeah, that's fine by me	09:15
fwereade	axw_, but then we need a strategy for wiring the fresh lease manager into the api server when it's bounced...	09:16
axw_	fwereade: ah, I was thinking they'd all bounce.. that won't happen though will it. unless we make all lease-manager errors fatal.	09:16
fwereade	axw_, if we made the lease manager part of state directly we might cut through that problem entirely	09:16
fwereade	axw_, a state already looks after the watcher and presence "worker"s	09:17
fwereade	axw_, it's not a good solution but it might make a good dolution easier to see	09:17
fwereade	axw_, not sure	09:17
fwereade	axw_, really have to go out now, bbs	09:17
axw_	sure, ttyl	09:17
dimitern	axw_, fwereade - http://reviews.vapour.ws/r/1777/ PTAL	09:28
dimitern	fwereade, you'll like this I believe :) ^^	09:28
axw_	dimitern: is resumer really run once per env? I would've thought it'd be once for the state server	09:30
axw_	I don't think there's a separate txn log per env is there?	09:30
dimitern	axw_, I think it's run once per state server (jobmanageenviron)	09:31
axw_	dimitern: sorry, reading fail. I saw perEnvSingular and read perEnv	09:31
dimitern	axw_, ah :)	09:31
dimitern	axw_, yeah - perEnvSingular could be named better - like envManagerWorkers	09:32
axw_	dimitern: actually... it does look like it'll be one per (hosted) env	09:33
axw_	env worker manager starts those workers for each env in state	09:33
* axw_ doesn't know JES well		09:34
dimitern	axw_, hmm - well, that smells fishy	09:34
dimitern	axw_, but I haven't changed the logic there I believe	09:34
axw_	dimitern: you moved it into startEnvWorkers, so I think there'd be one of them per hosted env. I could be wrong, thumper and co could tell you definitively. anyway, I'll keep reviewing	09:36
dimitern	axw_, fair point, will ping thumper or menn0	09:37
axw_	dimitern: stupid question. what do we gain by running this over the API anyway? it's pretty closely tied to mongo	09:39
dimitern	axw_, satisfying the "thou shalt not use state directly ever" concept :)	09:42
dimitern	axw_, fwereade is really keen on this and I agree - better isolation, mockability, etc.	09:42
dimitern	axw_, I guess I could move the starting of resumer in postUpgradeAPIWorker when isEnvironManager == true	09:44
axw_	dimitern: mk. well, what's there LGTM, apart from that possible per-env issue	09:44
dimitern	axw_, thanks!	09:44
axw_	dimitern: yeah that looks like it'd work	09:45
dimitern	axw_, it will still run 1 resumer per apiserver I guess, but it should work regardless	09:45
dimitern	(for all hosted envs and in HA setup)	09:46
axw_	hm yeah, we don't have singular workers over API. welp, I dunno. is it valid for two things to try to resume transactions?	09:47
axw_	I guess it must be	09:47
dimitern	axw_, looking in state/txn.go - ResumeAll() that ultimately gets called, it seems we always find all txns and try to resume !tapplied \|\| !taborted	09:49
perrito666	mornin	10:16
wallyworld	fwereade: with that pr, i was only trying to do the minimal work to improve what was there for 1.24, not solve the bigger picture issues which would take a lot more effort. i was hoping that as long as what was there was no worse, and hopefully better than what exists, it could solve the huge txn queue issues (but not everything else)	12:05
fwereade	wallyworld, I suspect that all that'd take is dropping the delete/add, and leaving everythinng else as is	12:10
fwereade	wallyworld, but the txn builder doesn't add anything afaics -- if anything it makes it slightly worse by making the lease managers more relentless in overwriting one another	12:11
fwereade	wallyworld, (I think?)	12:11
wallyworld	fwereade: that last point i did question - i think it could be changed to just error out if the txn revno differed	12:12
fwereade	wallyworld, it doesn't help	12:12
fwereade	wallyworld, you're just checking that the database looked how it did when you decided to make the change	12:12
fwereade	wallyworld, but you're not using the database to help you decide whether that change is sane	12:13
wallyworld	well isn't the database looking as you expect sufficient?	12:13
fwereade	wallyworld, no, because the only component that knows how it shoudl look is the lease manager	12:13
fwereade	wallyworld, the lease persistor is just doing as its told and not synchronising anything afaics	12:14
fwereade	wallyworld, it's only the lease manager that understands on what basis it's replacing the lease, but it's keeping that basis secret from the persistor, so the persistor can't know whether it's still a good idea at the time it looks at the db	12:15
wallyworld	hmmm, sounds like the lease manager needs to use the db as a point of synchronisation rather than an in memory model	12:15
fwereade	wallyworld, I think that is unquestionable	12:15
wallyworld	it could work if we could guarantee that the db 1:1 reflected the in memory model, but that doesn't work for ha etc	12:16
fwereade	wallyworld, it's one of those communication screwups where I'd thought that was the only way that could ever possibly work, and that clever in-memory stuff might be a smart optimisation	12:16
fwereade	wallyworld, it didn't even cross my mind that we'd try to build a distributed lease manager without synchronisation	12:16
wallyworld	it wouldn't be so bad is mongo wasn't so fucking sumb	12:16
wallyworld	dumb	12:16
fwereade	wallyworld, yeah, it's a genuinely interesting problem	12:17
wallyworld	so i was looking for a quick 1.24 fix (not perfect)	12:17
wallyworld	i thought that by at least making the db writes conditional, we may avoid the huge txn queue issue	12:18
wallyworld	not trying to fix everuthing	12:18
wallyworld	also not ignoring errors	12:18
fwereade	wallyworld, I haven't checked yet but I strongly suspect that the huge queues are because of the delete/add	12:18
wallyworld	at least we'd see what may be failing	12:18
wallyworld	right, so the delete add is gone	12:18
fwereade	wallyworld, and the trouble with not ignoring errors is that you can't really escape the tentacles	12:18
wallyworld	by using the buildtxn function we avoid the delete/add	12:19
wallyworld	as i said, not ment to be perfect	12:19
wallyworld	but no worse	12:19
wallyworld	with visible errors	12:19
fwereade	wallyworld, errors visible in the wrong place to a random subset of clients, I think?	12:20
wallyworld	errors will cause worker to reboot	12:20
wallyworld	with logging	12:20
fwereade	wallyworld, right	12:20
wallyworld	so better since they are visible	12:20
wallyworld	and maybe txn issue solved	12:20
fwereade	wallyworld, but the worst worker problems that cause hangs and deadlocks are not touched	12:20
wallyworld	yes	12:20
fwereade	wallyworld, and you're delivering the errors to inappropriate places	12:20
wallyworld	but that wasn;t the goal	12:20
wallyworld	why inappropriate? the worker will reboot, the cache wull be reloaded, the error will be logged = imporvement	12:21
wallyworld	as it is now, the cache can be corrupt	12:21
fwereade	wallyworld, the clients who callecd the method will get some weird error they should never see	12:21
fwereade	wallyworld, other clients will just hang	12:21
wallyworld	but that's no worse than now is it?	12:22
wallyworld	at least the error will be visible somehow instead of swallowed	12:22
fwereade	wallyworld, some errors will be visible to some clients	12:22
wallyworld	right, but only if something failed	12:23
fwereade	wallyworld, no	12:23
fwereade	wallyworld, ...or maybe I misunderstood you	12:23
wallyworld	quick hangout maybe?	12:24
fwereade	wallyworld, sure, 5 mins?	12:24
wallyworld	ok	12:24
wallyworld	in our 1:1	12:24
mup	Bug #1457218 changed: failing windows unit tests <ci> <regression> <windows> <juju-core:Fix Committed by ericsnowcurrently> <juju-core 1.23:Fix Committed by ericsnowcurrently> <juju-core 1.24:Fix Committed by ericsnowcurrently> <https://launchpad.net/bugs/1457218>	12:53
jam	wallyworld: fwereade: any solutions coming out of the hangout?	13:02
wallyworld	jam: you could join us briefly?	13:03
wallyworld	https://plus.google.com/hangouts/_/canonical.com/ian-william	13:03
jam	wallyworld: link? (I'm supposed to be meeting with mramm, but he's not showing up yet)	13:03
jam	wallyworld: he just showed up	13:04
wallyworld	jam: tl;dr; i think we can land the pr with slight mods	13:04
wallyworld	jam: fwereade is thinking about it :-)	13:04
jam	wallyworld: fwereade: can we do it with opaque tokens? (manager gives a request to persister which manager needs to pass back in the next time)	13:06
wallyworld	jam: i'm off to bed, fwereade will fill you in	13:31
fwereade	jam, so, I'm reasonably sure that wallyworld's PR doesn't make things worse, with a couple of fixes we can put that in	13:32
fwereade	jam, re passing tokens -- possibly? I couldn't think of a way to do that nicely, because of the smearing of knowledge across the layers (lease persistor knows what's written; lease manager knows what those leases mean; leadership manager knows how leases map to leadership)	13:34
fwereade	jam, but maybe I mistake what problem you're addressing?	13:35
wwitzel3	natefinch: ping	14:01
natefinch	ericsnow: check out https://github.com/natefinch/pie	14:23
ericsnow	natefinch: nice :)	14:23
voidspace	dimitern: ping	14:43
dimitern	voidspace, pong	14:47
voidspace	dimitern: I've created three tasks for working with the devices api	14:48
voidspace	dimitern: pre-generating MAC addresses is actually probably simpler than our initial approach of a machine agent and apiserver methods for the container to report the MAC address after provisioning	14:49
dimitern	voidspace, great, thanks! I'll have a look shortly	14:49
voidspace	dimitern: there are some open questions however	14:49
voidspace	dimitern: it doesn't look like you can associate a "device" with a "host"	14:49
voidspace	dimitern: so on host destruction we'll still have to manually release the addresses (destroy the containers)	14:49
voidspace	dimitern: that's easy, but not what we hoped	14:49
dimitern	voidspace, wait I don't quite follow	14:49
voidspace	dimitern: I thought part of the point we were hoping to get from the devices api was the ability to declare a container as belonging to a host machine	14:50
dimitern	voidspace, you need the system-id (instance id in juju terms) of the host to pass as parent= in device new, right?	14:50
voidspace	dimitern: gah	14:50
dimitern	voidspace, that's establishes the link	14:50
voidspace	dimitern: I was looking at get not new	14:50
dimitern	that even	14:51
voidspace	dimitern: so I didn't see parent	14:51
voidspace	dimitern: cool, that's great	14:51
dimitern	voidspace, :) yeah	14:51
voidspace	dimitern: storing the devices uuid will be interesting	14:51
voidspace	dimitern: 1) it's provider specific	14:51
voidspace	dimitern: 2) the logical place for it is in instanceData - but that normally doesn't get created until after provisioning	14:51
voidspace	dimitern: so there'll be some re-working there	14:51
dimitern	voidspace, yeah, true	14:53
dimitern	voidspace, it seems like we need to extend SetInstanceInfo to take an extra argument	14:56
voidspace	dooferlad: dimitern: I picked up that PDU you recommended (dooferlad) for cheap on ebay (about half the price of that refurbed one)	14:56
voidspace	dimitern: right	14:56
dimitern	voidspace, if that argument is set, we'll store it in a new field in the instanceData doc for the container	14:56
dimitern	voidspace, nice! does it work ok?	14:57
voidspace	dimitern: waiting for it to arrive	14:57
voidspace	dimitern: alternatively, we can fetch the device id from the mac address	14:57
voidspace	dimitern: so we can just store that, and it's not provider specific	14:58
dimitern	voidspace, interesting	14:59
dimitern	voidspace, so an environ method like InstanceIdFromMAC(mac string) (instance.Id, error)	15:00
voidspace	dimitern: well, the release IP address method could do that	15:01
voidspace	dimitern: the MAAS specific one	15:01
voidspace	dimitern: probably no need for a new public method on Environ	15:01
dimitern	voidspace, I like this!	15:02
dimitern	voidspace, the hostname can be used as well	15:02
voidspace	dimitern: right	15:02
dimitern	(but it needs to be a FQDN)	15:02
voidspace	dimitern: so it should be easy, and no need to store provider specific information	15:02
dimitern	voidspace, cool!	15:02
=== kadams54 is now known as kadams54-away
=== kadams54-away is now known as kadams54
voidspace	dimitern: so MAC address is not stored on the machine, nor the instanceData but in a networkInterfaceDoc	16:18
voidspace	dimitern: (in terms of state)	16:19
voidspace	dimitern: and that's done from SetInstanceInfo	16:20
dimitern	voidspace, yeah, that's a bit crappy and needs fixing at some point	16:24
voidspace	dimitern: is it the right way to store container mac address for now?	16:26
voidspace	dimitern: or is it already done like that	16:26
ericsnow	dimitern: is there (or will there be) networking info in charm metadata?	16:26
voidspace	dimitern: i.e. if we specify the MAC address for the container on creation, it will be populated correctly in state by SetInstanceInfo	16:26
voidspace	ericsnow: networking will largely be done as deploy time constraints and environment configuration	16:27
=== kadams54 is now known as kadams54-away
ericsnow	voidspace: hmm, I would have thought it would be similar to storage, where the charm specifies up-front what networking resources it will need	16:28
ericsnow	voidspace: see http://bazaar.launchpad.net/~axwalk/charms/trusty/postgresql/trunk/view/head:/metadata.yaml	16:28
dimitern	voidspace, well, considering we'll most likely change what we do in SetInstanceInfo apart from calling SetProvisioned	16:28
voidspace	ericsnow: what networking resources do you have in mind?	16:28
ericsnow	voidspace: not sure exactly :)	16:29
voidspace	ericsnow: what could a charm usefully specify...	16:29
dimitern	voidspace, I'd suggest to reuse SetInstanceInfo, if possible (pass the MAC as part of the network info)	16:29
ericsnow	voidspace: what have you got? :)	16:29
voidspace	dimitern: they should be already - as interfaces	16:29
voidspace	ericsnow: what "spaces" a unit can be in - specified at deploy time	16:29
voidspace	ericsnow: and then the creation of spaces and the creation of subnets and allocating them to spaces	16:29
ericsnow	voidspace: spaces as in subnets?	16:30
voidspace	ericsnow: a space is a collection of subnets	16:30
ericsnow	voidspace: k	16:30
voidspace	ericsnow: and they're environment specific, so you can't usefully specify anything about them in a charm	16:30
ericsnow	voidspace: so "space" is what could meaningful in the charm metadata	16:31
voidspace	ericsnow: raise ParseError("what?")	16:31
ericsnow	voidspace: you could at least identify the space	16:31
voidspace	ericsnow: but each environment will have different spaces	16:31
voidspace	ericsnow: so you specify them at deploy time	16:31
ericsnow	voidspace: I'm asking in context of charm-launched containers	16:32
ericsnow	voidspace: we are looking to specify them in the charm metadata	16:32
voidspace	ericsnow: well, a container will only be able to be in the spaces that the host can see	16:32
ericsnow	voidspace: part of that would be identifying the networking resources the container should use	16:33
=== kadams54-away is now known as kadams54
voidspace	ericsnow: the spaces available to a container will depend on the host - if the physical (or virtual!) machine a container is in doesn't have access to the subnets in a space then the container can't either	16:33
voidspace	ericsnow: so I don't think there's anything useful to specify in the charm metadata there	16:34
voidspace	ericsnow: unless the charm can get the spaces available at container creation time and (effectively) say "be on this subnet"	16:35
voidspace	ericsnow: which if the host is in several spaces, that may be useful	16:35
ericsnow	voidspace: exactly	16:35
ericsnow	voidspace: if there is only one possibility then there's no need to decide :)	16:36
voidspace	ericsnow: this is metadata added at charm runtime, not upfront then?	16:36
ericsnow	voidspace: it's in the face of multiple options that we'd like to be explicit	16:36
ericsnow	voidspace: no, it will be part of the charm metadata	16:36
voidspace	ericsnow: you can't know at charm creation time what spaces will be accessible to a machine at arbitrary machine creation time	16:37
voidspace	ericsnow: so you can't know anything useful upfront, it's deploy time data not charm data	16:37
ericsnow	voidspace: mostly declaring the space to use for a container is relevant if the charm has multiple containers and multiple spaces and the containers should be on the same subnet	16:38
voidspace	ericsnow: so if this is metadata encoded into the charm (i.e. not to be determined at hook runtime / container creation time) then you can't know ahead	16:38
voidspace	ericsnow: but what spaces units of a charm are to be deployed to is the decision of the person deploying the charm not the person writing the charm	16:39
voidspace	ericsnow: so you can't encode that into the charm	16:39
voidspace	I think if a charm (unit of a service) creates a container, the assumption has to be that it will have the same constraints as those specified for the charm	16:40
perrito666	/query natefinch	16:40
perrito666	lol	16:40
perrito666	my irc client has the worse UI in history	16:40
ericsnow	voidspace: okay, so we'll just have to wing it :)	16:40
voidspace	ericsnow: yeah	16:40
voidspace	ericsnow: so there may need to be some code / checking that we do pick the same subnet for configuring the networking of the container	16:41
voidspace	ericsnow: but I think that's deterministic, so it shouldn't be a problem currently	16:41
ericsnow	voidspace: agreed	16:43
voidspace	ericsnow: eventually we will do per-instance (including containers) firewalling - and setup routing rules so that spaces are isolated from each other	16:44
voidspace	ericsnow: so the host will need to know what ports the container is using as we're doing NAT	16:44
voidspace	ericsnow: at least with addressable containers we are	16:45
voidspace	ericsnow: but per-instance firewalling, and routing rules for spaces, are both some way off	16:45
ericsnow	voidspace: you mean like we mostly had to do for the new vsphere provider? :)	16:45
voidspace	ericsnow: thankfully I have no idea...	16:45
voidspace	g'night all	18:11
=== urulama is now known as urulama__
=== kadams54 is now known as kadams54-away
natefinch	I hate it when my job comes down to: let's find the least-sucky way to do this. ...because invariably people disagree which way is least sucky.	20:04
natefinch	wwitzel3: you around?	20:08
wwitzel3	natefinch: yeah	20:08
wwitzel3	natefinch: in moonstone with ericsnow	20:08
natefinch	kk	20:08
natefinch	I was wondering if you knew if it's possible to load the existing syslogconfig ... I can find a Write method, but not a Read method... so I don't know if we even support reading from whatever config we wrote to disk.	20:10
wwitzel3	natefinch: don't know off hand, I can poke around in a bit	20:13
natefinch	wwitzel3: that's ok, I can poke around, just figured I'd ask if you knew	20:14
natefinch	dammit, I hate it when the docs don't specify what happens in edge conditions. If you os.Rename a file and the target exists.. what happens?	20:32
perrito666	I unix, most likely a rewrite	20:32
perrito666	unless there is a guard	20:32
wwitzel3	anyone able to explain the workflow process of developing new stuff in juju/charms?	21:17
wwitzel3	do you work against v5-unstable? and propose to v5?	21:17
niedbalski	Does anybody had experienced this error (missing series) "21": agent-state-info: invalid binary version "1.23.3--armhf" ?	21:55
thumper	cmars: we on for today?	21:58
thumper	niedbalski: wow, cool...	21:59
thumper	unknown series?	21:59
thumper	niedbalski: what host?	21:59
niedbalski	thumper, 1.23.3-vivid (client), 1.23.2 ( bootstrap node ) on armhf. This happens on sync-tools / add-machine operations.	22:00
thumper	niedbalski: what hardware are you using?	22:01
thumper	for armhf?	22:01
niedbalski	thumper, raspberry pi 2	22:01
niedbalski	thumper, this is not super critical, is for my local lab, but the bug is ugly anyways :)	22:02
thumper	ack	22:02
thumper	can you file a bug plz?	22:02
thumper	cmars: nm, I just saw the email about the decline	22:02
niedbalski	thumper, ok, it seems that other archs experienced this same issue in the past, btw. (http://irclogs.ubuntu.com/2014/09/24/%23juju.txt)	22:03
niedbalski	thumper, https://bugs.launchpad.net/juju-core/+bug/1459033, anything else I can add?	22:14
mup	Bug #1459033: Invalid binary version, version "1.23.3--amd64" or "1.23.3--armhf" <juju-core:New> <https://launchpad.net/bugs/1459033>	22:14
thumper	niedbalski: nah, that is a good start	22:20
thumper	niedbalski: thanks	22:20
mup	Bug #1459033 was opened: Invalid binary version, version "1.23.3--amd64" or "1.23.3--armhf" <juju-core:New> <https://launchpad.net/bugs/1459033>	22:22
=== kadams54 is now known as kadams54-away
waigani	wallyworld, axw: I've hit a bug with 1.24, ec2 --upload-tools - there a bunch of CLOSE_WAIT connections on the server to s3 - full details: #459047	23:41
mup	Bug #459047: [105158.082974] ------------[ cut here ]------------ <amd64> <apport-kerneloops> <kernel-oops> <linux (Ubuntu):Confirmed> <https://launchpad.net/bugs/459047>	23:41
wallyworld	oh joy	23:42
wallyworld	maybe bug 1459047 perhaps	23:42
mup	Bug #1459047: juju upgrade-juju --upload-tools broken on ec2 <juju-core:New> <https://launchpad.net/bugs/1459047>	23:42
waigani	wallyworld: ugh, what did I paste?	23:43
wallyworld	missing the 1	23:43
waigani	ah, right heh	23:43
wallyworld	waigani: so i think you're on bug duty for onyx? looks like you've a bug to work on :-)	23:47
waigani	wallyworld: yep	23:48
wallyworld	waigani: we're having fun fixing lease manager stuff \o/	23:49
waigani	wallyworld: any idea why we're connecting to s3 with --upload-tools? I thought it was using gridfs?	23:49
waigani	wallyworld: oh yeah, that one looked interesting	23:49
wallyworld	s3 was at one stage a repository for public tools	23:49
waigani	wallyworld: do you know if we are using it for anything now?	23:50
waigani	s/are/should be	23:50
wallyworld	and s3 is still used for bootstrap state file i think (need to check)	23:50
wallyworld	i don't think we've ported off that yet	23:50
waigani	right	23:51
wallyworld	so very minimal use for new environments	23:51
waigani	okay, I'll leave you to your leasing :)	23:51
wallyworld	we can swap :-P	23:53
waigani	haha	23:53
mup	Bug #1459047 was opened: juju upgrade-juju --upload-tools broken on ec2 <juju-core:New> <https://launchpad.net/bugs/1459047>	23:58

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!