/srv/irclogs.ubuntu.com/2015/03/12/#juju-dev.txt

=== kadams54 is now known as kadams54-away
=== kadams54-away is now known as kadams54
=== kadams54 is now known as kadams54-away
niedbalskiericsnow, ping01:49
niedbalskiericsnow, http://reviews.vapour.ws/r/1088/ , re-submitted, thanks !01:49
jamkatco: just wondering if you got to the LeaderSettings watcher registration issue or whether I should be picking it up02:14
katcojam: hey i'm working on that right now02:20
jamkatco: k02:20
katcojam: it turned out to be a bit trickier to diagnose; the code was registering the worker correctly, but the worker itself didn't adhere to the state.NotifyWatcher interface02:20
katcojam: wallyworld figured that one out02:21
katcojam: so there's a few issues; one question is why is the type assertion failure not being bubbled up02:21
katcojam: but at any rate, i'll have a patch momentarily. had to eat dinner.02:21
jamkatco: when you EOD can you make sure to push it up so I can pick it up if it isn't done? and a ping to me to look at it would be good02:22
katcojam: it's 21:22 here, so i'll get a PR in before i stop :)02:22
katcojam: http://reviews.vapour.ws/r/1141/ i will stay on for questions/fixes02:39
jamkatco: looking now02:39
katcojam: wallyworld made an excellent suggestion for making the test better, but i don't have the time tonight to grok it; i can shore that up later: http://pastebin.ubuntu.com/10582820/02:40
jamkatco: right. I was about to suggest using the NotifyAsserter directly on the Changes rather than a secondary stream02:41
jamkatco: reviewed02:45
katcojam: ty looking02:45
* katco smacks head02:46
katcojam: so, you're absolutely right02:47
katcojam: do you think i should just implement a full watcher instead of attempting to piggy-back off of settingsWatcher?02:47
katcojam: or do you think a simple for loop wrapping the settings-watcher's change is sufficient?02:47
jamkatco: if all it needs to do is proxy another one, thats fine (I think). Is there a reason you can't just hand back the settings watcher itself?02:48
katcojam: yeah, the state.NotifyWatcher signature demands a signature of type Changes() <-chan struct{}02:48
jamkatco: if you are embedding *settingsWatcher, doesn't that give you *its* Changes channel?02:49
katcojam: and that's the bit preventing it from being registered02:49
jamah, so we have changes but its not a struct{}02:49
jamk02:49
katcojam: the settings watcher is not of type NotifyWatcher; it's signature is <-chan Setting something or other02:49
jamkatco: from being registered? or from being instantiated ?02:49
katcojam: i believe from being registered02:49
katcojam: we do some reflection from what wallyworld found02:49
katcojam: and anything stuffed into resources which does not conform to NotifyWatcher will not be utilized02:50
jamkatco: right but that's not *registration* thats *extraction*02:50
jamkatco: because resources can be whatever02:50
katcojam: right sorry; extraction02:50
jambut when you go to look them up it checks that the thing that's there matches.02:51
katcoright02:51
jamprobably we should have good logging as well02:51
jamsomething that says "I found something, but its not what you're looking for"02:51
katcoyeah we discussed that; it's a bit odd that this didn't trigger some kind of error/log, *something*02:51
jamwe generally don't send that as an error to the caller (though I think we could here) because of Permission/Priority concerns.02:51
katcowe also discussed maybe having separate register methods for different types to force this into a compile-time error02:52
jamanyway, I'll be back in 1002:52
katcok, i'll implement the for loop02:52
katcojam: i'm a bit concerned about what happens when the embedded settingsWatcher is told to stop; what would happen to the LeadershipSettingsWatcher's Changes loop?02:55
wallyworldkatco: coming late to the party, but in such cases you'd implement Stop() on the outer struct and call the embedded Stop() as well as cleaning up anything else03:01
wallyworldaxw: i've made a few comments on the gh pr03:02
katcowallyworld: yeah the thought crossed my mind; but that means i need to probably implement a tomb of my own03:02
axwwallyworld: thanks. I've gotta head out shortly, will respond when I get back03:03
wallyworldsure03:03
wallyworldkatco: yeah maybe, i'd need to check the code in more detail. the reason for not using settings watcher directly is the changes method returns a settings delta, not just an event? and we want notify watcher samantics?03:05
katcowallyworld: yeah03:05
wallyworldhmmmm03:05
wallyworldkatco: could we just add a Changes() <-chan struct{} to settings watcher03:06
wallyworldto make it also a notify watcher03:06
katcoi don't think go allows overloading like that03:06
katconaming conflict03:06
katcoit doesn't take into account the entire signature03:06
wallyworldah, yeah03:06
wallyworldnot i remember why i hate Go often03:06
wallyworldnow03:06
katcolol03:07
wallyworldthat and no exceptions and no generics03:07
katcoit forces you to stay simple... which can be an immensely good thing :)03:07
wallyworldkatco: so we could extract a base settings watcher and embed that in 2 variations - what we have now and a new notify variant03:08
wallyworldthe 2 new versions of the containing struct would have the different Changes() signatures03:08
katcowallyworld: hm. i'm kind of liking that idea. i do worry about unforeseen consequences03:08
wallyworldthat's what tests are for :-D03:09
wallyworldi think we have pretty ok coverage in that area03:09
katcomaybe earlier today i would have agreed with that03:10
katco:)03:10
wallyworldlol03:10
katcoi think i'll give that way a go though03:10
katcogood idea03:11
wallyworldgood luck :-)03:11
wallyworldi think it will work fine, and solves the issues encountered03:11
katcowallyworld: mmm... i think that just makes the issue worse. because i'd then have the same problem in 2 places. the base watcher would still have to notify on a <-chan settings03:15
wallyworldcouldn't the base watcher use an embedded interface to call when it needs to notify?03:17
wallyworldi'd need to look at the code to map it out03:17
katcowallyworld: hm. no that may work.03:18
wallyworldthe one with the notify semantics would just discard the info given and shove struct{} onto a channel03:19
katcowallyworld: right, but you're still talking to a channel; you lose the magic of select03:20
katcoso the way this would work is the i'm in the base type's select, specifically in the case statement that received a change03:21
katcoi then call the notify function that was passed to me03:21
katcothat notify function has to send on a channel03:21
katcoexcept that now if stop is called, it will block03:21
katcoas it's implemented now, the select would select the stop even if it was waiting on sending the notification03:22
katcobecause it's in its select03:22
wallyworldhmmm03:22
katconow mind you03:22
katcomy brain is revolting a bit at this time of night03:22
thumperwallyworld: do you have a link to the CI jobs on jenkins?03:23
katcothis is where generics would be wonderful03:23
thumperurl hacking is failing me03:23
katcothumper: o/03:23
thumperhey katco03:23
wallyworldthumper: http://juju-ci.vapour.ws:8080/job/github-merge-juju/ ?03:23
wallyworldor http://juju-ci.vapour.ws:8080/03:23
wallyworldthe last one is the dashboard for all the jobs03:24
thumperthat doesn't have the CI jobs on it03:24
wallyworldthumper: you need to be logged in03:24
thumperas who?03:24
thumperI don't have any creds that AFAIK03:25
rick_h_thumper: everything got locked down due to the creds leakage issue raised last week03:25
thumperhey rick_h_03:26
wallyworldkatco: wanna leave it till tomorrow? I'll see if i can hack it a bit today03:26
katcowallyworld: nah i'm close03:26
katcowallyworld: i don't think there's a way around it. i have to do a copy/paste job. it needs its own tomb setup03:27
wallyworldyeah, could be right03:27
rick_h_thumper: howdy03:27
katcothis is when i think go's simplicity works against it03:27
wallyworldexactly03:28
wallyworldand i run into that *all* the time03:28
katcoi think they are trying to solve this still03:28
wallyworldfor me, it's too simple03:28
katcobut code duplication of this complexity is annoying03:28
wallyworldleads to *lots* ot cut and paste bilerplate03:28
wallyworld code duplication of any complexity can get annoying - if i never see another cut and paste arraycontainsstring() it will be too soon03:29
thumperFYI: both CI blockers have now been cleared03:32
thumperverified that the fix that landed earlier fixed the upgrade issue03:32
thumperand the power test isn't timing out on cmd/jujud/agent either03:32
thumperpretty sure those two things were related03:33
katcowallyworld: jam: helper method for registering watcher: RegisterWatcher(...) or RegisterNotifyWatcher(...)?03:38
wallyworldNotify imo03:39
wallyworldas there are idfferent kinds03:39
wallyworldeg StringsWatcher03:39
mupBug #1430791 changed: Upgrade-juju is broken on most/all substrates <ci> <regression> <upgrade-juju> <juju-core:Fix Released by waigani> <https://launchpad.net/bugs/1430791>03:39
mupBug #1430898 changed: run-unit-tests ppc64el timeout <ci> <gccgo> <ppc64el> <regression> <test-failure> <juju-core:Fix Released> <https://launchpad.net/bugs/1430898>03:39
mupBug #1431130 was opened: make kvm containers addressable (esp. on MAAS) <addressability> <kvm> <maas-provider> <network> <juju-core:In Progress by dooferlad> <https://launchpad.net/bugs/1431130>03:39
katcowallyworld: gosh... it makes apiserver/common depend on state03:40
katcoi don't know if it's worth that.03:40
wallyworldno03:40
wallyworldlet's not do that03:40
katcopushing up changes with a little piece of my rotted heart.03:42
wallyworldlol03:43
katcohttp://reviews.vapour.ws/r/1141/03:43
katcoi think we could probably do something better by passing in a channel into a base type03:43
katcobut i'm not sharp enough right now to conceive of that03:43
wallyworldkatco: i know it's late for you, and i'm happy to do it, but i do think the test needs to use something like the pastebin i offerred earlier03:48
katcowallyworld: i was going to land that separately... if you want it in i'll have to hand off03:48
katcowallyworld: otherwise i won't be fresh for tomorrow. b/c babies don't have a snooze ;)03:48
wallyworldkatco: ok, so long as you land first thing tomorrow before we branch 1.23 :-)03:49
katcowallyworld: land the separate branch? or land changes you will make?03:49
wallyworldthe followup branch with revised tests03:49
wallyworldthis branch at least will fix the issue itself03:50
katcowallyworld: keep in mind i also have to do the amz fix03:50
wallyworldi can do the tests one03:50
katcowallyworld: it would be appreciated... so does this one have a ship it then?03:50
wallyworldkatco: just eyeballing changes - they are a cut and paste right?03:51
katcowallyworld: yup03:51
katcowallyworld: from settingsWatcher03:51
wallyworldkatco: ok, tyvm for perservering with this, +103:52
* katco teuche, Jenkins. it is i who waits on you.04:06
katcotouche even04:07
mupBug #1430898 was opened: run-unit-tests ppc64el timeout <ci> <gccgo> <ppc64el> <regression> <test-failure> <juju-core:In Progress> <https://launchpad.net/bugs/1430898>04:09
mupBug #1431134 was opened: fix container addressability issues with cloud-init, precise, when lxc-clone is true <addressability> <cloud-init> <ec2-provider> <lxc> <maas-provider> <network> <precise> <usability> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1431134>04:09
axwwallyworld: I just realised there's more to do, I hadn't updated the apiserver/storageprovisioner code to use the new watchers04:09
axwdoing so now04:10
katcogah04:11
katcobetween the time i submitted and the job queued up that bug hit04:11
katcoand now it rejected the build04:11
katcocan someone land this when trunk opens up? https://github.com/juju/juju/pull/181304:14
katcoi need to go to bed04:14
katcowallyworld: axw: anastasiamac ^^04:15
axwkatco: ok, will keep an eye out04:16
axwgood night04:16
katcoaxw: ty sir. have a good day04:16
jamwallyworld: katco: so I looked at the change, and we need to tweak it a bit for not actually being a settings watcher, also it is completely untested as is. But if you want to hand it off to me, since I actually need it for my patch, I can pick it up.04:41
jamwallyworld: would you rather just hand it off to me?04:41
wallyworldjam: if you want, i realise the tests need updating, but it was late for katco so i was going to follow up with a branch, but if you want to do it for your patch that wuld be fine with me04:42
jamwallyworld: well both that we now have a lot of untested code in LeadershipSettingsWatcher, and the test needs tweaking for featuretests04:42
wallyworldjam: agreed, hence the tests were to be updated straight away04:43
wallyworldjam: i was looking to use something like statetesting.NewNotifyWatcherC as is used elsewhere04:45
wallyworldwas that ow you wre going to redo the tests?04:45
jamwallyworld: yeah, I'm grabbing your paste as a reference04:47
wallyworldoh, that paste was just an initial brain dum04:47
wallyworldp04:47
jamwallyworld: I had forgotten that NotifyWatcherC was the original that I used to create the NotifyAsserterC channel stuff04:47
wallyworldbut should be close04:47
jamwallyworld: yeah, I'm tweaking it, but it gives a good reference04:47
jamlike, we have to import statetesting :)04:47
wallyworldof course :-)04:47
wallyworldi didn't compile it, just dumpt the text into pastebin04:48
wallyworldjam: also for later, did you talk with mark about Properties vs Hints?04:49
jamno, mark was out of town last week04:49
wallyworldok, np04:50
jamwallyworld: do you know why featuretests starts mongo in a different way?04:50
wallyworldno04:51
wallyworldi either didn't realise or forgot that it did04:51
wallyworldjam: but isn't this standard? coretesting.MgoTestPackage(t)04:51
jamhmm… maybe it doesn't. I was running into problems with a —journal flag but only for the featuretests directory04:51
wallyworldin package_test04:51
jambut they seem to be running now04:52
jammaybe it was something else04:52
wallyworldsymlink issue :-P04:52
jamwallyworld: from what I've seen could be04:53
wallyworldoh, i was joking :-)04:53
jamnope, still happening: [LOG] 0:03.670 ERROR juju.worker exited "state": cannot create database index: cannot use 'j' option when a host does not have journaling enabled04:53
jammy state worker spins endlessly in those tests04:54
wallyworldhmmmm, not seen that before04:54
=== beisner- is now known as beisner
axwwallyworld: can you please check my latest diff in http://reviews.vapour.ws/r/1128/05:40
axwwallyworld: I've updated storageprovisioner to match the scope changes05:40
wallyworldsure05:41
wallyworlddimitern: do you know why the blocking bug fails on ppc64 only?07:20
dimiternwallyworld, I don't know that for sure - it might be gccgo-specific07:21
wallyworlddimitern: it's hard to tell from the console log - there's so many go routines, most of which are blocked on a select; where did you see an concrete error info?07:22
dimiternwallyworld, search for the test I mentioned in the log - it's there, about the newDummyWorker (or whatever, the second thing) - towards the end of the panic dump you can see upgrade_test.go:1779 (or was it machine_test.go)07:24
wallyworlddimitern: thank you, my eyes weren't working properly07:24
dimiternwallyworld, oh, np - it's hard to see anything useful in most of these panics07:27
axwwallyworld: we're going to need to expand the storage provider and provisioner. if a machine reboots, we'll have to recreate tmpfs/loop08:00
TheMuemorning o/08:22
dimiternTheMue, o/08:27
wallyworldaxw: unless they say they want transient right?08:29
axwwallyworld: transient volumes should still be attached after restart, they just may be empty08:29
axwwallyworld: at least, in the ec2 definition08:29
wallyworldyes, that is true08:29
axwwallyworld: I'm thinking of folding the filesystem mounting into the storage provider/provisioner08:30
axwwallyworld: and how to deal with remounting after restart. then I realised we need to do the same for tmpfs/loop08:30
* TheMue just read the State discussion. yeah, there are always repeating names in many projects, e.g. also Context.08:30
axwTheMue:  it occurred to me as I used the word "context" a lot in that email ;)08:31
TheMue..ooOO( Data, Dump, Thing, Stuff, ... )08:31
TheMueaxw: yeah, as "context of a state"08:31
axwwallyworld: so, my current thinking is that we require Attach{Volume,Filesystem} to be idempotent08:31
wallyworldaxw: i think it makes sense for the provisioner worker to have that responsibility of dealing with restart/remounting08:31
wallyworldyes08:32
axwwallyworld: and then the first time we see an already provisioned storage, we request a re-attach08:32
wallyworldlike unit hooks08:32
axwwhich may be a no-op08:32
wallyworldyep08:32
axwwallyworld: also, we don't need to think about htis too much now, but I think we're going to want to run Attach on both env + machine for env-scoped attachments08:33
axwwallyworld: e.g. for Ceph where we'd want to prepare the resource in the env storage provisioner, then attach/mount in the machine storage provisioner08:34
wallyworldyeah, i haven't thought about that08:34
wallyworldnot enough brain space08:34
axwthat can come later, I'm pretty confident we can work it in08:34
axwwallyworld: anyway, I think I'll work on provisioning filesystems/filesystem attachments in the storage provisioner now08:35
axwI won't worry about loop/tmpfs for the moment, will come back to them later08:35
wallyworldyou mean the restart behaviout for tmpfs/loop?08:36
axwwallyworld: we don't watch volume attachments in the storage provisioner yet, so we can't handle them until we do08:37
axwwallyworld: if you prefer I can do it for volumes first, I was going to leave it so you could continue on it in parallel08:37
wallyworldaxw: ok, start on fs, i have some health stuff to do but can then pick up volumes08:38
axwthough TBH, it's kinda much the same... I might see if I can abstract some stuff to make the core logic common08:38
wallyworldok sounds good08:39
katcough sleepless night in our household o.009:03
katcogoing to work on goamz while i'm up09:04
coreycbbodie_, here now.  I'm on UK time this week and back to US eastern time next week.09:07
wallyworldkatco: is that you or a bot?09:09
katcobeep boop09:10
wallyworldR2D2?09:10
katcolol09:11
katcoi'm going to drink an r2d2 sized coffee09:11
katcoin a few hours09:11
wallyworlddouble shot09:11
katcowe think she is teething again09:12
voidspacedimitern: if you get a chance09:19
voidspacedimitern: http://reviews.vapour.ws/r/1138/09:19
dimiternvoidspace, cheers, will look shortly09:20
=== ashipika1 is now known as ashipika
=== perrito667 is now known as perrito666
dimiternvoidspace, dooferlad, TheMue, please take a look at this http://reviews.vapour.ws/r/1144/09:45
TheMuedimitern: *click*09:45
voidspacedimitern: s/live tests on MASS/live tests on MAAS/ ?09:51
voidspacedimitern: in PR description09:51
voidspacedimitern: ah, dhcp leases... :-)09:52
voidspacedimitern: sounds like you've been having fun working all this out09:52
dimiternvoidspace, :) oh yeah - ofc09:52
dimiternvoidspace, it was a fun week, yeah :D09:53
dimiternTheMue, thanks!09:53
TheMuevoidspace: MASS is ok, Metal as Super-Service09:53
* dimitern has bootstrapped a maas environment at least a 100 times in the past 4 days09:53
TheMueouch09:53
dimiternvoidspace, you've got a review09:59
voidspacedimitern: thanks09:59
voidspacedimitern: PR updated with extended test10:34
dimiternvoidspace, great! I'm having another look10:34
dimiternvoidspace, looks good to land10:39
voidspacedimitern: cool, thanks10:41
katcoaxw: ping10:52
wallyworldkatco: axw is here now, if you ping him again, or at least rub his lamp, he may appear in a puff of smoke11:10
wallyworldor he was11:10
katcolol11:10
katcoi think i figured it out, i just wanted to pick his brain11:10
mupBug #1431286 was opened: juju bootstrap fails when http_proxy is set in environments.yaml <juju-core:New> <https://launchpad.net/bugs/1431286>11:10
katcowallyworld: got time for a review of goamz?11:18
katcowallyworld: dimitern: https://github.com/go-amz/amz/pull/3711:21
dimiternkatco, will have a look shortly11:25
katcodimitern: thx11:27
TheMuedimitern: ready with a first pass and like it so far. but would like a 2nd review11:37
dimiternTheMue, sure, thanks so far :)11:37
* TheMue steps out for lunch11:43
* fwereade has confused his sleeping patterns again and would like to beg for someone to implement a trivial branch for him while he goes and has breakfast and stuff11:44
wallyworldkatco: sorry, was afk. looking11:44
axw_katco: pong11:44
katcowallyworld: np at all11:44
* axw_ emerges in a puff of smoke11:44
katcolol11:44
=== axw_ is now known as axw
katcoaxw_: hey, was mostly going to pick your brain re: io.Copy moving the stream pointer of a request body to the end11:44
* fwereade needs "leader-elected", "leader-deposed", and "leader-settings-changed" hooks added to charm/hooks11:44
wallyworldah, awx has already reviewed11:44
katcowallyworld: ah cool11:45
axwkatco: I just commentd on the PR11:45
axwcame on here to check if you were around :)11:45
fwereadeaxw, jam, I have a couple of concerns re http://reviews.vapour.ws/r/959/ -- possibly unfounded? please consider11:45
axwfwereade: okey dokey11:46
katcoaxw: so i don't think making a type assertion for ReadSeeker will work, because the body of http.NewRequest wraps the incoming body in a ioutil.NopCloser11:47
jamfwereade: I agree with you about axw's comment. I had one written myself, but forgot that publish button11:47
jamwhy is it at the top....!11:47
fwereadeaxw, actually I'm not sure I expressed them very well -- I just have a bit of a nervous about not coalescing11:47
fwereadeaxw, it was an uphill struggle to convince myself it was ok in the client watcher code11:48
axwkatco: rats11:48
katcoaxw: i know, i tried that at first =/11:48
jamfwereade: so the tick-tock stuff was just preserving what we had and going forward with it. I also wasn't entirely happy as it seems very different from our standard behavior.11:48
katcoaxw: i also realize there is an API break; but we *just* cut v311:48
katcoaxw: not really that sure what our stance is on that11:49
axwkatco: fair enough11:49
katcoaxw: i suppose i'll let dimitern weigh in on that one11:50
dooferladdimitern: Could you take a look at https://github.com/juju/testing/pull/54/files11:50
axwfwereade: I don't really understand about the watcher ownership. if the source Stop()s it cleanly, why wouldn't its Err() be nil?11:50
fwereadeaxw, IMO the source shouldn't stop its watcher until it's exiting its goroutine11:51
jamkatco: official versions is "what was in a stable release" IMO11:51
dooferladdimitern: I tried running the .bat version and it doesn't work inside "wine cmd" so I am guessing it was broken anyway. I can test in Windows in a moment.11:51
katcojam: stable release of juju? or of goamz?11:52
katcojam: how did the leader settings work out btw. sorry i had to (try) and go to bed last night11:52
jamkatco: well goamz is trickier, as its a dependency. I thought it was an API version.11:52
jambut if things are just broken in v3 seems like people can't use it11:52
jamkatco: working on the patch to tests, been very distracted this morning, but I can build on what you landed.11:53
katcojam: well, not so much broken. v4 signing just uses more memory than it should.11:53
fwereadeaxw, "just one deferred Stop, guaranteed to happen after the loop exits" feels much cleaner -- and thus indicates that any close of that channel inside the loop implies that *if* there's no error returned *someone else* has interfered with us, and we should complain11:53
axwfwereade: I'm sure I'm being slow, but I still don't see how that affects the error state of the watcher. One way or another the source stops the watcher; if it's *asking* to stop it, why would the watcher stop with an error?11:53
jamkatco: does fixing the memory issue actually cause a change to the API?11:53
katcojam: distracted? people on juju core? no! ;)11:53
jamaxw: if someone stops something you depend on, isnt that an error?11:54
jamaxw: (if I pull the handbrake while you're driving, I don't think you'd exit cleanly :)11:54
katcojam: it does, because it necessitates some signature changes to take in io.ReadSeeker and not io.Reader11:54
katcojam: we don't know how awesome axw is at drifting...11:54
axwfwereade: so liveSource.Stop() should always error?11:55
katcojam: if you're interested, here is one of the API breakers: https://github.com/go-amz/amz/pull/37/files#diff-7db3cb93944d57b2ffc803281c906018R20911:55
* axw is confused11:56
fwereadeaxw, I think what would happen is11:56
jamkatco: so if you're really concerned, you could just add a new function PutReadSeeker11:56
fwereadeaxw, liveSource.Stop calls q.tomb.Kill(nil); return q.tomb.Wait()11:56
jamnot an API break by most people's definition11:56
fwereadeaxw, the watcher lives on happily in the background11:56
jamI guess if the underlying code *requires* read seaker11:56
katcojam: except the old function could not exist as is11:56
jamseeker11:56
katcojam: yeah exactly11:57
jamkatco: well you could always read and buffer11:57
dimiternkatco, dooferlad, ok, I'm back and responding in order :)11:57
katcojam: well, that's what causes the memory concern. that's what we were doing prior11:57
fwereadeaxw, the main loop observes that <-q.tomb.Dying(), and exits with ErrDying, whithout touching the watcher11:57
axwfwereade: ahh, I see. it makes sense when taken with your other comment11:57
katcodimitern: ty!11:57
fwereadeaxw, cool :)11:57
fwereadeaxw, and fwiw I don't think you're being slow, these types with internal goroutines are subtle and quick to anger11:59
katcodaughter is awake (at the appropriate time this time)... afk for a bit11:59
dooferladdimitern: no rush - was about to get lunch anyway11:59
dimiterndooferlad, cheers12:00
dimiternkatco, reviewed12:04
dimiternkatco, couldn't help much I'm afraid12:04
axwjam: if you come up with something better than the tick/tock, I'm happy to change the storage source. I'll fix the EnsureErr/watcher.Stop bit tomorrow12:06
jamaxw: well that patch hasn't landed has it?12:06
axwjam: which patch?12:06
axwjam: I mean, I'll change the storage source to match whatever you do for the relation one12:07
jamaxw: ah12:07
jamkatco: so you can add a new memory-efficient API, and leave the old inneficient one around for compatibility12:07
fwereadejam, just posted http://reviews.vapour.ws/r/1145/, please let me know if you spot any impedance mismatches with what you're doing with Filter12:09
jamfwereade: if it is run-leader-elected, I just merged it, the API fits12:09
fwereadejam, will be writing proper tests for it today once I'm ~functional again12:09
fwereadejam, cool12:09
fwereadejam, pushed an update or two since I last spoke to you12:09
fwereadejam, but I don't think I touched leader-settingschanged12:09
fwereadejam, would also appreciate a second opinion on http://reviews.vapour.ws/r/1134/12:11
fwereadejam, and if you run your merged branch and find that it looks like it roughly works, would you send jamespage a link to the code?12:13
fwereadejam, l-s-c is the important bit for him AIUI, l-e is just a bonus, but giving the structure a bit of a kick around would be good all the same12:14
katcodimitern: ty for the review12:21
katcodimitern: so i can't keep the old version. it will error if we don't seek back to 0 in the stream12:23
dimiternkatco, right, and it can't be used only for signV2 ?12:24
katcodimitern: correct. we deprecated v2 for s312:25
dimiternkatco, can you at least add PutReader that has the same signature, but panics when called (e.g. "use PutReadSeeker in v3") ?12:26
katcodimitern: absolutely12:26
dimiternkatco, +112:27
katcodimitern: well actually... typing this out. isn't a compile time error better than a runtime panic?12:28
dimiternkatco, depends I guess12:30
dimiternkatco, ok, let's suck it up then12:31
dimiternkatco, leave PutReader as you did it12:31
katcodimitern: or i could change the name, and any references to the old name would be compile-time errors12:31
dimiternkatco, and add a comment about the change (was there an issue # as well?) - in case somebody wonders12:31
katcodimitern: yes there was12:32
dimiternkatco, it will still be a compile time error if the name is the same but not the interface type12:32
katcodimitern: ok, so land it as-is?12:32
katcodimitern: or i'm sorry... with a comment12:33
katcodimitern: updated: https://github.com/go-amz/amz/pull/3712:36
dimiternkatco, with a comment, but how about those benchmarks?12:37
jamfwereade: well. my tests are currently hanging right in the middle, which I don't understand, as they should at least be timing out, but I'm trying to sort through it.12:37
jamI tried SIGQUIT but the scrollback is soo big12:37
jamah, debug has it, the Merge API request is hanging12:38
jamkatco: !!! :)12:38
dimiterndooferlad, reviewed, how about you to review mine ? :)12:39
katcojam: merge settings is hanging?12:39
jamyep12:39
jamI see the request go in, and no response12:39
jamdebugging now12:39
jamkatco: http://paste.ubuntu.com/10585240/12:40
katcodimitern: i don't know how long this will take, but we want these changes in before the cut of v1.23 today12:40
katcodimitern: i need to help jam out right now12:40
katcojam: do you have a link handy to show the code that generates this?12:42
fwereadejam, ah, I think I know what you need12:43
fwereadejam, run a WorkerLoop in the test12:43
katcothat would do it12:43
katcofwereade: i don't think i've actually said hi to you in awhile12:44
katcosorry i've been scattered all over the place12:44
fwereadekatco, not sure I have either, how goes it?12:44
katcoi am a bit o.012:45
katcoup all night with my daughter12:45
fwereadekatco, no worries, I have been in full basement-troll mode myself12:45
fwereadekatco, ouch12:45
katcolol12:45
katcocan i be a basement elf instead?12:45
jw4Galadriel Cox-Buday12:45
fwereadehaha12:46
katcohigh praise jw412:46
jw4:)12:46
dimiternkatco, ok, so land it then12:46
katcodimitern: at worst it will not make things worse12:46
dimiternkatco, you have my lgtm and I get it :)12:46
fwereadespeaking of trolls, if anyone was reading hpmor and then kinda dropped out because update delays, it's finishing in 2 days12:46
jamkatco: so is there an easy way to update leadership settings outside the API? I'd probably just work around this, but we'd want to fix it12:47
fwereadejam, I think we do want to fix it, but I'm shelving it under the general "we need to fix shared resources" heading in my mind12:48
katcojam: sure, let me grab some code12:48
jamfwereade: "fix it" being ?12:48
fwereadejam, ie an apiserver needs an explicitly-created and managed lease-manager to run, rather than just hoping someone else has started one in the background12:48
fwereadejam, katco: the action-at-a-distance thing of WorkerLoop is what makes me edgy12:49
jamkatco: so I added debugf statements, and I see it get to the point of checking if we're still the leader12:49
jamthen it hangs12:49
jamwithout getting a failure or success12:49
fwereadejam, are you running a WorkerLoop in the test?12:49
jamfwereade: TestLeaderSettingsEvents code12:49
jamdo you need another worker running?12:49
fwereadejam, yeah12:49
fwereade                workerLoop := lease.WorkerLoop(st)12:50
fwereade                return worker.NewSimpleWorker(workerLoop), nil12:50
fwereadeif you've got one of those running for the duration of the test it'll be fine12:50
jamfwereade: for the *api* not to hang?12:50
fwereadejam, yes12:50
jam*ugh*12:50
fwereadejam, hence what I was saying above about making the dependency explicit12:50
jamat th ev12:50
katcojam: the reason being there needs to be a lease manager instance running12:50
jamat the very least dying an ugly death of "nobody responding" would be far better12:50
katcojam: and the lease manager runs in a worker12:50
jamkatco: fwereade: wouldn't JujuConnSuite be the place to be starting this sort of thing?12:52
katcojam: to answer your other question, you can get leadership settings off of state, and then update that12:52
katcojam: currentSettings, err := st.ReadLeadershipSettings(serviceId)12:52
katcocurrentSettings.Update(rawSettings)12:52
katco_, err = currentSettings.Write()12:52
fwereadejam, katco: I guess that's probably better but I'd almost rather have the ugliness front-and-centre so we maintain an incentive to notice and fix it12:53
fwereadejam, katco: perhaps that is optimistic12:53
katcofwereade: you are saying there is an issue running the lease manager in a worker?12:53
fwereadejam, katco: but even so, what I hate about JCS is that it just sets up *everything* our code might need12:54
fwereadekatco, I'm saying that having the api able to hang because a completely separate worker hasn't been started is an issue12:54
katcofwereade: i don't think we can do much better than an error there. the two are coupled12:55
fwereadekatco, we could start it explicitly inside apiserver and that'd be ~fine, modulo all the yuckiness of writing ad-hoc Runners into our workers12:55
fwereadekatco, or we could [write some more infrastructure] and have it created and passed into the apiserver worker, and that'd also be fine12:55
fwereadekatco, it's really just the action-at-a-distance thing12:55
katcofwereade: what about simply returning bool if it's not setup yet, and then returning an error?12:56
fwereadekatco, (and, btw, it's not very safe -- I caused myself a bunch of confusion by accidentally starting two workers in a test)12:56
katcofwereade: of course that should never happen in an actual environment12:56
fwereadekatco, doesn't address the "why does package X arbitrarily depend on someone else calling some function in package Y"12:57
jamkatco: fwereade: now I'm getting "failed to merge leadership settings"12:57
jamseems wordpress/0 isn't the leader of wordpress ?12:57
jamfwereade: so I need to claim leadership and all that stuff first?12:57
jamI think I'll just write to state12:57
jamfar less setup overhead12:57
jamfwereade: ideally this sort of test wouldn't even need a state running12:58
katcofwereade: well, i mean that's just properly decoupled code, right? the leadership api has a dependency on the lease manager. best to keep those decoupled than to more tightly couple them for the sake of testing12:58
katcojam: yeah you have to be leader12:58
katcojam: dems the business rules12:58
jamkatco: being able to run an API server that will hang responding to API calls seems a bit broken12:58
katcojam: i'm claiming that's a separate issue; we can easily return an error there12:59
katcojam: also i'm questioning whether that situation will arise anywhere but full-stack testing scenarios12:59
jamkatco: I'm saying that having the Leadership API exposed means it should be running the stuff it needs underneath12:59
jamIMO12:59
fwereadekatco, they're coupled,  full stop. not very tightly, but they are; and so an explicit dependency, whether external ("you can't create an apiserver without a lease manager") or internal ("when starting an apiserver, start and track a lease manager") is desirable12:59
fwereadekatco, and IMO an external one is far superior13:00
fwereadekatco, that way you don't have the mess around monkey-patching global funcs to test the code with the dependency13:00
jamfwereade: agreed (I think we're saying the same thing from different POV)13:00
fwereadejam, yeah, I haven't perceived us disagreeing13:00
jamfwereade: well I disagree with you all the time, but that's just when you're wrong :)13:01
fwereadejam, this was one of the major motivations for what I've been saying about JFDIing a worker registry13:01
fwereadejam, ;p13:01
jamfwereade: on a different note, we have a small naming issue between LeadershipSetting and LeaderSetting13:01
jamfwereade: I have the feeling we should try to standardize somewhere13:01
fwereadejam, bugger, I have not been consistent, have I13:01
jamfwereade: the API you requested from me was LeaderSettingsEvents13:02
fwereadejam, I think LeaderSettings is perhaps more correct?13:02
jamthe uniter method is s.uniter.LeadershipSettings.Merge13:02
fwereadejam, they're the settings that are/were set by the leader, not the settinngs that pertain to the concept of leadership13:02
jamfwereade: I feel like LeadershipSettings would be "who is the leader" (settings about Leadership) vs "LeaderSettings" is what did the leader *say*13:02
fwereadejam, paraphrase jinx13:02
jamfwereade: :)13:02
jamfwereade: so I'll stick with LeaderSettings at this level, and we can look to push that down into apiserver/leadership/settings.go13:03
fwereadejam, shtm, thanks13:03
jamsounds hot to me ?13:03
fwereadejam, that was *exactly* what I though, and decided to let it stand :)13:04
jamso hairy thank mary?13:04
katcofwereade: sorry back13:04
fwereadekatco, np13:04
jamfwereade: so one thing that surprised me13:05
jamfwereade: Discard*Event13:05
jamblocks trying to send the discard message13:05
jamis that what we want?13:05
voidspacedimitern: ping13:05
fwereadejam, yes13:05
jamI guess so13:05
fwereadejam, it guarantees that by the time the func returns, the discard has completed, and we can safely read from the corresponding channel without getting another event until there's a real change13:06
katcofwereade: i agree with creating an apiserver and passing in a lease manager. i don't think i knew how to do that given the constraints of the apiserver instantiation at runtime13:06
jamfwereade: I really wish we didn't have 50 lines of simplestreams noise when a test fails...13:06
fwereadejam, me too13:06
jam(though that's a sign of running too much in jujuconnsuite13:06
fwereadekatco, yeah, exactly, that was where I fucked up13:06
fwereadekatco, I sorta threw it over the wall at you without making clear that the worker dependency was going to be a problem13:07
katcofwereade: but you can at least see the nice dependency tree being built up; it's completely possible if apiservers allowed for it13:07
dimiternvoidspace, pong13:07
katcoi.e. leadershipservice takes a leadershipmanager takes a leasemanager13:07
fwereadekatco, oh absolutely -- I think all our workers need that13:08
katcoIoC?13:08
katcoor the n-tier thing13:08
fwereadekatco, pretty much -- I think the dependency chains are potentially long13:08
fwereadekatco, in terms of things that can be usefully modelled as persistent shared resources13:09
katcothat's where DI comes in usually... with go we'd have to use factories13:09
fwereadekatco, pretty much everything needs (1) an api connection and (2) an environment in which we can guarantee we won't be running upgrade steps13:09
fwereadekatco, the apiserver needs (2), and a state connection, and a lease manager13:10
katcoyeah13:11
fwereadekatco, so I *think* I have some worthwhile ideas here and I hope to get onto them soon13:11
katcoawesome :)13:11
jamfwereade: so I'm trying to write a test that we can Discard the first leadership message13:11
jamhowever, the test gets to DiscardLeaderSettingsEvent before it actually gets the underlying first watcher event13:12
fwereadejam, don't start reading from the discard chan until you've got the first watcher event13:12
jamfwereade: is that *correct* or just good for testing?13:12
fwereadejam, I think it's correct13:13
voidspacedimitern: when we call StopInstances on the kvm / lxc brokers we use instance ids not container tags13:13
voidspacedimitern: so inside StopInstances we don't have that information13:13
fwereadejam, otherwise no client can create,discard without races13:13
voidspacedimitern: it's not obvious to me how to go from one to the other13:13
voidspacedimitern: it's provisionerTask that calls StopInstances and it's called with instance.Instance which doesn't have container id either13:14
fwereadevoidspace, whatever did the StartInstance ought to have somehow recorded the instance id assigned to the container13:14
voidspacefwereade: ok, I'll look - thanks13:14
jamfwereade: k, it *is* what discardConfig does, but I felt having the extra local variable was clumsy. will do13:14
fwereadevoidspace, instance id is the responsibility of the broker layer -- it creates them and returns them and ought to be able to understand them when passed back in13:14
fwereadejam, that whole type is clumsy :(13:15
jamheh13:15
voidspacefwereade: it's definitely "somehow recorded", it just might be that we need to change the provisioner api to take instance id and do the lookup itself13:15
jamits only 1000 lines in a single select loop13:15
jamsurely that can't be bad13:15
voidspacefwereade: the new provisioner api takes a container id13:15
jamfwereade: I will say, the need to have everything in the select is a bit of a shame13:15
jamyou can't farm out to other selects13:15
jamI guess if they all pipelined into a channel13:15
* fwereade scratches head at voidspace -- link me code please?13:15
voidspacefwereade: I'll look at how the two are associated when the instance is started13:15
voidspacefwereade: this just landed this morning, a new ReleaseContainerAddresses api13:16
voidspacefwereade: for releasing (with the provider and in state) and IP addresses statically allocated to the container13:16
voidspacefwereade: so it needs to be called on StopInstances13:16
dooferladdimitern: Oh, I had already looked at yours but I thought TheMue said he was doing the review. Looked fine, but I didn't dig into great detail in terms of code style, I was mostly interested in the new/changed functionality.13:16
voidspacefwereade: https://github.com/juju/juju/pull/1810/files13:16
* jam still loves that "remove my local changes" in git is "git co file"13:17
jamI *guess* checkout the file again sort of makes sense if you cross your eyes and pull your brain out through your nose...13:17
jamand it *is* the way you do it in SVN as well13:18
dimiterndooferlad, ok, so you can approve it :)13:18
dimiternvoidspace, let me have a look13:19
fwereadevoidspace, it looks to me like that code is itself getting instance ids from container ids because those ids are recorded in the state anyway?13:20
fwereadevoidspace, would you explain the problem again, I think I'm confused13:21
voidspacefwereade: that code needs the container id and the instance id13:22
voidspacefwereade: the ip addresses are allocated to the container id in state and the provider needs the instance id to do the release on the provider side13:22
voidspacefwereade: so we need both13:22
fwereadevoidspace, right, and given a container id it looks in the state to figure out the instance id assigned to it13:22
voidspacefwereade: that api needs to be called when we shut down a container - so at StopInstances time13:22
voidspacefwereade: yes, from a container id we can get instance id13:23
voidspacefwereade: but at StopInstances time we have instance id and not container id13:23
voidspacefwereade: so I either need the broker to go from instance id to container id13:23
voidspacefwereade: or change the api to take instance ids13:23
fwereadevoidspace, I don't *think* so, but let me marshal my thoughts for a couple of minutes13:24
voidspacecool13:24
jamfwereade: so, WantLeaderSettingsEvents() I'd like to discuss the proper way to enabled/disable them. As I think we decided you definitely want an event immediately if you pass true, even if we didn't think there was an event.13:24
jamfwereade: do you feel it is better to point-to/not-point to a Changes channel, always read from the channel but not send it on, or create a new watcher each time13:24
jamcertainly in handling a "true" case, we can just set outLeaderSettingsEventOn = outLeaderSettingsEvents and it will send an event.13:25
jambut it *might* send 2 events if there is also a Changes() pending, right?13:25
fwereadejam, queued13:26
fwereadevoidspace, context assumption13:27
fwereadevoidspace, we're stopping an instance, and we have to unassign all the addresses before we stop the instance, because we can't reference the instance once it's destroyed13:27
fwereadevoidspace, accurate?13:27
voidspacefwereade: not quite, we can do it afterwards - order isn't important13:28
voidspacefwereade: we just need to ensure that we release the ip addresses with the provider when we shut down the instance13:28
mupBug #1427342 changed: juju package should have strict version dep with juju-core <packaging> <juju-core:Won't Fix by sinzui> <https://launchpad.net/bugs/1427342>13:29
fwereadevoidspace, ok, who is it that's assigning these addresses?13:29
voidspacefwereade: the broker in StartInstance does13:29
fwereadevoidspace, dimitern: I can't remember how dynamic we're trying to be at the moment13:29
voidspacefwereade: by calling the provisioner PrepareContainerInterfaceInfo which requests an IP address from the provider and associates it with host nic13:30
fwereadevoidspace, dimitern: ok, so, looking ahead a little13:30
voidspacefwereade: the broker then sets up the iptables rules and container config so that IP address routes to the new container13:30
dimiternvoidspace, fwereade, so from an instanceId for an lxc container you can go back to the machine it13:31
fwereadevoidspace, dimitern: what's the plan for when we're adding new addresses to containers at runtime?13:31
dimiternvoidspace, fwereade it's always the same format: prefix+newMachineTag(containerMachineId).String()13:31
fwereadevoidspace, dimitern: that feels hard to integrate with the provisioner13:32
dimiternfwereade, voidspace, we haven't go that far13:32
dimiterngone13:32
fwereadedimitern, [that feels to me like a broker implementation detail on which we should not be depending?]13:32
voidspaceI agree13:33
dimiternfwereade, it's a container/lxc implementation detail13:33
fwereadedimitern, voidspace: ok, still thinking13:33
fwereadedimitern, voidspace: what's the lifecycle handling like for addresses?13:33
dimiternvoidspace, fwereade, you can get the prefix (usually "juju" but can be other as well) from the broker.manager.ContainerConfig() -> PopValue(container.ConfigName)13:34
dimiternfwereade, voidspace, we release addresses when their assigned machine is no longer alive (at least for containers - that's the first step)13:34
voidspacefwereade: the don't have a Life (as nothing else references them) - so they're created when allocated (currently only for a new container) and removed when the container is stopped13:35
voidspacefwereade: or at least they will be removed...13:35
sinzuidimitern, fwereade katco, mgz: do you have a minute to review a branch for the release of 1.22.0 http://reviews.vapour.ws/r/1146/13:35
dimiternsinzui, looking13:36
fwereadedimitern, voidspace: ok, my main worry here is that the provisioner shouldn't really be responsible for it13:36
mgzdimitern wins...13:36
fwereadedimitern, voidspace: looking ahead, we're going to want to be assigning and removing addresses dynamically13:36
fwereadedimitern, voidspace: I know we're not doing that yet, so having the assignment in the provisioner *for now* is probably ok13:37
dimiternsinzui, ship it13:38
fwereadedimitern, voidspace: but if we're just implementing the removal now, I think it would probably be cleaner for machine death to queue a cleanup that directs all associated addresses to the attention of some address worker13:38
dimiternfwereade, *for now* yeah - that's the key takeaway :)13:38
fwereadevoidspace, dimitern: yeah, I understand it's a viable shortcut for now13:38
dimiternfwereade, the plan going forward is to resurrect and fix the networker, which will take care of the dynamic config at run time13:39
fwereadevoidspace, dimitern: I'm just suggesting that my spidey sense says this is a networker's job, and that it's better to implement a half-networker for cleanup that to add the cleanup to the provisioner as well13:39
fwereadevoidspace, dimitern: which is already creaking under the weight13:39
fwereadevoidspace, dimitern: and will need to have the address logic extracted regardless13:40
fwereadevoidspace, dimitern: am I making sense here?13:40
voidspacefwereade: dimitern: so either put it in the broker for now and move it to the networker when that is resurrected, or create a new worker and move *that* to the networker later13:41
voidspacethe former seems like less busy work :-)13:42
dimiternfwereade, I agree putting that responsibility into the provisioner is not ideal and can only be temporary13:42
dimiternfwereade, voidspace, yes - I was going to suggest what voidspace just said13:42
voidspaceor resurrect the networker now with just this functionality enabled13:42
dimitern"the former" I mean13:42
fwereadejam, yes, it might send 2 events if a change happens soon enough before we discard13:42
dimiternvoidspace, unfortunately it's a wee bit beyond repair in its current state13:43
voidspaceah, ok13:43
jamfwereade: I'm trying to sort through it now, because my current code is sending 2 events when I set it to true13:43
fwereadejam, I think for all practical purposes the problematic case is at startup, though, and we get good behaviour by only accepting discards once we've had an initial change13:43
jamfwereade: I have that behavior no problem. I'm talking about Want(true) vs Want(false)13:44
fwereadejam, is the watcher being read from continuously, or do you switch off reading from the source when the filter's client doesn't want the results?13:44
jamfwereade: currently the watcher is read continuously but it doesn't forward on an event when it is set to false13:44
dimiternfwereade, so can we agree, for now to leave the cleanup of addresses in the brokers with a TODO + bug filed to move it into the networker ASAP13:45
jamfwereade: it seems to be a timing thing13:45
jamspecifically13:45
jamif you start with Want(false) and then watch for changes, you get nothing as expected13:45
jamand then making a change, again gets nothing as expected13:45
jambut internally that change hasn't actually been seen13:45
fwereadedimitern, if we risk messing around with magical conversions from instance id to container id, I'm not sure we can13:46
fwereadedimitern, but wait13:46
fwereadedimitern, the provisioner knows container id and instance id13:47
fwereadedimitern, making StopInstances stop magically-associated addresses feels like a Very  Bad Thing -- shouldn't the provisioner itself just loop through a dead machine;'s addresses, cleaning them up, and then finish off the machine?13:47
fwereadewe already have all the info we need there13:47
dimiternfwereade, we absolutely can pass instanceIds to ReleaseContainerAddresses API13:48
fwereadedimitern, but why is the api server talking to the env to release them in the first place?13:48
fwereadedimitern, the provisioner has the env right there13:48
dimiternfwereade, that way the provisioner having state access can find the container id from the instance id13:48
fwereadedimitern, it has the instance ids13:48
fwereadedimitern, the provisioner is already working in terms of container ids13:49
dimiternfwereade, because addrs are given to juju by the environment, and we keep track of them in state - i.e. we need access to both13:49
fwereadedimitern, the provisioner *already has an Environ* -- surely it should not be farming environ calls out to the api server13:50
fwereadedimitern, it can say "hey tell me about the addresses for this container id" as well13:50
fwereadedimitern, and then tell the Environ to clean up those addresses13:51
dimiternfwereade, no it doesn't actually13:51
dimiternfwereade, we're talking about the provisioner on the machine, which doesn't have access to the complete environ13:51
fwereadedimitern, oh, hell, sorry13:51
* fwereade recontextuallises13:51
voidspaceheh13:52
dimiternfwereade, so that's the obstacle there - and it felt ok to me to do it in the env provisioner, where we have both access to state and the environ13:53
fwereadevoidspace, dimitern: so, a container has (let's say just) one address13:53
fwereadevoidspace, dimitern: when its host is finished with the container, it needs to somehow inform the state server that the address is no longer required13:54
dimiternfwereade, voidspace, that's part of the story - the other part needs to happen on the host (cleaning up routes, etc.)13:55
fwereadevoidspace, dimitern: and the state server needs to somehow determine <x information> about that container's address, so that it can [unassign it from the instance?] and destroy the address itself13:55
voidspacefwereade: the host, state and the provider all need action13:56
voidspacefwereade: yes, currently that means - fetch all ip addresses associated with the container id from state, unassign them from *the host nic* with the provider and delete from state13:56
fwereadevoidspace, dimitern: I'm mainly focused on the interactions between the model and the environ here, I assume for the purposes of this discussion that host cleanup is fine and once it's no longer using the address it needs to make sure it's cleaned up somewhow13:57
dimiternfwereade, voidspace, yeah, and that information for now is just a doc in the ipaddressesC13:57
jamfwereade: I need to EOD, I think things are "working" but the double change is causing the test to break. Any chance I can hand off to you?13:57
fwereadejam, go for it, just push and remind me the branch13:57
fwereadedimitern, voidspace: forgive my slowness, but let me restate again13:58
dimiternfwereade, no worries, should we do a g+ in fact?13:59
fwereadedimitern, voidspace: we have: [a Dead container]; [a host that's just destroyed the container instance]; [a host machine with an address assigned for the dead container]13:59
fwereadedimitern, voidspace: is irc ok? I feel like I'll need to be doing some looking back13:59
voidspaceirc is fine with me13:59
dimiternfwereade, voidspace, fine with me too14:00
dimiternfwereade, voidspace, so what do you mean by that last part14:00
voidspacefwereade: you are correct, with the note that "host machine with an address assigned for the dead container" means that this address is assigned by the *provider* and needs unassigning there14:00
dimiternvoidspace, yeah, that was my point as well14:01
fwereadevoidspace, dimitern: so, this provisioner is *almost* ready to remove the Dead machine, but it doesn't want to do that until it's cleaned up everything that the machine depends on, lest they leak14:01
fwereadevoidspace, and it doesn't have an env connection itself14:02
fwereadedimitern, ^^14:02
voidspacefwereade: we don't really care about order - but we do need to ensure they don't leak14:02
dimiternfwereade, voidspace, so far, so good yeah14:02
fwereadevoidspace, that's what Dead means essentially14:02
fwereadevoidspace, nothing should be seeing or interacting with this entity except its appointed undertaker as it were14:02
fwereadevoidspace, ie the provisioner14:02
voidspaceheh to "appointed undertaker"14:03
fwereadevoidspace, dimitern: so it would seem like the thing to do would be to mark the address dying, and have another worker detect that, clean up the environ-side resource, and set it to dead14:04
fwereadevoidspace, dimitern: and in general to follow the model defined by machine, where there's a model object with a lifetime that may or may not have an associated environ reference14:05
fwereadevoidspace, dimitern: and in the long term that'd be how we'd create and assign them as well, I expect, but that doesn't have to happen now14:05
dimiternfwereade, voidspace, that *does* make sense to me14:05
fwereadevoidspace, dimitern: so in that case we should just be able to Destroy() all those model-side addresses and have a worker to clean them all up separately14:06
dimiternfwereade, voidspace, and fits well with the original job of the networker - watching for changes to NICs and addresses and reacting14:06
voidspaceexcept for allocation we can't add indirection - we must sychronously request the address and then do associated setup14:06
voidspaceas we don't know if allocation of any specific address will succeed14:06
voidspaceand we want the address at container  setup to write the configs correctly14:07
voidspacefor death it's fine14:07
dimiternyeah14:07
fwereadevoidspace, you don't have to do it now, but I don't think it's impossible -- in the easy static case you just gate container setup on address validity, and in the yucky dynamic case you're no more screwed than before ;p14:07
voidspacefwereade: dimitern: which is fine for the apiserver implementation, but *still* leaves the same question about what information to pass from the machine worker to the apiserver14:08
voidspacefwereade: yep, cool14:08
voidspaceand instance id is fine as we know the host / provider so the api server can get the container id14:08
voidspaceand that maybe the least effort to provide14:09
dimiternfwereade, I'd rather *not* gate container creation on address availability14:09
fwereadevoidspace, I think it's just a matter of giving the addresses tags (which I'm pretty sure should not include their ip, they should probably just be address-<uuid>)14:09
dimiternfwereade, we'll end up in a world of pain14:09
dimiternfwereade, even now, if we can't allocate a static IP we still start the container - it will be addressable from its host14:09
fwereadevoidspace, and then we can (1) ask for the machine's addresses, and get juju entity ids, which we then pass back up in a Destroy call14:10
fwereadevoidspace, I think?14:10
fwereadedimitern, disagree, I think14:10
fwereadedimitern, when we create a container, we know what addresses it'll need [in the static case]14:10
voidspacefwereade: so, are you suggesting that we watch for model side container death to Destroy the addresses14:10
dimiternfwereade, static addresses being available cannot be guaranteed14:11
voidspacefwereade: or that the provisioner requests destruction?14:11
voidspacebecause it would be much easier to not have the provisioner have to know all the addresses14:11
fwereadevoidspace, I think the provisioner requests destruction of those addresses because it knows they're no longer needed14:11
fwereadedimitern, right, but consider14:11
dimiternfwereade, yes, but we only try to allocate it when we're about to start it initially14:11
voidspaceso how is the provisioner going to know the addresses to request destruction of?14:11
fwereadevoidspace, I think it should be able to ask for them?14:12
voidspaceif it has to make an api call to get addresses associated with the container, and then request destruction of them14:12
dimiternvoidspace, by their life cycle value I guess14:12
fwereadevoidspace, although actually that is a pointless round trip14:12
voidspaceit might as well just make one call requesting the destruction of the addresses for this container14:12
voidspaceright14:12
fwereadevoidspace, dimitern: can we not just add a cleanup for machine that destroys the associated addresses and keep that all server-side?14:12
fwereadevoidspace, dimitern: the provisioner says it's done with the machine14:13
voidspaceif we're going to have a worker doing the cleanup it might as well just watch for machine lifecycle14:13
dimiternfwereade, yes! indeed we can14:13
fwereadevoidspace, we'll need dynamic addresses soon enough, let's not couple address lifetime to machine lifetime unless we have to14:13
dimiternvoidspace, I was thinking about it but then I forgot14:14
voidspaceso machine destruction can request Destroy on all addresses and an address watcher do the cleanup14:14
voidspaceand that leaves us free to *also* destroy addresses dynamically14:14
fwereadedimitern, about the gate-on-address-availability, I think I want to keep pushing a little14:15
dimiternfwereade, voidspace, which means we shouldn't try to remove any machine that still has allocated addresses to cleanup, rather than "delete cascade"14:15
fwereadedimitern, if we know we need a container with an address on X network, shouldn't we be getting that address before we even start to run the container? and shouldn't the lack of an address cause us to fail fast, ideally before the provisioner even sees the machine?14:16
dimiternfwereade, so far CA is a new feature and unobtrusive - it will work or not, but won't make things worse than before CA got introduced14:16
fwereadedimitern, hmm, good point14:17
fwereadedimitern, have people been using unaddressable containers though?14:17
dimiternfwereade, however, I can see your point as well - and makes sense14:17
voidspacefwereade: people have been using containers14:17
dimiternfwereade, yeah - except on maas they're unaddressable everywhere14:18
voidspacefwereade: the fact that they're now meant to be addressable shouldn't cause them to fail to be created14:18
dimiternfwereade, voidspace, exactly - at least at this point14:18
fwereadedimitern, voidspace: huh? I thought they were always addressable on maas?14:18
dimiternshould we decide to make "have a static address on subnet X" a requirement (e.g. like having lxc-ls installed) - then, yes14:18
fwereadedimitern, voidspace: by means of black magic and layer-breaking, true, but still14:18
voidspaceheh, right14:19
fwereadedimitern, but ok, shouldn't that still be a model-level consideration either way?14:19
dimiternfwereade, they were kinda addressable (mostly), before I started fixing stuff14:19
dimitern:)14:19
fwereadedimitern, so to preserve behaviour, you create machines with 0 addresses, so the provisioner doesn't have to wait for any of them14:19
fwereadedimitern, haha14:19
dimiternfwereade, it sounds like a constraint of a sort14:19
fwereadedimitern, well, yes -- I think the idea has always been that everything should have an address on the juju-private network14:20
fwereadedimitern, but yes there is a use case for unaddressable containers14:21
fwereadedimitern, and I suppose we shouldn't take away that ability14:22
fwereadedimitern, bah14:22
fwereadedimitern, I am still not comfortable with giving people unaddressable containers by default though14:22
dimiternfwereade, at least not right now I think :)14:22
fwereadedimitern, I really think that should be opt-in, and that fact that it's hitherto been the deafult is the bug14:22
dimiternfwereade, now we're not - we're making a best effort to allocate an address every time, if supported and have available14:23
mupBug #1430049 changed: unit "(AnyCharm)" is not assigned to a machine when deploying with juju 1.22-beta5 <oil> <openstack> <uosci> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1430049>14:23
mupBug #1431372 was opened: juju restore fails with "/var/lib/juju/agents: No such file or directory" <backup-restore> <juju-core:Triaged> <https://launchpad.net/bugs/1431372>14:23
fwereadedimitern, yeah, and that's cool, but it's unpredictable in practice -- even if it works 90% of the time, that 10% will be baffling14:24
fwereadedimitern, it's not very useful if all we can say is "your container might get an address"14:24
dimiternfwereade, it's very well logged every step of the way, but it has to be exposed more14:24
fwereadedimitern, "your container will get an address, *or* fail, *or* not get an address because you explicitly said you didn't want it" is more palatable14:25
dimiternfwereade, we also can't guarantee you won't get a broken instance from the environ14:25
fwereadedimitern, yay good logging, but it's not there to support the UX -- that should surface the important things directly14:26
dimiternfwereade, but it happens now and then on EC2 with CI14:26
dimiternfwereade, yes, I agree - address allocation should be exposed with the status / health checks at some poit14:26
dimiternpoint14:26
fwereadedimitern, different class of problem -- that's equivalent to "the provider said it gave you X address but it lied"14:27
fwereadedimitern, but my point I think is that a failure to allocate an address does indeed need to feed back into an error on the machine14:27
dimiternfwereade, from UX POV it's not "the provider" it's juju's fault :)14:27
dimiternfwereade, ok, I'll keep this in mind14:28
voidspaceso for the immediate problem14:28
dimiternfwereade, but unlike the explicit relation endpoints bindings in the net phase 1 model, we don't have a way for the user to express intent14:29
voidspacewe keep the provisioner api, called by the provisioner/broker14:29
voidspacebut change the server side to give addresses a life-cycle and a cleanup worker14:29
voidspace*or* we keep all the cleanup server side14:29
dimiternvoidspace, fwereade, for the immediate problem, let's agree to have a cleanup procedure as part of the machine cleanup in the cleanup worker14:29
voidspaceok14:29
dimiternit will get veeeery clean14:29
voidspace:-)14:30
voidspacecleaner the better14:30
voidspaceso we can revert my last merge14:30
fwereadevoidspace, dimitern: only concern: calling into the environ to do work from within cleanup14:30
fwereadevoidspace, dimitern: would be much more comfortable with a lifecycle watcher for addresses that just worries about dying->dead14:31
dimiternfwereade, why a concern?14:31
dimiternfwereade, yeah, +1 - that will be the next step - networker14:31
natefinchfwereade, dimitern, anyone else:  wwitzel3 and I are working on upgrading non-state servers into state servers... we have the code to update the data in mongo, is a watcher the correct way to have the intended server notice it's supposed to convert into a state server, or is there a better way?14:32
fwereadenatefinch, +1 watcher, I think14:32
dimiternvoidspace, it doesn't have to be reverted now - it can be molded into the cleanup procedure :)14:32
fwereadedimitern, voidspace: I am failing to articulate it, but it's really not the right layer14:32
fwereadedimitern, voidspace: we need to do dances to use the environs from inside state anyway14:33
dimiternnatefinch, yeah14:33
fwereadedimitern, voidspace: I'd prefer not to encourage it except for really important reasons14:33
dimiternfwereade, how about a worker then - akin to the instance poller14:34
fwereadedimitern, voidspace: namely preflighting env requests to help give good fast-failing UX in response to nonsensical requests14:34
fwereadedimitern, perfect14:34
fwereadedimitern, and, yes, it's *another* Environ in the same process, which is bad and wrong, but I think I have a plan for that14:34
dimiternfwereade, voidspace, great - then it's decided14:34
* fwereade cheers14:34
fwereadethanks guys14:34
voidspacelovely14:34
voidspacedimitern: please summarise :-)14:34
voidspacea new worker to do address cleanup14:35
dimiternfwereade, thank you!14:35
voidspaceaddresses to have a lifecyle14:35
dimiternvoidspace, just a sec14:35
voidspaceok14:35
dimiternvoidspace, we can do it with a life cycle or otherwise, but essentially a strings/notify  watcher worker14:36
dimiternvoidspace, which listens to changes to ipaddresses and responds by releasing those no longer needed14:36
dimitern(hmm it can even eventually also handle the allocation i guess)14:37
dimiternvoidspace, we need to have a way to signal that worker - having a life is the simplest and obvious choice14:38
dimiternvoidspace, however, we already have AddressState14:38
=== mgz is now known as mgz_
dimiternvoidspace, which is kinda like life - moving forward14:38
voidspacedimitern: although that changes more frequently14:38
voidspacedimitern: so the watcher would be triggered more often14:38
voidspacedimitern: but fine14:38
fwereadedimitern, voidspace: I would recommend using Life myself14:39
voidspacecool14:39
fwereadevoidspace, sad to say that wouldn't help with frequent triggering, but for coincidental implementation reasons rather than fundamental design ones iyswim14:39
voidspaceheh14:39
fwereadevoidspace, dimitern: *unless* we put the life field in a distinct doc14:39
voidspacethat all got a lot more meta than I anticipated14:39
voidspacebut thanks14:39
dimiternfwereade, voidspace, ok, for pre-existing docs it won't be there, which would mean "alive" I guess.. hmmm smells like an upgrade step for sure14:39
voidspaceright14:39
fwereadevoidspace, dimitern: yeah, think so14:40
voidspacewe could make alive the empty string14:40
voidspacethen there's no upgrade needed...14:40
voidspace;-)14:40
dimitern:) if only..14:40
fwereadevoidspace, dimitern: I would quite like life in a distinct doc, because trigger patterns, but the inconsistency bugs me14:40
voidspaceso a separate collection with an ipaddress reference and a life14:41
dimiternvoidspace, fwereade, ok, that starts to sound to me like the old idea we had about an "environ-level networker"14:41
voidspaceI would *really* like to go to lunch14:41
fwereadedimitern, voidspace: fwiw Alive *is* iota14:42
dimiternvoidspace, sure :)14:42
voidspaceso if there are any more changes, dimitern can communicate them to me - but it sounds like we're there14:42
dimiternfwereade, nice!14:42
dimiternyes I think so14:42
fwereadedimitern, voidspace: but I would rather just write the fiueld properly than depend on unmarshalling, sometimes we write code that looks in the db not just the local docs, I think it's a landmine14:42
voidspaceyep, agreed14:42
fwereadedimitern, voidspace: so upgrade step please14:42
dimiternfwereade, +114:43
dimiternvoidspace, I'll summarize the steps needed and we can discuss them tomorrow?14:43
mupBug #1431401 was opened: Juju appears hung when using the local provider for the first time <juju-core:New> <https://launchpad.net/bugs/1431401>14:53
sinzuixwwt, alexisb dimitern natefinch fwereade : 1.23 has a pass. Do you want to branch? Now, or wait for more features/fixes to merge14:58
alexisbsinzui, lets branch14:59
=== kadams54 is now known as kadams54-away
dimiternsinzui, I have one more fix, but it will get merged after branching just the same15:00
katco 15:00
katco/query alexisb15:00
dimiterndooferlad, that's yours ^^15:00
=== kadams54-away is now known as kadams54
alexisbdimitern, fixes can still go in 1.23 :)15:00
alexisbjust no more features with with the exceptions fo the ones the release team and I have agreed to :)15:01
alexisbkatco, you looking for me?15:01
dimiternalexisb, yeah :)15:01
sinzuidimitern, master is 1 commit beyond out test. You need to merge dimitern/ca-fixes-clone in the 1.23 I create. If you are fine with this, I will create the branch this minute15:20
dimiternsinzui, ok, I can do this15:22
sinzuidimitern, done. https://github.com/juju/juju/commits/1.2315:23
dimiternsinzui, cheers15:24
dimiternsinzui, is the merge bot monitoring 1.23 yet?15:36
sinzuidimitern, adding it right now in fact15:36
dimiternit is15:36
dimitern:) great timing sinzui15:36
xwwtsinzui: branch it.15:37
=== kadams54 is now known as kadams54-away
mattywjam, do you have a moment to do a review for me? http://reviews.vapour.ws/r/1005/15:49
mattywsinzui, is landing in master blocked?16:02
mattywsinzui, or rather - it appears to be blocked but I didn't think it was16:03
sinzuimattyw, no, but try again, the bug is fix16:03
natefinchmattyw: Why does get meter status return a string for the code and not a MeterStatusCode?  Why even have the type if you're not going to use it?16:04
mattywnatefinch, where are you looking? it used to return strings but in theory I've made changes to use the types now16:05
natefinch(I know that's not your change here, just curious)16:05
mattywnatefinch, you looking at 1005?16:05
natefinchmattyw: yes16:07
natefinchmattyw: https://github.com/juju/juju/pull/1677/files#diff-8319a651ebe0aac63caf50325cef6903R11216:07
mupBug #1430898 changed: run-unit-tests ppc64el timeout <ci> <gccgo> <ppc64el> <regression> <test-failure> <juju-core:Fix Released> <https://launchpad.net/bugs/1430898>16:29
mupBug #1431444 was opened: juju run results contain extraneous newline <juju-core:New> <https://launchpad.net/bugs/1431444>16:29
natefinchyou know your product is becoming successful when people start complaining about newlines16:31
=== kadams54 is now known as kadams54-away
=== kadams54-away is now known as kadams54
=== kadams54 is now known as kadams54-away
=== kadams54-away is now known as kadams54
natefinchwow, just stumbled on cmd/jujud/agent/machine.go which has 77 import statements :/18:21
natefinchwow, google code is shutting down18:27
natefinchhttp://google-opensource.blogspot.com/2015/03/farewell-to-google-code.html18:27
bodie_davecheney, http://reviews.vapour.ws/r/1139/ should be satisfactory now :)18:28
bodie_natefinch ... wow18:28
perrito666natefinch: I think that is the nature of all google projects18:31
lazyPowernatefinch: is there way to block on juju run until the service is up?18:45
lazyPoweror do i need to write some glue code the parse juju status until the state is "started"18:45
natefinchlazyPower: hmm18:48
lazyPoweryeah, i kinda thought i'd have to write some glue code. nbd - just wanted to make sure i wasn't reinventing something that already existed18:54
natefinchlazyPower: Juju run might work to keep your refresh of juju status from running more often than you need... but I'm not sure that juju run will block until the service is started.19:01
lazyPoweryeah it actually returns "No service" so its getting run before the service is even deployed. Either way i'm going to have to block myself.19:01
lazyPowerand it doesn't make sense for juju run to block when the service doesn't even exist19:01
natefinchahh yeah, that makes sense19:01
bogdanteleagacan apt proxy/mirror settings change while an instance is running?19:22
natefinchbogdanteleaga: probably19:23
bogdanteleaganatefinch: have any idea where's the code that deals with it?19:24
natefinchbogdanteleaga: I didn't say we had code to deal with it, I said I wouldn't be surprised if it could happen :)19:25
natefinchbogdanteleaga: what would you need to do in response to a change?  Isn't the point of the proxy just that apt will now download stuff from somewhere else?19:26
bogdanteleaganatefinch: well I found some stuff that's done wrt it when the instance is booted through cloudinit19:27
bogdanteleaganatefinch: not sure about changes after that19:28
=== kadams54 is now known as kadams54-away
=== kadams54-away is now known as kadams54
natefinchthumper: got a little time to talk about converting servers into state servers?  having an auth problem logging into the mongodb, wondering if you might have insight as to how to fix that up20:25
thumperum... sure20:29
natefinchhaha20:30
natefinchjust hoping you'll think of something I've missed :)20:30
thumpernatefinch: hangout?20:31
natefinchthumper: https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=120:31
=== kadams54 is now known as kadams54-away
=== kadams54-away is now known as kadams54
perrito666oh man, I was also interested in natefinch and thumper conversation, I have a similar problem20:55
=== kadams54 is now known as kadams54-away
=== kadams54-away is now known as kadams54
thumperperrito666: you missed an awesome conversation. Pure magic.21:04
perrito666thumper: Well I dont doubt there was magic :p I was hoping for a bit of engineering though ;)21:18
thumperperrito666: nah, not much of that21:27
* perrito666 shrugs and puts his gandalf suit21:28
=== robbiew is now known as robbiew-afk
=== robbiew-afk is now known as robbiew
perrito666I am going to make a very generic question so please dont hate me and, if you think you have any answer whatsoever that might help please contribute it21:47
perrito666I want to change an api server machine tag, what else, besides changing the tag in agent config, you think I should do?21:48
perrito666thumper: ok, apparently my browser spell checker thinks your name should be Time, sorry22:13
thumper")22:13
thumperperrito666: why change the machine tag?22:13
thumperperrito666: you will also need to change the symlink for the tools22:14
perrito666thumper: I already have code that does that22:14
thumpermy question though is why?22:14
perrito666thumper: because, when you restore you cannot guarantee that the freshly created state server has the same tag than the old one22:14
* thumper has a theory22:14
thumperhmm...22:14
perrito666buuut22:15
perrito666I do think there should be another way22:15
thumperdoes the upstart script mention the machine id anywhere?22:15
perrito666thumper: not that I recall22:15
thumperif you change the tag of the machine, it will no longer be able to connect to the api server22:16
thumperbecause the password will not work22:16
perrito666I have to make a change in one of two, either the backed up data or the newly created machine22:16
thumpera mongo user is created for the machine with a particular password22:17
thumperthat user is the machine tag IIRC22:17
perrito666yes22:17
thumperon the physical machine, the places are the agent config file, and the link to the tools22:17
thumperI think that is all22:17
thumperactually22:17
thumperI think the machine tag is mentioned in the service file22:17
perrito666what worries me is the values inside the mongo22:18
thumperat least the name of the script22:18
thumperyes, the password will be wrong in mongo22:18
=== kadams54 is now known as kadams54-away
thumperthere are quite a few moving parts there22:19
perrito666you just made me realize that, since I am changing everything in the newly created machine, I could just change whatever reference is not replaced by the backup data22:19
perrito666and that would save me from actually changing the tag22:19
perrito666that leaves the question I asked in the email22:20
perrito666I need admin privileges to jumpstart the rs22:21
perrito666and iirc, some of the changes I do are also admin-ish22:22
thumperperrito666: I only just noticed the email, been focused on doing HR stuff22:43
thumperI'll think about it and try to have a well thought out response :)22:43
perrito666thanks, that way I can feel worse about just copy pasting the email22:44
bodie_any chance we can get this bugfix merged for 1.23?  https://bugs.launchpad.net/juju-core/+bug/143161223:05
mupBug #1431612: Action defaults don't work for nil params <actions> <defaults> <juju-core:New> <https://launchpad.net/bugs/1431612>23:05
mupBug #1431612 was opened: Action defaults don't work for nil params <actions> <defaults> <juju-core:New for binary132> <https://launchpad.net/bugs/1431612>23:15

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!