/srv/irclogs.ubuntu.com/2015/09/10/#juju-dev.txt

moqq(in the logs for the stalled agent its going crazy with many “panic: runtime error: invalid memory address or nil pointer dereference” and their stack traces)00:00
thumperwallyworld: ha00:14
wallyworldthumper: i'll raise a bug00:14
thumperwallyworld: never ran or never did the right thing?00:14
wallyworldnever ran00:14
thumperugh00:15
wallyworldthere's a check in the steps that the tag passed in is a unit tag00:15
thumperhaha00:15
wallyworldand it exits if not00:15
thumperoops00:15
wallyworldyeah00:15
wallyworldwe cargo culted a 123 step for 126 and it didn't run00:15
wallyworldand it caused an issue in CI and that's how we found out00:15
thumperheh00:16
mwhudsonwallyworld: want to test a package that makes the race detector work?00:17
wallyworldmwhudson: sure :-)00:18
* thumper is being summoned for lunch00:18
mwhudson(maybe, this is the first package i've ever created from scratch i think)00:18
thumperbbl00:18
mwhudsonwallyworld: http://people.canonical.com/~mwh/go-race-detector-runtime_229396-0ubuntu1_amd64.deb00:20
* wallyworld downloads00:20
mwhudsonwallyworld: caveat emptor but at worst it should simple not install / not help00:20
mwhudson*simply00:20
wallyworldmwhudson: it workd :-)00:22
mwhudsonwallyworld: omg00:22
wallyworldhave some faith in your own awesomeness :-)00:22
davechen1yis the build blocked ?00:41
mupBug #1494070 opened: unit agent upgrade steps not run <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1494070>00:42
natefinch-afkdavechen1y: http://juju.fail01:05
wallyworldniemeyer: hey gustavo, you around?01:07
davechen1ynatefinch-afk: nice01:08
niemeyerwallyworld: Heya01:08
niemeyerwallyworld: Sorry, haven't had a chance to look at the issue you mailed me about, but I will01:09
wallyworldniemeyer: just wondering about mongo 2.6 onwards not like $ in the doc field names01:09
wallyworldnp01:09
wallyworldjuju seems to run ok01:09
wallyworldbut mongo says it is unhappy01:09
wallyworldwhen you run the compatibility check tool01:09
niemeyerwallyworld: Need to check the details01:09
wallyworldnp, i'll wait to hear back. ty01:10
niemeyerwallyworld: Thanks for your patience, and for the report01:10
wallyworldsure, np. i know you are busy01:10
=== natefinch-afk is now known as natefinch
thumperdavechen1y, mwhudson: are either of you actively using rugby? There is an RT to upgrade its firmware02:26
davechen1ythumper: go for it02:34
natefinchQuick poll: creating a struct to hold function arguments to avoid code churn for all the millions of places where tests call this function every time it changes.... good idea or bad idea?02:53
natefinch(the function is state.AddService)02:54
natefinch(technically a method not a function)02:54
natefinch_thumper, wallyworld, davechen1y: ^ ?03:03
davechen1ynatefinch_: sounds abstract03:04
davechen1yhard to comment03:04
wallyworldnatefinch_: i like structs to hold args03:04
davechen1yas in03:04
davechen1yi don't think i understand what you are asking03:04
thumperme neither03:05
natefinch_davechen1y: instead of func foo(a int, b string, c instance.Something)    you have func foo(args FooArgs)    where FooArgs is just   type FooArgs struct{  A int, B string, C instance.Something }03:05
natefinch_the idea is that if we expect the arguments to the function to change often with optional parameters getting added to the end, this prevents code churn in the 100 tests that use this function only for the most basic functionality.03:06
=== natefinch_ is now known as nateinch
=== nateinch is now known as natefinch
natefinchNotable AddService is going from 5 arguments to 8 in my change, and there's a todo for modifying it further (from someone else)03:07
davechen1ynatefinch: what is changing that often ?03:07
davechen1ynatefinch: are most of those arguments usually the defaults ?03:07
natefinchdavechen1y: every time we add a new feature - storage, spaces, different kinds of placement etc03:07
davechen1ynatefinch: sounds like you should be using functional arguments03:07
davechen1ybut if you don't want to do that, then use a struct03:08
natefinchdavechen1y: in the tests, yes. There's about 40 places where we do AddService(soemthing, something, something, nil, nil, nil, nil)03:08
davechen1ymethods/functions with many parameters are a smell03:08
davechen1ysounds like all those parameters have a sensible default, the zero value03:08
natefinchindeed03:08
natefinchthus a struct is easy and obvious.  Functional arguments are kinda obscure, though I do sorta like them03:09
mupBug #1493850 changed: 1.22 cannot upgrade to 1.26-alpha1: run.socket: no such file or directory <1.22> <blocker> <ci> <regression> <run> <upgrade-juju> <juju-core:Fix Released by cmars> <https://launchpad.net/bugs/1493850>03:09
davechen1ynatefinch: your call03:10
davechen1yone request03:10
davechen1ypass the configuration structure _by value_03:11
davechen1yso callers cannot hold a referecne to it03:11
davechen1yconfig := service.Config { .... }03:11
davechen1yAddService(config)03:11
natefinchyep, I'm a bigbfan of passing by value03:12
natefinchbig fan03:12
* thumper agrees with davechen1y03:13
davechen1yis trunk still blocked ?03:13
davechen1yapparently not ;)03:14
wallyworlddavechen1y: i unblokced master03:15
wallyworldmarked bug as fix released03:15
mupBug #1493850 opened: 1.22 cannot upgrade to 1.26-alpha1: run.socket: no such file or directory <1.22> <blocker> <ci> <regression> <run> <upgrade-juju> <juju-core:Fix Released by cmars> <https://launchpad.net/bugs/1493850>03:15
davechen1ywallyworld: ta03:16
mupBug #1493850 changed: 1.22 cannot upgrade to 1.26-alpha1: run.socket: no such file or directory <1.22> <blocker> <ci> <regression> <run> <upgrade-juju> <juju-core:Fix Released by cmars> <https://launchpad.net/bugs/1493850>03:21
wallyworldthumper: replied, hopefully it adds some extra context to the discussion04:13
thumperta04:16
wallyworldawesome, unity crashed, reboot time04:17
davecheneyhere is a small one http://reviews.vapour.ws/r/2622/04:31
davecheneyhttps://bugs.launchpad.net/juju-core/+bug/149412104:39
mupBug #1494121: worker/uniter/remotestate: data race  <juju-core:New> <https://launchpad.net/bugs/1494121>04:39
mupBug #1494121 opened: worker/uniter/remotestate: data race  <juju-core:New> <https://launchpad.net/bugs/1494121>04:48
ericsnowdavecheney: could you spare me a review of a backport of a 1.25 patch of yours: http://reviews.vapour.ws/r/2624/05:14
davecheneyericsnow: looking05:31
axwwallyworld: what's the problem with the race detector in 1.5? works on my machine05:39
wallyworldaxw: i have no idea. i just saw the bug. davecheney ^^^^ are you using the packaged go 1.5?05:39
axw(I built from source FWIW)05:40
wallyworldaxw: there is an issue in the go 1.5 packaged for wily. race detection is broen. i installed a patch from mwhudson to fix it for me05:41
axwwallyworld: ah I see05:41
wallyworldthe breakage though is just compiling the test binaries i think05:41
wallyworldaxw: fwiw, i tried the race detector just before and got the same issue as the bug05:42
axwwallyworld: yes, I can repro. I was just curious05:42
axwwallyworld: going to take a break from azure to fix it05:42
wallyworldtyvm05:42
axwwallyworld: I just got a failure in WatcherSuite.TestActionsReceived too ... not a data race, just a failure05:42
wallyworldoh dear05:43
axwwallyworld: is this really Critical? it's a data race in a test, not in the code under test05:48
wallyworldaxw: i think the policy is that all data races are to be considered critical (or at least that was the case)05:54
wallyworldnow that we have the count at 0, we need to maintain that05:55
wallyworldbut i could be mis remembering the policy05:55
axwwallyworld: ok. does not seem more important to me than any other shitty tests, but ok ;)05:55
axwfound the issue anyways05:55
wallyworldaxw: it came about i think because of the need to get it to 0 so we could get upstream to fix that gccgo bug05:56
wallyworldso any data race was totally unacceptable05:56
urulamawallyworld: that's the Entity in charm store: https://github.com/juju/charmstore/blob/v5-unstable/internal/mongodoc/doc.go06:00
wallyworldlooking06:00
* wallyworld has to go to do school pickup, bbiab06:01
urulamawallyworld: i'm taking kids to school as well06:01
davecheneyaxw: http://reviews.vapour.ws/r/2625/diff/#06:15
davecheneyi don't get it06:15
davecheneythis change just adds a test06:15
axwwallyworld: no, it splits part off the end of a test06:15
axwand gives it a name06:15
axwerr sorry, davecheney06:15
axwdavecheney: hang on, I'll point out the problem line06:16
davecheneyoh i see06:16
axwdavecheney: https://github.com/juju/juju/blob/master/worker/uniter/remotestate/watcher_test.go#L374    <- here we trigger a watcher, it wakes up and we expect it to do nothing interesting.. but it will go off and read s.st.storageAttachment06:17
axwdavecheney: the test was overloaded anyway, hence I split it06:17
davecheneygot it06:18
davecheneythanks06:18
davecheneyaxw: http://paste.ubuntu.com/12326655/06:20
davecheneystill racey06:20
axwdavecheney: that one I'm fixing now06:25
axwdavecheney: may as well put it in the same PR I guess06:26
axwdavecheney: oh, I just saw the action failure... I see. storage is still racy06:27
davecheneyyea06:32
davecheneythanks06:32
axwdavecheney: should be fixed now. that one was a bug in the watcher, not the test. fixed the action test while I was there.06:46
davecheneylooking06:47
davecheneyaxw: lgtm06:52
axwdavecheney: thanks06:53
davecheneyaxw: http://reviews.vapour.ws/r/2622/06:53
davecheneyhow about one in return06:53
axwdavecheney: just a move, no code change?06:54
davecheneyyup06:54
davecheneymoving the bzr pack to utils06:54
davecheneyit doesn't need to be in juju/juju06:54
axwdavecheney: agreed, LGTM06:54
fwereadewallyworld, worried about https://bugs.launchpad.net/juju-core/+bug/148367207:47
mupBug #1483672: Allow charms to associate structured data with status <cloud-installer> <landscape> <juju-core:Fix Committed by hduran-8> <juju-core 1.25:Fix Committed by hduran-8> <https://launchpad.net/bugs/1483672>07:47
wallyworldwhich bit?07:48
fwereadewallyworld, apparently we've just implemented rich status without a spec?07:48
wallyworldsay wot?07:48
wallyworldwe allowed nme/value pairs to be added for non error status07:48
fwereadewallyworld, yeah07:49
wallyworldwe taked about this when the api was being discussed and neitjer of us recalled a good reason for diallowing it07:49
wallyworldand landscape wanted it07:49
fwereadewallyworld, that's rich status, except that it doesn't take any of sabdfl's requirements for it into account07:49
wallyworldi must admit i don't see the connection straight up - i'd have to find a rih status spec07:50
fwereadeoutput documents?07:51
fwereadethat is more or less an output doc07:51
wallyworldno, really?07:51
fwereadeexcept it's not persisted usefully07:51
wallyworldi don't see it as that at all07:51
fwereadewell, we've just grabbed the spelling sabdfl wanted for rich status07:52
wallyworldaren't output docs a totally different semantic than allowing a charm to record why it is in maintenance07:53
fwereadepossibly07:53
fwereadebut we are now expressly using the spelling earmarked for a different feature, but implemented with completely different semantics07:54
=== mgz is now known as _mgz
frobwareTheMue, I had to move our 1:1 today as I have a conflict.08:10
TheMuefrobware: yeah, just discovered. it's ok08:10
frobwareTheMue, I have a P&C induction session. Also missing the standup too.08:12
TheMuefrobware: P&C?08:12
frobwareTheMue, HR08:12
* TheMue missed that acronym08:12
TheMuefrobware: ah, thx08:13
frobwareTheMue, People & Culture to be more specific.08:13
TheMuefrobware: which definitely sounds better than Human "Resources" *iirks*08:13
TheMuefrobware: hmm, trusting my calendar we're overlapping with the core meeting08:16
=== akhavr1 is now known as akhavr
TheMuefrobware: but as long as we don't need more than 30 minutes it fits08:16
frobwareTheMue, ah, I see. I'm not in the meeting and I have another at 12 which is why it ended up where it is. 30 mins should be good. Otherwise I have back-2-back meetings from 9-3.08:18
fwereadewallyworld, ...and it looks like we've invented a new convention for passing k/v data into hook tools?08:19
TheMuefrobware: from my side it's enough, yes. currently mostly focussed on final pre-vacation tasks08:19
wallyworldfwereade: yaml isn't it?08:19
fwereadewallyworld, yeah -- where else do hook tools accept yaml?08:19
wallyworldrelation-set i thought08:19
fwereadewallyworld, definitely not08:20
fwereadewallyworld, relation-set letter=y08:20
wallyworldso just kv pairs then08:20
wallyworldi thought i was told it was yaml08:21
fwereadewallyworld, that, and the action-set stuff08:21
wallyworldit can be changed to kv pairs08:21
fwereadewallyworld, which is the existing convention for arbitrary structured data08:21
fwereadewallyworld, *but* the rich status plans included output schemas for the arbitrary structured data08:21
fwereadewallyworld, and we don't have anything like that08:22
fwereadewallyworld, *and* we've just implemented a new side channel for peer-relation-like data08:22
wallyworldfwereade: so relation-set does read yaml from a file08:22
wallyworldjust not from cmdline08:22
fwereadewallyworld, yes, via a --file arg08:23
wallyworldi think that's where the confusion came from08:23
wallyworldfwereade: so do we or do we not want to fix that landscape reported bug08:24
wallyworldi mean, this fix brings the cli into line with the api08:24
wallyworldthe api now allows kv pairs with arbitrary status values08:25
wallyworldit didn't before08:25
wallyworldand the hook tools didn't so were inconsistent08:25
wallyworldwith the api08:25
fwereadewallyworld, I honestly think the landscape bug is essentially a feature request for rich status08:26
fwereadewallyworld, it's certainly not an invitation to occupy a bunch of the hook-env design space without consultation or spec or any apparent consistency with what went before08:27
wallyworldso we can revert the commit. but that still leaves api inconsisent08:27
wallyworldthe inconsistency was a mistake, should have been kv08:28
wallyworldthe intent was not to occupy design space without a spec - it was to fill in an inconsistency between api and li08:29
wallyworldcli08:29
frankbanhi all core devs: could you please take a look at https://github.com/juju/juju/pull/3249 (initial bundle deployment support)? thanks!08:29
wallyworldfwereade:  because someone could write a cli client to do the same thing as we are now going to disallow from a hook tool directly08:30
fwereadewallyworld, I don't think the api implementation details have any reason to affect the hook environment we expose08:30
wallyworldsee above08:30
wallyworldsomeone could backdoor it08:30
wallyworldconsistency is good08:31
wallyworldso maybe we should revert the api changes to again disallow it08:31
fwereadewallyworld, params.SetStatus has accepted data for a good couple of years now?08:34
wallyworldonly for error states08:34
wallyworldthere were explicit checks in code08:34
wallyworldremember how we talked about this?08:34
wallyworldand couldn't see a reason to continue that behaviour?08:34
fwereadewallyworld, right... still not seeing how this means "let's change the data model and interaction patterns for the hook context"08:37
fwereadewallyworld, you can't just add stuff to the hook env without thinking about it08:37
fwereadewallyworld, is this intended to be the complement to leader-settings? in a way, it's kinda cool that it is08:38
wallyworldit was merely meant to bring the cli in line with th api08:38
fwereadewallyworld, I don't think that's a relevant consideration08:38
fwereadewallyworld, if you want you can write an api client that impersonates the unit and sets it to dead08:39
wallyworldotherwise people would just backdoor it anyway08:39
wallyworldsure, i ment with the SetStaus api08:39
wallyworldnot the api in general08:39
fwereadewallyworld, then they'd be dumb to do so, because they'd be depending on arbitrary implementation details08:39
wallyworldpeople use upload-tool08:39
fwereadewallyworld, and it would only work half the time anyway08:39
wallyworldseems easiest to revert for now08:40
fwereadewallyworld, I guess :(08:41
wallyworldbut how do we give landscape what they want08:41
wallyworldfwereade: about the side channel comment. i'll use the same argument as you used - "they'd be dumb to do so". i guess people can always find a way to manipulate a system.08:50
fwereadewallyworld, does the word "affordance" mean anything to you?08:51
wallyworldso it comes down to - do we or do we not want to allow status other than error to have a little bit of extra data besidesa human string08:51
fwereadewallyworld, ofc we do08:51
fwereadewallyworld, but this is not an ok way to do it08:51
wallyworldi seemed ok as it uses the current tool08:52
wallyworldwith extra params analogous to the api08:52
fwereadewallyworld, right08:53
fwereadeso now a bunch of charms will fail in surprising ways on old jujus08:53
wallyworldunless we impleent min version08:53
wallyworldwhat's the status of that?08:53
fwereadewallyworld, seems like it's been deprioritised again :-/08:54
wallyworldso suggestions then08:54
wallyworldhow would we implement this08:54
fwereadewallyworld, (1) stop and think -- who is this side channel for, what data should it contain, who is notiified of changes and how, what are the consequences of that08:55
fwereadewallyworld, what we've done here08:56
fwereadewallyworld, is create the first channel that outputs both to users and to other units in the service08:56
wallyworldwhich can be done via the api08:57
wallyworldso that's not really a key rebuttal08:57
fwereadewallyworld, well, no08:57
fwereadewallyworld, you're exposing the status data dict to the leader08:57
fwereadewallyworld, it is now suddenly a programmatic control channel, with new, surprising, and undocumented semantics, that will inevitably start to contain sensitive data08:58
wallyworldonly if people put it there08:58
wallyworldit's only  control channel if people misuse it that way08:59
fwereadewallyworld, no08:59
fwereadewallyworld, we control the environment08:59
fwereadewallyworld, we control the data that goes in and out08:59
wallyworldexcept when we don't09:00
fwereadewallyworld, if we put a big radio in the room, marked "messages from minions you can't get any other way", but tell people they shouldn't use it, we're being actively user--hostile09:00
fwereadewallyworld, btw, did we ever implement minion-status-change watching?09:01
fwereadewallyworld, don't think I've seen any code for it09:01
fwereadewallyworld, making service-status work reliably should, I think, take precedence over adding new ways for it to break the model further09:02
fwereadewallyworld, or did we explicitly decide that service status should be composed from arbitrarily out-of-date unit statuses?09:04
fwereadewallyworld, sorry, we're probably desynchronised, I have been whinging down an empty pipe09:07
fwereadewallyworld, can we go back to the "except when we don't" bit?09:09
fwereadewallyworld, the point of the hook environment is to provide the underlying guarantees that let juju work09:09
wallyworldsorry, i keep getting disconnected the past 1 hour or so09:10
fwereadewallyworld, like, if we tell you some information, we will also tell you when that information has changed09:10
fwereadewallyworld, and, we will rabidly restrict the information you are allowed to access, because every side channel we provide is an *official* side channel -- we know that people will use everything we provide, so we only provide things we're willing to build a proper eventual-consistency convergence model for09:11
fwereadewallyworld, and every piece of information you can access *without* a mechanism for seeing when it's changed is, basically, a bug09:12
wallyworldfwereade: maybe my network will stay up for a bit09:22
fwereadewallyworld, ...so, did status-get always have --include-data?09:22
wallyworldnot sure, i'd have to check09:22
fwereadewallyworld, seems like it wasn't added in that CL so I guess it's oldish09:23
wallyworldyeah, it has been there a bit09:23
fwereadewallyworld, and, yeah, I suppose *that* is not bad in isolation09:23
fwereadewallyworld, until we turned it into a subtly-broken variant of a minion-settings bucket, anyway09:24
wallyworldit's not supposed to be a settings bucket09:25
wallyworldit's not settings09:25
fwereadewallyworld, but you've made it one09:25
wallyworldi guess people could misuse it that way09:25
fwereadewallyworld, it's a data channel from one unit to another09:25
wallyworldonly if misused09:25
fwereadewallyworld, you expose it in status-get, therefore you eviidenntly want ppeople to use that data09:26
wallyworldi think a unit can only get its own settings09:26
fwereadeyeah but the service gets all of them09:26
wallyworldso it can aggregate overall state using the indivisual unit status09:26
fwereadewallyworld, right -- and09:27
fwereadeoh god09:27
fwereadeit's not the workload status, is it?09:27
fwereadeit's the workload status or maybe the agent status09:27
wallyworldunit ad agent have separate status09:28
fwereadeso it's a doubly unreliable channel because we'll hide any important data whenever the agent gets into an error state09:28
wallyworldyeah because the spec is "wrong"09:29
fwereadeehh, the spec didn't even bother to consider that case09:29
fwereadeit was all in terms of what we expose to the user09:29
fwereadeit's bad enough that we lie to the user09:29
wallyworldno, i meant that we were told the workloadhad to reflect the agent error09:29
fwereaderight, and we got explicit agreement from the very top that that was a UX consideration and shouldn't have to impact the model09:30
wallyworldright, so we do store separate staus always09:30
wallyworldit is a ui thing09:30
fwereadewallyworld, different user, different interface09:30
wallyworld?09:31
fwereadewallyworld, lying to end users because we think they can't handle the truth is just kinda dumb09:31
wallyworldi agree09:31
wallyworldbut we were told to do it09:31
fwereadewallyworld, actively subverting the mechanism we use to tell the service what its components are up to is actively broken09:31
fwereadedammit I have to get to the shops before my meeting-block starts, bbiab09:34
dimiternTheMue, you've got a review09:35
dimiternaxw, are you around by any chance?09:36
axwdimitern: hiya, I am09:36
TheMuedimitern: thx09:37
dimiternaxw, about that change in provider common about subnets and zones, do you have a few minutes to discuss it?09:40
axwdimitern: yes, sure09:40
axwdimitern: hangout or here?09:40
dimiternaxw, here's fine09:40
dimiternaxw, I'm open to a better solution - basically we need to take into account 3 things: zone placement, 2) units auto distribution across zones; 3) spaces constraints (implying a given list of subnets to use)09:41
dimiternaxw, while 1) when given overrides 2), but can cause an error if it conflicts with 3)09:41
dimiterns/, but can cause/, it can also cause/09:42
dimiternaxw, and since most of that is happening in AvailabilityZoneAllocations, I was thinking it's the least obtrusive solution to just give it a list of subnet ids (if a spaces constraints are given and the provisioner already populated SubnetsToZones map in StartInstanceParams)09:43
axwdimitern: sorry, just need to clarify. you want to prevent the user from forcing a machine into a zone when it specifies constraints?09:44
axwdimitern: if so - I'm pretty sure up until now the idea has been to ignore constraints when placement is specified09:45
dimiternaxw, ah, well that's sounds sane09:45
dimiternaxw, however it might be surprising, if we at least don't issue a warning09:45
axwdimitern: IMO we just need to consider auto-placement in the face of those constraints09:45
axwdimitern: could be helpful I guess, but I don't think it's worth introducing more concepts into the AZ handling code. I'd prefer to see a more general way of indicating that a zone is not valid09:47
dimiternaxw, e.g. consider you do $ juju deploy postgres --constraints spaces=db,^apps --to zone=one-of-the-zones-not-matching-db09:47
axw(not a valid choice for those constraints)09:47
dimiternaxw, right, so if we can detect the conflict at deploy time we fail early (or proceed with a warning), rather than at provisioning time09:48
axwdimitern: equally in MAAS you can do "juju deploy postgres --constraints mem=1024M --to puny-node"09:48
axwbut yes, it would be ideal to fail early09:48
axwdimitern: well 1024M is puny, but you know what I mean :)09:49
dimiternaxw, right :)09:49
dimiternaxw, ok, so about AvailabilityZoneAllocations..09:50
dimiternaxw, you're suggesting to change it to call AvailabilityZones() before InstanceAvailabilityZoneNames() ?09:50
axwdimitern: nope. I assumed you were going to modify AvailabilityZoneAllocations to call SubnetsAvailabilityZoneNames, and ignore any results from AvailabilityZones that are not in that09:52
axwdimitern: is that right, or am I way off?09:52
dimiternaxw, that was my plan, yes09:52
dimiternaxw, so in case both candidates []instance.Id and subnetIds []network.Id are given, we call InstanceAvailabilityZoneNames() for the former and SubnetsAvailabilityZoneNames() for the latter09:53
axwdimitern: my only issue is that I'm not sure how many providers that will make sense for09:54
dimiternaxw, and finally return []AvailabilityZoneInstances (which should grow a Subnets field []network.Id, like it has Instances)09:54
dimiternaxw, well, only if the provider supports spaces SAZNs() will be called, as otherwise the SubnetsToZones StartInstanceParams field won't be populated by the provisioner otherwise09:56
axwdimitern: ok, but we are forcing all the implementers of ZonedEnviron to implement SubnetAvailabilityZoneNames09:56
dimiternaxw, right, I see your point - it might be better to have SubnetsAvailabilityZoneNames() as a package-level func09:58
dimiternaxw, but then the we need to extend common.AvailabilityZone to have SubnetIDs() method09:59
axwdimitern: I think if DistributeInstances (and AvailabilityZoneAllocations) were passed a function to filter out invalid zones, that'd work?09:59
axwdimitern: not even necessarily to AvailabilityZoneAllocations. the filtering logic could be done in DistributeInstancs alone10:00
dimiternaxw, I'm not sure about that10:00
dimiternaxw, DistributeInstances is called in state when assigning a unit, right?10:00
axwdimitern: yes. it would need to query the instances to determine their subnets.10:01
axwdimitern: yeah both functions would need the filter, one for add-machine, one for deploy/add-unit10:02
dimiternaxw, hmm..10:02
axwdimitern: team meeting time, if you're coming10:02
dimiternaxw, but would that be enough for StartInstance to do the right thing?10:02
dimiternaxw, ah, yeah, with a callback it will10:02
dimiternaxw, omw10:02
dimiternaxw, thanks, I'll work out a sketch of what we discussed and propose it - will ping you to have a look tomorrow10:18
axwdimitern: thanks, sounds good10:18
perrito666I definitely want one of these at home https://pbs.twimg.com/media/CDC9FfDWEAAYgfS.jpg:large12:07
TheMuedimitern: time for a quick HO regarding one of your review comments?12:42
perrito666fantastic, something decided that I wanted my locale in spanish12:48
dimiternTheMue, not right now - we have another call with the MAAS guys in less than 10m12:50
TheMuedimitern: ok, only need infos about the full stack feature test. I've got them in there.12:51
dimiternTheMue, I meant in featuretests/ - e.g. cmd_juju_space_test.go12:52
TheMuedimitern: yes, there I do have them12:52
TheMuetoo12:52
TheMuedimitern: TestSpaceCreateNotSupported and TestSpaceListNotSupported12:52
dimiternTheMue, then it's fine - no follow-up needed :)12:53
TheMuedimitern: hehe, ok, thx12:53
dimiternTheMue, and you can drop the test and helper around the "not supported" case via the supercommand12:54
TheMuedimitern: you mean RunSuperNotSupported? it only has been a convenience helper for Run12:56
dimiternTheMue, yeah12:56
dimiternTheMue, both not supported cases are tested in the subcommand tests12:57
dimiternTheMue, and testing it via the supercommand running create|list when not supported is covered in featuretests/12:57
TheMuedimitern: yes, I needed this helper due to the ErrSilent12:57
mgzrogpeppe: are you around to talk charm/charmstore dependencies?13:31
rogpeppemgz: sure13:31
rogpeppemgz: wanna hangout?13:32
mgzrogpeppe: sure13:37
mgzrogpeppe: hm, http://paste.ubuntu.com/12328486/13:58
pmatulisre 'upgrade-juju --version',  ① how to get a list of available versions and ② what logic is used to pick a version?14:05
mgzrogpeppe: bumping github.com/juju/schema to version as in juju/juju works14:05
perrito666pmatulis: 1) juju ugprade-juju --dry-run14:05
rogpeppemgz: i'll push a better version of dependencies.tsv14:09
mgzrogpeppe: ta14:10
pmatulisperrito666: did that already. it gives me versions according to some algorithm, not according to a forced version (--version). at this time the output is14:10
mgzyou can save the landing to be a test run of the gating14:10
pmatulisno upgrades available14:10
perrito666:(14:10
pmatulisperrito666: 'xactly14:11
perrito666pmatulis: current version on the server?14:12
pmatulisperrito666: my agents are currently running 1.22.8, if that's what you meant14:13
perrito666pmatulis: yes thank you14:13
rogpeppemgz: https://github.com/juju/charm/pull/15214:16
perrito666mgz: did you ever review the patch I proposed to fix the migration of status history??14:17
mgzrogpeppe: that's `godeps > dependencies.tsv` with your current working set of deps?14:19
rogpeppemgz: pretty much, yes14:19
rogpeppemgz: i've landed it14:19
mgzperrito666: I did read the branch, was past my eod so was hoping someone else would pick it up14:19
rogpeppemgz: it should be ok now14:19
perrito666mgz: no one did14:19
perrito666how sad14:19
mgzperrito666: I can +1 but would like someone else to look as well, I am well removed from this code14:20
perrito666np14:20
mgzrogpeppe: you didn't let me use the change as a guinea pig... ;_;14:21
rogpeppemgz: i can back it out...14:21
mgzrogpeppe: nah, I can test without needing a real landing14:21
rogpeppemgz: ok, cool14:21
mgzactually, I'll just use 150, it's nice and trivial14:25
TheMuedimitern: found and removed it, now using your RunCreate and a similar RunList to not swallow the expected error. has been the needed hint, thx.14:26
mgzrogpeppe: worked,14:28
mgzhttps://github.com/juju/charm/pull/15014:28
mgzhttp://juju-ci.vapour.ws:8080/job/github-merge-juju-charm/1/console14:28
rogpeppemgz: awesome, thanks14:29
mgzperrito666: you seem to have some review comments from the antipodes14:35
perrito666m?14:35
katcoxwwt: ping14:35
mgzperrito666: trivial stuff14:36
perrito666mgz: indeed, nice anyway14:36
dimiternTheMue, cheers :)14:37
* dimitern can't stand what cloudconfig/userdatacfg_test.go has become - turns out we're not even testing how non-ubuntu series InstanceConfig looks like14:40
dimiternI'm fixing this and adding centos7 tests14:41
natefinch<fwereade> natefinch, ping me when you come on and make me talk about the queued-action watcher and why it's good/bad and should be copied/not14:44
fwereadenatefinch, ah, bother, I haven't looked at it properly14:45
natefinchfwereade: np14:45
katcofrobware: last meeting wrapped up a bit early if you have time now14:47
mupBug #1494356 opened: OS-deployer job fails to complete <blocker> <ci> <regression> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1494356>14:47
frobwarekatco, sure14:47
frobwarekatco, meh. let me restart chrome...14:48
katcofrobware: haha k14:49
fwereadenatefinch, ok, so, actions use a thing called an idPrefixWatcher14:49
fwereadenatefinch, which is a StringsWatcher14:49
fwereadenatefinch, but which behaves differently to other StringsWatchers14:49
frobwarekatco, hehe. "There is a problem connecting to this video call. Try again in a few minutes.". joy.14:49
katcofrobware: doh! possibly your auth. expired? seems to happen a lot14:50
natefinchfrobware: make sure you're using the right account, it might not be using your canonical account14:50
natefinchfwereade: ok14:50
katcofrobware: i see you in the meeting which is odd... now 2 of you :p14:50
fwereadenatefinch, so many StringsWatchers are on sets of lifecycle entities14:52
natefinchfwereade: ug, this sounds like it's encoding data in the ID and then relying on parsing the ID to re-extract that data.... can we avoid doing that?  It's always bit me in the past14:53
fwereadenatefinch, and they notify by sending the appropriate entity ids in response to enter-set, change-life-to-dying, and remove-or-set-dead14:53
fwereadenatefinch, I am keen to hear alternatives14:53
fwereadenatefinch, but we have some fun restrictions14:53
fwereadenatefinch, like, we have to be incredibly stingy with db access in the watchers14:54
fwereadenatefinch, because any time a watcher is not selecting on the channel it registered, it might be blocking *every other watcher*14:55
fwereadenatefinch, that said14:56
fwereadenatefinch, in this case I actually think we don't have to14:56
mupBug #1494356 changed: OS-deployer job fails to complete <blocker> <ci> <regression> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1494356>14:56
fwereadenatefinch, or, well, hm14:56
natefinchfwereade:  I am probably misunderstanding something - I thought watchers were per-collection... and since this is a new collection with a single purpose... do we really need to do more than get the ID from the channel and use it to do the work it needs to do?14:57
fwereadenatefinch, well, if the id does encode the data that's all we need14:57
fwereadenatefinch, if it's a nice opaque id we have to hit the db to find out anything useful14:57
fwereadenatefinch, and, fwiw, yes, almost all the watchers just look in one collection14:58
natefinchfwereade: how does one watch block all the other watchers?14:59
fwereadenatefinch, but they're all sharing an underlying mechanism, with which they interact by registering/unregistering channels to receive events14:59
fwereadenatefinch, and the underlying watcher just loops over everything that's registered for each event and delivers them all in sequence15:00
fwereadenatefinch, I'll give you a moment to recover ;p15:00
natefinchfwereade: so is it blocking all other watchers or all other watchers of the same collection?15:01
fwereadenatefinch, all other watchers15:01
fwereadenatefinch, there's just one state/watcher.Watcher15:01
natefinchso it's like our very own global interpreter lock15:02
fwereadenatefinch, one instance of which backs all the various watchers defined in state15:02
fwereadenatefinch, yeah, close enough :)15:02
natefinch...this statement coming from someone who knows nothing about the GIL except it's bad and stops multithreading ;)15:02
* fwereade once wrote a bridge between GILful and GILfree python interpretations; that was fun, but most of the time the GIL-handling is safely out of the way of actual code15:05
fwereadenatefinch, the same mitigation strategy probably applies though15:05
fwereadenatefinch, run a bunch of them and distribute your requests among them so no one instance can lock everything up15:06
fwereadenatefinch, although ofc that's a tad wasteful15:06
fwereadenatefinch, taste and discretion required :)15:06
fwereadenatefinch, so, anyway15:07
natefinchfwereade: well, if you say it's for the best, I believe you... but that sort of sounds like we need to encode everything in the ID15:07
mupBug #1494356 opened: OS-deployer job fails to complete <blocker> <ci> <regression> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1494356>15:08
fwereadenatefinch, the forces very often push us that way, yes :(15:09
fwereadenatefinch, however, in this case I think we *can* quite happily use opaque ids -- uuids or something15:09
fwereadenatefinch, and send those out from the watcher15:10
fwereadenatefinch, the worker doesn't *have* to do anything more than send up a call saying "please run this list of assignment request ids"15:11
fwereadenatefinch, and figure out what retry strategy it needs to handle failures15:11
natefinchfwereade: fwereade right, so the worker just passes the id to the API and the API handles it.15:12
fwereadenatefinch, yeah, exactly -- and my expectation here is that we have one global assigner, so there's no benefit to encoding classification data in the id anyway15:13
fwereade(one assigner per env, rather((15:13
fwereade)))15:13
fwereadenatefinch, I think the most important part is going to be surfacing failures in the unit status, and knowing how we go about retrying; and making sure that, arrgh, we don't break the interaction between unit status and fast-forward unit destruction15:15
fwereadenatefinch, am I saying helpful things?15:17
=== _mgz is now known as mgz
natefinchfwereade: yep15:18
fwereadenatefinch, cool -- so, to go back to watcher semantics15:19
fwereadenatefinch, I think it's fine to have a StringsWatcher that sends [initial set] on first event, and [newly added ids] on subsequent events15:19
fwereadenatefinch, and I *think* that one is so simple as to be best implemented standalone (well, on top of commonWatcher)15:20
fwereadenatefinch, the main thing is to hunt down the existing stringswatcher and make sure that what events they're signallling is clearly documented15:20
fwereadenatefinch, (that's something that should have happened when we added the action watcher, sorry I missed it)15:23
fwereadenatefinch, actually, it looks like many of them are already documented correctly15:24
fwereadenatefinch, and we can follow the same form15:24
fwereadenatefinch, // WatchAssignmentQueue returns a StringsWatcher that notifies of every item added to the queue.15:25
fwereadenatefinch, or something15:25
fwereadenatefinch, have you written watchers before?15:26
xwwtHi katco15:27
natefinchfwereade: sorry, had to step away for a second.15:30
natefinchfwereade: I have, but it was a long time ago at this point15:31
fwereadenatefinch, been a while for me too :)15:33
fwereadenatefinch, I think the important considerations here are: (1) as always, go for at-least-once-delivery, so start the watch before reading initial state15:34
fwereadenatefinch, (2) expect bursty writes, so do that things where we keep sucking events off the watch channel for a few ms before sending a batch15:35
fwereadenatefinch, (3) send out events as uuids (if that's what you pick) -- but not raw internal ids, or tags, even though we'll want to send them back up as tags15:36
fwereadenatefinch, because even though it's dumb that we send out state-client ids over watcher channels15:36
fwereadenatefinch, we should address this with a watcher-event-translation layer in apiserver15:37
fwereadenatefinch, rather than pervert the state watchers by making *some of them* return api tags15:37
fwereadenatefinch, sane-ish?15:38
fwereadenatefinch, re (2): updates, ok := collect(ch, in, w.tomb.Dying())15:39
natefinchfwereade: re: send events as UUIDs - do you mean to add a field to the document that is a UUID and separate from the _id itself?15:41
fwereadenatefinch, I forget the details of how watchers interact with the multiEnv stuff in state15:41
natefinchI Think I'm confusing myself, if the watchers only get the IDs anyway15:42
fwereadenatefinch, I *think* that in this case a plain string UUID is how we want to represent it as it leaves state15:42
fwereadenatefinch, and we may or may not need to pay attention, at some point, to the fact that its _id is *really* going to be prefixed with the env id15:42
fwereadenatefinch, menn0 would have the latest on how well the leaks in that abstraction have been patched15:43
fwereadenatefinch, but *most* of the time, when we're safe from multi-env leaks, yes, we can just use the UUID as the _id15:43
natefinchfwereade: in theory if you're just using the _id as an opaque id, it doesn't matter what we've encoded into it... which is kind of the point of not parsing the id15:44
fwereadenatefinch, yeah, hopefully all that is handled for you one layer below15:44
natefinchfwereade: when you say we'll want to send them back up as tags, what do you mean?15:45
fwereadenatefinch, I mean that tags are the language of the api, and I would prefer to always represent references to juju entities in that format over the wire15:46
fwereadenatefinch, it's annoying that the watchers don't respect that15:46
natefinchfwereade: how does a worker translate an id to a tag without accessing the database?15:47
fwereadenatefinch, if the tag is always just, say, "queued-assignment-<uuid>" it's pretty easy to convert15:48
fwereadenatefinch, the watcher concerns have sent tentacles all the way through the codebase, really15:48
fwereadenatefinch, generally tag and id are two-way convertible though15:49
fwereadenatefinch, without context15:49
fwereadenatefinch, as are id and (internal) _id15:50
natefinchfwereade: so a tag is basically an id that also specifies its type15:50
fwereadenatefinch, yeah15:50
dimiternfwereade, hey15:56
dimiternfwereade, looking at a few unit logs from the last blocker bug: http://data.vapour.ws/juju-ci/products/version-3040/OS-deployer/build-250/machine-0/unit-mysql-0.log15:57
dimiternfwereade, why the uniter seems to be always waiting to lose leadership at the first time ModeAbide is entered?15:58
dimiternlooks like it happens just before the first relation-joined hook is called for mysql:cluster16:00
fwereadedimitern, that should certainly always be happening if it's the only unit of the service16:11
fwereadedimitern, minions will be waiting to gain leadership16:11
fwereadedimitern, the vast majority of the time those tickets will never fire16:14
alexisbkatco, dimitern juju team, master and 1.25 are now officially blocked16:19
alexisbwe will need to identify what is causing 1.25 to fail and get a fix commited16:19
alexisbsinzui, abentley do we have a bug open for the current CI failure on 1.25?16:19
alexisbmgz, ^^^16:21
mgzalexisb: bug 149435616:21
mupBug #1494356: OS-deployer job fails to complete <blocker> <ci> <regression> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1494356>16:21
alexisbmgz, thanks, let me go look16:21
sinzuialexisb: also bug 149388716:21
mupBug #1493887: statusHistoryTestSuite teardown fails on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1493887>16:21
mgzmaster only for that one16:22
alexisbcherylj, I see you are looking at 149435616:22
alexisbfirst off thank you16:22
alexisbare you planning on working that bug, cherylj ?16:23
alexisbif so can you please assign it to yourself16:23
cheryljalexisb: yeah, I can do that16:23
alexisbmgz, can you address cherylj's question in the bug please16:23
mgzalexisb: sure16:25
alexisbthanks all16:25
cheryljmgz:  I will need to step out for a bit, but shouldn't be gone for more than an hour.16:26
mgzcherylj: short version is the jobs should include all the lxc logs if they were actually on the machine, but I will double check16:28
dimiternalexisb, I was looking at the OS-deployer bug for some time now16:28
alexisbdimitern, thank you16:28
dimiterncherylj, alexisb, unfortunately it's not clear why it happens yet16:28
rogpeppewith feature branches, what's the preferred method for keeping them up to date with master? merge or rebase?16:49
mgzcherylj, dimitern: I am rerunning the job with a shorter timeout and extra log capturing, should be done in 45mins16:50
dimiternmgz, cheers - btw do you know what version of maas is that os-deployer trying to use?17:03
mgzdimitern: it's running on our maas1817:06
mgzthere;s nothing obviously borked about it, I've been poking it today looking for something17:07
mgzbut it's possible our networking got screwed up or something else non-obvious17:07
mgzI just can't see any evidence of that from the run17:07
dimiternmgz, I found out with maas 1.9 we're having issues, but that's with a yet-uncommitted change on trunk there17:08
dimiternmgz, it seems both timeouts are due to not provisioning lxc containers, but I can see the juju-trusty-lxc-template is created ok17:10
dimiternmgz, and the X/lxc/Y machine starts; where are the container logs and cloud-init then?17:11
mgzdimitern: my rules were only capturing extra lxc logging for the local case, I added in a pattern for remote as well, so we will see17:12
dimiternmgz, awesome!17:13
=== akhavr1 is now known as akhavr
mupBug #1494441 opened: ppc64el: cannot find package "encoding" <blocker> <ci> <ppc64el> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1494441>18:57
alexisbrogpeppe, ^^^19:01
rick_h_alexisb: he's EOD19:02
rogpeppealexisb: i am eod, but not sure what you were pointing me at19:02
alexisbthe bug above19:02
alexisbit is your commit19:02
alexisblp 149444119:03
natefinchalexisb: the last bug of that type was an environmental one.... I'm pretty sure it's still an environmental issue19:05
natefinchsinzui: ^^    not being able to find a package in the standard library is not a bug in juju19:05
sinzuinatefinch: sure, but we wont be releasing juju until someone on core fixes it19:06
natefinchit's an environmental issue, just like it was last time19:06
natefinchsomehow the go standard library on the machine doing the build is messed up19:07
sinzuinatefinch: We have new machines anc clean containers.19:07
sinzuiSince we are building like Lp, and it fails, I cannot see how we can released19:07
natefinchcertainly, the problem needs to be fixed, I'm just saying, nothing we change on github is going to fix the problem19:07
sinzuinatefinch: I can fix the issue by backing out the bad commit so can any member of the core team19:08
natefinchsinzui: that's like blaming the car manufacturer for building a car that hits a pothole in the road. The pothole is the problem, not the car19:10
natefinchin this case, the build infrastructure is the road with the pothole19:11
natefinchmaster builds fine with gccgo on my machine19:11
sinzuinatefinch: no it is not. We are obligated to deliiver to Ubuntu a version that they can build and distribute on trusty ppc64el. There are several forks of the xml package already in the code base, something needs to be taught to use the fork19:12
natefinchwe need to fix the damn pothole, and stop changing the car to avoid it19:12
alexisbnatefinch, we are not going to get a gccgo update into trusty19:13
sinzuinatefinch: can we hangout. I want to tell up about my upcoming  MIR meeting any my hope for the one true path.19:13
natefinchgccgo works on my machine19:13
natefinchthat's the thing19:13
natefinchin a meeting now... we can talk after19:14
alexisbsinzui, natefinch I agree with both of you19:14
alexisbhowever, for an immediate fix the commit needs to be revert19:14
alexisbkatco, given it is rogpeppe eod, if there is a member of your team that has bandwidth we should revert the commit19:15
alexisbotherwise it will have to wait for tomorrow19:15
katcoalexisb: k, sec in meeting19:15
* alexisb changes location 19:15
alexisbnow that I have caused trouble ;)19:15
mgznatefinch: are you confusing gccgo bugs?19:30
mgznatefinch: bug 1440940 wasn't changed by altering the ppc build environment19:30
mupBug #1440940: xml/marshal.go:10:2: cannot find package "encoding" <blocker> <ci> <regression> <test-failure> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.24:Fix Released by ericsnowcurrently> <juju-release-tools:Fix Released by gz> <https://launchpad.net/bugs/1440940>19:30
mgzit was fixed twice, by:19:30
mgz* first me hacking around it in the juju/xml package19:31
mgz* second by eric making the vsphere provider *8never even try to compile* on gccgo19:31
mgzas juju/htttrequest introduces a new problem import we're going to have to hack around it again19:32
natefinchmgz: I swear last time I got on a fresh ppc machine and was able to build just fine with the packages provided by apt.19:42
mwhudsonwaait, i fixed that can't find encoding bug like a year ago19:59
mwhudsonor at least i think i did20:03
mwhudsonbuuut somehow the fix isn't in trusty20:05
mwhudsonffs20:05
natefinchmwhudson: dave said the fix was in the gccgo in trusty-updates20:14
mwhudsonyeah well it doesn't seem to be20:15
natefinchmwhudson: ahh, well, that explains some things20:15
mupBug #1494476 opened: MAAS provider with MAAS 1.9 - /etc/network/interfaces "auto eth0" gets removed and bridge is not setup <juju-core:New> <https://launchpad.net/bugs/1494476>20:42
=== urulama is now known as urulama__
=== natefinch is now known as natefinch-afk
perrito666lol I rushed back to a meeting... it was in half an hour22:04
perrito666wallyworld: disregard my email, Ill be on time22:08
wallyworldok22:08
ericsnowwallyworld: could you take a look at #149312322:53
ericsnowwallyworld: it's similar to #1472729 which you fixed in July22:53
mupBug #1493123: Upgrade in progress reported, but panic happening behind scenes <landscape> <landscape-release-29> <upgrade-juju> <juju-core:In Progress by ericsnowcurrently>22:53
mup<juju-core 1.24:In Progress by ericsnowcurrently> <juju-core 1.25:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1493123>22:53
mupBug #1472729: Agent shutdown can cause cert updater channel already closed panic <regression> <upgrade-juju> <juju-core:Fix Released by wallyworld> <juju-core 1.24:Fix Released by wallyworld> <https://launchpad.net/bugs/1472729>22:53
wallyworldericsnow: will do, just in a meeting22:53
ericsnowwallyworld: np; I'll be in and out22:54
wallyworldericsnow: yeah, looks like a similar fix is needed at first glance23:03
ericsnowwallyworld: I'm just not positive that certChangedChan is the offending channel in this case23:04
wallyworldme either, i haven't looked in detail at the logs23:04
ericsnowwallyworld: the corresponding timeline wouldn't line up23:04
ericsnowwallyworld: k, I'll poke at it some more; feel free to grab it too :)23:05
wallyworldwill do, got some stuff i have to get done this morning first up, will look after that23:06

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!