/srv/irclogs.ubuntu.com/2013/04/04/#juju-dev.txt

=== thumper-gym is now known as thumper
* thumper is back to go through the review00:19
thumperdavecheney:  https://codereview.appspot.com/8348043 has been updated following fwereade__'s review01:09
thumpermore looks would be good though01:09
* davecheney looks01:18
* thumper needs food, goes hunting in the kitchen01:23
* bigjools imagines thumper spearing a bear roaming in his kitchen01:24
davecheneyhttp://creativeroots.org/2011/08/kiwi-anatomy-t-shirt/01:25
bigjoolshaha01:27
thumperdavecheney: how many people do we have in this TZ that can approve my branch?01:54
* thumper wonders when jam starts01:54
thumperbigjools: not a bear, but I did get some salami and cheese :)01:56
thumperlucky me, they didn't put up much resistence01:56
bigjoolsthumper: I found a hot cross bun that was toasted01:57
thumperbigjools: did you have to light the fire to toast it?01:57
thumperor was it left by a passer-by01:57
thumper?01:57
bigjoolsI could use Ian's dog's breath01:57
bigjoolsit's enough to curl hair01:58
thumperheh01:58
* thumper needs to collect a young child from school01:58
thumperbbs01:59
thumperdavecheney: ping04:27
thumperdavecheney: https://code.launchpad.net/~thumper/juju-core/package/+merge/15702004:30
* thumper off until the meeting04:31
jamjtv1: just trying to sort out what branch you want reviewed06:01
=== ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: dimitern | Bugs: 2 Critical, 53 High - https://bugs.launchpad.net/juju-core/
dimiternmorning all06:13
jamhi dimitern06:15
jtv1Hi jam, hi dimitern06:16
jtv1jam: I was about to bring it up...  Had some trouble with lbox, but had to go with the normal way instead:06:16
jtv1https://code.launchpad.net/~maas-maintainers/juju-core/maas-provider-skeleton/+merge/15702506:17
jtv1I'm just fixing the conflicts.06:17
jtv1Oh dear.  I'm getting build errors, but I get the same ones in trunk.06:29
jtv1Nonexistent functions in "log," and more.06:30
dimiternjtv1: which revision are you on?06:31
jtv1110406:31
dimiternjtv1: me too, just pulled trunk and running the tests; can you paste the errors?06:33
jtv1Coming up.06:34
jtv1I get the exact same ones for trunk and our branch, so I guess that's a good sign w.r.t. ours.06:34
jtv1dimitern: http://paste.ubuntu.com/5675740/06:35
dimiternall passes here06:35
dimiternjtv1: these are build errors, not test failures - something is not updated I think06:36
jtv1I'll run a "go get -u juju-core" then  I guess06:38
dimiternjtv1: well, it'll possibly screw up your working dir06:38
dimiternjtv1: if you have branches, etc. - maybe try changing $GOPATH to a new dir and get it afresh?06:39
jtv1Ouch.06:39
jtv1OK, I'll try that, thanks06:39
rogpeppemorning all!06:43
jtv1Hi rogpeppe06:44
rogpeppejtv1: hiya06:44
dimiternrogpeppe: hey06:44
* davecheney waves06:44
dimiterndavecheney: hi :)06:44
rogpeppedavecheney: good evening; dimitern: yo!06:45
dimiternfwereade__: I have a branch for you https://codereview.appspot.com/8330043/06:47
fwereade__dimitern, hey, just reviewed06:50
dimiternfwereade__: cheers! btw the pending/starting is just the first step, the MA will set the correct status in a follow-up I think06:52
fwereade__dimitern, I think I would prefer to set nothing than to set something that's only mostly correct06:52
fwereade__dimitern, there's an argument to be made that "provisoned" or "starting" or whatever can be inferred perfectly well from InstanceId06:53
dimiternfwereade__: ok, i'll think about it some more06:53
fwereade__dimitern, cheers06:53
dimiternfwereade__: why do you think breakJuju should panic in the provisioner test?06:56
dimiternis the meeting now or in 1h ?07:01
dimiternmramm2: ping07:02
* davecheney wonders if he has time to order a pizza before the meeting07:03
dimiternso it's not now07:03
davecheneynope, still 6pm in AU07:04
davecheneymeeting is at 7pm07:04
dimiternso when does AU switch to DST?07:05
davecheneysoon07:05
davecheneydimitern: 7th Apr07:06
dimiternI see07:06
davecheneybut not Queensland07:06
davecheneythey never adopted daylight savings because they were worried it would confuse the Cows07:06
dimiternso you're last to go - we already switched last sunday, and US - before 3 weeks07:06
dimitern:D cool07:07
davecheneydimitern: even inside australia the 7 states shift at different times07:07
davecheney/s/times/dates07:07
davecheneybecause when something is hard, it helps to make it even harder by not coordinating07:07
* davecheney says fuckit, and orders a pizza'07:07
dimiterninteresting :) and opinionated07:09
fwereade__dimitern, sorry07:26
fwereade__dimitern, the provisioner test should not have a JujuHome07:26
dimiternfwereade__: it's not using it really07:27
fwereade__dimitern, it runs in the cloud and doesn't have an environments.yaml or any of the associated crap07:27
fwereade__dimitern, yes it is07:27
dimiternfwereade__: it has environ config though07:27
fwereade__dimitern, it's getting the environment from environments.yaml07:27
fwereade__dimitern, which invokes a whole bunch of crap that is *wrong* for this context07:28
fwereade__dimitern, I see why it works though07:28
dimiternfwereade__: well, it's changing that, not reading it, but I see your point07:28
fwereade__dimitern, it's because JujuConnSuite is broken07:28
fwereade__dimitern, but fixing that is out of scope07:28
fwereade__dimitern, it's changing it and then reading it, right?07:28
dimiternfwereade__: I'll split the "make broken env.yaml" from "make broken environ config" for a given dummy method07:29
dimiternfwereade__: not07:29
fwereade__dimitern, then why's it writing it? :)07:29
dimiternfwereade__: reading it - just changing it and returning config.Config07:29
dimitern:)07:29
dimiternfwereade__: because breakJuju seemed like a good candidate for that, already used in cmd tests07:30
dimiternfwereade__: i changed it to reuse it, but it diverged too much from the original and now it's doing 2 unrelated things07:30
fwereade__dimitern, sure, but the dummy provider already has exactly the capability you need -- you just set one field in the existing config07:30
fwereade__dimitern, I think that just doing that in the provisioner test is neatest07:31
dimiternfwereade__: wasn't sure, but now I see it better (after having done it the wrong way first) :)07:31
fwereade__dimitern, don't worry, this stuff is nasty to begin with and only made worse by the essentially-random contexts in which we run tests07:31
dimiternfwereade__: how about adding an exported call to the dummy provider to make breaking methods easier?07:32
fwereade__dimitern, honestly its not that much work to change a config07:32
fwereade__dimitern, cfg, err := environ.Config().Apply(map[string]interface{}{"broken": "StartInstance"}); c.Assert(err, IsNil); err = environs.SetConfig(cfg); c.Assert(err, IsNil)07:33
dimiternfwereade__: it's not obvious and it's definitely useful to have a shortcut I think, for future use as well07:33
dimiternfwereade__: (and documented properly, rather than having to read through the source to figure out how)07:34
fwereade__dimitern, isn't that a method that'll be used in exactly one place?07:34
dimiternfwereade__: in any place that needs to break the dummy provider's methods without needing environments.yaml07:35
dimiternfwereade__: i'm sure there'll be others07:35
fwereade__dimitern, I suspect that `UpdateConfig(env Environ, attrs map[string]interface{}) error` would actually be useful07:36
dimiternfwereade__: anyway, too early for me to argue properly :) I'll just do it in the provider tests as a helper, which can be extracted if needed07:36
fwereade__dimitern, provider or provisioner? :)07:37
dimiternfwereade__: yeah, provisioner07:37
fwereade__dimitern, cool, thanks07:37
* dimitern dives into the review queue07:38
jtv1dimitern: about those test failures on trunk...  I run "go test ./..." in a fresh Go environment, against trunk, and I get further now but it's still pretty bad: http://paste.ubuntu.com/5675840/07:41
jtv1Is that what you're getting?07:42
* thumper is back07:42
dimiternjtv1: you need mongod installed for the state tests07:43
jtv1Ah damn, forgot about that.  Will it work on i386 now?07:43
dimiternjtv1: i think now, unless you build it + ssl07:43
dimiterns/now/not/07:44
jtv1OK, I'll try it on the cloud.  Thanks.07:44
dimiternjtv1: but it seems mongo is not the only problem - some things are still missing from the checkout07:45
dimiternrogpeppe: can you help jtv1 with these issues? ^^07:45
rogpeppejtv1: looks like you might not have mongod installed07:46
jtv1I don't — Dimiter already spotted that one.07:47
jtv1I'm trying it on the cloud now.07:47
dimiternrogpeppe: that's not the only problem though - see the undefined stuff07:47
jtv1It was in a fresh Go environment.07:47
rogpeppejtv1: what's your current directory and what's your GOPATH set to?07:48
jtv1My GOPATH is /tmp/go (where I built a fresh environment)07:49
jtv1I softlinked $GOPATH/src/launchpad.net/juju-core to my own checkout of trunk, and was cd'ed into that.07:49
rogpeppejtv1: ah, that might be your problem07:50
jtv1The softlink, the working directory, or the combination?07:52
rogpeppejtv1: the fact that the working directory isn't directly below $GOPATH07:52
rogpeppejtv1: what happens if you do "go test launchpad.net/juju-core/..." ?07:53
davecheneyjtv1: you can't softlink GOPATH07:53
davecheneysorry07:54
davecheneythe go tool is too smart for symlinks07:54
jtv1Too smart for its own good then :)07:54
rogpeppejtv1: it tries to work out the package path by looking at the current working directory07:54
rogpeppejtv1: unfortunately there's not enough information for it to do so07:55
jtv1That other command line says "launchpad.net/juju-core/..." matched no packages07:55
rogpeppejtv1: i'm surprised it didn't give you an error actually07:55
dimiternjtv1: you can softlink stuff inside $GOPATH though - I have ~/work/go/src/launchpad.net/juju-core -> ~/work/juju-core and it works07:55
jtv1Well I'm getting plenty :)07:55
jtv1That's what I did too.07:55
rogpeppedimitern: i'm surprised that works actually07:56
dimitern(actually it's the other direction, ~/work/juju-core is a symlink to $GOPATH/src/launchpad.net/juju-core)07:57
rogpeppedimitern: ah, that makes more sense07:57
rogpeppet07:57
jtv1davecheney: I didn't softlink GOPATH... just $GOPATH/src/launchpad.net/juju-core07:57
davecheneysorry, symlinks inside GOPATH are a no no07:58
rogpeppejtv1: the problem is that when the go tool gets the current working directory, it sees ~/mycheckout, not $GOPATH/src/launchpad.net/juju-core07:58
rogpeppejtv1: so it can't work out that you're building launchpad.net/juju-core07:58
jtv1rogpeppe: but it'll work as long as I use the launchpad.net/juju-core/... path?08:03
rogpeppejtv1: no; i thought it would, but it doesn't08:03
jtv1:(08:03
rogpeppejtv1: (i'm not sure why)08:03
jtv1But then how does that softlink work for you!?08:03
rogpeppejtv1: it doesn't08:03
jtv1:(08:03
rogpeppejtv1: it works for dimiter because his softlink points *inside* GOPATH08:03
jtv1I wonder if maybe it isn't better to have one Go environment per branch...08:04
=== mthaddon` is now known as mthaddon
rogpeppejtv1: i use cobzr, but you can also use bzr colocation08:06
jammgz: we miss you08:07
mgztakes too long to sign in to g+ these days...08:08
jammramm2: so you don't feel unloved. PING :)08:13
thumperdavecheney: would you be happy if I made environs.config.DefaultSeries a const string?08:41
thumperI could make the change and merge08:41
davecheneythumper: +108:41
thumperdavecheney: just proposing those changes08:50
davecheneythumper: where does version.CurrentNumber come from ?08:52
davecheneywas it added recently ?08:52
thumperyes, added in that branch08:52
jtv1fwereade__: thanks for the  notes on our feature branch!  We were just looking at how big the divergence was, and I think this'll be helpful.08:52
davecheneythumper: i can't find it ?08:53
davecheneyam i an idiot ?08:53
thumperdavecheney: to of version/version.go08:53
thumperdavecheney: first change in that file08:53
thumpernot an idiot, just blind :)08:53
davecheneyahh, yes, v sorts afrer e08:53
davecheneyso, i was right first time :)08:53
thumper:(09:03
thumperfor some reason merging in trunk breaks it09:03
* thumper pokes09:03
thumperjam: your sync-tools changes have broken my branch09:06
* thumper tries to work out the test09:06
mgzyup, that was the plugin crashing...09:06
mgzI won't rejoin, signin takes too long09:07
jamthumper: the mongo branch or ?09:08
thumperjam: yeah, my refactoring...09:08
thumperwhich hits some fake tools somehow09:08
jamthumper: I set up a dummy env, and then copy tools out of it09:09
thumperI'm getting some extra tools there...09:09
jambut you can't set up a dummy env09:09
jamwithout fake tools09:09
jamso I delete them09:09
jamso the test has known state09:09
thumperah, but not all of them...09:09
jamnow you need to delete 2 of them, I guess09:09
thumperright/09:09
thumperI'll poke09:09
thumperisn't there a delete all?09:09
jamthumper: no, but you can list, iterate, and delete.09:09
jamStorage, has Put, Get, List, Remove09:10
jamthumper: line 67 is where yoou can change the "lookup Current" into "list all"09:11
jamsync_tools_test.go09:11
* thumper nods09:11
jamtechnically you should be matching against tool patterns, etc. But in practice, I think you can just nuke everything in the dummy env.09:11
jam(may get us in trouble in the future :)09:11
jtv1Say, does anyone know why the tests for launchpad.net/juju-core/cmd/juju might be timing out?09:12
jamjtv1: if you are timing out, most likely you aren't running a mongo that is new enough to support ssl09:12
jamthat was the most common occurance I've encountered.09:12
jamjtv1: this is a 600s timeout, right?09:12
jtv1Yes.09:12
jtv1During TestPackage.09:13
jtv1So maybe I should try Raring?09:13
jamjtv1: if you have the wrong mongod in your PATH, it doesn't start09:13
jamjtv1: you have to run from the PPA or use the tarball09:13
jtv1I just installed the quantal  mongod.09:13
jamjtv1: the one from Julian?09:13
jtv1No, just the standard one.  What PPA should I use?09:13
jamjtv1: https://launchpad.net/~juju/+archive/experimental09:14
jamthat only has raring09:14
jamjust a sec09:14
jamhttps://launchpad.net/~julian-edwards/+archive/mongodb09:14
jamhas quantal09:14
jtv1Thanks.09:14
jamit looks like raring might come with a default one that is new enough: https://launchpad.net/ubuntu/raring/+source/mongodb/1:2.2.4~rc0-0ubuntu109:15
jambut I'm not positive09:15
jambecause the SSL support is contentious because of GPL licensing vs OpenSSL, etc.09:15
jtv1OK, I'll try the PPA first then.  I'm also keeping notes so hopefully people can re-use them.  Thanks.09:16
jamthumper: "constant wouldn't be a string": "const foo string = "stuff"09:16
thumperjam: yeah, that is what I have done09:16
thumperjam: the target bucket also needs fixing :(09:21
jamthumper: I couldn't find exactly when the bucket gets filled, so I just allowed it to have one extra09:24
thumpernm, I'm fixing09:24
jamIt looked like putFakeTools was done on first *connect*09:24
jamrather than on creation.09:24
jamthumper: the other option is to have a dummy config items that lets you disable auto tools.09:25
thumperjam: sure... but not now :)09:25
mattywrogpeppe, it looks like Status has been removed from unitinfo recently?09:32
rogpeppemattyw: yes, unfortunately. i'm working on putting it back right now.09:32
mattywrogpeppe, ok no problem09:33
=== andrewdeane is now known as awd
thumperjam: managed to fix all the issues during the call :)09:35
jam:)09:35
thumperjust doing a final test run prior to submitting09:35
fwereade__jtv1, you're welcome :)09:35
fwereade__jtv1, there will certainly be more but I need to look at it in bits :)09:35
thumperI have a feeling tomorrow will be doing all the peer reviews09:35
thumperfor me at least09:35
thumperalso09:35
thumperyay for checking the tests prior to merging09:35
thumperwith trunk merged in09:36
jtv1fwereade__: I'm trying to get trunk to pass tests right now, and after that I can get a bit of an overview of the divergence.  rvba and I are planning to do the fixing for this one.09:37
fwereade__jtv1, cool09:37
* thumper has merged the branch, and is leaving the office09:41
thumperciao09:41
dimiternfwereade__: the provisioner currently aborts if any machine failed to start10:12
dimiternfwereade__: I didn't change that, it was already like this10:13
rogpeppedimitern: just FYI, revision 1101 broke our tests against go tip for me.10:20
dimiternrogpeppe: can I see  a paste?10:20
rogpeppedimitern: it's probably the go runtime's fault, not yours10:20
rogpeppedimitern: http://paste.ubuntu.com/5676161/10:21
rogpeppedimitern: that's running TestUniterRelations only10:21
dimiternrogpeppe: I have no idea why10:22
dimiternrogpeppe: could it be a bug in mgo?10:23
rogpeppedimitern: i really don't know. i'm trying to narrow down the problem first.10:25
rogpeppedimitern: it's not helped by the fact that from go revisions 15910 to 16052 our stuff does not build due to a bug in the go compiler.10:27
dimiternrogpeppe: ha.. interesting10:28
fwereade__dimitern, yeah, but making that sane is part of the job, I think10:35
fwereade__dimitern, sorry, bbiab10:35
dimiternfwereade__: my point is, when it fails it exits the loop, like the uniter does, and gets restarted10:44
dimiternupdated https://codereview.appspot.com/8330043/10:53
fwereade__dimitern, sorry, still half-having lunch, but: the uniter sets a hook error state and keeps going, because it's meant to handle that failure11:04
fwereade__dimitern, the situation is analogous11:04
fwereade__dimitern, besides, letting the pro die will fuck the firewaller at the same time, and that has even more complex state to set up every time11:05
dimiternfwereade__: so this means general refactoring of the provisioner, not just setting an error status, which I think is outside of this CL11:05
dimiternfwereade__: it'll die exactly the same way now, no regression introduced11:06
dimiternfwereade__: i'm happy to do the refactoring though, in a follow-up - will get me a chance to know it a whole lot better11:07
jam'AaaaaAaa,e.11:24
dimiternjam: very expressive encoding :)11:26
jamdimitern: 5-year old keyboard typing. And me hitting <enter> instead of <delete>11:26
jamdimitern, mgz: standup?11:29
fwereade__dimitern, I think there has been a communication error11:35
dimiternfwereade__: oh, what was it?11:36
fwereade__dimitern, meh, just what I thought I meant and what you thought I meant re the story11:37
fwereade__dimitern, it is ofc my fault :)11:37
dimiternfwereade__: sorry, but it's still not clear for me11:38
fwereade__dimitern, in my mind, making the provisioner react sensibly to predictable errors -- as does the uniter -- is absolutely part of this story11:38
fwereade__dimitern, think through step by step what will happen when we try to provision a machine with bad constraints11:39
fwereade__dimitern, the provisioner starts up, sees an unprovisioned machine, tries to provision it... falls over11:41
fwereade__dimitern, gets restarted, sees an upprovisioned machine, tries to provision it... falls over11:42
fwereade__dimitern, and may well continue doing this forever even if there are other machines that *could* be provisioned11:42
fwereade__dimitern, now we have a channel by which errors can escape to something that'll handle them -- however badly -- we should (1) send the errors down that channel and (2) not send them elsewhere11:44
fwereade__dimitern, in my mind, the two pieces of work are two sides of the same coin11:44
dimiternfwereade__: wanna have a quick g+?11:46
fwereade__dimitern, sure11:46
fwereade__TheMue, ping12:21
TheMuefwereade__: pong12:40
TheMuefwereade__: just proposing the changes ;)12:41
fwereade__TheMue, heyhey, just wanted to say I'm around if you want to chat about the uuid stuff12:41
TheMuefwereade__: found your feedback clear (at least i hope i got it right ;) )12:41
fwereade__TheMue, cool :)12:41
TheMuefwereade__: so, it's in again12:42
fwereade__TheMue, what are the plans for env doc key vs globalKey then? ah ok cool I'll read it :)12:42
TheMuefwereade__: i liked your idea, it's then a real global key.12:43
fwereade__TheMue, sorry, I think I failed to be clear here12:44
TheMuefwereade__: i'm listening12:45
fwereade__TheMue, the globalKey for the environment was until recently "e"12:45
fwereade__TheMue, I'm not sure why that changed on the Environment entity in the first place12:45
TheMuefwereade__: "e#<<name>>"12:45
fwereade__TheMue, http://paste.ubuntu.com/5676494/12:47
fwereade__TheMue, the global key for the environment is still "e"12:47
fwereade__TheMue, that globalKey() method is I think objectively wrong12:47
TheMuefwereade__: ah, yes, i meant the the method globalKey() returned12:48
TheMuefwereade__: the "e", like for the settings, has been the reason i've used "e" in the first run as document id too12:48
fwereade__frankban, ping12:48
fwereade__TheMue, yes, exactly12:49
frankbanfwereade__: pong12:49
fwereade__frankban, Environment.globalKey()12:49
fwereade__frankban, it doesn't match the usual environment global key as used elsewhere; is this because the existing one ("e") is crazy, or because you didn't know about it?12:49
fwereade__TheMue, so, whatever we pick for globalKey is irrelevant12:50
dimiternfwereade__: in order for the PA to pick up machines in error state _at all_ is to watch for something other than Life12:51
frankbanfwereade__: OIC, you mean "e#[name]" vs just "e"? if so, the latter.12:51
dimiternfwereade__: I mean machines that were in error status but was resolved12:51
fwereade__frankban, cool, thanks, just checking12:51
fwereade__dimitern, ah-ha!12:52
fwereade__dimitern, ok that makes things more interesting/annoying in the future12:52
dimiternfwereade__: PA uses WatchMachines() which is a lifecycle watcher, not on status or anything else12:52
fwereade__dimitern, but for now remember we don't expose any way to resolve12:52
TheMuefwereade__: so globalKey() should only return "e" and that's also ok as document id for the environment doc in its collection?12:52
fwereade__dimitern, so it doesn't matter yet12:52
fwereade__TheMue, (1) probably (2) maybe12:52
TheMuefwereade__: hehe12:53
dimiternfwereade__: so I cannot test the machine is skipped - it'll always be skipped once created and is in error state12:53
fwereade__TheMue, in both cases it's worth a discussion if you're not familiar with the context12:53
dimiternfwereade__: even after restarting the PA the machine still won't be picked up12:53
fwereade__dimitern, and that is correct behaviour12:54
fwereade__dimitern, verify that it happens and you're good, I think12:54
dimiternfwereade__: ok, but seems incomplete12:54
TheMuefwereade__: as the doc is the only one in it's collection the id only has to be static, so i initially thought "e" like the one for the environment conf or its constraints would be fine.12:54
dimiternfwereade__: but I guess when we have to deal with resolve it'll come back to this12:54
fwereade__dimitern, yeah -- I think it's a problem we can safely solve then12:55
TheMuefwereade__: for the globalKey() i have to look more, indeed. interestingly the tests pass all.12:55
niemeyerGood mornings12:55
niemeyerdimitern: Fixed: https://codereview.appspot.com/835504412:55
fwereade__TheMue, I think I'm fine with "e" all round, actually, but please make sure it's a const somewhere and not used in magic string form12:56
dimiternniemeyer: cool, thanks I'll take a look12:56
TheMuefwereade__: so shall i change all other places where today simply "e" is used too?12:57
fwereade__TheMue, yes please12:57
TheMuefwereade__: will do so12:57
fwereade__TheMue, although, wait a mo12:57
TheMuefwereade__: yes, until your review is in12:58
fwereade__TheMue, I'm suspicious of having the two keys be the same12:58
fwereade__TheMue, any particular reason not to use the uuid?12:58
fwereade__TheMue, it *is* a unique identifier for the environment, after all :)12:59
fwereade__TheMue, (we should not use the uuid for the global key, just for the entity key)12:59
TheMuefwereade__: when doing the FindId() i have to know the id. otherwise i could also read the first document of the collection. would work too.12:59
fwereade__TheMue, just-pick-one is OK for now13:00
fwereade__TheMue, if we end up sharing that collection when multitenanting it'll break, but we'd always have to have the env uuid available in the *State, so it'd be a trivial fix13:01
TheMuefwereade__: ok, then will do it this way. so i'll use the uuid field as _id.13:03
dimiternniemeyer: LGTM13:07
dimiternfwereade__: i think this should be about all of it: https://codereview.appspot.com/8330043/13:13
fwereade__dimitern, ack13:15
* dimitern lunch13:15
=== sidnei` is now known as sidnei
benjiI have a small branch up for review, it's pretty straight-forward: https://codereview.appspot.com/836604313:54
* rogpeppe is in a sea of *almost* reproducible bugs and uncertain dependencies13:56
dimiternbenji: I'm on it shortly14:03
benjithanks dimitern14:04
rogpeppemgz: do you know of a way of getting bzr to print the revision id it's currently on (not just the latest revision id it knows about)?14:05
mgz--tree14:05
rogpeppemgz: ah, i missed that, thanks14:06
rogpeppemgz: i'm quite surprised it's not the default14:06
rvba`fwereade__: Hi… I just wanted to let you know that I was able to implement the missing method InstanceId() in the MAAS provider.  I've followed your advice and used cloudinit to populate a file containing the instanceId and then the method InstanceId() reads from that file.  I've just tested my code in the lab with real machines and a real MAAS deployment and it seems to work ok.14:06
fwereade__rvba`, thanks, that's great news14:07
mgzit is a little surprising sometimes14:07
=== rvba` is now known as rvba
mgznot an easy default to flip though, revno and revision-info commands mostly used from scripts that would potentially break if the semantics changed14:08
fwereade__rvba, I hope it wasn't too much hassle... (there is a path towards completely dropping provider.InstanceId, but it's really nice not to have to do it right now ;))14:09
fwereade__mramm2, rogpeppe, TheMue, hey, should we be meeting?14:09
TheMuefwereade__: already done :D14:09
rogpeppefwereade__: ah yes14:09
fwereade__TheMue, sorry14:10
rogpeppefwereade__, TheMue, mramm2: i'm there14:10
TheMuerogpeppe, fwereade__: we've seen that after this morning and with the current state on the board we don't need the meeting today14:10
fwereade__rogpeppe, funny, I think I'm there too14:10
fwereade__rogpeppe, moot anyway :)14:10
rvbafwereade__: it was actually pretty straightforward apart from one trick: I added a method in cloudinit.go to be able to add a runcmd at the *beginning* of the list of commands.14:10
rvbaBecause jujud is started by cloudinit and we need the file containing the instanceId to be present before jujud is started.14:11
fwereade__rvba, I've been kinda expecting us to need something like that for about a year, so np there :)14:11
rvbaAll right :)14:12
rogpeppefwereade__, rvba: another possibility would be to change environs/cloudinit so you pass in the cloudinit.Config instead of returning it14:13
fwereade__rogpeppe, hmm, that's rather nice I think, and seems to fit the style of cloudinit-composition already used in maas -- rvba, comment?14:14
TheMuefwereade__: which way do you prefer the passing of the UUID to NewHookContext to be tested? it's done implicit by compared what is passed to what later is retrieved as env variable.14:16
rogpeppefwereade__: we might rename cloudinit.New to cloudinit.Config.Configure if we did that, perhaps14:16
rvbafwereade__: rogpeppe sounds like a good idea…14:16
TheMuefwereade__: the field is only used to set the env variable, there's no other accessor14:17
fwereade__rvba, then maybe go with that if that's ok?14:17
fwereade__TheMue, so, you need to check it in an actual hook14:17
rvbafwereade__: ok, I'll give it a try.14:17
rogpeppervba: thanks14:17
fwereade__TheMue, ATM you don't test that the env var actually has the correct value14:17
fwereade__rvba, rogpeppe: tyvm14:17
TheMuefwereade__: i do, with the latest CL14:18
fwereade__TheMue, you do not14:18
TheMuefwereade__: it's generated once, passed to NewHookContext and compared with the env variable in AssertEnv()14:18
fwereade__TheMue, you test HookContext14:18
fwereade__TheMue, you do not ever test the correct value of the environment uuid, for which you need to actually run a uniter14:19
fwereade__TheMue, you do test, perfectly well, that a hook context created with a given UUID will put that UUID into the env14:20
TheMuefwereade__: then i got your comment wrong. thought you meant the passing to the HookContext14:20
fwereade__TheMue, that's exactly what I mean14:20
TheMuefwereade__: hmm14:20
fwereade__TheMue, if I changed runHook() to pass in trivial.NewUUID(), what test would fail?14:20
TheMuefwereade__: yeah, i know what you mean, but i would have expected a different comment. ;)14:21
dimiternbenji: reviewed14:21
benjithanks14:21
fwereade__TheMue, sorry :)14:21
TheMuefwereade__: will change (or add) the test14:22
TheMuenice, uniter_test has "only" 1632 loc14:25
dimiternTheMue: but getting to know all of them is incredibly rewarding in the long term ;)14:27
TheMuedimitern: thankfully i know all chars, numbers and signs. will sort it and check it in again, then it's more simple. 1859 x "a", 198 x "b", 1329 x "c" ... :D14:29
benjidimitern: thanks for the review, I made the changes you suggest.  Now for a second review.  Any takers?14:33
dimiternbenji: cheers14:35
fwereade__TheMue, you can do it all in the context type14:37
rogpeppeniemeyer: ping14:39
niemeyerrogpeppe: pongus14:39
rogpeppeniemeyer: heya14:39
rogpeppeniemeyer: i'm seeing strange failures when running the uniter tests against go tip, and i'm trying to narrow down the issue14:39
niemeyerrogpeppe: Ok14:40
rogpeppeniemeyer: one symptom in particular i'm seeing is that mgo is returning an error like: Invalid ns [@\x10!]14:40
rogpeppeniemeyer: the "Invalid ns" comes from within mongod14:40
rogpeppeniemeyer: the ns itself varies - it looks like garbage14:40
niemeyerrogpeppe: Ok14:41
niemeyerrogpeppe: ns is the collection name14:41
niemeyerrogpeppe: plus the database name usually14:41
rogpeppeniemeyer: i'm also sometimes seeing a panic in mongoSocket.Acquire14:41
rogpeppeniemeyer: do you have an idea of places that it might become corrupt?14:42
fwereade__TheMue, just (1) drop context id, add env uuid instead (2) s/UniterSuite-%d/%s/ in goodHook, badHook and hookPattern (and use ctx.uuid instead of ctx.id)14:42
niemeyerrogpeppe: No.. mgo doesn't use anything unsafe14:42
rogpeppeniemeyer: yeah, i didn't think so.14:42
rogpeppeniemeyer: the tests pass ok under the race detector14:43
niemeyerrogpeppe: I've seen one of your tracebacks.. it looks like corruption indeed14:43
niemeyerrogpeppe: Like, bad corruption, not just a race14:43
rogpeppeniemeyer: yeah14:43
niemeyerrogpeppe: The thing that it said was nil in the traceback was actually a field that was not a pointer14:43
rogpeppeniemeyer: yeah, and i added a nil check before the place that panicked, and i still got a panic there at some point14:44
rogpeppeniemeyer: something is quite wrong14:44
niemeyerrogpeppe: Might be worth raising that with the core debs14:44
niemeyerdevs14:44
rogpeppeniemeyer: i bisected the problem to the scheduler change, but that might be just exposing some underlying GC race14:45
rogpeppeniemeyer: they're following the issue already, i think14:45
niemeyerrogpeppe: It sounds like 1.1 is pretty close to release.. it'd be sad if something like that went out14:45
rogpeppeniemeyer: i totally agree14:46
niemeyerrogpeppe: Oh, you filed something upstream about it already?14:46
rogpeppeniemeyer: http://code.google.com/p/go/issues/detail?id=519314:47
rogpeppeniemeyer: it's a pity we can't *entirely* rule out our own stuff, as we do use cgo, but tbh the yaml stuff has been stable for ages.14:47
niemeyerrogpeppe: Yeah14:48
niemeyerrogpeppe: I may actually end up porting goyaml pretty soon as a non-Canonical task14:49
niemeyerrogpeppe: This will help us ruling that stuff out as well14:49
rogpeppeniemeyer: that would be marvellous, yes.14:49
mattywrogpeppe, sorry to interrupt, I'm getting a "juju home hasn't been intialized" when trying to call the api, But I've got an envionment running - there aren't any units but the state server is there14:52
rogpeppemattyw: you'll need to import environs/config and copy this function from cmd/juju: http://paste.ubuntu.com/5676897/14:57
rogpeppemattyw: this is an unfortunate consequence of recent changes14:57
niemeyerrogpeppe: Here is a crazy idea: would the goyaml package work over rpc?14:57
rogpeppefwereade__: ^14:57
rogpeppeniemeyer: hmm, depends on the C API i think14:58
niemeyerrogpeppe: Not really14:58
niemeyerrogpeppe: What'd go over the wire has nothing to do with C14:58
fwereade__mattyw, is this a "juju home..." error coming back from the API server?14:58
niemeyerrogpeppe: The point is to quickly build a mock version of goyaml that is type safe14:59
rogpeppeniemeyer: i'm thinking of the reflection stuff, but perhaps all that is done Go side14:59
niemeyerrogpeppe: It doesn't make a difference14:59
fwereade__mattyw, wait, that error is a panic, right?14:59
niemeyerrogpeppe: The real goyaml would sit on the other side of the wire14:59
rogpeppeniemeyer: to unmarshal, you'd presumably need to send the other side a description of the kind of type you want to unmarshal into, right?15:00
fwereade__rogpeppe, API and yaml might not mix so wonderfully... AIUI yaml is a hassle in javascript15:00
fwereade__rogpeppe, but maybe I mistake you15:00
niemeyerrogpeppe: I don't know.. haven't used the rpc package at all, so I don't know what it is sending15:00
mattywfwereade__, yes it is15:00
niemeyerrogpeppe: I know it does send information about the real types, though15:00
rogpeppeniemeyer: it sends gob by default15:00
dimiternfwereade__: I'm not getting your point here: https://codereview.appspot.com/8330043/diff/11001/worker/provisioner/provisioner.go#newcode27315:01
niemeyerrogpeppe: I imagined for the purposes of juju-core, we only marshal a very limited set of types15:01
dimiternfwereade__: processMachines already behaves this way, due to the change in pendingOrDead15:01
niemeyerrogpeppe: Which means it'd be easy to custom-code it15:01
rogpeppefwereade__: the problem is we've got a real client here, and if you want to use environs.New and have the normal default thing happening, you need to call SetJujuHome15:01
fwereade__mattyw, ok, basically an incomplete environ config has somehow made its way into state, and something is throwing a fit because the user environment that usually fills that stuff in is not present15:01
niemeyerrogpeppe: and tell the other side what it should unmarshal onto, and send the full value back over the rpc15:02
fwereade__rogpeppe, mattyw: sorry, I'm not 100% clear on the exact context15:02
* rogpeppe goes to look at goyaml15:02
fwereade__mattyw, but based on what rogpeppe said I said at lest one untrue thing above15:02
fwereade__dimitern, processMachines starts new machines before trashing unknown ones15:03
niemeyerrogpeppe: I don't think there's anything interesting to look there15:03
niemeyerrogpeppe: The point is to use the real goyaml15:03
fwereade__dimitern, if it continues to do this, we could end up with multiple instances of individual agents, and that would be bad15:04
niemeyerrogpeppe: and to request unmarshalling over a socket instead of locally15:04
mattywfwereade__, does it sound like my env hasn't bootstrapped properly?15:04
dimiternfwereade__: so you suggest moving the trashing before starting?15:04
fwereade__dimitern, yes please15:04
rogpeppeniemeyer: i don't think you can - you can't pass pointers over the wire15:04
fwereade__mattyw, so this is a real live environment that's doing this?15:04
niemeyerrogpeppe: You surely can pass structs containing pointers that are set, right?15:05
niemeyerrogpeppe: GiveMeAFoo(data []byte) *Foo15:05
rogpeppefwereade__: the point is that, i think, this (previously valid) code will now fail: http://paste.ubuntu.com/5676928/15:05
mattywfwereade__, yes, on aws, I've been having trouble with it recently ( "state entity not found" on the juju mailing list)15:05
mattywfwereade__, but this bootstrap seemed to work15:06
rogpeppeniemeyer: the problem is that Unmarshal needs to know what it's unmarshalling into15:06
niemeyerrogpeppe: Like, GiveMeAFoo?15:06
dimiternfwereade__: sure thing15:06
rogpeppeniemeyer: if you can enumerate all the possible types for Foo, that'll work. but i'm not sure you can do that easily.15:07
niemeyer<niemeyer> rogpeppe: I imagined for the purposes of juju-core, we only marshal a very limited set of types15:07
fwereade__rogpeppe, I don't consider that code to be valid... the amount of magic we put in the environment config was a fuckup of the first order, and this is IMO just the latest toll it is extracting15:07
rogpeppefwereade__: so you don't want us to be able to use juju stuff from a normal Go program?15:08
fwereade__rogpeppe, I don't want us to be able to magically infer stuff, no15:08
rogpeppefwereade__: it's not about magically inferring stuff. it's about actually using the juju libraries to interact with juju.15:08
rogpeppeniemeyer: i'm not sure. maybe we actually never unmarshal into a struct.15:09
rogpeppeniemeyer: but we do15:09
fwereade__rogpeppe, the whole problem is the magic inference15:09
niemeyerrogpeppe: Hmm.. can't parse that.. :)15:10
rogpeppefwereade__: what magic inference?15:10
TheMuefwereade__: thx, needed a bit to follow but i think i've got it15:10
rogpeppeniemeyer: here's an example of a type that we unmarshal into: http://paste.ubuntu.com/5676945/15:10
fwereade__rogpeppe, everything in environs/config that assumes the existence of $HOME or $JUJU_HOME, or any of the environs themselves that pull stuff out of the env15:10
TheMuefwereade__: the UniterSuite-%d has the comment "prevents cross-pollution". so shouldn't it stay?15:11
rogpeppefwereade__: so user code should just duplicate the checkJujuHome function from cmd/juju and be done with it?15:11
fwereade__rogpeppe, as far as I'm concerned, if you want to interact programmatically you can supply a valid config15:12
rogpeppefwereade__: the config comes from the user's home directory15:12
rogpeppefwereade__: i think it's perfectly reasonable that if you write a juju-interacting program that it can understand $HOME/.juju/environsments.yaml just like the normal juju command15:13
niemeyerrogpeppe: yep?15:14
rogpeppeniemeyer: so would we want to hard-code that type as "one of the limited set of types that we use with yaml"? or would we need to break it apart and make an unmarshal RPC for each component piece of it?15:15
TheMuerogpeppe: $JUJU_HOME/environments.yaml15:15
rogpeppeTheMue: either, right?15:16
niemeyerrogpeppe: The former15:16
TheMuerogpeppe: yes15:16
rogpeppeniemeyer: so every time we change that type, we'd need to update goyaml in sync with it?15:16
fwereade__rogpeppe, I think that asking that programmer to be explicit about where he wants his data to come from is pretty reasonable15:16
niemeyerrogpeppe: man..15:16
niemeyerrogpeppe: I'm suggesting a *test case*15:16
rogpeppeniemeyer: ah!15:17
fwereade__rogpeppe, if there were lots of people doing that, it might even make sense to export that function15:17
niemeyerrogpeppe: To see if things break when we don't have any cgo in the system15:17
rogpeppeniemeyer: that's well doable, yeah15:17
niemeyerrogpeppe: I'm not suggesting you do any such awful hacks permanently, or even to commit something as bad as that15:17
rogpeppeniemeyer: cool, sorry, i didn't realise it was a totally temporary thing you were suggesting15:17
niemeyerrogpeppe: np15:18
rogpeppefwereade__: that's what i would suggest15:18
fwereade__mattyw, (sorry, btw, I am still thinking... but popping out to get ciggies, brb)15:18
mattywfwereade__, ok thanks15:18
fwereade__mattyw, would you paste me the panic please?15:19
rogpeppeniemeyer: i'd start by instrumenting goyaml to see what types are actually used (and in fact i think it might just be interface{})15:19
niemeyerrogpeppe: Probably not.. but it's a reduced set either way15:20
rogpeppeniemeyer: in which case i'm not entirely sure we can make gob work. json might be sufficient though.15:20
niemeyerrogpeppe: You can make it work as suggeste15:20
niemeyerd15:20
niemeyerrogpeppe: By asking the other side to unmarshal onto a specific type, and send back the result15:20
rogpeppeniemeyer: gob doesn't allow you to unmarshal onto *interface{} by default.15:20
niemeyerrogpeppe: Why does that matter?15:21
rogpeppeniemeyer: because (i think) you need to be able to tell the other side what specific type you want to unmarshal onto15:21
niemeyerrogpeppe: As I just mentioned, the wire is seeing concrete types15:21
rogpeppeniemeyer: and you don't have that information15:21
niemeyerrogpeppe: I'm not following..15:22
niemeyerrogpeppe: "you need to be able to tell the other side".. so just do!15:22
rogpeppeniemeyer: i'm not entirely sure that you can marshal and unmarshal a map[interface{}]interface{} in gob.... but if you register all the possible concrete types, it may work.15:23
niemeyer"""15:24
niemeyerInterface types are not checked for compatibility; all interface types are treated, for transmission, as members of a single "interface" type, analogous to int or []byte - in effect they're all treated as interface{}. Interface values are transmitted as a string identifying the concrete type being sent (a name that must be pre-defined by calling Register), followed by a byte count of the length of the following data (so the value can be skipped if i15:24
niemeyert cannot be stored), followed by the usual encoding of concrete (dynamic) value stored in the interface value. (A nil interface value is identified by the empty string and transmits no value.) Upon receipt, the decoder verifies that the unpacked concrete item satisfies the interface of the receiving variable.15:24
niemeyer"""15:24
rogpeppeniemeyer: yeah, it may work, but i haven't tried it yet15:25
mattywfwereade__, http://pastebin.canonical.com/8841915:25
niemeyerrogpeppe: Well, either it works, or there's a bug in the code, or in the documentation15:25
rogpeppeniemeyer: it's, let's say, a less explored area :-)15:26
niemeyerrogpeppe: Ugh.. I see you got another crash, inside the gc this time15:26
niemeyerrogpeppe: and Russ is out.. sucs15:26
niemeyerks15:26
rogpeppeniemeyer: yeah.15:27
rogpeppeniemeyer: and i have other stuff that i need to be doing too. but... this is important stuff to sort out, i think.15:27
niemeyerrogpeppe: If 1.1 is shipped and stuff breaks randomly, yeah, it'll be pretty bad15:28
rogpeppeniemeyer: yeah, the gob thing works ok: http://play.golang.org/p/XGWLfQgPzr15:32
niemeyerrogpeppe: Cool15:33
rogpeppeniemeyer: BTW here's little go hack i'd not thought of before to give you arbitrary variable-length keys to maps: http://play.golang.org/p/cs0HTON9nj15:36
rogpeppeniemeyer: i don't think i'd ever actually want to *use* it, but it's kinda cool15:36
niemeyerrogpeppe: That's very cool.. hadn't thought about it either (but haven't needed it so far as well)15:39
dimiternfwereade__: I disagree with some of your suggestions, not the important ones though - would you take a look again please? https://codereview.appspot.com/833004315:39
fwereade__mattyw, would you explain a little bit more about your use case here? you're trying to open an API connection given an env config?15:39
niemeyerrogpeppe: It reminds of erlang15:39
fwereade__dimitern, ack15:39
rogpeppeniemeyer: binary trees as keys, yay!15:39
niemeyerrogpeppe: I'd call {a, b} as head, tali15:40
niemeyertail15:40
rogpeppeniemeyer: yeah15:40
niemeyerrogpeppe: There are tons of algorithms around this logic.. erlang lists are fully based on them, given the immutability of the language15:40
niemeyerrogpeppe: Good one, thanks15:40
mattywfwereade__, at the moment the use case is just, spin up some units, connect to the api and start adding and removing units to see what I get back from the api15:45
mattywfwereade__, (the adding and removing of units would be done using the juju command - not the api)15:46
fwereade__dimitern, did you propose your changes?15:46
fwereade__mattyw, ok, I do see the problem; and rogpeppe, I take your point, but the fact remains that if we'd done it properly in the first place we wouldn't have to deal with any of this absurdity :(15:48
fwereade__rogpeppe, config.SetCLIMode()? :(15:49
rogpeppefwereade__: i'm not sure. for the user-level logic, it makes sense to fall back to stuff in $HOME15:49
rogpeppefwereade__: i'd prefer the logic the other way around actually15:49
rogpeppefwereade__: config.SetDaemonMode15:49
rogpeppefwereade__: so that user-level client code can proceed with as little friction as possible15:49
fwereade__rogpeppe, if stuff is going to fail stupidly I want it to fail on the client where we can see it happening15:50
rogpeppefwereade__: haven't we just set stuff up so we'll get a panic on the server not the client?15:50
dimiternfwereade__: yeah, I proposed them15:51
fwereade__rogpeppe, what I *thought* we were doing was setting stuff up so we'd get a panic in the *tests*15:51
fwereade__rogpeppe, but it turns out that JujuConnSuite shares certain characteristics with the honey badger15:51
dimitern:D15:52
fwereade__rogpeppe, and renders all that useless15:52
dimiternthe honey badger is the best15:52
* rogpeppe doesn't know anything about honey badgers15:52
dimiternthe honey badger just doesn't give a shit basically15:53
dimiternrogpeppe: http://en.wikipedia.org/wiki/The_Crazy_Nastyass_Honey_Badger15:53
rogpeppedimitern: 58m views + 1 :-)15:54
rogpeppeniemeyer: sum total of ways we use goyaml: http://paste.ubuntu.com/5677075/15:56
niemeyerrogpeppe: Hah :)15:57
rogpeppeniemeyer: well, from the uniter tests anyway15:57
niemeyerrogpeppe: Sure.. you've managed to break things wiht it alone, right?15:57
rogpeppeniemeyer: yup15:58
niemeyerrogpeppe: Superb15:58
niemeyerrogpeppe: piece of cake then15:58
* rogpeppe is on it15:58
=== deryck is now known as deryck[lunch]
fwereade__mattyw, ok, I have not yet come up with the Right answer16:04
fwereade__mattyw, the expedient one, though, is as rogpeppe anti-suggested -- copy checkJujuHome from cmd/juju/main.go and call it before you mess with environments locally16:05
* fwereade__ remains convinced that the CLI is the special case, and that privileging that case to the point where it's inseparable from environ config was a seriously Bad Move16:09
rogpeppefwereade__: i'm not sure about the former, but i think you're probably right about the latter.16:10
hazmatdoes the charm and tools cache also get moved with juju_home?16:10
fwereade__hazmat, yeah16:10
rogpeppehazmat: i think so, yes16:10
fwereade__hazmat, authorized-keys is the wrinkle16:10
hazmatfwereade__, ic. i ended up with authorized-keys-path lookup relative to juju home in addition to absolute lookups. authorized-keys by themselves are content16:13
fwereade__hazmat, hell, that is a very good point, would you file a bug please?16:14
hazmatfwereade__, sure16:16
dimiternrogpeppe: I need one more review of the provisioner stuff https://codereview.appspot.com/8330043/ - it should be quick and simple16:16
fwereade__hazmat, that is strange, because the code seems to all be in terms of $HOME16:16
hazmatfwereade__, trunk is yes, i needed it for some easier multi-env management so i just added it16:17
hazmati need to reinstall 64bit i think to get going with juju-core.. doing it in vms isn't quite as nice.16:17
hazmator amenable to hacking16:18
hazmaton a branch re adding16:18
hazmatfwereade__, bug is here.. i think that's what your referencing. bug 116460116:19
_mup_Bug #1164601: JUJU_HOME should support relative lookup paths for authorized-keys-path <juju-core:New> < https://launchpad.net/bugs/1164601 >16:19
rogpeppedimitern: i'm just wondering what is the motivation for the change to processMachines16:21
dimiternrogpeppe: so we can remove unknown/not matching machines before trying to start matching unprovisoned ones16:22
rogpeppedimitern: is it just to make sure that we stop unknown instances even when startMachines continually fails?16:22
dimiternrogpeppe: it won't fail more than once now - the machine will be in error state and will be ignored16:23
dimiternrogpeppe: on the other hand, if a machine was started but SetInstanceId failed we'll remove it next time before16:24
dimiternrogpeppe: before trying to start new ones16:24
rogpeppedimitern: i'm not sure i see why the order matters16:25
rogpeppedimitern: won't it all work ok in the original order?16:25
rogpeppedimitern: my reason for asking is that findUnknownInstances and stopInstances may be quite slow, and it might be better if we did the start instance ASAP16:26
dimiternrogpeppe: I should refer you to fwereade__ for better explanation then, sorry16:28
fwereade__rogpeppe, if we stop unknown ones before starting new ones, we should be able to avoid ever having 2 instances of the same agent running at the same time16:29
rogpeppefwereade__: ah!16:29
* dimitern still doesn't get how this could happen though..16:30
fwereade__dimitern, SetInstanceId fails16:30
rogpeppefwereade__: assuming ec2 eventual consistency does actually tell you the names of all recently  started machines of course :-)16:31
fwereade__rogpeppe, yeah, there's that16:31
fwereade__rogpeppe, call it a best effort that should hit a number of cases in which we would otherwise be encouraging that sort of inconsistency16:31
rogpeppefwereade__: so should an agent check that its instance id is set before carrying on to do stuff?16:31
rogpeppefwereade__: because otherwise it will be killed ad-hoc16:32
fwereade__rogpeppe, probably, that would be sensible16:32
fwereade__rogpeppe, and that would eliminate the case I was trying not to think about, too16:32
dimiternif SetInstanceId fails, how come we'll have 2 instances having the same agent?16:32
rogpeppedimitern: we've started the instance16:33
rogpeppedimitern: but we haven't managed to set the instance id of the machine16:33
rogpeppedimitern: so the new machine comes up16:33
rogpeppedimitern: and starts running an agent16:33
dimiternrogpeppe: but the MA is local to the instance, right? each instance will have its own MA16:33
rogpeppedimitern: but in the meantime, the provisioner sees that there's a machine without an instance id16:33
fwereade__rogpeppe, the only one left over is the initial password-setting... I think we could possibly get an ugly race there, in which we have 2 machines running the agent but neither able to log in16:33
rogpeppedimitern: and starts another instance16:34
fwereade__rogpeppe, but it would be *very* hard to trigger, I think16:34
rogpeppefwereade__: it could be mitigated if the agent checked the instance id *before* setting the password16:34
fwereade__rogpeppe, oh, maybe not16:34
fwereade__rogpeppe, yep16:34
dimiternrogpeppe: yeah, a third one, while the other two (one with instid, other without) are still running16:34
fwereade__rogpeppe, but wait... ok, I need to go chop vegetables and think about this16:35
dimiternrogpeppe: and we'll have 2 valid running machines and one stray16:35
dimiternrogpeppe: how does this translate to having the same agent on 2 machines?16:35
fwereade__dimitern, +100 to rog's suggestion of making the UA resilient to this16:36
fwereade__dimitern, two machines are both configured to run the agent for machine 316:36
fwereade__dimitern, both of them start running wordpress/016:36
fwereade__dimitern, everything goes to shit16:36
dimiternfwereade__: ah! the machineid is the same, but one doesn't have instanceid!16:36
fwereade__dimitern, yeah16:36
dimiterngot it finally :) cheers16:37
rogpeppefwereade__, dimitern: ha, it's easy to check the InstanceId before the password's set16:37
dimiternrogpeppe: where?16:37
rogpeppedimitern: in MachineAgent.Entity16:37
dimiternrogpeppe: shouldn't this be a separate CL?16:37
rogpeppedimitern: definitely16:38
fwereade__dimitern, yes16:38
rogpeppedimitern: but i'd like a comment16:38
fwereade__dimitern, sorry, I should have been clear about that16:38
rogpeppedimitern: in processMachines16:38
rogpeppedimitern: because this is subtle stuff16:38
fwereade__dimitern, yeah, maybe best if you rewrite the explanation according to your understanding16:38
dimiternrogpeppe: cool, write a review comment and I'll add a card to do it in a follow-up16:38
rogpeppedimitern: ok16:39
dimiternfwereade__: sgtm, will just add the "same machine ID" stuff to the comment and it should be clear enough16:39
rogpeppefwereade__: of course, we are assuming that StopInstance is synchronous here16:39
rogpeppefwereade__: the machine agent should probably check that the instance id of the machine matches the instance id of the machine it's actually running on16:40
rogpeppefwereade__: in fact, with that check we *could* start new machines before stopping unknown ones, i think16:42
dimiternrogpeppe: ok, noted down 2 changes for a follow-up: MA should ensure the instance id of the machine in state matches the one it's running on, and then set the password to connect to state16:46
rogpeppedimitern: sounds good16:47
dimiternrogpeppe: sounds good?16:47
rogpeppedimitern: you've got a review BTW16:47
dimitern:) great, I'll add a card16:47
dimiternrogpeppe: tyvm16:47
dimiternfwereade__: you're also fine with the changes I did?16:49
dimiternrogpeppe: where's that MachineAgent.Entity you mentioned?17:02
rogpeppedimitern: in jujud/machine.go17:02
dimiternrogpeppe: and when's the password being set - I can't see17:04
rogpeppedimitern: in RunAgentLoop (by openState)17:04
dimiternrogpeppe: ok, thanks17:05
dimiternfwereade__: ping17:06
dimiternok i'm off, good night all17:08
=== ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: - | Bugs: 2 Critical, 53 High - https://bugs.launchpad.net/juju-core/
=== deryck[lunch] is now known as deryck
=== BradCrittenden is now known as bac
=== BradCrittenden is now known as bac
rogpeppeniemeyer: i did the goyaml-as-rpc thing and it still crashes (earlier actually, but in a similar way)18:50
niemeyerrogpeppe: Oh, "sweet"18:50
rogpeppelol18:50
rogpeppeniemeyer: here's the rpc server BTW: http://paste.ubuntu.com/5677572/18:51
rogpeppeniemeyer: i got a fair few of the standard goyaml tests passing but not all18:51
niemeyerrogpeppe: Thta's neat!18:51
niemeyerrogpeppe: Well, that's not necessary18:52
niemeyerrogpeppe: We just need enough of it to get the juju tests passing18:52
rogpeppeniemeyer: indeed. but it was useful to get a reasonable assurance that it was doing something right18:52
niemeyerrogpeppe: +118:52
niemeyerrogpeppe: So there we go18:52
rogpeppeniemeyer: yup18:52
niemeyerrogpeppe: We should really raise a harder flag on that one18:52
niemeyerrogpeppe: Did you increment that data on the issue yet?18:53
rogpeppeniemeyer: it's consistently dying here now: http://paste.ubuntu.com/5677577/18:53
rogpeppeniemeyer: just about to18:53
niemeyerrogpeppe: Some weird addresses there18:55
rogpeppeniemeyer: if put an "if nil" check and the problem moved elsewhere. darn heisenbugs!18:59
rogpeppes/if/i/18:59
niemeyerrogpeppe: Okay, is that nil value still that in that field that is actually a by-value field?19:01
niemeyerrogpeppe: If so, it's of course impossible for it to be nil without a GC error, so this might be a good case to raise19:01
rogpeppeniemeyer: i'm checking for socket==nil in mongoSocket.Acquire19:02
rogpeppeniemeyer: it looks like that can be nil19:02
niemeyerrogpeppe: Okay, I'm thinking of that traceback on server.go19:02
rogpeppeniemeyer: except i'd expect the first arg to Acquire to be 0x0 if that was the case19:03
niemeyerrogpeppe: If it can be reproduced, that'd be an easier one19:03
rogpeppeniemeyer: i just want to catch it in the act19:03
niemeyerrogpeppe: Because it was claiming there was a nil value that was impossible to be nil without the GC being on crack19:03
rogpeppeniemeyer: really? i'm not sure i see that19:04
niemeyer(or an unsafe error, which we can't have anymor)19:04
niemeyerrogpeppe: Do you have the old traceback around?19:04
rogpeppeniemeyer: how old? :-)19:04
niemeyerrogpeppe: The first one I saw you mentioning on that regard19:05
niemeyerrogpeppe: So pretty old I suspect19:05
rogpeppeniemeyer: this one is the first one i saw, i think: http://paste.ubuntu.com/5677624/19:06
rogpeppeniemeyer: (as reported in the issue)19:06
rogpeppeniemeyer: it looks fairly similar19:06
niemeyerrogpeppe: Okay, it wasn't that one19:06
niemeyerrogpeppe: Hmm19:09
niemeyerrogpeppe: There's a race there19:09
niemeyerrogpeppe: It may be an actual issue in mgo19:10
rogpeppeniemeyer: FWIW i ran the tests with the race detector enabled ok19:10
niemeyerrogpeppe: I don't know about that, but there's a logic error in the acquireSocket function19:10
niemeyerrogpeppe: Open session.go in 289819:11
niemeyerrogpeppe: Invert the two Acquire calls so that RUnlock is done after they're the Acquire19:11
rogpeppeniemeyer: s.masterSocket.Acquire() ?19:11
niemeyerrogpeppe: Both slave and master19:11
niemeyerrogpeppe: They both have the same mistake19:12
rogpeppeniemeyer: ok, will do19:12
niemeyerrogpeppe: It's entirely possible that in a concurrent scenario the sockets are niled out before the Acquire is done, because they're outside of the lock19:12
rogpeppeniemeyer: yes19:12
niemeyerrogpeppe: Hah.. on a unrelated note, check out this weird bug that was fixed in tip: http://play.golang.org/p/FceemIemVw19:16
rogpeppeniemeyer: ah yes, i caught that one going through19:17
rogpeppeniemeyer: apparently some people were relying on it19:17
niemeyerrogpeppe: Indeed.. I got an error report about that :)19:17
niemeyerrogpeppe: How's the test going?19:18
rogpeppeniemeyer: just crashed that second19:18
niemeyerrogpeppe: Where?19:18
rogpeppeniemeyer: same place19:18
niemeyerrogpeppe: Can I see the new traceback?19:19
rogpeppeniemeyer: http://paste.ubuntu.com/5677661/19:19
rogpeppeniemeyer: the line number's slightly different because the start of Acquire does this check:19:20
rogpeppeniemeyer: http://paste.ubuntu.com/5677666/19:20
rogpeppeniemeyer: so it really does look like corruption19:20
rogpeppeniemeyer: otherwise the first panic would have been triggered19:21
niemeyerrogpeppe: Wow.. yeah19:21
niemeyerrogpeppe: What's the Go issue again? It scrolled out of view19:22
rogpeppeniemeyer: 519319:22
niemeyerrogpeppe: Thanks19:22
rogpeppeniemeyer: (what, no search?)19:22
niemeyerrogpeppe: I'm lazy.. but if I have to debate that with you it doesn't work anymore.. :-)19:23
rogpeppelol19:23
niemeyerrogpeppe: The fact mgo puts every socket on their own goroutine is probably helping to trigger the bug19:25
rogpeppeniemeyer: yeah.19:25
niemeyerrogpeppe: Or.. the reading side of the socket, anyway19:25
niemeyerrogpeppe: I sent a private ping to Rob about the issue, just to make sure it won't be left behind19:36
rogpeppeniemeyer: thanks19:36
niemeyerrogpeppe: np.. usually I wouldn't be worried since rsc is CCd, but given he's away, I'm actually a bit concerned19:36
rogpeppeniemeyer: it is marked as Priority-ASAP19:36
niemeyerrogpeppe: It can easily be unmarked as well :)19:38
rogpepperight, that's me utterly done for the day20:10
rogpeppeniemeyer: thanks for the assistance20:10
thumperhi rogpeppe20:10
thumpernight rogpeppe20:10
rogpeppethumper: hiya20:10
rogpeppethumper: how's tricks?20:10
thumperrogpeppe: good, was just thinking about code, then realised that today is the last day for peer reviews20:11
rogpeppethumper: anything you wanna chat about while we're online together? :-)20:11
thumperso I have those to do instead20:11
thumpernot really...20:11
rogpeppethumper: oh frick20:11
thumperreviews suck the life out of me20:11
thumperI know they shouldn't20:11
thumperbut they do20:11
thumperat least I'm not doing manager ones20:11
rogpeppethumper: code reviews or peer reviews... or both?20:11
thumperI would have had eight extra to do otherwise20:11
thumperjust peer / manager / report reviews20:12
thumpercode reviews are fine20:12
* rogpeppe grabs a beer and settles down to do some reviews20:14
rogpeppethumper: where should i have found out about the peer review deadline, other than by word of mouth?20:15
thumperrogpeppe: in the all hands email20:15
thumperabout reviews20:15
thumperquite a few weeks back20:15
thumpergave the timeline in it20:15
rogpeppethumper: ah, found it. i wonder if "by 5th April" is by end of day or end of previous day.20:17
thumperit normally means by the end of the 5th in the west-most time zone20:18
thumper:)20:18
rogpeppethumper: cool. tomorrow, then...20:20
rogpeppethumper: have a good day20:21
thumperrogpeppe: keep the beer for it though20:21
thumperrogpeppe: it helps20:21
rogpeppethumper: i certainly will20:21
niemeyerrogpeppe: have a good night20:21
* thumper has put off reviews long enough, and goes to log into the system20:57
thumperhi davecheney23:52
thumperdavecheney: https://code.launchpad.net/~thumper/juju-core/package/+merge/15702023:53
thumperdavecheney: and https://code.launchpad.net/~thumper/+recipe/juju-core-daily23:53
thumperdavecheney: you can see in the built packages that the files are there and the man page too23:53

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!