/srv/irclogs.ubuntu.com/2014/06/26/#juju-dev.txt

bodie_thumper -- sorry about that!  oops.... sigh00:08
bodie_I'd cherry-picked the PR content it's waiting on in order to put the work through.  let's see here...00:09
bodie_apologies for wasting your time with that00:11
bodie_there we go00:15
bodie_https://github.com/juju/juju/pull/16300:15
sinzuiwallyworld, the job failed even after I cleaned up the environment. Here are the logs http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1434/00:36
wallyworldsinzui: sec, talking to alexisb , will contact you soon00:36
wwitzel3wallyworld: do you know where I can get the instance type information I need for the client API?00:37
wallyworldwwitzel3: yeah, sorry, i read your email and haven't had a chance to respond yet - been in meetings all morning. there is an api there, i just have to lookit and and let you know. will do so soon00:45
wwitzel3wallyworld: np, thanks.00:47
wallyworldsinzui: i can see the state server workers all start up, and also mongo. there appears to be no reason why the api client cannot connect to port 17070 - is it possible to do a netstat to see if the state server is indeed listening on the correct port?00:51
sinzuiwallyworld, This is what I saw during the upgrade http://pastebin.ubuntu.com/7703456/00:55
sinzuiwallyworld, WTF, this just happened on the next test00:56
sinzuihttp://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1435/console00:56
wallyworldsinzui: is that done after these lines00:56
wallyworld2014-06-26 00:13:38 INFO juju.mongo open.go:90 dialled mongo successfully00:56
wallyworld2014-06-26 00:13:38 DEBUG juju.state open.go:58 connection established00:56
* wallyworld looks at new console00:56
wallyworldseriously? it is mocing us00:57
wallyworldmocing00:57
sinzuiwallyworld, it is a pass with a panic...that is a first00:57
wallyworldmocking00:57
wallyworldsinzui: i think that's due to the agent shutting down for upgrade, just caught it at a bad time00:58
sinzuiwallyworld, status may have panicked, it is called several times. the last call gave a result showing all machines upgrades00:58
sinzuiwallyworld, could be...but the code i wrote tries to capture that...that is why status will be called several times00:59
wallyworldsinzui: i can see why status is behaving that way and there was a recent change there - it think it's missing a sanity check01:00
wallyworldwe can get back an error from the call to get status and still have partial status to display01:01
wallyworldbut we should check that we do indeed have some status and not nil01:01
wallyworldso it will be a simple fix if i am correct01:01
wallyworldsinzui: but i wonder why the CI ob failed the first time, there's no obvious reason01:02
sinzuiwallyworld, I am unsure what to do now. If I wasn't watching that happen, I would declare the test good and just focus on the ill azure01:03
wallyworldsinzui: 2 out of 3? :-D01:03
sinzuiwallyworld, exactly01:03
wallyworldlet's run it again01:03
sinzuiazure and joyent are messed up. I am in the consoles killing machines01:04
wallyworldmenn0: hi ya, i think there's an issue with the recent changes to status. i think we are missing a nil check. see http://pastebin.ubuntu.com/7703483/01:04
wallyworlddo you agree?01:05
menn0yep. davecheney has already fixed it01:06
menn0https://github.com/juju/juju/pull/12701:06
menn0wallyworld: it looks like the landing bot didn't pick up the merge request though.01:06
menn0wallyworld: how do we make it notice the PR?01:07
wallyworldum01:07
wallyworld$$merge$$ should have been enough01:07
wallyworldi'll look into it01:07
menn0It could be because this was a PR where the merge started and then was aborted because the initial proposal wasn't quite right01:07
menn0wallyworld: ^^01:07
wallyworldah01:08
menn0wallyworld: I think you were the one who killed it in Jenkins (at dave and my request)01:08
wallyworldmenn0: yes. but after that it needs $$merge$$ again to re-trigger01:08
wallyworldi'll do that01:09
menn0well it's had that01:09
menn0but try again I guess01:09
sinzuiwallyworld, (Sorry for this awkward question from my daughter), Are all 40-50 year-old Aussies obsessed with ABBA.01:09
sinzuiI said no, but she doesn't believe me01:09
wallyworldrotfl01:09
wallyworldonly some of us :-)01:09
wallyworldabba were very, very popular here01:09
wwitzel3who isn't obsessed with ABBA? am I right?01:10
sinzuiI know. I told her about how many weeks Fernando and Dancing Queen spent at number one and she then decided there is a cohort who can't let the band go01:10
wallyworldshe is right01:14
wallyworldbut i am not one of them :-)01:14
wallyworldwwitzel3: i sent you an email - there's a little refactoring required, sorry01:15
wwitzel3wallyworld: ok, yeah, I saw that .. I think I can just add that to the interface in common provider? But I am still not sure how to actually get the provider to call the method on.01:16
wallyworldwwitzel3: you have it in your method01:17
wallyworld func (api *EnvironmentAPI) getInstanceTypes(env environs.Environ)01:17
wallyworldenv is the provider01:17
wallyworldso you add the method to Environ01:17
wwitzel3wallyworld: lol01:18
wwitzel3wallyworld: of course it is01:18
wallyworldthe ConstrainstValidator() method is already there01:18
wallyworld:-)01:18
wwitzel3I was sooo close01:18
wallyworldyep :-)01:18
wwitzel3wallyworld: thanks :)01:18
wallyworldnp01:18
wallyworldmenn0: i have no idea what's wrong, i'm just going to merge it directly01:19
menn0wallyworld: ok thanks01:19
wallyworldsinzui: i just merged in a fix for that status panic01:20
sinzui:)01:21
wallyworldsinzui: it was proposed a few days ago it seems but the bot just didn't want to pick it up01:21
jcw4review requested: https://github.com/juju/juju/pull/164 ;  this just updates for the newly updated names package and makes the internal structure of the Action consistent with other state structures.01:46
thumperwallyworld: with you shortly02:01
wallyworldok02:01
waiganithumper: I'm thinking of picking up this bug: https://github.com/juju/juju/issues/13802:28
waiganithumper: I see there are two ssh clients: openssh and gocrypto embedded. Does the gocrypto save known_hosts?02:29
thumperwaigani: first thing, can you move that bug to launchpad?02:29
waiganithumper: sure02:29
waiganithumper: https://bugs.launchpad.net/juju-core/+bug/133448102:32
_mup_Bug #1334481: juju should not record ssh certificates of ephemeral hosts <juju-core:Triaged by waigani> <https://launchpad.net/bugs/1334481>02:32
thumperwaigani: can you also link that on github too? for the issue02:32
waiganithumper: done02:33
waiganiadded comment02:33
waiganithumper: shall I do the same for this one: https://github.com/juju/juju/issues/13302:34
thumperwaigani: check to see if it has been done already, but yes02:34
waiganithough there has already been some discussion on github02:34
waiganiok02:34
waiganithumper: done, and linked on github02:37
thumperwaigani: yes, working on that issue would be good02:37
waiganithumper: cool. I'll start with a failing test. So we will just not store the known hosts at all on ssh right?02:38
thumperwaigani: yeah... but just for juju ssh02:40
waiganithumper: cmd/juju/ssh ?02:41
thumperyup02:41
wallyworldaxw: got time for a quick hangout?02:49
axwwallyworld: can you give me 5 mins please?02:50
wallyworldsure02:50
axwwallyworld: I'm in the tanzanite-daily hangout02:54
jcw4thumper: fwiw, on PR 163, a lot of the tag/id stuff has been clarified with that last names package update I submitted02:55
thumperheh...02:55
thumperI'm just commenting on what I see02:55
thumper:)02:55
jcw4thumper: bodie_ told me this morning that he expected to have to do some refactoring once my change was in02:56
jcw4thumper: :)02:56
* jcw4 is just nervous about when thumper's beady eye gets on my PR next02:56
thumperjcw4: which PR is it you want me to look at?02:57
jcw416402:57
jcw4or not... y'know02:57
jcw4if you need a nap or something...02:57
bodie_thumper, cleaned up 163, btw -- it looks like you're commenting on the content I removed :(02:57
jcw4all jokes aside, I'm *loving* the code review process on this project02:57
bodie_it's still useful -- the code you're commenting on is for PR 140 and 141, so I can still make use of it02:58
thumperbodie_: hmm... ok, what should I be looking at?02:59
bodie_https://github.com/binary132/juju/commit/1de2d29aba97a422da32fcfde1a15c94e150e1ad02:59
jcw4bodie_: is that because your PR 163 was rebased on top of your pending 140 and 141 ?03:00
thumperbodie_, jcw4: something to be aware of, with the upcoming work on multi-environment state servers, all the document _id fields will change to include the env uuid03:00
bodie_yes, I removed the condensed commit and pushed --force so I thought it would be clear it was gone from the PR03:00
bodie_but, the rebased commit was just the same content from 140 and 141, so I could run tests03:01
jcw4thumper: so if we hide that _id behind the public api of the state types we should be fine right?03:02
thumpergenerally...03:02
rick_h_wallyworld: :P how many bundles have you created by hand?03:02
wallyworldrick_h_: none03:03
wallyworldbut it sure would be nice to have a cli for it03:03
rick_h_wallyworld: sure, from an existing environment as a dump/backup.03:03
wallyworldeg juju start bundle followed by a series of deploy, relate commands, then juju end bundle03:04
rick_h_wallyworld: but anyway, replied. rubbing some ointment over the gui comment sting :P03:04
rick_h_wallyworld: it's called a shell script, you can do that today03:04
wallyworldrick_h_: sorry if it came out bad, wasn't intended03:05
rick_h_wallyworld: I'm joking with you03:05
wallyworldi just wanted to make the point that most of our target audience don't use guis03:05
rick_h_wallyworld: except I don't think bundle creation is a cli/scriptable thing03:05
rick_h_wallyworld: do we have a target audience now?03:05
wallyworlddevop people?03:05
wallyworldi guess windows folks want a gui03:06
rick_h_because if we're going to talk small devs I'll argue with you03:06
wallyworldi'm a small dev03:06
wallyworldi don't use guis03:06
rick_h_but yes, at scale people want scriptable > * (*cough* thumper *cough*)03:06
wallyworldyup03:06
rick_h_wallyworld: never confuse you vs a target audience.03:06
rick_h_I use vim and a terminal all day and the only gui app I run is a browser03:07
thumperrick_h_: I'm going to make you happy and make it all script happy03:07
thumperrick_h_: leave it with me :)03:07
rick_h_thumper: :) hey wallyworld is preaching it too03:07
thumperrick_h_: I've already cleared the approach with fwereade03:07
rick_h_woot03:07
* rick_h_ does happy dance03:07
wallyworldthumper: you have a doc to share on that?03:07
thumperwaigani: nope, it is inside my head, but not complex03:08
rick_h_wallyworld: thumper and I were having this conversation yesterday so glad to see your chime in as well.03:08
wallyworldthumper: must be simple to be in your head03:08
thumperwallyworld: it is03:08
rick_h_ouch03:08
rick_h_maybe there's a GUI in it03:08
wallyworldlol03:09
rick_h_har har!03:09
wallyworldyou funny03:09
rick_h_I try, it's past my bedtime03:09
wallyworldcareful, you'll turn into a pumpin03:09
wallyworldk03:09
rick_h_half way there, let me get an orange shirt03:09
wallyworldll03:11
wallyworldo03:11
wallyworldand a green hat03:11
wallyworldand a camera03:12
* thumper takes a deep breath and moves to the next PR03:24
* bodie_ hands thumper a bottle of water and cheers him on03:27
sinzuithumper, wallyworld. A recent rev broke the win installer. We cannot compile it https://bugs.launchpad.net/juju-core/+bug/133449303:37
_mup_Bug #1334493: Cannot compile win client <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1334493>03:37
* sinzui forces build a a revision before the win and os revisions03:42
thumperbodie_, jcw4: some comments on PR 16403:45
jcw4thumper: right behind you03:45
thumperparticularly the last one, as that is the biggest question I have03:45
jcw4I'll comment on the pr thumper, but this goes back to that watcher point03:46
jcw4if we have a watcher on the actions collection03:47
jcw4and that watcher gets _id's for *free*03:47
jcw4we can filter on those _id's without another db hit03:47
thumpersure, but that doesn't answer the question03:48
jcw4thumper: because there could be multiple actions with the same name03:48
thumperjcw4: a key question is: "Is the combination of unit and action name unique?"03:48
jcw4thumper: no03:48
thumperwhy?03:48
thumperwhat is the differentiating point that makes actions here special?03:48
thumperhow does a user differentiate?03:49
thumperif I say "run the backup action" it may mean multple things?03:49
thumperif so, why?03:49
thumperor is this "an instance of someone running the backup action" ?03:49
jcw4thumper: every time a user types 'juju do <action>' an Action get's queued on the actions collection using the assigned unit and <action> name03:50
jcw4thumper: I may say the same command twice03:50
jcw4thumper: intending it to run twice03:50
thumperok...03:50
thumperhow come an action doesn't have a user?03:50
thumperor a date requested?03:51
thumperI think an action should have a timestamp that it was created03:51
thumperand who requested it03:51
jcw4thumper: my very first PR for this document had unitName, timestamp, (no user), etc.03:51
thumperheh03:51
jcw4in discussion w/fwereade we eliminated the unitName because it would be encoded in the _id03:52
wallyworldsinzui: looking03:52
jcw4the timestamp was deemed unnecessary for now03:52
jcw4thumper: the intent is for us to basically have a super lightweight 'tracer' implementation03:52
* thumper coughs03:52
jcw4thumper: and then fill in the details later03:53
* thumper looks shiftily at fwereade's shadow03:53
* jcw4 feels guilty for throwing fwereade under the bus03:53
thumperjcw4: what is the lifetime of an action?03:53
jcw4fwiw, I think fwereade's case was sound03:53
thumperwhen do we remove it?03:53
jcw4thumper: as long as it takes for the unit to execute it.03:53
jcw4probably minutes or seconds03:54
jcw4usually03:54
thumperso we end up with an action result?03:54
thumperhow long do they live?03:54
jcw4forever03:54
jcw4and ever03:54
thumperouch...03:54
* thumper forsees an issue03:54
thumperwe obviously have different definitions of lightweight03:55
thumperto me remembering who asked and when is part of very lightweight03:55
jcw4to be fair, we haven't discussed any archiving of the results yet03:55
thumperwhen you record the result, you then have a timestamp for finish and can then deduce a duration03:55
thumperjcw4: but results could be big right?03:55
jcw4thumper: indeed03:55
thumperjcw4: or do they point to locations on file?03:55
jcw4not in the current implementation03:56
thumperwell... they could...03:56
jcw4thumper: yep... tbh we hadn't thought that far ahead yet03:56
jcw4(we being me)03:56
thumpergiven that we want to back up the db periodically03:56
thumperand I don't want all my postgresql database backups stored in mongo03:56
* davecheney shreeks03:57
thumpersorry davecheney, bad moment to listen03:57
davecheneyi've been listening for a wihle03:57
davecheneyi just couldn't stand it any longer :)03:57
thumperhaha03:57
davecheneyhttp://paste.ubuntu.com/7703965/03:57
davecheneystill one more race in the state/apiserver package03:57
davecheneyi'm on it03:57
thumperta03:58
jcw4thumper, davecheney to be fair we don't have *any* actions actually runnable yet, so the danger isn't there until we do :)03:58
davecheneyjcw4: anything that ends with 'my backups are stored in mongodb' is horrifying03:59
thumperjcw4: so you are just going to hand us a hand grenade and say "here you go, juggle"03:59
* thumper chuckles03:59
* jcw4 wonders how to respond to that03:59
thumperheh03:59
jcw4well....03:59
jcw4:)03:59
thumperjcw4: we'll need a way for a user to say "please discard the results for this action now"03:59
davecheneywhen the only tool you have is mongodb, everything looks like /dev/null03:59
thumperdavecheney: mongo is web scale04:00
davecheneythumper: so's /dev/null04:00
jcw4:)04:00
thumperexactly04:00
davecheneyaxw: wallyworld is there a race build in jenkins ?04:00
wallyworldum04:00
jcw4thumper: so... we're trying to build/define actions here as we go04:00
wallyworldno04:00
davecheneyjcw4: that's going to end in tears04:00
thumperhaha04:01
wallyworldwe are considering it04:01
thumperhmm...04:01
davecheneypossible race from sabdfl, possible sadness from your team04:01
davecheneywallyworld: i'll add it to the weekly meeting notes as a discussion point04:01
wallyworldsure04:01
davecheneywallyworld: do you know the status of the release / upgrade ?04:01
davecheneyi was watching a bunch of reverts overnight04:01
davecheneythat then got reverted04:01
thumperjcw4: so... one question04:01
wallyworlddavecheney: reverts were red herring, i think a few conclusions were jumped to04:02
* thumper tries to formulate...04:02
wallyworlddavecheney: someone broke the windows build, i'm fixing that now04:02
sinzuiwallyworld, the build of the older revision, the one that only reverts daves rev04:02
davecheneywallyworld: i think it was a good hunch04:02
jcw4thumper, davecheney we've purposefully not exposed cli usage yet so that there's minimal exposure until we're done.04:03
thumperjcw4: ack04:03
sinzuiwallyworld, This the the first time I have specifically tested a rev to get a pass04:03
thumperjcw4: IMO, and fwereade may disagree, the id for any document should be composable from attributes in that document04:03
wallyworldsinzui: you talking about the local upgrade?04:03
thumperjcw4: so we don't need to parse the id to get attributes04:04
thumperjcw4: expecially if parts of said id are used in other places04:04
thumpersuch as the tag04:04
sinzuiwallyworld, yes, but since dave's rev was immediately restored, CI never tested just the revision we wanted04:04
thumpergoing from a set of attributes to an ID is easier than trying to do the reverse04:04
sinzuiwallyworld, the rev could be restore until CI had built juju without it04:04
jcw4thumper: that makes sense; it feels a little redundant, but makes sense to me04:04
thumperand the amount of data we are storing is minimal04:04
jcw4thumper: +10004:04
thumperseriously, minimal04:05
wallyworldsinzui: oh, so that one *may* have broken upgrades? i thought we just got a passing CI test?04:05
davecheneyhttps://bugs.launchpad.net/juju-core/+bug/133450004:05
_mup_Bug #1334500: state/apiserver: more data races <race-condition> <juju-core:Triaged by dave-cheney> <https://launchpad.net/bugs/1334500>04:05
wallyworldthumper: https://github.com/juju/juju/pull/165 fixes a release blocker sinzui found04:05
jcw4thumper, davecheney what we need is some way to incrementally build / design what we're doing, and get active feedback (like this), without a fully specified feature doc04:05
davecheneyi'll throw this back in the pool if I can't fix this by EOD04:05
sinzuiwallyworld, I think we got lucky. the test passed, yet there is a panic in it04:05
thumperdavecheney: ack04:05
thumperwallyworld: looking04:05
davecheneyjcw4: yes, if you don't have that, success will be hard04:05
wallyworldsinzui: the panic was just in a juju cmd04:06
sinzuiIf this rev I am testing passes I will release it. that is all I want a rev that passes that developers don't also say has a hidden bug04:06
thumperwallyworld: lgtm04:06
wallyworldit's fixed now but would have had little impact04:06
wallyworldthumper: thanks04:06
wallyworldthumper: i'm just going to hit merge directly so sinzui can rerun the windows build04:06
jcw4davecheney, thumper I know you're in the middle of a couple other issues here; but we're also in somewhat of a tight spot because sabdfl is anxious for a version of actions that works end to end (even if it's very minimal)04:07
sinzuiwallyworld, I cannot04:07
sinzuiI am testing a previous rev04:07
thumperjcw4: ok... please can we start by updating the state doc so the id is composed from other attributes?04:07
wallyworldah ok04:07
jcw4thumper: +104:07
sinzuiCI gets nasty if I try to make it do change what is being test04:07
thumperjcw4: and if you don't decide to add a timestamp and user, add a note that says thumper wants it there04:07
jcw4thumper: absolutely, and I'll add jcw4 too04:08
thumper\o/04:08
jcw4thumper: I'll also add a note to ActionResults about the long term risk of not managing old results04:08
thumperjcw4: I think that removing old action results must be part of the initial release04:09
thumperotherwise crazy ensues04:09
jcw4thumper: agreed04:09
thumperprobably something as easy as "juju action rm <resultid>"04:10
thumperjcw4: so... actions are defined in the charm metadata, yes?04:10
jcw4thumper: yes04:11
thumperjcw4: do we do validation somewhere on action names being requested04:11
thumperjcw4: is there a command to list action results?04:11
jcw4thumper: yes, and will be04:11
thumperjcw4: we are going to have to have user there... ASAP04:11
thumperjcw4: because I will most of the time only be interested in seeing the actions I asked for04:11
thumperjcw4: but I should be able to see all04:11
jcw4thumper: interesting04:11
thumper(assuming I have permissions)04:12
jcw4thumper: makes sense04:12
thumperjcw4: as an aside, we will probably have permissions fine graned enough to say who can do what actions on which service04:12
jcw4thumper: were you involved in the draft spec of Actions ?04:12
* thumper handwaves04:12
thumperjcw4: not really, I think that was mostly sabdfl04:12
thumperjcw4: although I have spend most of the last two weeks just writing specs04:13
* thumper sighs04:13
jcw4:(04:13
jcw4https://docs.google.com/document/d/14W1-QqB1pXZxyZW5QzFFoDwxxeQXBUzgj8IUkLId6cc/edit#heading=h.q6wtcjv2r9h04:13
jcw4thumper: I think I want to capture a lot of your suggestions there04:13
thumperheh04:14
thumperjcw4: looks like the doc suggests a uuid for an action04:14
jcw4yep04:14
jcw4I don't recall if we explicitly discarded that idea or if it just slipped by us when we started worrying about filtering the events on the watcher04:15
thumperjcw4: also notice that the spec shows that the action records when it was invoked04:16
thumperthat looks like a timestamp to me04:16
* jcw4 blushes04:17
jcw4not for the first time tonight04:17
thumperhmm...04:17
thumperI do think that the design has gotten a little overcomplicated, in that we only need one action doc, not two04:17
jcw4two?04:17
thumperwe should have the action results stored with the action04:17
jcw4I see04:18
thumperI don't think we need an ActionResult doc04:18
thumperthe result belongs to an action04:18
thumperthis way you don't need to copy fields across04:18
thumperconsider this:04:18
thumper$ juju status action:UUID04:18
thumperin the spec, there are two options:04:18
thumperrunning, or failed04:19
thumperthis indicates to me that we are looking in one place to see the information04:19
thumperwhich means a simple database query04:19
thumperto get the action whether it is running or done04:19
* thumper takes a deep breath04:19
thumperI feel a real design review coming along04:20
thumperhow much time do you have?04:20
jcw4thumper: yes that makes sense... believe it or not, we started there and currents and eddies along the way pushed us to the two docs we have now04:20
jcw4I *want* to go for hours04:20
* thumper smiles04:20
jcw4I *should* have been off hours ago04:20
jcw4:)04:20
* thumper looks in trunk04:21
thumperhmm...04:21
* thumper goes back to the spec04:22
thumperjcw4: ok where should I dump my thoughts?04:24
thumperjcw4: I don't want to put them in the spec04:25
jcw4How about an email to the list?04:25
thumperjcw4: do you have a design spec?04:25
thumperum... yeah... ok04:25
thumpermore potential for bikeshedding04:25
jcw4I almost emailed the list a couple days ago, but didn't04:25
thumperbut ok04:25
jcw4thumper: that's true04:25
thumperlets try it :)04:25
jcw4we started a couple spec docs, but nothing worth sharing04:25
jcw4Maybe you might craft a new doc and link to it from an email?04:26
thumperone may fall out of the conversation04:26
jcw4thumper: ack04:26
jcw4<--- did you notice that?04:26
jcw4;)04:26
jcw4learning new catch phrases as I go04:27
thumperjcw4: nice04:31
wallyworlddavecheney: i'm not sure the data races are critical blockers for the 1.19.4 release - so long as CI is happy, we can fix them post release04:38
davecheneywallyworld: sure thing04:39
davecheneyyou're the judge04:39
wallyworldbut if you can fix quickly....04:39
davecheneyi'm fixing it anyway04:39
davecheneybut please lets not block this release any futher04:39
wallyworldgreat, may be able to sneak it in :-)04:39
wallyworldyup04:39
wallyworldthat was the thinking04:39
wallyworldi'll take off the 1.19.4 milestone04:39
wallyworldsinzui: so what's the verdict with the release at the moment?04:42
sinzuiIf I must release, I can use a8f48d14 which is before the 1.18.x upgrade fix04:43
sinzuiThe revision under test has that fix, is before the win build broke, and might be without the local precise upgrade problem04:44
wallyworldlooks reasonable04:46
wallyworlddoes github show commits in order?04:48
thumperjcw4: sent04:48
wallyworldin order of merging?04:48
thumpersinzui: remember that I want to disable the user command for the 1.20 release04:48
wallyworldsinzui: if the commits are in order, the 1.18 upgrade fix predates commit a8f48d14 doesn't it?04:48
thumpersinzui: but really just in the release branch04:48
wallyworldand yes, i just check that rev, and my 1.18 fix is there04:50
wallyworldthe one to stop peer grouper publishing empty api addresses04:50
thumperjcw4: sorry I haven't looked in earnest before now04:50
jcw4thumper: just one more thing on the list... I'm glad you've done what you have :)04:52
thumpernp04:52
sinzuithumper, wallyworld I am honestly just looking for a passing rev. We wanted to release week, so that rev would not have had any of these fixes. I was a moron for not choosing 6a2c202d when it passed 2 days ago04:52
thumpersinzui: you still can right?04:53
* thumper is about to leave and cook04:53
thumper2 hours of meetings tonight04:53
=== thumper is now known as thumper-afk
davecheneywallyworld: fix coming up, 20 seconds04:53
davecheneyFUCK04:54
davecheneymy working copy is screwde04:54
sinzuiI can will release the newest passing rev tomorrow, when I am awake enough to not make a mistake04:54
wallyworldsinzui: next time we'll branch off a release candidate so commits to trunk don't screw us04:55
davecheneyhttps://github.com/juju/juju/pull/16604:55
davecheneywallyworld: pls review04:55
wallyworldlooing04:55
sinzuiwallyworld, Next time I wont listen to developer saying we have a blocking bug.04:55
davecheneyi'm removing that errant werker.yml file04:55
wallyworldlooking04:55
sinzuiwallyworld, I care about regressions in the recent commit. I don;t care about something that has been broken for weeks or months04:56
wallyworldsinzui: agreed. if something does come up and it "needs" to be marked critical, we should then get concensus at least04:56
sinzuiwallyworld, in the case of the upgrade bug, we took more time coming to consensus that fixing it. maybe a dead line is more important. time-based releases work best when we just release04:58
sinzuiwallyworld, 1.19.3 was made from a  week-old rev because trunk was broken04:58
wallyworldyeah, if we can keep to a short enough release cadence04:59
davecheneythumper-afk: thanks for the review04:59
sinzuiwallyworld, mayeb we need stop the line. no one lands a branch until the critical is fixed...no one adds another regression until we fix the current on04:59
davecheneyanyone else ?04:59
wallyworlddavecheney: was it just that one attr?04:59
wallyworldsinzui: i'd rather branch04:59
wallyworldthat way trunk development is not held up05:00
davecheneywallyworld: thanks gents05:00
davecheneyi'll submit this now05:00
wallyworldsinzui:  a few days before release cut a 1.19.4 rc branch05:00
davecheneythis doesn't feel like the root cause of the mogno panics05:00
davecheneyi'll keep looking05:00
wallyworldand test it, and address any blockers on that branch05:00
wallyworlddavecheney: awesome for looking, thank you05:00
sinzuiwallyworld, when do we branch? trunk has been broken most of this month05:01
davecheneythumper-afk: fyi - i'm going to stop landing names changes for the immediate05:01
davecheneyuntil05:01
wallyworldsinzui: fair point. ideally, i'd say stop the line when CI breaks. but with unreliable clouds.....05:01
davecheneya. 1.19.4 lands05:01
davecheneyb. i can find those f'n races05:01
sinzuiwallyworld, exactly. I am awake now because azure and joyent cannot be trusted05:02
wallyworlddavecheney: did the race detector pick up that envuuid one?05:02
wallyworldsinzui: that's the root cause of some of our "trunk is broken" issues05:02
wallyworldbecause we can't *enforce* a "you break it, you fix it" approach05:02
wallyworldbecause we can't trust why CI is broken05:03
sinzuiwallyworld, HA is the root cause of most trunk brokeness, followed by API05:03
wallyworldah, true05:03
* jcw4 goes to bed... 05:03
wallyworldmore specifically, mongo is terrible05:03
wallyworldand mongo + replicaset is even worse05:03
wallyworldbut mongo is web scale :-/05:04
davecheneywallyworld: yup05:08
davecheneyi am *sure* there are other races05:08
davecheneybut right now, we can't see the wood for the trees05:08
wallyworlddavecheney: there indeed are, but if you fix the apiserver one, then woot05:08
wallyworldthere's another in our watcher shutdown code05:08
wallyworldcausing session closed errors05:08
davecheneywallyworld: on it05:09
wallyworlddavecheney: you know the error i mean?05:10
davecheneywallyworld: and it was the one we hoped jam's PR would fix05:10
wallyworlddavecheney: bug 130501405:11
_mup_Bug #1305014: panic: Session already closed in TestManageEnviron <intermittent-failure> <test-failure> <juju-core:Triaged by rogpeppe> <https://launchpad.net/bugs/1305014>05:11
wallyworldthat's the main one i think05:11
davecheneyshit05:11
davecheneythat was supposed to be fixed05:11
davecheneywe spent three days agonosing over the bloody fix05:11
davecheneybefore it landed05:11
wallyworldoh?05:11
davecheneywallyworld: this is _not_ the change that you hulk smashed for jam last night ?05:11
wallyworldbefore the api races over the past week, that session closed one was the main reason tests failed05:12
wallyworldnope05:12
wallyworldmy change was about stopping the peer grouper publishing empty apiaddress lists05:12
wallyworldif that's the one you are referring to05:12
davecheneyright05:13
davecheneyon it then05:14
davecheneywallyworld: is it qlways one package that blows up with the session already closed errro ?05:33
wallyworldaxw: wtf does this mean. i haven't pushed to my branch since last time, yet it is saying my remote branch is behind the remote counterpart05:33
wallyworldgit push origin managed-resources05:33
wallyworldTo https://github.com/wallyworld/juju05:33
wallyworld ! [rejected]        managed-resources -> managed-resources (non-fast-forward)05:33
wallyworlderror: failed to push some refs to 'https://github.com/wallyworld/juju'05:33
wallyworlddavecheney: mostly i think05:33
rogpeppedavecheney, wallyworld, axw: hiya05:33
wallyworldthe travebacks in the bug report should show it05:33
wallyworldhi05:34
davecheneywallyworld: i'll figure it out05:35
davecheney... me waits for tests to run05:35
wallyworlddavecheney: sorry, yeah i don't have the info to hand, i'd need to go and look at the bug report05:36
davecheneywallyworld: meh, i'll live05:37
=== vladk|offline is now known as vladk
wallyworldaxw: wtf, now the pull request diff shows all the changes to trunk after I did the initial proposal05:44
wallyworldi did rebase by branch so i could confirm there were no conflicts with trunk sit it may have bit rotted05:45
wallyworldhow the fark then do you do that and no have github mess everything up? this stuff just worked flawlessly with launchpad :-(05:45
jam1wallyworld: "i did rebase" sounds like the start of your problems05:46
jam1rebase throwing away history means DAG related operations lose context (IMO)05:46
jam1wallyworld: so if you merge trunk, and then rebase that later05:46
jam1all those changes look like you introduced them (I believe)05:46
jam1depends on if rebase throws out the merge commit or not05:47
wallyworldjam1: i thought rebase in this case simply moved your stuff out the way, merged in tip of trunk, and put your changes back?05:47
jam1wallyworld: well in your history one of your changes is merging trunk, right?05:47
wallyworldjam1: sure, but with launchpad, that all just worked05:47
jam1anyway, it isn't something I've used tremendously05:47
jam1wallyworld: you never rebased in LP05:47
jam1you don't *have* to rebase in git05:48
wallyworldso how else do i bring in trunk and not have my wip commits all sprined through?05:48
wallyworldsprinkled05:48
jam1wallyworld: live with them being sprinkled, like we did with LP05:48
wallyworldwell, not really05:49
jam1wallyworld: they were hidden by default, but you can get that with "git log --first-parent" as well05:49
wallyworldwhen you did the merge into trunk in lp, your merge commit was correctly placed in the timeline05:49
wallyworldso if i $$merge$$ what's there now, i assume all the commits already in trunk will be ignored and just my new stuff will go in?05:50
jam1wallyworld: well, it should try to merge the two, hopefully the changes that you brought in from trunk just apply cleanly05:51
jam1You can try just doing "git merge master --no-commit"05:51
jam1and see if that works without conflict.05:51
jam1(or upstream/master, or however you relate to the github.com/juju/juju master branch)05:52
wallyworldok. the rebase workflow i got from rick - that's how he brings in trunk to ensure his work is sane with tip05:52
wallyworldwhat should i use instead?05:52
jam1wallyworld: "git merge master" is what I would ued05:52
wallyworldpull upstream/master maybe05:52
jam1use05:52
jam1'pull' might work, as I think it is just fetch+merge05:52
wallyworldok, will try that05:53
jam1I'm just not sure if it is also going to change the defaults for "fast forward"05:53
wallyworldi really don't understand the love for git05:53
jam1wallyworld: I don't know if it is love as much as "it does the job, it was popular so everyone jumped on the bandwagon, and switching tools is hard so I always prefer the one I know"05:56
jam1and *probably* a little bit of stockholm syndrom "this was hard so it must be good"05:56
davecheneywallyworld: good news and bad news05:56
wallyworldsure. i wouldn't mind switch if git were better than bzr05:56
davecheneygood news: i found the data race05:56
davecheneybad news: its in the upgrade code05:56
davecheneywhich is probably why you guys can't cut a release05:56
wallyworlddavecheney: which one? that watcher/session closed?05:57
davecheneywallyworld: this shit takes _so_ long to run, i'm only reporting what I can see05:57
davecheneythe more I look, the more i'll find05:57
wallyworldok05:57
wallyworlddavecheney: upgrade only fails on precise/local05:58
wallyworldworks on other clouds and series05:58
* davecheney reaches for table leg06:00
jam1wallyworld: so a few things that I can concretely say are better: a) git commit is faster for really big trees and lots of history, b) git push/pull logs into github faster than Launchpad, because of LP limitations that I tried to fix, but ran into odd bugs and never got time to finally address, c) the actual transfer times are also a lot faster, d) colocated branches by default are  BigDeal(tm) that you could configure Bazaar to work well, but not out of t06:00
wallyworldjam1: so i came back to work on this branch after several days. when i went to push, it complained about my branch is behind remote counterpart and to pull. so i did, but there were conflicts which precisely corresponded to the changes i had made locally, and i had to resolve by "accept mine"06:01
wallyworldjam1: bzr handles history and file renames better too06:01
jam1wallyworld: that sounds like a mistaken set of targets for your pushand pull06:01
jam1wallyworld: bzr's view of history (default in log) is beautiful (IMO)06:01
jam1*but*06:01
jam1it is very expensive to compute06:01
jam1as it is O(allhistory)06:01
wallyworldi push/pulled from origin/<branch> where origin is gh.com/wallyworld/juju06:02
jam1so we paid a lot of user visible performance, and didn't push hard enough for how much better it actually presents history06:02
wallyworldsure, but computers are fast06:02
wallyworldhow fast is fast enough06:02
jam1wallyworld: I certainly have my bias in that06:02
jam1but it didn't actually win hearts and minds06:02
davecheneywallyworld: faster than mercurial would be fine :)06:02
wallyworldyup :-)06:03
axwwallyworld: sorry was out. did you sort the PR issue?06:03
jam1wallyworld: also, when we had breakpoints like trying to get Mozilla (lost to hg) we were *very* slow because we were using a bad format.06:03
* axw hasn't read all the history yet06:03
jam1we fixed that format in the next release06:03
jam1but too late06:03
jam1same thing for python's switchover06:03
wallyworldaxw: it's all screwed. if you look, you'll see my latest commits at the end06:03
jam1we had an improvement in the works, but it didn't land before they made their decision.06:03
wallyworldyeah :-(06:03
jam1mercurial has a strong advantage that they didn't try to abstract things06:03
jam1they supported 1 format and focused tightly on it06:04
jam1git and hg both went with the "sync to local is important, remote support is not" while Bazaar abstracted out "I can treat anything as just another branch"06:04
jam1which also cost Performance and developer time06:04
jam1wallyworld: but it means you can "bzr log lp:juju-core" whereas you can't do that with git06:05
wallyworldi do like that about bzr06:05
jam1git only supports sync to local, and then you log, etc locally06:05
wallyworlda lot06:05
wallyworldyep06:05
davecheneysadface, http://paste.ubuntu.com/7704302/06:05
jam1wallyworld: but it means the primatives for log, etc, know that they have a local file they can just mmap, etc.06:05
wallyworldindeed06:05
wallyworlddavecheney: funny, that test never fails in practice06:06
wallyworldwe have other races in production code i'd be more interested in fixing06:07
davecheneywallyworld: i think it's not a real race, it's just in the cleanup code, like most of our race06:08
wallyworldok06:08
axwwallyworld: I suspect you rebased on something other than upstream/master06:09
wallyworldaxw: i rebased on master (local)06:10
wallyworldafter pulling in tip from remote master06:10
wallyworldjam1: so the pr on github doesn't seem to show the latest diff vs tip of trunk like lp does after you just push shit up06:10
wallyworldjam1: because all of the noise in the pr now are actually commits in juju master06:11
jam1wallyworld: that I don't really know github, it is possible they find the ancestor they want when they start, and then they just  stick with that one for the rest of the review06:11
wallyworldthey should be ignored06:11
jam1wallyworld: launchpad actually does a merge without committing it, and shows that diff06:11
jam1which means it can even show you conflicts, etc.06:11
wallyworldi suspect you are right which makes me very sad06:11
axwjam1 wallyworld: yep, ancestor only for the initial diff AFAIK06:11
jam1vs just "diff from common ancestor"06:12
wallyworldthat sucks balls06:12
wallyworldreally06:12
wallyworldhow do so many people work that way?06:12
wallyworldmakes it very hard to have work in progress06:12
axwwallyworld: do you want to have a hangout and screen share to fix it?06:13
wallyworldok06:13
axwbrb06:13
axwwallyworld: in the tanzanite hangout06:15
wallyworldaxw: changes pushed06:43
axwwallyworld: cool, looks happier now06:44
axwnfi what happened before though06:44
wallyworldaxw: still had to do a push -f even the second time06:44
axwwallyworld: yeah because it failed to push before06:44
axwwallyworld: every time you rewrite history you have to do that06:44
axwforce push that is. you can only push without force if previously pushed history is unchangd06:45
jam1wallyworld: axw: who worked on "consider retry loop for failing direct db operations" ?06:45
jam1It looks like a card your team would have worked on, but nobody is assigned06:45
wallyworldjam1: no one yet06:45
jam1wallyworld: it was in the 'merged' column as of last week06:45
jam1when I moved everything from merged into the archive06:46
jam1wallyworld: should it be pulled out somewhere?06:46
wallyworldthat wasn't intentional06:46
jam1wallyworld: ok, put it in your todo then?06:46
axwaccidental, shoul be in backlog or deleted I think06:46
wallyworldi think it can be deleted now06:46
wallyworldno need for it atm06:46
jam1wallyworld: and I'm pretty sure menn0 was the one who worked on "Show relation name in status output", corect?06:46
jam1bug #119448106:46
_mup_Bug #1194481: Can't determine which relation is in error from status <hours> <observability> <ui> <juju-core:Fix Committed by menno.smits> <https://launchpad.net/bugs/1194481>06:46
wallyworldyep06:46
wallyworldi think so06:47
jam1wallyworld: is there a user for "unit tests fail on utopic" ?06:47
wallyworldjam1: is that a completed card?06:48
wallyworldi fixed a couple of thise06:48
jam1bug #132507206:48
_mup_Bug #1325072: unit tests fail on utopic <ci> <test-failure> <utopic> <juju-core:Fix Committed by wallyworld> <juju-core 1.18:Fix Released by wallyworld> <https://launchpad.net/bugs/1325072>06:48
jam1wallyworld: that is from "Week Ending June 6"06:49
wallyworldsounds right06:49
jam1wallyworld: k, I'm writing a script that pulls out stuff like velocity via the Kanban API and its showing some holes in our old labels06:50
jam1nothing too bad, and I probably won't worry much farther back06:50
wallyworldok06:50
jam1bug #128139406:51
_mup_Bug #1281394: uniter failed to run non-existant config-changed hook <regression> <juju-core:Fix Released> <https://launchpad.net/bugs/1281394>06:51
axwwallyworld: you changed the name of the result error but didn't change the defers06:52
wallyworldoh ffs, sigh06:52
wallyworldwill fix06:52
wallyworldaxw: done06:55
axwwallyworld: thanks, reviewed06:56
wallyworldthank you06:56
wallyworldwas a good review06:56
davecheneyhttp://paste.ubuntu.com/7704476/06:59
davecheneyi'm trying to fix the race in the upgrade test06:59
davecheneybut now it fails constantly on the safety check i put in06:59
davecheneygoroutine 4930 [sleep]:07:02
davecheneytime.Sleep(0xdf8475800) /home/dfc/go/src/pkg/runtime/time.goc:39 +0x3107:02
davecheneygithub.com/juju/juju/state/api.(*State).heartbeatMonitor(0xc20822d5e0, 0xdf8475800) /home/dfc/src/github.com/juju/juju/state/api/apiclient.go:264 +0x6607:02
davecheneycreated by github.com/juju/juju/state/api.Open /home/dfc/src/github.com/juju/juju/state/api/apiclient.go:196 +0xae307:02
davecheneywe leak a shitload of these goroutines07:02
davecheneyin the tests07:02
* davecheney creates issue07:02
wallyworldjam1: i'm off to soccer, but maybe you could get someone to look at why we continue to have very limited success with CI passing the local upgrade test only on precise. the latest machine-0 log from the failed test shows nothing obvious to me - previously there were errors in the log which showed why api server on port 17070 didn't start. only thing i can see is an apt get of a mongo-server package in the middle of the re-start after07:06
wallyworldupgrade initiated. could just be log interleaving, not sure. here's a link to the latest failing job from which machine-o log can be got http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1436/07:06
wallyworldcurtis considers this a release blocker07:07
wallyworldthe apt get mongo-server thing does look like the only suspicious thing i can see that may be different to trusty07:08
jam1wtf... 10 minutes ago leankit was reporting 700 cards, it now only reports 225. it just decided that our archive was old enough it could throw it away.... ?07:10
jam1wallyworld: k, I'll try to give it a look07:10
davecheneyjam1: there must be some cutoff07:11
davecheneyand many of those cards are OLD07:11
davecheneymany of them date back to Atlanta07:11
jam1davecheney: sure, but 10 minutes ago it gave me 700, I don't think we crossed the threshold in 10 mins07:11
davecheneyjam1: dunno, just trying to help07:12
davecheneyi'm probably not helping07:12
wallyworldjam1: i was thinking you'd delegate to someone07:12
jam1wallyworld: I'm pretty good at debugging stuff like this, so I'll at least give it a shot.07:12
wallyworldok07:13
wallyworldjam1: frustratingly it passes sometimes07:14
jam1wallyworld: well if it is a racy install of stuff, and sometimes we manage to install first07:14
wallyworldjam1: yeah, i didn't get to look to see at what stage we apt installed, i only just looked atthe log07:14
jam1wallyworld: "2014-06-26 06:26:41 INFO juju.state open.go:337 found existing state servers []07:25
jam1"07:25
jam1sounds problematic...07:25
davecheneyerk07:26
jam1I don't know that it is the specific problem, that is in "cloud-init" so maybe no servers are available during the first connect, but it does seem weird.07:27
TheMuemorning07:42
=== vladk is now known as vladk|offline
sinzuiwallyworld, jam1: I am off to get some sleep. I will release the blessed revision from this page, http://juju-ci.vapour.ws:8080/job/revision-results/ it will probably be a8f48d14 because I don't believe trunk will get better in a few hours08:21
sinzuiwallyworld, jam1 The rev I forced CI to test will pass, though I still believe local precise upgrades are dodegy08:34
jam1sinzui: so you think tip will pass, but we still should be investigating getting reliable P upgrades, right?08:35
sinzuijam1 I didn't test tip. I tested an older rev that was skipped08:36
jam1sinzui: do you mean a8f48d1408:36
jam1or something else? as I don't see any other revs being tested in "revision-resultS"08:36
jam1sinzui: the current loacl-upgrade-precise-amd64 is still blinking red, afaict08:37
sinzuijam1 I tested 1d57f5208:37
jam1sinzui: http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/ shows that rev as failing 3 times08:37
sinzuiJam1 yes I am waiting for the destroy-env to complete http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1439/console08:37
sinzui^ that is a pass08:37
sinzuibut I dare note hurry destroy-env for fear that the act will cause an error08:38
jam1sinzui: so blinking red is because it was red in the past but is running now?08:38
sinzuijam1 yes08:39
sinzuinot obvious08:39
sinzuilxc-destroy is taking forever08:39
mivtachyahuapologies for the cross-post, but has anyone ever seen a bug where juju confuses what machines are which machine numbers?08:40
sinzuimivtachyahu, I haven't seen that before08:41
mivtachyahuI've come into work this morning to find that all the servers are jumbled up, ie what was machine 7 yesterday is machine 12 today08:42
sinzuimivtachyahu, yes, that happen, machine numbers cannot be reused. so a number given to a machine that is added them removed also removed the number forever08:43
mivtachyahuah, no, you misunderstand, 7 is now 12, 12 is 17, 17 is now 8, 8 is now 7, they're jumbled, not removed.08:44
mivtachyahu(those numbers illustrative, I've not mapped which machines are actually which)08:44
sinzuijuju-ci's hiest machine number is 52, but there are only 10 active machines08:44
sinzuimivtachyahu, that is mad. How do you know they are jumbled? the ip addresses?08:45
sinzuiwallyworld, jam1, all the circles are blue http://juju-ci.vapour.ws:8080/08:46
mivtachyahuwhen I juju ssh <machine number> they have the wrong contents, when I issue a juju status, the units are showing on the correct machine *numbers*, but the public-addresses have changed.08:46
jam1sinzui: so we still have a chance for trunk tip if we get fixes, but we expect to release 1d57f5208:46
sinzuijam1. You do. so CI has about 4 hours to work08:47
axwmivtachyahu: which version of juju, and which provider type?08:48
mivtachyahujuju 1.18.1 and on azure.08:48
axwok, nothing comes to mind. if it were in the 1.19 series then I'd be blaming availability sets because service units get a single load balanced IP08:51
sinzuiwallyworld, jam1. I reopened https://bugs.launchpad.net/juju-core/+bug/1334493 because juju doesn't execute after it is compiled on windows08:56
_mup_Bug #1334493: Cannot compile win client <regression> <windows> <juju-core:Triaged by wallyworld> <https://launchpad.net/bugs/1334493>08:56
* sinzui tries to rebuild and hopes for the best08:57
axwwallyworld: AFAICT, the Azure vhds cannot be reused. each one is a disk image for a separate VM instance, like you'd have if you were running VMs in VMWare or VirtualBox08:59
axwwallyworld: i.e. they're not pristine OS images, but VM disks08:59
axwwallyworld: going to move that card to "done"09:00
sinzuiaxw, thank yo for investing that.09:00
axwsinzui: nps09:00
axwsinzui: I think we used to leak those VHDs because we were using a more error prone method of deleting disks before09:01
axwsinzui: I switched the code over to using an API that deletes all associated disks when we terminate VMs09:01
axwI think it's only in the 1.19 series tho09:01
sinzuiaxw, the offcial api didn't let you delete them when you deleted disks until a few months ago09:02
sinzuiI had to upgrade the libraries we use to delete them09:02
mivtachyahuwell, good news, my weird bug has fixed itself. :)09:04
axwweird indeed. mivtachyahu if you stumble across the steps to reproduce the issue, please file a bug (or ping someone in here to do so)09:05
mivtachyahuwill do09:06
axwdavecheney: this is committed, right? https://bugs.launchpad.net/juju-core/+bug/133450009:11
_mup_Bug #1334500: state/apiserver: more data races <race-condition> <juju-core:In Progress by dave-cheney> <https://launchpad.net/bugs/1334500>09:11
davecheneyaxw: yes, committed09:21
davecheneysorry, i didn't update the status09:21
axwnps09:22
jamaxw: I believe it is, but due to the revision that sinzui was actually able to get to pass CI, it probably won't be in 1.19.409:36
jamdavecheney: fwiw, we really don't need a get+set operation, just a simple mutexed get that will populate the cached value if it is empty would have been a better fit.09:36
davecheneyjam: i didn't want to hold the lock over that other operation09:37
jamdavecheney: given the whole point is that it is just a cache, I don't think we want to trigger the operation 2x while getting it. but it isn't like it is a big deal.09:41
axwdavecheney: I'm gonna have a look at the leaking heartbeat goroutine bug09:44
axwseems to be a bunch of api.Opens without corresponding Closes.09:44
jamaxw: we seem to do that a fair bit in the test suite, I've caught a few in the past.09:46
jamOnce you have a couple Copy & Paste helps ensure it spreads :)09:46
jamaxw: in other code bases we had things like "ensure 0 threads are running when a test ends"09:47
axwyeah, that'd be nice09:47
axwperhaps we should add that to the base suite's TearDownTest09:47
jamaxw: well, we could try to move towards that, I think we'd find a lot of problems to start with.09:48
jamaxw: I also don't know if golang gives you a great view of "what is running", but probably it does somewhere09:48
jamdimitern: TheMue: vladk|offline: just a reminder that we're skipping our daily standup for the team standup in 10 min09:49
axwjam: we could compare runtime.NumGoroutine() before/after test run. I expect you're right and it'd be painful initially09:50
dimiternjam, ok09:50
jamaxw: so for threads we compared the set of thread ids at start and end.09:51
jamaxw: so set(at_end) - set(at_beginning) must be empty09:51
jamaxw: though my google-fu says "you can't get a list of all running goroutines"09:53
jamI know that you can, given that panic can print it out, but I imagine using that trick would be really really bad :)09:54
axwyeah. it would be nice to compare sets, but I think just comparing size would be good enough09:55
jamaxw: (runtime.Stack(, all=true) and then parsing that for what is running)09:55
perrito666morning everyone09:55
jamaxw: main problem with just doing the count, is that it sometimes passes accidentally, and it still doesn't give you any information about what is running that shouldn't be that you need to go fix.09:56
jamIn that respect, the runtime.Stack() method actually isn't terrible, as you could print out "these goroutine stacks are running and probably shouldn't be)09:56
axwjam: true, though in that case you could just dump runtime.Stack(..., all)09:56
jamaxw: or as I was pointing out, you could just use Stack(…,all) and use that for set difference09:56
axwyes, I suppose you could compare entry points09:57
jamTheMue: just a reminder you're OCR today09:59
TheMuejam: sure, already done first10:00
TheMuefirst ones10:00
jamTheMue: great10:00
TheMuejam: made a calendar entry for it to not forget it ;)10:00
jamTheMue: :), team standup onw10:01
TheMuejam: yeah, here also my calendar reminded me10:02
=== vladk|offline is now known as vladk
=== vladk is now known as vladk|offline
=== vladk|offline is now known as vladk
=== vladk is now known as vladk|offline
TheMueafk for lunch11:03
=== vladk|offline is now known as vladk
vladkdimitern: ping11:33
vladkdimitern: I created WatchInterfaces, My current problem is that it's impossible now to add network interfaces after they were provisioned.11:35
vladkI can remove this check from machine.go, but this breaks some tests. Otherwise, I can11:35
vladkdimitern: Otherwise, I can't test the watcher when I add the network interface11:36
dimiternvladk, there was a slight change11:36
vladkdimitern: what do you mean?11:37
dimiternvladk, jam, fwereade and i discussed and we can use a notifywatcher instead of a stringswatcher11:37
dimiternvladk, jam, for the network interfaces11:37
dimiternvladk, jam, that way we don't need to care about tags for interfaces11:37
dimiternvladk, as for your question, you'll need to change AddNetworkInterface slightly, so it doesn't fail when the machine is provisioned11:39
vladkdimitern: this breaks some of the tests, so I need to fix them, too11:39
dimiternvladk, i.e. assertAliveAndNotProvisioned becomes aliveDoc, and the if m.doc.Nonce != "" needs to go11:39
dimiternvladk, yep, naturally11:40
vladkdimitern: should I change stringswatcher to notifywatcher?11:40
dimiternvladk, yes11:42
dimiternvladk, i'm updating the model doc today to reflect what we discussed11:43
dimiternvladk, that's the only thing affecting your work now11:43
vladkdimitern: may I do PR with stringswatcher to get a quick feedback and change it lately?11:43
dimiternvladk, of course11:44
vladkthanks11:44
vladkdimitern: please, review https://github.com/juju/juju/pull/16911:54
=== vladk is now known as vladk|offline
=== vladk|offline is now known as vladk
wallyworldmgz: i'm still in a meeting, i'll ping you soon for 1:112:00
mgzwallyworld: sure, I'll hang out there for when you arrive12:01
wallyworldbe there soon12:01
=== vladk is now known as vladk|offline
=== eagles0513875 is now known as greenrice
=== greenrice is now known as eagles0513875
rogpeppe2trivial update to dependencies.tsv, anyone? https://github.com/juju/juju/pull/17712:49
rogpeppe2fwereade, dimitern, mgz, natefinch, wwitzel3: ^12:49
=== urulama is now known as uru-food
* rogpeppe2 thinks it's trivial enough to just merge anyway12:54
* rogpeppe2 does that12:54
TheMuerogpeppe2: taking a look12:54
TheMuerogpeppe2: argh, too quick12:55
TheMuerogpeppe2: ;)12:55
rogpeppe2TheMue: that's ok - there's not exactly much to review...12:55
TheMuevladk|offline: made some comments12:55
TheMuerogpeppe2: have to compare this nice number to available revisions :D12:56
rogpeppe2TheMue: the 'bot will complain if it doesn't work...12:56
TheMuerogpeppe2: taedd? (trial-and-error driven development)12:56
rogpeppe2TheMue: with changes that simple, it seems reasonable to me12:57
TheMuerogpeppe2: yep12:57
fwereadenatefinch, I'll be with you soon13:04
=== vladk|offline is now known as vladk
=== uru-food is now known as urulama
lazypowerGreetings juju-core. There was an LXC update this morning that wipes mount fstype=rpc_pipefs, if i recall correctly this causes problems with containers does it not?13:33
lazypowerhttp://i.imgur.com/tjSkSG6.png13:34
TheMuerogpeppe2: seen that the merge failed?13:37
rogpeppe2TheMue: no i hadn't. thanks13:37
TheMuerogpeppe2: yw13:37
* rogpeppe2 wants to work out a decent way to get an obvious warning when a merge fails13:38
TheMue+113:40
rogpeppe2oh bugger, it's been changed to break the API13:40
rogpeppe2i'm stuffed now13:40
rogpeppe2because the new charm changes require the new names package13:41
* rogpeppe2 wonders why all those tag changes needed to happen13:41
=== rogpeppe2 is now known as rogpeppe
rogpeppehmm, i guess i'll just hack around the issue for now13:44
lazypowerwrt my question above, here's a bug that was filed that shows the behavior: https://bugs.launchpad.net/juju-core/+bug/131952513:44
_mup_Bug #1319525: juju-local LXC containers hang due to AppArmor denial of rpc_pipefs mount with local charms <local-provider> <lxc> <juju-core:Invalid> <lxc (Ubuntu):Incomplete by tyhicks> <https://launchpad.net/bugs/1319525>13:44
perrito666rogpeppe: "for now"®13:44
perrito666is there anything in place to tell a machine "hey, apiserver and stateserver have changed" ?13:46
rogpeppeperrito666: the state server addresses should change13:47
rogpeppeperrito666: and they can be watched13:47
perrito666rogpeppe: come again please, I cannot join those two things you just said into something I understand13:48
rogpeppeperrito666: :)13:48
rogpeppeperrito666: what are you trying to do?13:48
perrito666rogpeppe: restore ;)13:48
perrito666current restore ssh's into all of the agents and runs a sed script to change apiadresses and stateaddress13:49
perrito666I really would like to do something prettier13:49
rogpeppeperrito666: well, i think ssh'ing in is probably the only option13:49
rogpeppeperrito666: but what you do *when* you've ssh'd in could be prettier13:50
rogpeppeperrito666: you could add a jujud subcommand which updates the addresses in the agent.conf file13:50
rogpeppeperrito666: and then invoke that from the ssh command13:50
lazypowerDoes anyone know why LXC would give up on round robin dns assignment? I have evidence here it has done so: http://pastebin.ubuntu.com/7705971/13:52
perrito666rogpeppe: it will have to do ¯\_(ツ)_/¯13:53
rogpeppeperrito666: what kind of thing would you *like* to be able to do?13:53
perrito666rogpeppe: well I think that your idea pretty much sums what I would like to be able to do, perhaps wrapped, something like having agents listen for "control commands" and a mechanism to issue those, I think I have some bias for working too much with embedded devices :p13:55
=== vladk is now known as vladk|offline
rogpeppeperrito666: it wouldn't be too hard to get agents to listen on a local socket for control commands14:07
sinzuimgz, 1.19.4 will be the revision you created. CI had skipped it for a new rev yesterday. I made CI test just your rev to get a pass14:07
sinzuimgz: I am very interest in your work to play unittests in lxc14:07
rogpeppeperrito666: but that does mean the agent has to be up and running at the moment you're doing the restore14:07
bodie_morning all14:08
perrito666rogpeppe: well restore always assumed the agents are up14:08
rogpeppeperrito666: really?14:08
rogpeppeperrito666: how so?14:09
perrito666rogpeppe: well, the script that runs on all machines does:14:11
perrito666450         initctl stop jujud-$agent14:11
perrito666which would fail and exit the script if jujud-$agent was not up14:12
rogpeppeperrito666: that'll work ok if the agent is already stopped though, won't it?14:12
rogpeppeperrito666: oh really - i thought initctl stop was idempotent14:12
rogpeppeperrito666: that's a bug then14:12
rogpeppeperrito666: blame me :-)14:12
mgzsinzui: ace, thanks - I did reland the change, so will keep an eye on the job as well14:12
perrito666rogpeppe: :) oh, then I un asume that14:13
rogpeppeperrito666: no, you're right14:14
rogpeppei wonder if there's a way to tell initctl to stop a service only if it's already running14:15
perrito666rogpeppe: || true14:15
rogpeppeperrito666: ha ha14:15
rogpeppeperrito666: that's indeed the simplest solution, though not great14:16
rogpeppeperrito666: better would be to test the output of initctl status first, i think14:16
perrito666rogpeppe: you would have to check status I guess14:16
rogpeppeperrito666: yeah14:16
perrito666returns stop/waiting or sth like that when not started14:16
jcastrohttps://bugs.launchpad.net/juju-core/+bug/133468314:42
_mup_Bug #1334683: juju machine numbers being incorrectly assigned <juju-core:New> <https://launchpad.net/bugs/1334683>14:42
jcastrohas anyone seen this before?14:42
jcastroit's affecting someone in production14:43
natefinchjcastro: looking14:49
jcastrothey are early adopters, so any help you can lend would be <314:49
alexisbjcastro, what version of juju did they hit that bug?14:49
=== makyo_ is now known as Makyo
natefinchI wonder if this is azure being wacky14:50
jcastroI'll ask them to update the bug14:50
natefinchalexisb: looks like 1.18.114:50
alexisbjcastro, wallyworld's team will be tackling azure issues this cycle, this may be one of them14:51
alexisb^^ just an fyi14:51
jcastrorock and roll!14:52
natefinchjcastro: updating to 1.18.4 couldn't hurt14:55
sinzuinatefinch, do you have a few minutes to review https://github.com/juju/juju/pull/17815:09
perrito666sinzui: what happent with 1.19.5?15:10
sinzuiperrito666, well. we really want 1.20.0, though my scripts want to make 1.19.5. We will create a stable 1.20 branch and let master think it is 1.20.015:11
perrito666sinzui: that explains the commit message which says something very different from the actual patch15:12
sinzuiperrito666, natefinch yep. I realised that if I make the branch 1.19.5, I need to land another branch next week to get the version right for june 3015:13
sinzuimaybe I am wrong15:13
perrito666sinzui: perhaps I am about to say something sinful but, wouldn't it be nice if you re-wrote a bit the past so that commit message says the right thing?15:15
sinzuiperrito666, I was thinking something a little different, but it also means retracking the PR15:15
perrito666sinzui: if you use git ammend then push it should look as if this little mistake never happent15:16
natefinchsinzui: LGTM15:17
sinzuiperrito666, I need to fork at juju-1.19.4 to create stable 1.20 branch. I think need to merge a branch into both devel and stable that sets the version. stable branch will want to be 1.20.0 and I will merge select revs from devel into it. Maybe devel needs to be 1.20-alpha to indicate it is devel15:17
sinzui^ natefinch maybe I want to do something different because I need a stable branch and juju will switch to the new version rules15:18
sinzuiperrito666, natefinch and the *next* unstable version that thumper and I discussed would be 1.21-alpha115:20
* sinzui delete PR15:20
=== vladk|offline is now known as vladk
=== vladk is now known as vladk|offline
ericsnownatefinch: are we doing standup now?17:00
natefinchericsnow: I can't, sorry.  Probably will have to be very late today if at all.  I have to take my daughter to her 1 year checkup in an hour.17:06
ericsnownatefinch: no worries17:06
natefinchLet's shoot for 3.5 hours for now, hopefully I'll be back in and working17:06
perrito666natefinch: 3.5hours from now?17:08
ericsnownatefinch: sounds good17:12
natefinchperrito666: from now, yeah, sorry17:19
perrito666natefinch: I think Ill be around17:19
alexisbnatefinch, Do we need more time scheduled with gsamfira and team for the workload stuff?17:24
alexisbwwitzel3, ping17:26
natefinchalexisb: probably.... it's been slow going.  Good, but not fast17:27
alexisbnatefinch, ok, I will put an hour on the calendar for tomorrow, then we can discuss if we want to do a few days next week17:28
natefinchalexisb: I'm on vacation next week :/17:30
alexisbcrap thats right17:31
natefinchthey're actually doing well, so it might not be so bad17:31
alexisbheh ok I will schedule I bit more time tomorroe then17:31
alexisband then we can exit with a game plan while you are gone17:31
bachi i'm trying to bootstrap an environment on azure and it is not coming up.  all-machines.log shows that 'machiner' cannot set the machine address and it is constantly restarting: http://paste.ubuntu.com/7707189/18:25
bacis this situation recoverable?18:25
=== BradCrittenden is now known as bac
perrito666sinzui: Ill lgtm if you promise me that you took care of the extra step that broke things the other time :)20:21
sinzuiperrito666, I am pondering those same consequences for my inc-1.20-alpha1 branch20:22
sinzuiperrito666, We change the transition number to 1.19.9...but I don't think we can land a version change to 1.20-alpha1 until after 1.20.0 is release. 1.18.x throws a wobbly when we ask it up upgrade to a version with alphas20:24
* sinzui ponders 1.19.5 for master until 1.20.0 is released20:25
perrito666ericsnow:20:47
perrito666news about nate?20:47
ericsnowperrito666: nope20:48
perrito666ericsnow: he is not in the hangout20:49
ericsnowperrito666: yeah, not on IRC either20:49
perrito666he most likely fell on the netsplit20:50
bacsinzui: do you have much juju/azure experience?20:53
perrito666sinzui: you got lgtmd20:53
sinzuithank you perrito66620:53
sinzuibac I have a lot of janitorial azure experience20:53
=== Guest8558 is now known as wallyworld
wallyworldsinzui: hi, you finally got a rev to release :-) with the 1.20 branch you want to create off master, will CI be able to run tests for both the release candidate branch and trunk? will you set up a jenkins slave to test our future RC branches as well as trunk?21:35
sinzuiwallyworld CI knows how to watch any bzr or git branch21:40
sinzuiwallyworld CI knows how to watch any bzr or git branch21:45
sinzuiwallyworld_, will the lander/git-merge-juju work with a non-master branch?21:45
sinzuiI have a merge ready to try when we want21:46
sinzuiwallyworld_, also I have built a juju env from 3 clouds, a private vpn, and have some neigh impossible archs http://juju-ci.vapour.ws:8080/computer/21:46
sinzuiI think I can now afford to be sick and get rest21:47
jcw4is there a problem upgrading my 14.04 ubuntu that I use for develoment to go 1.3 ?21:53
sinzuijcw4, You will discover the 1.2-1.3 bugs faster than CI's gccgo testing will report22:00
jcw4hehe22:00
jcw4that's what I was afraid of22:01
jcw4does juju have a 'support matrix' of which versions of Go are supported on which platforms?22:01
sinzuijcw4, OSX appears to be building with 1.3. It was disconcerting to see since I don't have osx hardware to test with.22:01
wallyworld_sinzui: the lander should handle a non master branch - i'll confirm with martin22:01
jcw4sinzui: I had a hard time getting all the tests to work (go 1.2) on osx22:02
sinzuijcw4, We are officially 1.2 on all OSes for all series...except ubuntu doesn't officially provide 1.2 for precise22:02
jcw4sinzui: I see22:02
sinzuiwallyworld_, do I not have $$merge$$ special powers? I thought my inc of master to 1.19.5 woul work22:03
wallyworld_sinzui: anyone in juju team should be able to type $$merge$$, did it not work?22:04
sinzuijcw4, you added series (maverick) support to the version name? I was pleased to see that in my test of that today22:04
jcw4wallyworld_: ^^22:04
sinzuiI may be impatient22:04
wallyworld_jcw4: ?22:05
jcw4sinzui addressed that comment to me but I think it was intended for you wallyworld_ ?22:05
wallyworld_i didn't add maverick support, not sure why we did sice maverick is EOL22:06
wallyworld_isn't it?22:06
sinzuijcw4, no you. I was surprised to not see unknown when I bootstrapped today with an osx client22:06
sinzuisorry osx mavericks22:06
wallyworld_sinzui: you are right, the lander has not picked up your $$merge$$, i'll look  into it22:06
jcw4oh; no :-(22:07
jcw4sinzui: I don't even know how to do that yet :)22:07
wallyworld_sinzui: just to confirm - you created the 1.20 branch off the rev used to cut 1.19.4, right?22:07
sinzuithats okay, I am EOD now. no 19 hours days now that I have a release to create stable from. and I have an army of slaves to do my bidding22:08
sinzuiwallyworld_, I sure did22:08
wallyworld_awesome22:08
wallyworld_i'll inc the evrsion number to 1.20 also if it hasn't been done22:08
wallyworld_sinzui: you have indeed been working too hard, you need to got rest and get better, perhaps with a glass of red22:09
sinzui:)22:09
sinzuiI will call that medicine for my soar throat. a cough suppressant22:10
wallyworld_sinzui: i'll add a lander job to look at and land stuff off the 1.20 branch22:10
wallyworld_sinzui: so maybe tomorrow when you come in to work you can then hook 1.20 up to CI22:11
sinzuiwallyworld_, I will be visiting mgz tomorrow. Now that I have all my slaves, I want to run unit tests in lxc on them. I think that will take 30-40 minutes off the time it take CI to run unittest, build packages, and test local provider22:12
wallyworld_\o/22:12
=== alexisb is now known as alexisb_afk
sinzuiwallyworld_, I just added 1.20 to the list of branches to test. Ci is testing it now.22:32
wallyworld_sinzui: you are f*cking amazing22:32
alexisb_afkwallyworld_, +1 to that :)22:32
=== Guest28217 is now known as wallyworld
* perrito666 needs to autodocument his code because he is loosing track of it22:40
wallyworldsinzui: is there a separate dashboar for 1.20 vs trunk?22:42
sinzuiwallyworld, no, sorry22:42
wallyworldthat's ok just wondering22:42
wallyworldso how do you see that 1.20 vs trunk is ok?22:43
wallyworldthumper: can you ping me after your standup?22:51
waiganidavecheney: standup take two23:09
davecheneywaigani: rightou23:13
davecheneywallyworld: is the tree open or closed ?23:26
wallyworldopen, we've created a separate 1.20 branch23:26
wallyworldon my todo list to send email23:27
thumperwallyworld: ping23:43
wallyworldthumper: hey, have you seen this issue https://launchpad.net/bugs/132905123:43
_mup_Bug #1329051: local charm deployment fails on "git not found" due to wrong apt proxy <amd64> <apport-bug> <third-party-packages> <utopic> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1329051>23:43
wallyworldwrong proxy being used inside lxc23:43
thumperno23:44
wallyworldok, it seems Juju uses the apt_proxy setting from host machine when setting up proxy inside lxc23:44
wallyworldwhich is wrong23:44
wallyworldi'll schedule for next stable milestone23:45
thumperyes, we do just blindly use the apt proxy of the host23:46
wallyworldok, seems like a legit issue then23:50

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!