[01:15]  * wallyworld sighs. ocr on leave means reviews kinda stall
[01:20] <alexisb> wallyworld, we should start having folks assign back-up for ocr when they go on vacation
[01:20] <wallyworld> yes we should - we used to have that as a polciy
[01:20] <wallyworld> i guess we still do in theory
[01:20] <wallyworld> folks just have to follow it :-)
[02:04] <menn0> wallyworld: just so you know, i'm currently dealing with a nasty upgrade issue that CTS has run in to
[02:04] <wallyworld> oh no
[02:04] <wallyworld> bug?
[02:05] <menn0> wallyworld: seems to affect any upgrade from 1.24.x
[02:05] <menn0> wallyworld: the agent won't restart... a worker is failing to stop
[02:05] <menn0> wallyworld: only seems to happen with big complex envs (bootstack in this case)
[02:05] <menn0> wallyworld: i'm using the bootstack staging env atm the repro and am making some progress
[02:06] <menn0> wallyworld: RT 82240... i'll make sure there's a Juju ticket if there isn't already
[02:06] <menn0> thumper was looking at this last week but I've taken it over in his absence
[02:06] <wallyworld> oh, ty
[02:07] <wallyworld> menn0: is it bug 1468653
[02:07] <mup> Bug #1468653: jujud hanging after upgrading from 1.24.0 to 1.24.1(and 1.24.2) <canonical-bootstack> <juju-core:Triaged> <juju-core 1.24:In Progress by thumper> <https://launchpad.net/bugs/1468653>
[02:07] <alexisb> menn0, that is a bootstack bug
[02:08] <menn0> alexisb: I don't think so
[02:08] <menn0> alexisb: jujud is definitely not doing the right thing... it's getting stuck when trying to shut down
[02:08] <menn0> wallyworld: that is the ticket though (LP was timing out for me)
[02:09] <wallyworld> ok
[02:58] <menn0> wallyworld: this is definitely leadership related
[02:58] <wallyworld> oh joy
[02:58] <menn0> wallyworld: when I reproed there were 9 stuck API connections
[02:58] <menn0> wallyworld: and there were 9 goroutines waiting in BlockUntilLeadershipReleased
[02:58] <menn0> wallyworld: in 1.24.0 at least there's a naked channel read there
[02:59] <wallyworld> stuck before upgrade during shotdown of agents?
[02:59] <menn0> wallyworld: yep
[02:59] <menn0> wallyworld: when did you fix all the naked channel ops?
[02:59] <wallyworld> i didn't fix those
[02:59] <wallyworld> i think maybe william did?
[02:59] <wallyworld> or tim?
[03:00] <menn0> ok, I thought you did some
[03:00] <wallyworld> if i did i can't remember
[03:00] <menn0> anyway, I shoudl hopefully have this soon
[03:00] <wallyworld> i guess we think that 1.24.2 is ok
[03:01] <wallyworld> and 1.24.0 upgrades may need a manual process if it hangs
[03:01] <menn0> not sure, I think the problem still happens when upgrading from 1.24.2
[03:01] <menn0> I'll check that soon
[03:01] <wallyworld> ok
[03:02] <menn0> it takes about 30 mins to build up the env to test it
[03:03] <menn0> and I don't want to tear it down just yet until I've finished looking at this env
[03:08] <wallyworld> ok
[03:17] <menn0> wallyworld: ok, i understand the problem now
[03:17] <menn0> wallyworld: checking to see if someone has already fixed it in a later 1.24 or in master
[03:17] <wallyworld> ok
[03:18] <menn0> wallyworld: basically if any of the leadership API requests are active (and some are quite long running) while an upgrade is initiated the server will get stuck
[03:19] <menn0> wallyworld: the more units you have the more likely you are to hit the problem
[03:20] <wallyworld> menn0: yes, that sounds very plausibl based on what has been observed before and what andrew/william fixed
[03:20] <wallyworld> i don't think there's a fix we can do because 1.24.0 is already running
[03:22] <menn0> wallyworld: that's true, but it would be good to ensure that the next 1.24.0 doesn't have the problem
[03:22] <menn0> sorry 1.24.x
[03:23] <menn0> wallyworld: from looking at the code, the problem is still there in master
[03:23] <menn0> wallyworld: hangout?
[03:23] <wallyworld> sure, sec
[03:24] <menn0> wallyworld: onyx standup?
[04:02] <menn0> wallyworld: it's not actually as simple as we thought... the thing returned by NewLeaseManager is actually the singleton which is supposedly getting killed
[04:02] <menn0> wallyworld: there must be some other aspect
[04:03] <wallyworld> otp, sec
[04:03] <menn0> wallyworld: no worries
[04:03] <menn0> wallyworld: i'll sort it out
[04:04] <wallyworld> ty
[08:51] <menn0> wallyworld: I have a fix for the problem
[08:51] <menn0> wallyworld: it's way past my EOD and i'm not working tomorrow so I'm writing up notes for thumper so he can write some tests around it and land it
[08:52] <menn0> wallyworld: we're not out of the woods yet though... post upgrade about 50% of the units on this env have hook failures
[08:52] <menn0> wallyworld: will send an email
[08:52] <wallyworld> menn0: thanks for sticking with it, i'll talk to tim tomorrow
[09:48] <mup> Bug #1471231 changed: debugLogDBIntSuite teardown fails <ci> <unit-tests> <juju-core db-log:Fix Committed> <https://launchpad.net/bugs/1471231>
[12:46] <perrito666> morning
[12:59] <alexisb> morning perrito666
[13:04] <perrito666> there is something about anual medical check that makes me feel old
[13:04]  * perrito666 sighs and makes appt
[13:17] <anastasiamac> perrito666: wait until u get kids... :)
[13:18] <ashipika> anastasiamac: +1
[13:25] <sinzui> perrito666: katco I cannot find the on-call reviewer callendar. Any clues?
[13:25] <katco> sinzui: it's just the juju team calendar
[13:35] <sinzui> katco: Anymore clue's. Canconal's Google Calenendar tells me none of the juju email address have a calendar?
[13:41] <sinzui> wwitzel3: Can you review http://reviews.vapour.ws/r/2140/
[13:42] <natefinch> man, coming back from vacation is always so hard
[13:43] <katco> natefinch: o/ hope you had a good time
[13:43] <natefinch> katco: amazing time.  Could have used another week (and a raise to be able to afford it ;)
[13:44] <TheMue> katco: he had, seen it on Instagram ;)
[13:44] <katco> natefinch: lol
[13:44] <katco> TheMue: :)
[13:44] <wwitzel3> sinzui: taking a look now
[13:45] <TheMue> natefinch: looked like a lot of fun in a cool environment
[13:45] <natefinch> TheMue: it was great.  We did it last year in a house half this size... the extra interior space and nicer beach made this year even better.
[13:45] <natefinch> (and 50% more expensive... but worth it)
[13:46] <katco> ericsnow: wwitzel3: natefinch: we have 2 meetings overlapping. just meet in moonstone
[13:46] <wwitzel3> sinzui: just the dep updates? I was able to update to them and build juju and bootstrap, so combined with the tests you did, LGTM.
[13:46] <TheMue> natefinch: your familiy growed, you need the space
[13:46] <TheMue> ;)
[13:47] <natefinch> TheMue: yeah, I gotta stop doing thing
[13:47] <TheMue> natefinch: cute family, no need to stop
[13:47] <sinzui> wwitzel3: it is, I just wanted a dev to ask the hard questions about consequences. Thank you. This is the compararable branch for master. http://reviews.vapour.ws/r/2141/
[13:49] <natefinch> TheMue: haha... the number of bedrooms in my house, seats in my car, and lack of hair on my head say otherwise ;)
[13:49] <TheMue> natefinch: hmm, ok, there are constraints, yep :D
[13:58] <perrito666> sinzui: tim and wayne
[13:58] <sinzui> thank you perrito666 katco and fwereade sorted me out
[13:58] <natefinch> katco: I didn't check my calendar until now and just realized we have the iteration meeting nowish.  I have to take my daughter to a swim lesson in about 15 minutes... can we push the iteration meeting back a couple hours?  Sorry for the late notice... I forgot we'd pushed the iteration meeting to today.
[13:59] <katco> natefinch: not really, i am taking the middle of today off to catch up on some things. you need to start checking your calendar dude
[14:00] <natefinch> katco: I know, I know.  Totally my fault. I'm sorry.
[16:37] <cherylj> Is there someone who owns the CentOS support within Juju?
[16:40] <cherylj> alexisb: ^^
[16:40] <cherylj> (I figure you'd be the most likely to know :)
[16:41] <alexisb> gsamfira and team did the work
[17:06] <natefinch> katco, wwitzel3, ericsnow_afk: how goes?
[17:06] <wwitzel3> natefinch: good, I'm just working on wpm bugs
[17:08] <bogdanteleaga> cherylj: I might be able to answer questions
[17:20] <katco> natefinch: pick up some of the bugs in the backlog if you don't mind
[17:20] <katco> wwitzel3: please tag the bug you're working on and move to actively working
[17:21] <natefinch> katco: will do
[17:21] <wwitzel3> katco: thanks
[17:46] <natefinch> katco: FYI: one bug was fixed by someone else, one was marked invalid, and one seems to be assigned to gsamfira, though that was 5 days ago, so I'm not sure if he's actually working on it.  The other bug in the backlog is being worked on by wwitzel3.  I could do my "clean up assigned bugs" task, unless you think there's something more important
[19:48] <alexisb> natefinch, pending katco's arrival, there are plenty of bugs against 1.25 you can tackle :)
[19:48] <alexisb> lots and lots
[19:48] <natefinch> alexisb: heh ok
[19:56] <mup> Bug #1424892 changed: rsyslog-gnutls is not installed when enable-os-refresh-update is false <cloud-init> <logging> <juju-core:Fix Released by natefinch> <juju-core 1.24:Fix Released by natefinch> <https://launchpad.net/bugs/1424892>
[20:12] <natefinch> mgz: don't suppose you're around?
[20:16] <natefinch> sinzui: is my CI blockers bookmark incorrect?  It shows no blockers, but trying to merge some code to main returns "does not match fixes-blah"  My bookmark, for reference: https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL
[20:53] <sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
[20:53] <sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
[20:54] <sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
[20:54] <sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
[20:54] <sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
[20:54] <sinzui> https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&field.importance%3Alist=CRITICAL&field.tag=ci+blocker+&field.tags_combinator=ALL
[20:54] <sinzui> natefinch: CI is testing the fixes now
[20:55] <sinzui> looks the the osx change is good too
[20:56] <natefinch> sinzui: mind if I add that link to the blocking bugs wiki page that Martin made today?  That way, hopefully it'll get updated if the requirements change
[20:57] <sinzui> natefinch: go ahead
[20:59] <natefinch> done
[20:59] <natefinch> thanks sinzui
[21:08] <mup> Bug #1473461 changed: OSX/darwin builds fail: undefined: password.EnsureJujudPassword <blocker> <ci> <osx> <regression> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1473461>
[21:33] <thumper> ah mah gard, so many emails
[21:35] <thumper> fwereade: I'm here
[21:44] <fwereade> thumper, heyhey
[21:44] <fwereade> thumper, not so critical really, I think it's just JujuConnSuite being shite
[21:44] <fwereade> thumper, and I've convinced myself that it's an INFO log anyway so it's moot
[21:45] <fwereade> thumper, but if, in your Copious Free Time, you were to come up with a clean way of separating the logging (that wasn't just "replace JujuConnSuite"), that would be awesome
[21:51] <thumper> fwereade: there is a way...
[21:52]  * fwereade is all ears
[21:53] <thumper> the base suite brings in a logging sute
[21:53] <thumper> the logging suite captures the logs
[21:53] <thumper> and replaces the default logger (stderr) with one that goes to gocheck
[21:54] <thumper> so... wondering what the problem is
[21:54] <fwereade> thumper, well, me too, I'm vaguely assuming that because JCS has everything running all at once there's some global logging setup somewhere that dumps the state stuff into the stderr of the testing.Context
[21:55] <fwereade> thumper, I imagine the cmd.Logger or whatever it is has a hand in it?
[21:55] <thumper> IIRC, there was some change to the default loggers with the log roller
[21:55] <fwereade> thumper, and it's not wrong to be sending all those logs to stdout
[21:55] <thumper> but I've not looked deeply
[21:56] <fwereade> thumper, it's just that it's happening in the same process, which is out of the ordinary, and so gets logged with everything else
[21:56] <fwereade> thumper, I guess the answer with that specific test to to run it against an api stub and check it doesn't log when nicely isolated
[21:58] <thumper> :)
[22:36] <davecheney> \o/
[22:49] <thumper> hi davecheney
[22:50] <thumper> davecheney: how'd the conference go for you?
[22:51] <davecheney> thumper: excellently
[22:51] <davecheney> i guess that means I beat axw back to austalia
[23:13] <bradm> is there any way to see what jujud is doing, load wise?  we've got one constantly sitting between 100% - 150% cpu, and the logs aren't particularly illuminating - doesn't look too busy at all
[23:20] <mup> Bug #1446871 changed: Unit hooks fail on windows if PATH is uppercase <ci> <hooks> <windows> <juju-core:Fix Released by natefinch> <juju-core 1.24:Fix Released by natefinch> <https://launchpad.net/bugs/1446871>
[23:43] <thumper> bradm: best suggestion is to change the log settings to debug
[23:43] <thumper> bradm: or are they at debug already?
[23:44]  * thumper takes a deep breath and resolves conflicts between master and jes-cli branch
[23:56] <bradm> thumper: we have a 20G log file, so either we're on debug or its very very verbose for info, but we'll check.
[23:57] <bradm> yes, we're definately on debug
[23:57] <bradm> we're seeing a lot about ClaimLeadership