/srv/irclogs.ubuntu.com/2015/07/13/#juju-dev.txt

* wallyworld sighs. ocr on leave means reviews kinda stall01:15
alexisbwallyworld, we should start having folks assign back-up for ocr when they go on vacation01:20
wallyworldyes we should - we used to have that as a polciy01:20
wallyworldi guess we still do in theory01:20
wallyworldfolks just have to follow it :-)01:20
menn0wallyworld: just so you know, i'm currently dealing with a nasty upgrade issue that CTS has run in to02:04
wallyworldoh no02:04
wallyworldbug?02:04
menn0wallyworld: seems to affect any upgrade from 1.24.x02:05
menn0wallyworld: the agent won't restart... a worker is failing to stop02:05
menn0wallyworld: only seems to happen with big complex envs (bootstack in this case)02:05
menn0wallyworld: i'm using the bootstack staging env atm the repro and am making some progress02:05
menn0wallyworld: RT 82240... i'll make sure there's a Juju ticket if there isn't already02:06
menn0thumper was looking at this last week but I've taken it over in his absence02:06
wallyworldoh, ty02:06
wallyworldmenn0: is it bug 146865302:07
mupBug #1468653: jujud hanging after upgrading from 1.24.0 to 1.24.1(and 1.24.2) <canonical-bootstack> <juju-core:Triaged> <juju-core 1.24:In Progress by thumper> <https://launchpad.net/bugs/1468653>02:07
alexisbmenn0, that is a bootstack bug02:07
menn0alexisb: I don't think so02:08
menn0alexisb: jujud is definitely not doing the right thing... it's getting stuck when trying to shut down02:08
menn0wallyworld: that is the ticket though (LP was timing out for me)02:08
wallyworldok02:09
menn0wallyworld: this is definitely leadership related02:58
wallyworldoh joy02:58
menn0wallyworld: when I reproed there were 9 stuck API connections02:58
menn0wallyworld: and there were 9 goroutines waiting in BlockUntilLeadershipReleased02:58
menn0wallyworld: in 1.24.0 at least there's a naked channel read there02:58
wallyworldstuck before upgrade during shotdown of agents?02:59
menn0wallyworld: yep02:59
menn0wallyworld: when did you fix all the naked channel ops?02:59
wallyworldi didn't fix those02:59
wallyworldi think maybe william did?02:59
wallyworldor tim?02:59
menn0ok, I thought you did some03:00
wallyworldif i did i can't remember03:00
menn0anyway, I shoudl hopefully have this soon03:00
wallyworldi guess we think that 1.24.2 is ok03:00
wallyworldand 1.24.0 upgrades may need a manual process if it hangs03:01
menn0not sure, I think the problem still happens when upgrading from 1.24.203:01
menn0I'll check that soon03:01
wallyworldok03:01
menn0it takes about 30 mins to build up the env to test it03:02
menn0and I don't want to tear it down just yet until I've finished looking at this env03:03
wallyworldok03:08
menn0wallyworld: ok, i understand the problem now03:17
menn0wallyworld: checking to see if someone has already fixed it in a later 1.24 or in master03:17
wallyworldok03:17
menn0wallyworld: basically if any of the leadership API requests are active (and some are quite long running) while an upgrade is initiated the server will get stuck03:18
menn0wallyworld: the more units you have the more likely you are to hit the problem03:19
wallyworldmenn0: yes, that sounds very plausibl based on what has been observed before and what andrew/william fixed03:20
wallyworldi don't think there's a fix we can do because 1.24.0 is already running03:20
menn0wallyworld: that's true, but it would be good to ensure that the next 1.24.0 doesn't have the problem03:22
menn0sorry 1.24.x03:22
menn0wallyworld: from looking at the code, the problem is still there in master03:23
menn0wallyworld: hangout?03:23
wallyworldsure, sec03:23
menn0wallyworld: onyx standup?03:24
menn0wallyworld: it's not actually as simple as we thought... the thing returned by NewLeaseManager is actually the singleton which is supposedly getting killed04:02
menn0wallyworld: there must be some other aspect04:02
wallyworldotp, sec04:03
menn0wallyworld: no worries04:03
menn0wallyworld: i'll sort it out04:03
wallyworldty04:04
menn0wallyworld: I have a fix for the problem08:51
menn0wallyworld: it's way past my EOD and i'm not working tomorrow so I'm writing up notes for thumper so he can write some tests around it and land it08:51
menn0wallyworld: we're not out of the woods yet though... post upgrade about 50% of the units on this env have hook failures08:52
menn0wallyworld: will send an email08:52
wallyworldmenn0: thanks for sticking with it, i'll talk to tim tomorrow08:52
mupBug #1471231 changed: debugLogDBIntSuite teardown fails <ci> <unit-tests> <juju-core db-log:Fix Committed> <https://launchpad.net/bugs/1471231>09:48
perrito666morning12:46
alexisbmorning perrito66612:59
perrito666there is something about anual medical check that makes me feel old13:04
* perrito666 sighs and makes appt13:04
anastasiamacperrito666: wait until u get kids... :)13:17
ashipikaanastasiamac: +113:18
sinzuiperrito666: katco I cannot find the on-call reviewer callendar. Any clues?13:25
katcosinzui: it's just the juju team calendar13:25
sinzuikatco: Anymore clue's. Canconal's Google Calenendar tells me none of the juju email address have a calendar?13:35
sinzuiwwitzel3: Can you review http://reviews.vapour.ws/r/2140/13:41
natefinchman, coming back from vacation is always so hard13:42
katconatefinch: o/ hope you had a good time13:43
natefinchkatco: amazing time.  Could have used another week (and a raise to be able to afford it ;)13:43
TheMuekatco: he had, seen it on Instagram ;)13:44
katconatefinch: lol13:44
katcoTheMue: :)13:44
wwitzel3sinzui: taking a look now13:44
TheMuenatefinch: looked like a lot of fun in a cool environment13:45
natefinchTheMue: it was great.  We did it last year in a house half this size... the extra interior space and nicer beach made this year even better.13:45
natefinch(and 50% more expensive... but worth it)13:45
katcoericsnow: wwitzel3: natefinch: we have 2 meetings overlapping. just meet in moonstone13:46
wwitzel3sinzui: just the dep updates? I was able to update to them and build juju and bootstrap, so combined with the tests you did, LGTM.13:46
TheMuenatefinch: your familiy growed, you need the space13:46
TheMue;)13:46
natefinchTheMue: yeah, I gotta stop doing thing13:47
TheMuenatefinch: cute family, no need to stop13:47
sinzuiwwitzel3: it is, I just wanted a dev to ask the hard questions about consequences. Thank you. This is the compararable branch for master. http://reviews.vapour.ws/r/2141/13:47
natefinchTheMue: haha... the number of bedrooms in my house, seats in my car, and lack of hair on my head say otherwise ;)13:49
TheMuenatefinch: hmm, ok, there are constraints, yep :D13:49
perrito666sinzui: tim and wayne13:58
sinzuithank you perrito666 katco and fwereade sorted me out13:58
natefinchkatco: I didn't check my calendar until now and just realized we have the iteration meeting nowish.  I have to take my daughter to a swim lesson in about 15 minutes... can we push the iteration meeting back a couple hours?  Sorry for the late notice... I forgot we'd pushed the iteration meeting to today.13:58
katconatefinch: not really, i am taking the middle of today off to catch up on some things. you need to start checking your calendar dude13:59
natefinchkatco: I know, I know.  Totally my fault. I'm sorry.14:00
=== ericsnow is now known as ericsnow_afk
cheryljIs there someone who owns the CentOS support within Juju?16:37
cheryljalexisb: ^^16:40
cherylj(I figure you'd be the most likely to know :)16:40
alexisbgsamfira and team did the work16:41
natefinchkatco, wwitzel3, ericsnow_afk: how goes?17:06
wwitzel3natefinch: good, I'm just working on wpm bugs17:06
bogdanteleagacherylj: I might be able to answer questions17:08
katconatefinch: pick up some of the bugs in the backlog if you don't mind17:20
katcowwitzel3: please tag the bug you're working on and move to actively working17:20
natefinchkatco: will do17:21
wwitzel3katco: thanks17:21
=== liam_ is now known as Guest62504
natefinchkatco: FYI: one bug was fixed by someone else, one was marked invalid, and one seems to be assigned to gsamfira, though that was 5 days ago, so I'm not sure if he's actually working on it.  The other bug in the backlog is being worked on by wwitzel3.  I could do my "clean up assigned bugs" task, unless you think there's something more important17:46
=== ericsnow_afk is now known as ericsnow
alexisbnatefinch, pending katco's arrival, there are plenty of bugs against 1.25 you can tackle :)19:48
alexisblots and lots19:48
natefinchalexisb: heh ok19:48
mupBug #1424892 changed: rsyslog-gnutls is not installed when enable-os-refresh-update is false <cloud-init> <logging> <juju-core:Fix Released by natefinch> <juju-core 1.24:Fix Released by natefinch> <https://launchpad.net/bugs/1424892>19:56
natefinchmgz: don't suppose you're around?20:12
natefinchsinzui: is my CI blockers bookmark incorrect?  It shows no blockers, but trying to merge some code to main returns "does not match fixes-blah"  My bookmark, for reference: https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL20:16
sinzuinatefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this20:53
sinzuinatefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this20:53
sinzuinatefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this20:54
sinzuinatefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this20:54
sinzuinatefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this20:54
sinzuihttps://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&field.importance%3Alist=CRITICAL&field.tag=ci+blocker+&field.tags_combinator=ALL20:54
sinzuinatefinch: CI is testing the fixes now20:54
sinzuilooks the the osx change is good too20:55
natefinchsinzui: mind if I add that link to the blocking bugs wiki page that Martin made today?  That way, hopefully it'll get updated if the requirements change20:56
sinzuinatefinch: go ahead20:57
natefinchdone20:59
natefinchthanks sinzui20:59
mupBug #1473461 changed: OSX/darwin builds fail: undefined: password.EnsureJujudPassword <blocker> <ci> <osx> <regression> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1473461>21:08
thumperah mah gard, so many emails21:33
thumperfwereade: I'm here21:35
fwereadethumper, heyhey21:44
fwereadethumper, not so critical really, I think it's just JujuConnSuite being shite21:44
fwereadethumper, and I've convinced myself that it's an INFO log anyway so it's moot21:44
fwereadethumper, but if, in your Copious Free Time, you were to come up with a clean way of separating the logging (that wasn't just "replace JujuConnSuite"), that would be awesome21:45
thumperfwereade: there is a way...21:51
* fwereade is all ears21:52
thumperthe base suite brings in a logging sute21:53
thumperthe logging suite captures the logs21:53
thumperand replaces the default logger (stderr) with one that goes to gocheck21:53
thumperso... wondering what the problem is21:54
fwereadethumper, well, me too, I'm vaguely assuming that because JCS has everything running all at once there's some global logging setup somewhere that dumps the state stuff into the stderr of the testing.Context21:54
fwereadethumper, I imagine the cmd.Logger or whatever it is has a hand in it?21:55
thumperIIRC, there was some change to the default loggers with the log roller21:55
fwereadethumper, and it's not wrong to be sending all those logs to stdout21:55
thumperbut I've not looked deeply21:55
fwereadethumper, it's just that it's happening in the same process, which is out of the ordinary, and so gets logged with everything else21:56
fwereadethumper, I guess the answer with that specific test to to run it against an api stub and check it doesn't log when nicely isolated21:56
thumper:)21:58
davecheney\o/22:36
thumperhi davecheney22:49
thumperdavecheney: how'd the conference go for you?22:50
davecheneythumper: excellently22:51
davecheneyi guess that means I beat axw back to austalia22:51
bradmis there any way to see what jujud is doing, load wise?  we've got one constantly sitting between 100% - 150% cpu, and the logs aren't particularly illuminating - doesn't look too busy at all23:13
mupBug #1446871 changed: Unit hooks fail on windows if PATH is uppercase <ci> <hooks> <windows> <juju-core:Fix Released by natefinch> <juju-core 1.24:Fix Released by natefinch> <https://launchpad.net/bugs/1446871>23:20
thumperbradm: best suggestion is to change the log settings to debug23:43
thumperbradm: or are they at debug already?23:43
* thumper takes a deep breath and resolves conflicts between master and jes-cli branch23:44
bradmthumper: we have a 20G log file, so either we're on debug or its very very verbose for info, but we'll check.23:56
bradmyes, we're definately on debug23:57
bradmwe're seeing a lot about ClaimLeadership23:57

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!