[00:00] <mup> Bug #1494947 opened: Panic keyManagerSuite.TestImportKeys on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494947>
[00:00] <mup> Bug #1494948 opened: Panic charmVersionSuite.TearDownTest on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494948>
[00:00] <mup> Bug #1494949 opened: Panic charmVersionSuite.TearDownTest on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494949>
[00:03] <mup> Bug #1494947 changed: Panic keyManagerSuite.TestImportKeys on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494947>
[00:03] <mup> Bug #1494948 changed: Panic charmVersionSuite.TearDownTest on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494948>
[00:03] <mup> Bug #1494949 changed: Panic charmVersionSuite.TearDownTest on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494949>
[00:06] <mup> Bug #1494947 opened: Panic keyManagerSuite.TestImportKeys on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494947>
[00:06] <mup> Bug #1494948 opened: Panic charmVersionSuite.TearDownTest on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494948>
[00:06] <mup> Bug #1494949 opened: Panic charmVersionSuite.TearDownTest on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494949>
[00:06] <davecheney> hello
[00:06] <davecheney> both the blocking bugs have been marked fixed committed
[00:06] <davecheney> how long until CI is unblocked
[00:06] <davecheney> thanks
[00:09] <mup> Bug #1494947 changed: Panic keyManagerSuite.TestImportKeys on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494947>
[00:09] <mup> Bug #1494948 changed: Panic charmVersionSuite.TearDownTest on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494948>
[00:09] <mup> Bug #1494949 changed: Panic charmVersionSuite.TearDownTest on wily <ci> <intermittent-failure> <panic> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1494949>
[00:27] <mup> Bug #1496184 opened: juju bootstrap on armhf/keystone hangs   juju version 1.24.5 <juju-core:New> <https://launchpad.net/bugs/1496184>
[00:30] <mup> Bug #1496184 changed: juju bootstrap on armhf/keystone hangs   juju version 1.24.5 <juju-core:New> <https://launchpad.net/bugs/1496184>
[00:33] <mup> Bug #1496184 opened: juju bootstrap on armhf/keystone hangs   juju version 1.24.5 <juju-core:New> <https://launchpad.net/bugs/1496184>
[00:39] <mup> Bug #1496184 changed: juju bootstrap on armhf/keystone hangs   juju version 1.24.5 <juju-core:New> <https://launchpad.net/bugs/1496184>
[00:42] <mup> Bug #1496184 opened: juju bootstrap on armhf/keystone hangs   juju version 1.24.5 <juju-core:New> <https://launchpad.net/bugs/1496184>
[00:54] <mup> Bug #1496188 opened: panic in juju worker <juju-core:New> <https://launchpad.net/bugs/1496188>
[01:07] <rick_h_> anastasiamac_: around?
[01:12] <anastasiamac_> rick_h_: Hi :D
[01:13] <rick_h_> anastasiamac_: howdy, so the meeting time is a bit off for my tastes. So I wanted to check in and ask what I could provide to help in my absence?
[01:13] <anastasiamac_> rick_h_: apologies for funny timing for the meeting - it's just aplaceholder... plz let me know what works for u :)
[01:13] <rick_h_> anastasiamac_: aside from the info on the bundle break down lib and such, is there anything else you needed on my end?
[01:13] <anastasiamac_> rick_h_: ur presence? :P
[01:14] <rick_h_> anastasiamac_: :)
[01:14] <anastasiamac_> rick_h_: what time would suit u better?
[01:14] <rick_h_> anastasiamac_: the other one worked better, that's just getting up early
[01:15] <rick_h_> anastasiamac_: but if you know what you're looking for input-wise maybe I can do it async?
[01:15] <rick_h_> I don't want to hold things up on me
[01:15] <rick_h_> as I think we had most things in hand back in the spec aside from the new bundle stuff and I've given you all the details on that
[01:17] <anastasiamac_> rick_h_: from my perspective, I think m k at the moment with info
[01:17] <anastasiamac_> rick_h_: would it suit u better if I'll try to move the meeting by 5hrs?
[01:17] <rick_h_> anastasiamac_: ok, nvm. I'll be there
[01:17] <rick_h_> anastasiamac_: don't worry about it
[01:17] <rick_h_> anastasiamac_: forget I was here../me disapears into the night
[01:17] <anastasiamac_> rick_h_: i would like u to sleep :D
[01:18] <anastasiamac_> but I'll move the meeting to later as I think it may suit u and John better :D
[01:18] <rick_h_> anastasiamac_: naw, I like to party :P
[01:18] <rick_h_> anastasiamac_: ask my team, it's not that bad for me anyway and I can sleep in thus :)
[01:19] <anastasiamac_> rick_h_: if u reconsider, m happy to move to more suitable time ;D
[01:19] <rick_h_> anastasiamac_: all good, go back to happy work thoughts for your day
[01:20] <anastasiamac_> rick_h_: thnx :D get rest!
[02:49] <thumper> menn0: hey, got time to talk about the unit agent upgrades?
[02:49] <thumper> davecheney: care to look at this bug? http://reports.vapour.ws/releases/3059/job/run-unit-tests-trusty-ppc64el/attempt/3804
[02:50] <thumper> davecheney: weird that it happens on power but not amd64
[02:53] <menn0> thumper: yep
[02:54] <thumper> davecheney: although, that code looks so fucked up
[02:54]  * thumper is ask menno too
[02:54] <thumper> menn0: standup hangout?
[02:54] <menn0> thumper: see you there
[02:55]  * thumper wonders how this code works
[03:07] <davechen1y> thumper: honestly no idea
[03:07] <thumper> davechen1y: anyway, I've got this one
[03:07] <davechen1y> we've had continual problems getting the right versino of gccgo and libgo5 on ppc64 machines
[03:31] <thumper> menn0: this is the uniter panic on power http://reviews.vapour.ws/r/2671/diff/#
[03:31] <thumper> davecheney: I have a golang question...
[03:31] <thumper> if I have a function, and that function defines a pointer variable
[03:32] <thumper> and in that same function I create some other closures (functions) that refer to that variable
[03:32] <thumper> are the closures defined to refer to that variable by reference?
[03:33] <davecheney> thumper: yes
[03:33] <thumper> good
[03:34] <davecheney> thumper: uniter fix LGTM
[03:35] <thumper> davecheney: ta
[03:46] <mup> Bug #1496217 opened: panic in cmd/jujud on power <blocker> <ci> <regression> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1496217>
[03:46] <mup> Bug #1496221 opened: Whitespace or new lines in "juju status --format=tabular" output <juju-core:New> <https://launchpad.net/bugs/1496221>
[03:49] <mup> Bug #1496217 changed: panic in cmd/jujud on power <blocker> <ci> <regression> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1496217>
[03:49] <mup> Bug #1496221 changed: Whitespace or new lines in "juju status --format=tabular" output <juju-core:New> <https://launchpad.net/bugs/1496221>
[03:50] <davecheney> thumper: when do "fix committed" blockers roll off the dashboard ?
[03:50] <thumper> davecheney: when they have passed CI
[03:51] <thumper> wallyworld: http://reports.vapour.ws/releases/3059/job/aws-upgrade-20-trusty-amd64/attempt/29 fix didn't fix
[03:52] <davecheney> thumper: do we have to wait til tomorrow for that ?
[03:52] <thumper> davecheney: I just removed one my marking it fix released
[03:52] <thumper> davecheney: we are down to two failures
[03:52] <thumper> I have submitted a fix for one of them
[03:52] <thumper> which in honesty I think will change a panic into a different failure
[03:52] <thumper> the other was the upgrade one...
[03:54] <davecheney> thumper: true, if that valye was nil
[03:54] <davecheney> something else failed to initalise
[03:55] <mup> Bug #1496217 opened: panic in cmd/jujud on power <blocker> <ci> <regression> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1496217>
[03:55] <mup> Bug #1496221 opened: Whitespace or new lines in "juju status --format=tabular" output <juju-core:New> <https://launchpad.net/bugs/1496221>
[03:55] <thumper> holy shit mup is somewhat behind
[03:57] <thumper> wallyworld: is the uniter failure due to the missing upgrade step?
[04:04] <mup> Bug #1495591 changed: TestRunCommand fails on windows <blocker> <ci> <test-failure> <windows> <juju-core:Fix Released by cmars> <https://launchpad.net/bugs/1495591>
[04:05] <anastasiamac_> thumper: which uniter failure r u referring to?
[04:05] <thumper> anastasiamac_: the most recent failure from ci - http://reports.vapour.ws/releases/3059/job/aws-upgrade-20-trusty-amd64/attempt/29
[04:06] <thumper> menn0: I have a hacky fix for the uniter upgrade step in a 1.25 branch, running tests now
[04:07] <menn0> thumper: cool
[04:10] <mup> Bug #1495591 opened: TestRunCommand fails on windows <blocker> <ci> <test-failure> <windows> <juju-core:Fix Released by cmars> <https://launchpad.net/bugs/1495591>
[04:13] <mup> Bug #1495591 changed: TestRunCommand fails on windows <blocker> <ci> <test-failure> <windows> <juju-core:Fix Released by cmars> <https://launchpad.net/bugs/1495591>
[04:14] <thumper> wallyworld: after chatting with anastasiamac_ the answer I think is "yes, the lack of uniter upgrade steps is causing the upgrade failure"
[04:14]  * thumper has a hacky fix
[04:15]  * thumper shakes his fist at the sky
[04:15]  * thumper is grumpy
[04:15] <thumper> in our current tests... even 1.25, I've come across three different intermittent failing tests
[04:19] <davecheney> thumper: -race
[04:20] <thumper> bah humbug
[04:20] <thumper> no presents for anyone this year
[04:22] <thumper> github.com/juju/juju/worker/peergrouper fails every time for me...
[04:22]  * thumper sighs
[04:22] <davecheney> yeah, that one has failed for months for me
[04:22] <davecheney> especially on non intel platforms
[04:23] <anastasiamac_> for bug 1495542, i think that the reported bug is no longer a blocker as the test exercising unused code which caused the trace has been skipped...
[04:23] <anastasiamac_> This bug still needs to be addressed but it should not be blocking master.
[04:23] <anastasiamac_> The failure observed now seem to be different - the trace looks different. What we currently observe MAY warant another bug ;D
[04:23] <mup> Bug #1495542: 1.20.x cannot upgrade to 1.26-alpha1 <blocker> <ci> <regression> <upgrade-juju> <juju-core:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1495542>
[04:24] <anastasiamac_> thumper gives presents? each year?
[04:24] <thumper> anastasiamac_: not any more...
[04:24] <anastasiamac_> thumper: no ned to stop on my account :)
[04:25] <anastasiamac_> need*
[04:25] <thumper> davecheney: I'm running go 1.5 now, wondering if that is related...
[04:25]  * thumper hits the -race fails with the go package...
[04:25] <thumper> no races for me
[04:26] <davecheney> thumper: almost certainly
[04:26] <davecheney> see long discussion about getting 1.5 rolled out everywhere before comitting to 1.6 for 16.04
[04:32]  * thumper nods
[04:32] <thumper> wallyworld: actually back to the upgrade failure, it seems that yes, while there is an error with a bad kind, things progress anyway
[04:32] <wallyworld> thumper: "ModeAbide: cannot set invalid status "started"" - that looks like it's caused by new juju status enums not being processed correctly, but that code has been in place for months. not ure why that is shwing up now.
[04:32] <davecheney> thumper: AFAIK juju passes on intel with 1.5
[04:32] <wallyworld> maybe it was always there and we didn;t notice
[04:33] <davecheney> it almost passes on non intel with go 1.5
[04:33] <thumper> wallyworld: the fundamental error is that the source sets relation config, the sink never runs relation-changed hook
[04:33] <davecheney> but slowness makes it hard to be absolute here
[04:33] <davecheney> see previous plee for faster ppc64 hardware
[04:33] <thumper> davecheney: amd64 and go 1.5 fails peergrouper almost every time for me
[04:33] <wallyworld> thumper: ok, will start at that point
[04:34] <wallyworld> thumper: peergrouper tests kinda suck (from memory). they need to be transformed into proper unit tests
[04:34] <thumper> wallyworld: ack
[04:35] <davecheney> thumper: try GOMAXPROCS=1 go test .../peergrouper
[04:35] <davecheney> that will make the scheduler look more the like 1.2.1 scheduler
[04:35] <davecheney> and may improve reliabilty
[04:35] <thumper> menn0: is this what you were thinking for the uniter upgrades?  http://reviews.vapour.ws/r/2672/diff/#
[04:35]  * menn0 looks
[04:35] <davecheney> if so, then those tests need to be fixed to not expect a certain order of operations
[04:35]  * thumper pokes the peergrouper
[04:37] <sinzui> anastasiamac_: As long as master is not blessed, master is blocked. if you close bug 1495542, you need to replace it with a critical ci regression bug because stakeholder will not allow is to release if the test shows they canot upgrade from 1.20 to 1.24.
[04:37] <mup> Bug #1495542: 1.20.x cannot upgrade to 1.26-alpha1 <blocker> <ci> <regression> <upgrade-juju> <juju-core:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1495542>
[04:38] <sinzui> anastasiamac_: CI Cares about tests passing, fixing bugs without getting the tests passing does not mean we can release
[04:39] <menn0> thumper: that's pretty much what I was thinking although in my mind those upgrade steps would be run from the unit agent Run()
[04:39] <anastasiamac_> sinzui: ah... my bad - the bug is not about the trace but upgrade difficulties :D of course, it should stay :)
[04:40] <menn0> thumper: calling them from Run is probably slightly preferrable in case there are some others we need to add that aren't uniter related
[04:40] <menn0> thumper: also, I thought you said there were 2 upgrade steps like this?
[04:42] <thumper> menn0: the other one is introduced in master, this is a 1.25 branch
[04:43] <thumper> where is the unit agent Run func?
[04:43] <menn0> cmd/jujud/agent/unit.go
[04:44] <menn0> thumper: ^
[04:44] <thumper> ta
[04:45] <menn0> FFS!
[04:45] <menn0> I'm getting TLS handshake errors with a freshly bootstrapped env using master
[04:46] <menn0> actually... ignore me
[04:47] <thumper> davecheney: FWIW setting GOMAXPROCS=1 makes the peergrouper tests pass
[04:47] <menn0> I think there's agents from a hosted env trying to contact the previous state server
[04:47] <thumper> heh
[04:56] <thumper> menn0: I'll move the run upgrades func to the unit agent code
[04:57] <menn0> thumper: they're probably also fine where they are (in the uniter package)
[04:57] <thumper> menn0: nah...
[04:57] <thumper> moving already
[05:07] <mup> Bug #1496237 opened: peergrouper tests very unstable with Go 1.5 <intermittent-failure> <tech-debt> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1496237>
[05:10] <mup> Bug #1496237 changed: peergrouper tests very unstable with Go 1.5 <intermittent-failure> <tech-debt> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1496237>
[05:10] <thumper> menn0: this better? http://reviews.vapour.ws/r/2672/diff/#
[05:13] <mup> Bug #1496237 opened: peergrouper tests very unstable with Go 1.5 <intermittent-failure> <tech-debt> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1496237>
[05:14]  * menn0 looks
[05:15] <menn0> thumper: LGTM
[05:16] <thumper> menn0: I'll await fwereade's comment
[05:16] <menn0> cool
[05:29] <thumper> ok... I'm pretty much done for today
[05:29] <thumper> see all y'all tomorrow
[06:30] <jam1> wallyworld: ping
[06:31] <wallyworld> jam1: hi
[06:32] <jam> wallyworld: so of Series in Charm Metadata and Resources, I think we'd rather focus on the former
[06:33] <jam> I know you poked me about the latter, and I can try to bring it up in 30 min, though I have other higher priority right now.
[06:33] <wallyworld> jam: yeah, that work is underway. i'd also like to get resources spec at least approved so we can start on it this cycle. eco really wants it. maybe you can do it next week, or just ask the spec to be reviewed over the next few days
[06:34] <wallyworld> jam: series in metadata won't take 100% so there will be a little spare capacity
[07:39] <axw> fwereade: I'm debugging #1495542, getting a lot of "2015-09-16 07:24:41 DEBUG juju.worker.dependency engine.go:438 "uniter" manifold worker stopped: getting resource leadership-tracker: dependency not available"
[07:39] <mup> Bug #1495542: 1.20.x cannot upgrade to 1.26-alpha1 <blocker> <ci> <regression> <upgrade-juju> <juju-core:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1495542>
[07:40] <axw> fwereade: restarting the API server did not appear to help, but restarting the unit agent did. makes me suspect the dependency engine isn't delivering a new API connection to the leadership tracker... kind of clutching at straws though
[07:41] <axw> fwereade: other part to the error: "leadership-tracker" manifold worker returned unexpected error: leadership failure: leadership manager stopped
[07:42] <axw> fwereade: and after I restarted the API server, I was getting: engine.go:438 "leadership-tracker" manifold worker stopped: leadership failure: error making a leadership claim: connection is shut down
[07:44] <axw> cmars: ^^ on upgrade from 1.20 to 1.26, "metricspooldir" doesn't exist. is there meant to be an upgrade step for this?
[07:44] <dimitern> axw, o/
[07:44] <axw> dimitern: heya
[07:45] <dimitern> axw, just a heads up - we decided to do a simpler implementation (not as effective as I'd like) for provisioning instances into subnets of a space, only in EC2 and not integrating that with AZ distribution
[07:46] <dimitern> axw, then, we should have a discussion how to properly do it with you and/or fwereade
[07:47] <axw> dimitern: okey dokey, so does it disable distribution when a space is specified?
[07:48] <dimitern> axw, it tries to accommodate both - i.e. starting from all AZs and restricting them to only those relevant to a given space
[07:49] <dimitern> any reviewers? this is a forward port to master of the 1.25 fix for bug 1492066: http://reviews.vapour.ws/r/2673/ - please take a look
[07:49] <mup> Bug #1492066: cloud-init fails when deploying CentOS with Juju. <centos> <cloud-init> <juju> <juju-core:In Progress by dimitern> <juju-core 1.25:Fix Committed by dimitern> <https://launchpad.net/bugs/1492066>
[07:49] <axw> dimitern: hm ok, I don't really get how you implement DistributeInstances correctly then. I might need to see code to understand
[07:51] <dimitern> axw, I'll send you a link to the PR by mail when I'm done
[07:51] <axw> dimitern: thanks
[07:51] <fwereade> axw, sorry, processing
[08:01] <fwereade> axw, ok, I'm not really managing to form a coherent theory at the moment -- that latest log does rather look like the various workers are coming up? but we've got weird settings revision numbers?
[08:02] <fwereade> axw, but then the leadership errors you posted above make it look like the problem is in
[08:02] <axw> fwereade: it did in CI, but when I repro'd it just continuously restarted manifold workers
[08:02] <fwereade> axw, fuck, I think I might know what's happening
[08:03] <fwereade> axw, we've never had a mechanism for usefully bouncing the workers that live under state
[08:04] <axw> fwereade: I saw the "HackLeadership" thing, which is called when the API server stops handling a client connection
[08:05] <axw> fwereade: isn't that the only time we need to bounce it...?
[08:05] <fwereade> axw, if it encounters some unknown error, it'll stop on its own
[08:06] <fwereade> axw, but then I don't quite get what happens when the apiserver restarts
[08:06] <axw> fwereade: I didn't see any "stopping leadership manager with error", but I'm about to repro again so I'll double check
[08:09] <fwereade> axw, error logging around state/leadership/manager.go:38 would detect that, I think
[08:09] <fwereade> axw, I don't think there's anything in state that'd otherwise notice the error until state itself is closed
[08:10] <fwereade> axw, hopefully we'd log that one, but it could be a long time after it actually happened
[08:10] <axw> fwereade: lol. third time's the charm, didn't repro this time :/
[08:10] <fwereade> goddammit
[08:11] <axw> fwereade: settings version is still screwed though
[08:12] <fwereade> axw, huh
[08:12] <fwereade> axw, so the settings version was right before the upgrade and something overwrote it?
[08:12] <axw> fwereade: not sure yet. I upgraded, ran a "config-set" which triggers "relation-set", remote state sees it... but it's the same as what's in local state already
[08:13] <fwereade> axw, is the relation-set to a unique value?
[08:13] <fwereade> axw, we elide null changes
[08:13] <axw> fwereade: yes
[08:14] <fwereade> axw, just checking :(
[08:15] <fwereade> axw, I guess check the rel-state/mgo-state versions before the upgrade and see which get touched by the upgrade?
[08:15] <fwereade> axw, either something's breaking the versions
[08:15] <axw> fwereade: yeah. gotta got for a bit, I'll try that after
[08:15] <fwereade> axw, or the versions are already broken
[08:15] <fwereade> axw, cheers
[08:18]  * fwereade has to depend on wifi through walls for a few hours, please forgive any spottiness
[08:43] <frankban> menn0 or katco: could you please take a look at http://reviews.vapour.ws/r/2633/ when you have time? proposed against a feature branch
[09:01] <frobware> dimitern, voidspace, jam: standup time if you're coming...
[09:02] <voidspace> frobware: I'm here
[09:02] <jam> frobware: omw
[09:07] <axw> fwereade: welp... before upgrade, api server sent a relation units change with version=3. after upgrade, the initial change event has version=2
[09:13] <axw> fwereade: I think this might be because of the env-uuid migration... txn-revno would reset, because the doc is removed and re-added. but not sure why it hasn't been an issue before.
[09:14] <fwereade> axw, holy hell, that is horrifying
[09:15] <fwereade> axw, ok, if we're going to address this -- as we must -- let's do it right and maintain our own revno
[09:15] <axw> fwereade: yep, SGTM
[09:15] <axw> fwereade: any thoughts on why this wouldn't have been an issue until now?
[09:16] <fwereade> axw, no idea at all
[09:17] <fwereade> axw, ah-ha
[09:17] <fwereade> axw, ...or not
[09:17] <axw> heh :)
[09:18] <fwereade> axw, hmm. did we change from remove/add in one txn to remove-a-bunch then add-a-bunch?
[09:18] <fwereade> axw, I'm pretty sure that if we did a remove/add in one txn it'd preserve revno
[09:19] <fwereade> axw, separate remove then add would be dangerous, I hope we don't do that
[09:19] <axw> fwereade: one big txn, but they're different docs - they have different IDs
[09:20] <fwereade> axw, but I suspect a separate add then remove would be safe *except* for changing the underlying revno
[09:20] <axw> fwereade: https://github.com/juju/juju/blob/master/state/upgrades.go#L1133
[09:21] <axw> fwereade: the doc inserted would have txn-revno in it I guess, but I think that'd be overwritten by the txn code?
[09:21] <fwereade> axw, bugger, I would appear to be completely wrong
[09:21] <fwereade> axw, I have to double-check mgo/txn
[09:22] <axw> fwereade: and I'll do some experiments in that code
[09:38] <fwereade> axw, ha
[09:38] <frobware> dimitern, so I think I see why the number of sockets in close_wait dropped so dramatically: Out of memory: Kill process 4280 (jujud) score 699 or sacrifice child
[09:38] <fwereade> axw, yes, I am completely wrong
[09:40] <axw> fwereade: as in, revno is not preserved?
[09:41] <fwereade> axw, yeah -- I now think all the new docs will be inserted with revno 2
[09:41] <axw> fwereade: that matches what I saw
[09:41] <fwereade> axw, and possibly we do a few subsequent txns on them to bump those up further?
[09:42] <fwereade> axw, but it'll only work right if we bump them far enough?
[09:42] <fwereade> axw, making it more likely to fail (subtly at first) in long-lived environments
[09:42] <axw> fwereade: right... so I have nfi how nobody has found this before
[09:43] <axw> fwereade: sorry gotta go, I'll bbl
[09:43] <fwereade> axw, take care
[09:44] <axw> fwereade: perhaps before we were using !=, now we're using >
[09:44]  * axw actually goes
[09:45] <fwereade> axw, I thought we always used > but could be wrong
[09:54] <voidspace> frobware: dimitern: so the dimensions I have for the juju poster are wrong for the printer - I need to spend a bit of time getting the graphic the right size so I can get it printed for Friday :-/
[09:55] <voidspace> frobware: dimitern: hopefully not too long, but I'm currently derailed onto that as it's a bit urgent (this is for PyCon UK and the Juju poster session I'm doing)
[09:58] <voidspace> ericsnow: ping
[10:30] <frobware> voidspace, ack
[10:38] <voidspace> frobware: dimitern: graphic and printing sorted, collection Friday AM! *phew*
[10:57] <axw> fwereade: confirmed that it is >2 before, and 2 after upgrade. I'll look into an upgrade step / watcher change.
[10:59] <axw> fwereade: and we used to use !=
[11:00] <axw> fwereade: https://github.com/juju/juju/blob/1.25/worker/uniter/relation/livesource.go#L223
[11:00] <fwereade> axw, well, that would explain it
[11:00] <fwereade> axw, and could indeed still show up as missed-hooks on upgrade
[11:01] <fwereade> axw, but be much rarer
[11:01] <fwereade> axw, but still needs to be fixed
[11:01] <axw> fwereade: indeed
[11:01] <fwereade> axw, re the other stuff
[11:02] <fwereade> axw, er, I think I'm saying, it looks like there are more bugs
[11:02] <fwereade> axw, but I'm not clear which they are and what's been addressed already in that thread
[11:03] <axw> fwereade: ok. I definitely saw issues with manifolds restarting, and also one with the metrics spool dir not existing
[11:03] <axw> fwereade: I had to manually create it to proceed
[11:03] <axw> fwereade: but... that only happened to me the first couple of times I tested this
[11:04] <fwereade> cmars, thoughts on ^^?
[11:18] <dimitern> voidspace, frobware, I'm back now - reading scrollback..
[11:21] <dimitern> voidspace, nice! I'm sure the poster session will be interesting :)
[11:22] <dimitern> voidspace, incidentally, please have a look at this http://reviews.vapour.ws/r/2673/ ;)
[11:23] <dimitern> dooferlad, frobware ^^
[11:34] <voidspace> dimitern: looking
[12:16] <frobware> dimitern, going to get some lunch whilst that process gobbles up sockets... want to look into this at the top of the hour (14 UTC+1)?
[12:17] <voidspace> dimitern: that's a shed load of test changes!
[12:17] <voidspace> dimitern: LGTM
[12:18] <dimitern> frobware, sure, np
[12:18] <dimitern> voidspace, thanks! yeah - but at least they're manageable  now :)
[12:19] <voidspace> dimitern: yep, good work
[12:19] <frobware> dimitern, voidspace: I was confused by the change largely because I looked that test changes first. the "other" file makes it quite obvious.
[12:19] <axw> fwereade: I need to go have dinner, but this is what I'm intending to do: https://github.com/juju/juju/compare/master...axw:state-settings-version
[12:19] <fwereade> axw, thanks
[12:19] <axw> fwereade: there's some fallout in state tests I need to look into
[12:19] <voidspace> frobware: yeah, me too :-)
[12:21] <fwereade> axw, that's glorious
[12:21] <fwereade> axw, I have wanted time and/or an excuse to do that forever
[12:23] <dimitern> voidspace, frobware, yeah, but to test the obvious changes was mostly impossible due to the way tests were written (they accumulated a lot of cruft over time)
[12:30]  * dimitern is gradually build up frustration with helm in emacs - it's helpful for a few things, but it gets increasingly annoying in most common cases
[12:33] <rogpeppe> simple addition to the errors package: https://github.com/juju/errors/pull/21
[12:33] <rogpeppe> reviews appreciated, thanks
[12:40] <axw> fwereade: cool :)  upgrades will be a little bit messy, but worth the pain
[12:53] <rogpeppe> axw: fancy a quick trivial review? :) http://reviews.vapour.ws/r/2674/
[12:54] <axw> rogpeppe: only for you
[12:54] <axw> rogpeppe: shipit
[12:54] <rogpeppe> axw: ta! :)
[12:54] <rogpeppe> axw: you're a love
[12:56] <bogdanteleaga> unit-sleep-0[1244]: 2015-09-16 12:55:45 DEBUG juju.worker.dependency engine.go:438 "metric-sender" manifold worker stopped: failed to open spool directory "C:/Juju/lib/juju/metricspool": GetFileAttributesEx C:/Juju/lib/juju/metricspool: The system cannot find the file specified.
[12:57] <bogdanteleaga> shouldn't this at least try to create that directory if it's not there?
[13:01] <rogpeppe> this is what I wanted it for: http://reviews.vapour.ws/r/2675/
[13:01] <frobware> dimitern,  I enable helm for about a day, get frustrated with it, turn it off for a few months. Then I see the potential in some youtube video, turn it on... lather, rinse, repeat...
[13:02] <rogpeppe> if someone could give me a review of this please, it would make me happy http://reviews.vapour.ws/r/2675/
[13:03] <dimitern> frobware, :) yeah - it's hijacking some commands somewhat haphazardly (latest frustration is compilation-find-file)
[13:10] <ericsnow> voidspace: pong
[13:28] <bogdanteleaga> dimitern, I've been starting with it lately and it seems really nice so far, but I find my usecases are somewhat limited and usually integrate with projectile
[13:36] <dimitern> bogdanteleaga, yeah, the worst of all, helm *really* tries to be "helpful" *everywhere*
[13:41] <bogdanteleaga> dimitern, well, I didn't find it intrusive, at least not yet
[13:42] <bogdanteleaga> dimitern, it's not like it starts showing up if you didn't call for it :P
[13:59] <voidspace> ericsnow: unping...
[13:59] <ericsnow> voidspace: :)
[14:30] <voidspace> dimitern: in my code review fwreade spake thusly "I'm raising a bit of an eyebrow at the gobal-prefer-ipv6 thing... is this going to be a long term approach?"
[14:30] <voidspace> dimitern: I assume I should reply "no"...
[14:35] <rogpeppe> katco: hiya
[14:35] <katco> rogpeppe: hey
[14:36] <rogpeppe> katco: fancy a simple review? :) http://reviews.vapour.ws/r/2675/
[14:36] <katco> rogpeppe: working my way through them now actually :)
[14:36] <rogpeppe> katco: cool
[14:36] <rogpeppe> katco: this one's really really ickle simple...
[14:36] <katco> rogpeppe: you are 3rd in the queue my friend
[14:36] <rogpeppe> katco: awesome!
[14:37] <rogpeppe> katco: BTW it's targeting the chicago-cubs feature branch
[14:37] <katco> rogpeppe: what is that branch anyway?
[14:38] <rogpeppe> katco: adding macaroon support to juju-core
[14:38] <katco> rogpeppe: why wasn't it just named that lol
[14:38] <rogpeppe> katco: it's the branch we worked on in the recent chicago sprint
[14:38] <rogpeppe> katco: blame mattyw :)
[14:39] <katco> voidspace: re: http://reviews.vapour.ws/r/2634 is this a forward-port of http://reviews.vapour.ws/r/2593/ ?
[14:42] <voidspace> katco: it is...
[14:42] <voidspace> katco: I'm going to close the two forward port reviews and repropose when the original PR is actually done
[14:42] <katco> voidspace: ah ok, wondered if that was what was going on
[14:43] <voidspace> katco: yeah, it was done. And then fwreade looked at it ;-)
[14:43] <katco> haha
[14:43] <voidspace> :-)
[14:43] <voidspace> all good stuff
[14:43] <katco> he keeps us honest :)
[14:46] <dimitern> voidspace, yes I think so
[14:50] <katco> rogpeppe: shipit
[14:50] <rogpeppe> katco: ta!
[15:01] <voidspace> fwereade: ping
[15:03] <fwereade> voidspace, pong
[15:03] <voidspace> fwereade: question about apiserver/client/status
[15:04] <voidspace> fwereade: note as a precursor that machine/unit PublicAddress/PrivateAddress have now changed and the only error they can return *does* indicate that an address has not yet been set
[15:04] <voidspace> fwereade: we fetch PublicAddress for both UnitStatus and MachineStatus
[15:04] <voidspace> fwereade: you were unhappy about not surfacing errors (the old code and my branch set an empty DNSName / PublicAddress)
[15:05] <voidspace> fwereade: as fetching the address can no longer raise arbitrary errors (no secret db write) are you happier with returning an empty address
[15:05] <voidspace> fwereade: if *not*, there's no obvious existing field (maybe InstanceState on MachineStatus) to surface the error information in
[15:06] <voidspace> fwereade: http://reviews.vapour.ws/r/2593/#comment17052
[15:06] <fwereade> voidspace, yeah, and status is bloated enough already
[15:06] <voidspace> fwereade: ok
[15:07] <voidspace> I thought that removing the possibility of arbitrary errors made it less of an issue, but wanted to check
[15:07] <fwereade> voidspace, possibly dump them to stderr as warnings..?
[15:07] <voidspace> as in the diff there will still be an "apparently ignored" error
[15:07] <fwereade> voidspace, I think that once you have an err return you have to assume the possibility of unknown errors
[15:08] <voidspace> heh, hmmm
[15:08] <fwereade> voidspace, (especially when the network is involved, as it is here)
[15:08] <voidspace> I've *documented* the error the method can return
[15:08] <voidspace> and if you can't trust a doc string what *can* you trust...
[15:08] <fwereade> voidspace, I'm laughing both merrily and bitterly over here
[15:08] <voidspace> fwereade: so this is on the apiserver, what do you mean by "dump to stderr"?
[15:08] <voidspace> fwereade: log?
[15:09] <fwereade> voidspace, oh -- ha
[15:09] <fwereade> voidspace, dammit, sorry
[15:10] <fwereade> voidspace, forget stderr then, just log it at WARNING or something :/
[15:10] <voidspace> fwereade: cool, thanks
[15:10] <voidspace> appreciated, sorry to steal your cycles
[15:10] <fwereade> voidspace, was thinking of an older version of status that did much more client-side
[15:10] <fwereade> voidspace, no worries, always a pleasure
[15:26] <wwitzel3> ericsnow: ping
[15:26] <ericsnow> wwitzel3: hi!
[15:30] <wwitzel3> fwereade: ping
[15:30] <fwereade> wwitzel3, pong
[15:32] <katco> is pinging random people a thing now? a meme maybe? ping wwitzel3
[15:34] <wwitzel3> katco: ping
[15:34] <wwitzel3> ;)
[15:34] <katco> wwitzel3: ping, wwitzel3. ping. as your fore-fathers have pung.
[15:34] <wwitzel3> haha
[15:38] <mattyw> I remember when this was a sensible channel
[15:39] <katco> mattyw: ok mr chicago-cubs branch ;)
[15:39] <mattyw> katco, I feel bad - I've ranted a couple of times the last week about how much I hate pet names
[15:40] <mattyw> katco, and there I was sinning the whole time I was ranting
[15:40] <katco> haha
[15:40] <mattyw> katco, also - hello there, not spoken in ages
[15:40] <katco> mattyw: yeah, hiya
[15:41] <katco> mattyw: haven't talked to several folks in a long time whom are not in my tz =/
[15:50]  * perrito666 learns that this was once a sensible channel
[15:53] <natefinch> ericsnow: back now
[15:54] <ericsnow> natefinch: k
[16:20] <mup> Bug #1496472 opened: TestRun fails intermittently on ppc64 <ci> <intermittent-failure> <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1496472>
[16:21] <alexisb> sinzui, mgz is this a critical blocker for 1.25 ^^^
[16:21] <alexisb> I saw the cursed runs
[16:22] <mgz> alexisb: nope, it's intermittent, and we've had a selection of different ppc64 test issues
[16:22] <mgz> don't have enough info yet
[16:22] <alexisb> mgz, ack, thanks
[16:23] <mgz> alexisb: bug 1496217 is another failure from the previous run that likely has higher impact
[16:23] <mup> Bug #1496217: panic in cmd/jujud on power <blocker> <ci> <regression> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1496217>
[16:24] <alexisb> mgz, ack, cmars thanks for picking up that bug!
[16:25] <alexisb> alrighty all I am stepping out for a bit, katco knows how to get a hold of me if anything urgent comes up
[16:25] <katco> i am accepting requests for prank calls
[16:27] <mgz> katco: you are in an amusing mood today :)
[16:27] <katco> mgz: i aim to please :)
[16:29] <mup> Bug #1496472 changed: TestRun fails intermittently on ppc64 <ci> <intermittent-failure> <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1496472>
[16:38] <mup> Bug #1496472 opened: TestRun fails intermittently on ppc64 <ci> <intermittent-failure> <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1496472>
[16:45] <rogpeppe> katco: here's a somewhat more substantial PR if you have a moment or two. it's just code cleanup, but will make moving forward easier. https://github.com/juju/juju/pull/3296
[16:45] <katco> rogpeppe: will tal in a bit
[16:46] <rogpeppe> katco: ta!
[16:46] <rogpeppe> ericsnow, natefinch: you might wanna take a look too https://github.com/juju/juju/pull/3296
[16:46] <ericsnow> rogpeppe: nice
[16:47] <rogpeppe> ericsnow: thanks :)
[16:47] <ericsnow> rogpeppe: I looked into doing a refactor like this last year and chickened out :)
[16:47] <rogpeppe> ericsnow: i couldn't just make things worse...
[16:47] <rogpeppe> ericsnow: the error return stuff is terrible and sadly cannot be fixed.
[16:48] <ericsnow> rogpeppe: yeah, that was one of the major things that hung me up
[16:48] <rogpeppe> ericsnow: i've no idea what whoever it was was thinkin
[16:49] <rogpeppe> g
[16:49] <ericsnow> rogpeppe: hopefully I didn't make it worse with all the changes I made in that space for backups
[16:50] <rogpeppe> ericsnow: i'm afraid backups is one of the problem places
[16:50] <ericsnow> rogpeppe: :(
[16:50] <rogpeppe> ericsnow: almost every API call returns an error in the form {Error: {error object}}
[16:50] <rogpeppe> ericsnow: except backups returns the error as just {error object}
[16:52] <ericsnow> rogpeppe: FWIW, that was some of the first code I wrote for Juju (and in Go)
[16:52] <rogpeppe> ericsnow: that's ok - someone else should've been more on the ball
[16:52] <ericsnow> rogpeppe: so no surprises :)
[17:27] <perrito666> anyone is local provider savvy?
[17:31] <ericsnow> katco, mgz, natefinch: I've found the problem (for backward-compatibility the API server converts empty strings into nil in config settings)
[17:32] <ericsnow> katco, mgz, natefinch: this has an impact on the behavior of UpdateConfigSettings but not on directly setting the config settings
[17:33] <ericsnow> katco, mgz, natefinch: I should have a patch up shortly
[17:33] <natefinch> ericsnow: thanks!
[17:35] <katco> ericsnow: wow, nice... i think fwereade categorized this as "spooky action at a distance"
[17:36] <ericsnow> katco: :)
[18:24] <mgz> ericsnow: woho!
[18:33] <mgz> ericsnow: wave the branch at me when it's up
[18:33] <ericsnow> mgz: k
[18:58] <ericsnow> mgz: http://reviews.vapour.ws/r/2679/
[18:59] <katco> ericsnow: tal
[18:59] <ericsnow> katco: thanks
[18:59] <mgz> ericsnow: thanks!
[19:04] <ericsnow> mgz: sorry that took so long
[19:04] <mgz> ericsnow: thank you, that branch makes sense to me
[19:04] <mgz> and I can see why we struggled to track it down :)
[19:05] <katco> ericsnow: shipit
[19:05] <ericsnow> mgz: yeah, it was just non-obvious enough :/
[19:05] <katco> ericsnow: and ty for your hard work
[19:05] <ericsnow> katco: hey, this one was a team effort :)
[19:25] <perrito666> bbl gym
[19:32] <ericsnow> katco, mgz, natefinch: the fix has landed
[19:32] <natefinch> ericsnow: thanks for figuring it out.  I'm catching up on the changes you made now
[19:32] <katco> ericsnow: woohoo!
[19:36] <mgz> ericsnow: you have the next run through CI, eta 40mins
[19:37] <ericsnow> mgz: k
[19:47] <natefinch> ericsnow: so your change fixes the problem because it drops all the map entries with a nil value?
[19:48] <ericsnow> natefinch: yep
[19:48] <ericsnow> natefinch: it's what it was doing before
[19:48] <natefinch> ericsnow: yeah... I wish there was a more obvious way to do it... converting in and out of the settings object still qualifies as spooky action IMO.
[19:51] <ericsnow> natefinch: I was aiming for a consistent execution path through the two approaches
[19:51] <ericsnow> natefinch: I agree it isn't optimal
[19:54] <natefinch> I'm still surprised that using the deployer to deploy the bundle with the config had different results than deploying from the command line.
[19:54] <natefinch> I guess the client must strip out the empty config values where the deployer was relying on the server to do so
[19:56] <ericsnow> natefinch: yep, I suspect some trickery from the CLI before it makes the API call
[20:22] <natefinch> ericsnow: I wonder if it wouldn't be more appropriate to have createSettingsOp strip out keys with nil values
[20:24] <ericsnow> natefinch: that would impact other callers of createSettingsOp that currently do not worry about it
[20:24] <ericsnow> natefinch: not that it's necessarily the wrong thing to do
[20:24] <ericsnow> natefinch: I was trying to minimize the potential impact for this patch
[20:30] <natefinch> ericsnow: yeah... I'm just trying to make the code a little more obvious and explicit.
[20:30] <ericsnow> natefinch: fair enough :)
[20:50] <mgz> 1.24 building on CI now.
[21:00] <natefinch> ericsnow: http://reviews.vapour.ws/r/2680/diff/#
[21:01] <natefinch> gotta run to dinner... but I think that's a more obvious codepath (and keeps your new test, which still passes, as do all other tests)
[21:02] <ericsnow> natefinch-afk: I'll take a look
[21:02] <ericsnow> natefinch-afk: note that I already landed that other patch for 1.24
[21:14] <ericsnow> katco: I've sent you my feedback on that email
[21:14] <ericsnow> katco: enjoy! :)
[21:14] <katco> ericsnow: sweet, ty
[21:42] <mup> Bug #1496217 changed: panic in cmd/jujud on power <blocker> <ci> <regression> <juju-core:Fix Released by cmars> <https://launchpad.net/bugs/1496217>
[23:45] <menn0> wallyworld: have you got time for a quick hangout?
[23:57] <alexisb> \o/ 1.24 is blessed
[23:57] <alexisb> ship the puppy!
[23:59] <xwwt> team is on it