[00:52] <axw> wallyworld: I think you misunderstood me about the API
[00:52] <axw> I just meant to combine them in the CLI code
[00:52] <wallyworld> ah, damn
[00:52] <axw> i.e. create an interface that you can satisfy by embedding both the facades into one struct
[00:53] <wallyworld> oh well, what's there is ok i think?
[00:53] <wallyworld> we use that same pattern elsewhere too
[00:53] <axw> wallyworld: I think it's fine
[00:53] <axw> wallyworld: ModelManager makes more sense for this command anyway I think
[00:53] <wallyworld> yeah, agree
[01:23] <rick_h_> wallyworld: is the model destroy blocking something communicated to other teams?
[01:23] <rick_h_> wallyworld: coukd tgat imoact things like the gui?
[01:23] <wallyworld> rick_h_: it was a stakeholder bug from OIL I think. I am not sure who else knows
[01:24] <wallyworld> i am not sure about GUI - I'll send an emil
[01:24] <wallyworld> rick_h_: but the blocking is ony CLI
[01:24] <rick_h_> wallyworld: right, i'm +1000 on the change, but want to make sure we chat with ither clients
[01:24] <rick_h_> wallyworld: ah ok
[01:24] <wallyworld> the api behaves the same
[01:24] <rick_h_> wallyworld: nvm then
[01:25] <wallyworld> fair enough question to asl
[01:25] <wallyworld> *ask
[01:48] <menn0> thumper: i've fixed the racy pingerSuite tests: https://bugs.launchpad.net/juju/+bug/1625214
[01:48] <mup> Bug #1625214: pingerSuite.TestAgentConnectionsShutDownWhenAPIServerDies write: broken pipe <ci> <intermittent-failure> <regression> <unit-tests> <juju:In Progress by menno.smits> <https://launchpad.net/bugs/1625214>
[01:48] <menn0> thumper: what I meant to paste: https://github.com/juju/juju/pull/6410/commits/f9ee5dba43e0fa032fe67d14d1c89cdc1c1e1a55
[01:49] <thumper> looking
[01:52] <thumper> approve
[01:52] <menn0> thumper: cheers
[02:01] <menn0> wallyworld: https://github.com/juju/juju/pull/6411 please
[02:02] <wallyworld> menn0: sure, just finishing another
[02:02] <menn0> wallyworld: ignore the first commit. that one is merging as part of another PR now.
[02:02] <wallyworld> ok
[02:19] <wallyworld> menn0: could you take a look at this one by christian? i think your +1 would be valuable. I've already had a look https://github.com/juju/juju/pull/6408
[02:20] <wallyworld> also, your 6411 pr has conflicts
[02:20] <menn0> wallyworld: will do. he emailed me about it too.
[02:20] <wallyworld> ta
[02:21] <menn0> wallyworld: conflicts fixed
[02:21] <wallyworld> ta
[02:53] <axw> menn0: are you investigating that cert issue?
[02:54] <menn0> axw: I was about to take a short break and then stuck into it
[02:54] <menn0> axw: I still have a controller up where that's happened
[02:54] <menn0> axw: have you seen it too?
[02:54] <axw> menn0: ok. would it be helpful if I looked in parallel?
[02:54] <axw> menn0: no, but it didn't sound hard to repro
[02:55] <menn0> axw: could be. if you could take a look too that would be great.
[02:55] <axw> menn0: ok. I'll let you know if I find anything
[02:55] <menn0> axw: here's what the failure looks like: http://paste.ubuntu.com/23301342/
[02:56] <axw> okey dokey
[02:56] <axw> menn0: and all you did was bounce the controlelr agent?
[02:57] <menn0> axw: I stopped the lxd machine and started it again
[02:57] <axw> ah yes
[02:57] <axw> ok
[02:57] <menn0> the controller instance
[02:57]  * menn0 will be back in 15 mins
[03:17] <axw> wallyworld: what did you want me to test with add-model? something like this? bootstrap rc2, upgrade to my branch, then add-model foo && add-model bar ap-southeast-2
[03:17] <wallyworld> axw: yeah, check that the endpoint fixing is done for addmodel also
[03:18] <wallyworld> axw: or even first confirm that it is broken without your fix
[03:18] <wallyworld> so we can be sure that the fix is relvant nd valid
[03:18] <axw> wallyworld: it would be broken if you tried to upgrade from rc2
[03:18] <axw> wallyworld: but if you added a model to a running rc2 without upgrading, it should work. anyway, I'll try what I described
[03:19] <wallyworld> ta, i think we need to be 10000% sure
[03:40] <axw> menn0: I strongly suspect -> https://github.com/juju/juju/commit/0d64508f8005ea84c141ed392389fbaef0e70b30
[03:40] <axw> menn0: in particular, "info.Cert = existing.Cert" without also setting Key
[03:51] <blahdeblah> Hi everyone, which ppa should I be using to get the RC version of juju 2.0 so I can try https://docs.google.com/document/d/1yT5pvS38g9Z9SviI9NPmhIZrsO8drwHDPNWAMSmCedE/edit# ?
[03:52] <menn0> axw: looks like there was already a ticket for this issue: https://bugs.launchpad.net/juju/+bug/1631145
[03:52] <mup> Bug #1631145: rc2 upgrade to rc3 failed with cannot start api server worker: cannot set initial certificate: cannot create new TLS certificate: crypto/tls: private key does not match public key <juju:New> <https://launchpad.net/bugs/1631145>
[03:52] <menn0> axw: i've just bumped the priority etc
[03:54] <axw> okey dokey
[03:54] <menn0> axw: any luck replicating?
[03:55] <axw> menn0: not yet
[03:55] <axw> blahdeblah: according to release email,  ppa:juju/devel
[03:55] <blahdeblah> axw: cool - thanks
[03:58] <wallyworld> axw: with maas, do you know what happens if you attempt to acquire a node without specifying an architecture? does it just pick some arbitary machine so long as any other constraints match?
[03:58] <axw> menn0: hrm. I am seeing "getsockopt...connection refused" without the other error though
[03:58] <axw> and status isn't working now
[03:58] <axw> wallyworld: I don't recall, sorry
[03:58] <menn0> axw: that's what you'll see at the client
[03:58] <menn0> axw: what about in the controller's logs?
[03:59] <wallyworld> maybe menn0 recalls?
[03:59] <axw> menn0: this is in the controller machine log
[03:59] <menn0> axw: hmm ok
[03:59] <axw> menn0: it's failing to connect to mongo...
[03:59] <menn0> axw: possibily a different problem?
[03:59] <axw> yes, possibly
[04:00] <anastasiamac> wallyworld: r:maas arch.. yes, i beleve so.. at lest this s what i ve seen
[04:00] <menn0> axw: not ideal
[04:00] <axw> menn0: although... do we use the same cert/key for mongo?
[04:00] <wallyworld> menn0: ta, ok
[04:00] <menn0> wallyworld: sorry, not sure about maas
[04:00] <wallyworld> np, i'll ask the maas qguys
[04:00] <wallyworld> guys
[04:01] <menn0> axw: maybe we do use the same cert/key... worth checking
[04:03] <axw> menn0: it does appear to be the case
[04:04] <axw> when we update the cert, we write out to server.pem
[04:04] <menn0> axw: ok, so probably a different manifestation of the same issue
[04:04] <axw> yup
[04:04] <menn0> axw: i'm tracing through where the cert and key that apiserver.NewServer is given come from
[04:06] <menn0> axw: ok so that's out of StateServingInfo in the agent config
[04:06] <menn0> axw: perhaps one is being updated without the other?
[04:14] <menn0> axw: this could be it: https://github.com/juju/juju/blob/master/cmd/jujud/agent/machine/servinginfo_setter.go#L67-L74
[04:15] <menn0> axw: should existing.PrivateKey get copied too?
[04:19] <menn0> axw: I'd also be a lot happier if most of what the stateservinginfo_setter did happened /inside/ the ChangeConfig
[04:19] <menn0> axw: same for the method which does cert updates for the apiserver
[04:19] <menn0> looks racy otherwise
[04:24] <menn0> axw: the more I look, the more I think this is the bug
[04:24]  * menn0 fixes
[04:25] <axw> menn0: sorry, I just realised you lost connection after I pasted the commit before :/
[04:25] <menn0> axw: did you come to the same conclusion?
[04:25]  * menn0 has been having wifi problems
[04:25] <axw> [11:40:32] <axw> menn0: I strongly suspect -> https://github.com/juju/juju/commit/0d64508f8005ea84c141ed392389fbaef0e70b30
[04:25] <axw> [11:40:57] <axw> menn0: in particular, "info.Cert = existing.Cert" without also setting Key
[04:25] <menn0> axw: ha! did see that. we independently found the same thing
[04:26] <menn0> axw: s/did/didn't/
[04:27] <menn0> axw: do you also agree that much of the logic in stateservinginfo_setter should be /inside/ the ChangeConfig func?
[04:28] <menn0> axw: same for MachineAgent.upgradeCertificateDNSNames
[04:28] <axw> menn0: yes, I think so
[04:28] <axw> anything looking at current config
[04:29] <menn0> axw: ok, let's fix it.
[04:29] <menn0> axw: have you already started?
[04:29] <axw> menn0: nope, eating lunch atm
[04:30] <wallyworld> blahdeblah: i so want to mark your bug as invalid. using firefox as default browser should be invalid :-)
[04:30] <menn0> axw: ok, i'll start on servinginfo_setter.go.
[04:30] <menn0> axw: i'm going to have to stop soon but will be back on later this evening
[04:32] <blahdeblah> wallyworld: Feel like taking the gloves off today, eh? :-)
[04:32] <axw> menn0: ok. let me know where you leave off, I can pick it up
[04:32] <menn0> axw: sounds good
[04:32] <wallyworld> blahdeblah: i used to run firefox until it started to grind everything to a halt with many tabs open and other performance things
[04:33] <blahdeblah> wallyworld: It has never done that for me.  I've heard many claims of such, but never seen it personally, despite having 4 windows with 100-odd tabs open right now.  I thank NoScript for that.
[04:34] <wallyworld> blahdeblah: yeah, that may well be it. sadly lots of sites require js, including lp. well lp doesn't need it but uscks without it
[04:34] <blahdeblah> I use js on lots of sites; I just don't let it do that by default
[04:35] <wallyworld> i find it too hard to whitelist everything. well maybe am too lazy
[04:35] <blahdeblah> anyhew, my bug is valid, and I'll fight you for it! :-P
[04:35] <wallyworld> adblock works a treat though
[04:35] <wallyworld> oh i know it is valid
[04:35] <wallyworld> was just stirring
[04:36] <blahdeblah> I see your stirring, and raise you lp:1587644, which got me out of bed at least once this weekend
[04:37] <anastasiamac> menn0: wallyworld: considering u've re-viewed the fix for bug 1625774
[04:37] <mup> Bug #1625774: memory leak after repeated model creation/destruction <ateam> <eda> <oil> <oil-2.0> <uosci> <juju:In Progress by 2-xtian> <https://launchpad.net/bugs/1625774>
[04:38] <anastasiamac> menn0: wallyworld: is it possible/wanted/needed on 1.25 to solve some of the memory leakage there too?
[04:39] <wallyworld> not sure, depends on when state pool was introduced. not sure if it was for multi model or not
[04:39] <wallyworld> would have to dig in and look
[04:39] <menn0> anastasiamac: it could be applied there but I don't think multi-model was supported without a feature flag was it? the fix would have minimal utility if that's the case.
[04:45] <wallyworld> yes, multi model in 1.25 is ff
[04:46] <anastasiamac> wallyworld: even server-side stuff? we usually ff at cli level...
[04:46] <wallyworld> not for multi model
[04:46] <wallyworld> IIRC
[04:47] <menn0> yeah, I'm pretty sure the feature flag was applied throughout the server too
[04:52] <menn0> axw: I have the ssi setter fixes done, pushing now
[04:52] <axw> cool
[04:52] <menn0> axw: can I leave the MachineAgent.upgradeCertificateDNSNames fix to you?
[04:53] <axw> menn0: yep sure
[04:54] <menn0> axw: https://github.com/juju/juju/pull/6413
[04:55] <axw> menn0: I think the only thing to do there is to not use "si.Cert", but get fresh value inside ChangeConfig?
[04:55] <axw> ta
[04:55] <mup> Bug #1630728 changed: remove user  needs better message that user is made inactive <usability> <juju:Triaged> <https://launchpad.net/bugs/1630728>
[04:57] <menn0> axw: shouldn't cert.NewDefaultServer be called inside the ChangeConfig though?
[04:57] <axw> menn0: yes, I mean everything from "Parse the current certificate to get the current dns names." down should be inside
[04:58] <menn0> axw: +1 that's what I was thinking
[04:58] <menn0> should be easy
[04:58] <menn0> axw: just trying to QA this PR before I get called to help out with other stuff
[04:58] <axw> menn0: except maybe the mongo bit. might want to do that after writing agent conf
[04:58] <axw> menn0: okey dokey, thanks
[04:59] <menn0> axw: maybe... it might be ok to do it in the changeconfig though
[04:59] <menn0> axw: actually, i'm out of time. if you could review that PR, I'll check back in later and do the QA before merging
[05:00] <axw> menn0: approved
[05:01] <axw> menn0: with one late comment
[06:30] <menn0> axw: I made that change and QAed. Merging now.
[06:33] <axw> menn0: thanks
[06:33] <axw> menn0: I'm just QAing my change now
[06:33] <menn0> axw: cool. i can review when it's up. i've got to do dishes etc but will check back in later.
[06:33] <axw> menn0: https://github.com/juju/juju/compare/master...axw:lp1631145-upgradecertificatednsnames?expand=1 if you want to take quick look
[06:33] <axw> ok, no woprries
[07:14] <axw> wallyworld: got anything smallish you'd like help?
[07:16] <wallyworld> axw: not off hand. I am finishing a partial list-clouds fix pending input from rick et al. i think there's one remaining bug on alexis' list to do with agent restart but not sure how big it is
[07:16] <axw> wallyworld: sounded big, I'll take a look
[07:17] <wallyworld> yeah, it did at first look
[07:52] <wallyworld> axw: is your cert fix just because we need to recover from a borked cert cause by rc3?
[07:52] <axw> wallyworld: yes. and also we should be doing stuff inside ConfigChanged in general. doesn't matter in this case because the function is only called at startup
[07:53] <axw> but good to keep it clean, to avoid copy&paste errors
[07:53] <wallyworld> +1
[07:53] <wallyworld> lgtm
[08:05] <hoenir> could someone review my PR ? https://github.com/juju/juju/pull/6414
[08:06] <hoenir> also note that this PR will be the foundations of a new feature I wish to unlock on juju, enabling manual provisioning for windows machines.
[08:07] <hoenir> foundation*
[08:11] <mup> Bug #1420996 opened: Default secgroup reset periodically to allow 0.0.0.0/0 for 22, 17070, 37017 <canonical-is> <juju-core:New> <https://launchpad.net/bugs/1420996>
[08:58] <dooferlad> mgz: is the github jujubot happy? It seems to be ignoring my $$merge$$ on https://github.com/juju/juju/pull/6406
[09:20] <babbageclunk> wallyworld: ping?
[09:20] <wallyworld> yo
[09:20] <babbageclunk> wallyworld: I think I agree with you about Release vs Put - do you think menn0 will sulk if I change it?
[09:21] <wallyworld> babbageclunk: he can sulk, i'll get the popcorn ready :-D
[09:21] <babbageclunk> :)
[09:21] <wallyworld> babbageclunk: or even Done()
[09:25] <babbageclunk> wallyworld: Done isn't verby enough for my taste. Get/Done doesn't seem right.
[09:25] <wallyworld> fair point, i was trying to find an alternative to Release() to aleviate objections :-)
[09:27] <babbageclunk> always the peacemaker
[09:27] <wallyworld> oh, i have been known to enjoy poking the hornet's nest :-D
[09:42] <babbageclunk> wallyworld: do you think I should rip the non-strict model uuid checking out of validateModelUUID?
[09:43] <babbageclunk> wallyworld: There are still a few places that use it.
[09:43] <wallyworld> babbageclunk: i *think* so - maybe in a followup after checking with tim/menno. i am pretty sure it was just to support really old clients
[09:44] <babbageclunk> wallyworld: Ok - I'll do it as a separate change.
[09:44] <wallyworld> sgtm
[10:02] <mup> Bug #1631899 opened: juju show-controller --show-password does not show the password <juju-core:New> <https://launchpad.net/bugs/1631899>
[10:21] <babbageclunk> wallyworld: Want to take another look at that PR, or are you ok for me to merge it?
[10:22] <wallyworld> babbageclunk: i can take a quick look, menno was happy with it
[10:25] <babbageclunk> wallyworld: cool thanks - I think I've gone through all of your comments.
[10:28] <wallyworld> babbageclunk: i think it looks good to go. thanks for doing the fix; was a challenege to get all the bit lined up so you did well
[10:28] <babbageclunk> wallyworld: cheers!
[11:02] <beisner> hi all, last week we started seeing sha256 mismatches when units try to download the charm from the controller (1.25.6).  it's prevalent in openstack ci.  ie. shutting down: ModeInstalling ... failed to download charm ... expected sha256 FOO, got BAR
[11:02] <beisner> is there a known issue or are we special? :)   more detail @ http://pastebin.ubuntu.com/23302619/
[11:11] <mup> Bug #1631899 changed: juju show-controller --show-password does not show the password <juju-core:New> <https://launchpad.net/bugs/1631899>
[11:23] <mup> Bug #1631899 opened: juju show-controller --show-password does not show the password <juju-core:New> <https://launchpad.net/bugs/1631899>
[11:26] <mup> Bug #1631899 changed: juju show-controller --show-password does not show the password <juju-core:New> <https://launchpad.net/bugs/1631899>
[11:26] <mup> Bug #1541482 opened: unable to download local: charm due to hash mismatch in multi-model deployment <2.0-count> <juju-release-support> <juju:Fix Released by menno.smits> <juju-core:New> <https://launchpad.net/bugs/1541482>
[12:17] <mup> Bug #1629951 changed: cannot specify subnet to  create controller in on bootstrap <juju:Triaged> <https://launchpad.net/bugs/1629951>
[12:17] <mup> Bug #1630029 changed: models should inherit vpc-id from controller  <juju:Triaged> <https://launchpad.net/bugs/1630029>
[13:00] <dooferlad> rick_h_: 1:1 today?
[13:01] <rick_h_> dooferlad: /me checks thought he was in it
[13:01] <rick_h_> dooferlad: oh hmm, stuck at "requesting to join the video call"
[14:00] <rick_h_> katco``: dimitern voidspace mgz natefinch ping for standup
[14:00] <mgz> omw
[14:00] <dimitern> omw
[14:00] <rick_h_> oh right, US away so ignore me katco`` and nate
[14:02] <voidspace> rick_h_: omw
[14:49] <frobware> jamespage: do you often run into this issue: https://bugs.launchpad.net/juju/+bug/1600546
[14:49] <mup> Bug #1600546: lxd subnet setup by juju will interfere with openstack instance traffic <2.0> <network> <sts> <juju:Triaged by rharding> <nova-compute (Juju Charms Collection):New> <https://launchpad.net/bugs/1600546>
[14:51] <voidspace> rick_h_: ping
[14:51] <rick_h_> voidspace: pong
[15:21] <rick_h_> dooferlad: ping
[15:21] <dooferlad> rick_h_: hello
[15:21] <rick_h_> dooferlad: have more time to chat?
[15:22] <dooferlad> rick_h_: yes
[15:22] <rick_h_> dooferlad: k, meet you back in the 1-1 room
[15:32] <voidspace> rick_h_: 30 seconds for a bikeshed needed if you have it
[15:34] <voidspace> rick_h_: last comment on bug 1602192 is the output I've implemented
[15:34] <mup> Bug #1602192: when starting many LXD containers, they start failing to boot with "Too many open files" <lxd> <juju:In Progress by rharding> <lxd (Ubuntu):Confirmed> <https://launchpad.net/bugs/1602192>
[15:38] <rick_h_> voidspace: otp
[15:38] <rick_h_> voidspace: will look in a sec
[15:38] <voidspace> rick_h_: ok, np - I'll PR it and the reviewer can bikeshed it
[16:00] <jamespage> frobware, I've never run into that issue
[16:00] <frobware> jamespage: thanks. just trying to understand the severity and whether we address it "now".
[16:00] <jamespage> but apparently trent has hit it alot
[16:01] <frobware> jamespage: any feeling whether this is just happening for just the training setup?
[16:01] <jamespage> frobware, I'm not quite close enough to know the answer to that
[16:01] <jamespage> sorry
[16:02] <frobware> rick_h_: ^^ fyi
[16:02] <voidspace> dimitern: if you have any ideas on how to test this, I'm all ears
[16:02] <voidspace> dimitern: https://github.com/juju/juju/pull/6419
[16:02] <dimitern> voidspace: looking
[16:02] <voidspace> dimitern: except providing something that stubs out PrintLn.
[16:03] <voidspace> dimitern: the important bit is in provider/common/bootstrap.go
[16:04] <dimitern> voidspace: how about adding an interface with BootstrapMessage() method and then check in common/bootstrap if it's implemented by the Environ ?
[16:04] <dimitern> voidspace: then you could test it with the dummy provider, but no need to change all providers?
[16:05] <voidspace> dimitern: right, but what does that enable me to test?
[16:05] <voidspace> dimitern: I could then pass in a fake environ with a custom message, but what do I test?
[16:05] <voidspace> dimitern: unless I replace the call to fmt.Println with something that can be mocked - which doesn't seem very useful
[16:05] <voidspace> dimitern: I'm kind of arguing that as this is a ui change it needn't be tested
[16:06] <voidspace> dimitern: maybe I can look in featuretests to see if something like this is covered
[16:06] <dimitern> voidspace: only that common.Bootstrap() calls the optional method, when it's there, and checking the ctx.Stdout() to ensure it's there?
[16:06] <rick_h_> voidspace: wfm for now, thank you
[16:06] <voidspace> rick_h_: cool, thanks
[16:06] <rick_h_> frobware: yea, so I'm -1 on that being the biggest issue atm
[16:06]  * rick_h_ goes to get lunchables
[16:06] <voidspace> dimitern: in which case I can test that *already*
[16:07] <voidspace> dimitern: and put a non-empty string in dummy provider to test it
[16:07] <voidspace> dimitern: let me look - thanks
[16:07] <voidspace> dimitern: if we're calling fmt.Println will that be in ctx.Stdout ?
[16:07] <dimitern> voidspace: alternatively, you could go with BootstrapContext.Infof() only called in provider/lxd ?
[16:08] <dimitern> voidspace: since it's very specific and not needed everywhere
[16:09] <dimitern> voidspace: check e.g. PrepareForBootstrap in provider/ec2 and the code around validateBootstrapVPC()
[16:10] <dimitern> voidspace: (bootstrap)ctx.Infof() is already used for similar messages during bootstrap in some providers
[16:10] <voidspace> kk
[16:33] <voidspace> dimitern: using Stderr on the context I can test that the message is output
[16:34] <dimitern> voidspace: \o/ nice!
[17:55] <rogpeppe1> katco``: i've made a bunch of comments and changes in response to your review of https://github.com/juju/juju/pull/6407. PTAL.
[17:55] <rick_h_> rogpeppe1: she's on US holiday today
[17:56] <rick_h_> rogpeppe1: and with the EU folks EOD'ing will have to catch tomorrow it looks like sorry for the delay
[18:18] <arosales> If any folks are have access to s390 we are seeing https://bugs.launchpad.net/juju/+bug/1632030
[18:18] <mup> Bug #1632030: juju-db fails to start -- WiredTiger reports Input/output error <juju> <juju-db> <mongodb> <s390x> <juju:New> <https://launchpad.net/bugs/1632030>
[19:09]  * rick_h_ goes to get the boy home from school, biab
[20:23] <veebers> alexisb: Ina normal (test) run, should I see the log message: "ERROR cmd supercommand.go:458 creating API connection: ..."? (I'm seeing this error in the failing tests: ERROR cmd supercommand.go:458 creating API connection: EOF)
[20:24] <alexisb> not on a normal test run you are expecting to pass without error
[20:24] <veebers> I never see "creating API connection" in a passing test run (for grant-revoke)
[20:24] <alexisb> veebers, when did you start seeing the new failures?
[20:24] <veebers> alexisb: what is it a symptom of? One sec will check times
[20:25] <veebers> alexisb: looks to be about Oct 7 (Jenkins time, so UTC?) had 2 passes of 12 runs since then, the failures all include that error message
[20:28] <alexisb> menn0, ^^^
[20:28] <alexisb> I am wondering if this lines up with this merge: https://github.com/juju/juju/pull/6400
[20:32] <menn0> veebers: where are you see that message? from the juju client or in agent logs?
[20:33] <menn0> veebers: a certain amount of such messages are expected as when controllers come up and when workers restart after new controllers are added
[20:33] <veebers> menn0: um, I'm pretty sure the client? (I'm seeing it in the log output from juju command)  http://reports.vapour.ws/releases/4467/job/functional-grant-revoke/attempt/721 if you search for "connection: EOF"
[20:34]  * menn0 looks
[20:34] <veebers> menn0: this is after a controller comes up, users have been added etc.
[20:40] <veebers> menn0: Would this be related to the ping changes to? http://reports.vapour.ws/releases/4467/job/run-unit-tests-race/attempt/1949 (FAIL: monitor_internal_test.go:69: MonitorSuite.TestLaterPingFails)
[20:41] <menn0> veebers: that's a new test. i'll take a look.
[20:41] <menn0> veebers: the other issue appears to be macaroons related. here's the related bit in the controller's logs:
[20:41] <menn0> 2016-10-10 16:17:39 INFO juju.apiserver request_notifier.go:70 [1E] API connection from 10.0.8.1:33086
[20:41] <menn0> 2016-10-10 16:17:39 INFO juju.apiserver admin.go:102 login failed with discharge-required error: verification failed: no macaroons
[20:41] <menn0> 2016-10-10 16:17:39 INFO bakery service.go:366 server attempting to discharge "eyJUaGlyZFBhcnR5UHVibGljS2V5IjoieVZlUEU3cUNJMm5nZWZuUXo2NlFPWVEwanlMVnVDanMwZGhMa0VUa3dRST0iLCJGaXJzdFBhcnR5UHVibGljS2V5IjoiNzF5NnNIOExNNUU1OHYxaGpGek9JdCtCZGRLRHR6SE5pN0ExQ1dBWHZqND0iLCJOb25jZSI6InpoMllTNnZ1bFFHNUlPdVRucXY4U0hVNW5MQWlSaE43IiwiSWQiOiJlSHM0Y3dERDRiOCs5TXBDUG1RakwvenVFdXZhV3dSbmhkY0tXWmtGQjA5MENaMXYxcnBYZ0RNczRsTTUrc2FSS2l2RGt0OFhlOG5BZTNxVG5
[20:41] <menn0> 4Y1YwMXhIUWZxTXRmWHk4TUdnNmRvSWZlZWpSZmJnby9pdjIxSmZiU2FkQjNwdXR6ZmJ2ekRvbGh1aDNPcGdCbTRQd2pSRGx4cGFEQnZaL0lwZGhoL0lQb3dMdXc9PSJ9"
[20:42] <menn0> 2016-10-10 16:17:39 INFO juju.apiserver request_notifier.go:80 [1E]  API connection terminated after 28.244335ms
[20:42] <menn0> 2016-10-10 16:17:39 INFO juju.apiserver request_notifier.go:70 [1F] API connection from 10.0.8.1:33096
[20:42] <menn0> 2016-10-10 16:17:39 INFO juju.apiserver request_notifier.go:80 [1F] user-admin API connection terminated after 43.24719ms
[20:43] <menn0> veebers: i'll dig a bit more into that one to see if I can figure it out
[20:43] <veebers> menn0: I see a macroons change about 3 days ago, "remove macaroons on logout"
[20:44] <menn0> veebers: ok, could be related. those messages might also be normal. I'll do some digging.
[20:46] <veebers> menn0: Hmm, it's possible that the test misbehaves with those recent changes. Just looking now and if it's running on lxd (which this test is) it doesn't handle entering the password.
[20:47] <veebers> menn0: Did that behaviour change recently? (i.e. lxd not need password)
[20:47] <menn0> veebers: hmm, ok. that could be it.
[20:47] <menn0> veebers: I have no idea.
[20:47]  * veebers builds latest juju to test
[20:47] <veebers> menn0: let me test that and I'll get back to you
[20:47] <rick_h_> veebers: hmm, yea the branch nate did made sure to clear cookies when you logout
[20:47] <rick_h_> veebers: menn0 which meant that you had to actually login again after a logout for a change
[20:48] <veebers> rick_h_: that's probably it then, I should be able to confirm shortly
[20:49] <menn0> veebers: is there a ticket yet for the TestLaterPingFails intermittent failure?
[20:49] <veebers> menn0: not yet, I can make one right now
[20:50] <menn0> veebers: ok thanks. i've just reproed it here
[20:51] <menn0> grrr. even with injected clocks it's still too easy to introduce these intermittent timing issues ....
[20:52] <veebers> menn0: fyi https://bugs.launchpad.net/juju/+bug/1632105
[20:52] <mup> Bug #1632105: Test MonitorSuite.TestLaterPingFails fails  <ci> <regression> <unit-tests> <juju:Confirmed> <https://launchpad.net/bugs/1632105>
[20:53] <menn0> veebers: cheers
[21:14] <menn0> veebers: ok i've got the fix for TestLaterPingFails done... will propose shortly
[21:15] <veebers> menn0: sweetbix
[21:16] <menn0> veebers: thanks for pointing it out
[21:17] <veebers> menn0: heh, no worries, it popped up in the 'unknowns' that I'm wrangling
[21:18] <menn0> good to get on top of these intermittent failures quickly
[21:18] <veebers> agreed
[21:23] <menn0> thumper: easy review pls: https://github.com/juju/juju/pull/6420
[21:23]  * thumper looking
[21:24] <thumper> +1
[21:27] <veebers> menn0: ugh, sorry for the noise earlier, the fix for the grant/revoke/macroons stuff was in the CI test.
[21:27] <menn0> veebers: ok good to know
[21:40] <menn0> thumper: it's probably too late to expose the HighAvailabilty facade for controller logins isn't it?
[21:41] <thumper> menn0: yes and no
[21:41] <thumper> you could create a new one
[21:41] <thumper> but still need to support the old one
[21:41] <menn0> thumper: i'm working on the ticket to allow juju enable-ha to just work regardless of the current model
[21:42] <thumper> I guessed
[21:42] <menn0> thumper: the client side work is a piece of cake but you then end up with a controller login
[21:42] <menn0> thumper: the alternative is to do something tricky in the client so it logs into the controller model
[21:42] <thumper> Add the new facade
[21:42]  * thumper thinks
[21:43] <menn0> I don't think that would be hard to do and it's possibly lower risk
[21:43] <thumper> you need to have admin access in the controller model to do it
[21:44]  * menn0 checks what the current story with access is
[21:44] <thumper> or perhaps just write?
[21:45] <menn0> you currently need superuser access
[21:46] <menn0> that probably makes sense TBH
[21:50] <menn0> thumper: i don't *think* a new facade is required, the existing HighAvailability facade just needs to be exposed for controller logins and (controller) model logins (for compatibility)
[21:50] <thumper> ok
[21:50] <menn0> thumper: the client will have to try both approaches (in case a newer client is talking to an older server)
[21:50]  * thumper nods
[21:50] <menn0> thumper: with a preference for a controller login
[21:51] <menn0> ok
[21:51] <menn0> thumper: thanks rubber duck :)
[21:59] <veebers> thumper: you recently worked on some unit tests wrt to certificates/keys etc? Fyi I just filed this bug which may be of interest: https://bugs.launchpad.net/juju/+bug/1632127
[21:59] <mup> Bug #1632127: Unittest "MachineSuite.TestCertificateDNSUpdatedInvalidPrivateKey" fails on multiple archs <ci> <unit-tests> <juju:Confirmed> <https://launchpad.net/bugs/1632127>
[21:59] <thumper> veebers: as much as I love fixing bugs, I'm focusing on a different piece of work this week
[22:01] <veebers> thumper: ack, only wanted to bring it to your attention incase my assumption of you working in that area was correct :-)
[22:21] <menn0> veebers: TestCertificateDNSUpdatedInvalidPrivateKey was added by axw yesterday
[22:21] <menn0> it just failed for me on a merge attempt tooo
[22:23] <veebers> menn0: ack thanks, oh he's now on leave right?
[22:23] <menn0> veebers: ah yes... could be
[22:23] <menn0> veebers: assign it to me... I was working with him in that area yesterday
[22:23] <anastasiamac> menn0: veebers: Roger filed a bug for the failure.. i've marked Chris's as a duplicate...
[22:24] <menn0> anastasiamac: ok cool. bug number?
[22:24] <veebers> menn0: anastasiamac points out I filed a dupe bug
[22:24] <anastasiamac> menn0: veebers: https://bugs.launchpad.net/bugs/1631990
[22:24] <mup> Bug #1631990: cmd/jujud/agent: sporadic test failure in MachineSuite.TestCertificateDNSUpdatedInvalidPrivateKey <ci> <unit-tests> <juju:Triaged> <https://launchpad.net/bugs/1631990>
[22:24] <veebers> ah, beat me to it :-)
[22:25] <menn0> veebers: yep got that... I was after the original ticket
[22:25] <menn0> anastasiamac, veebers: thanks
[22:28] <jamespage> hey menn0 - I see you picked up bug 1541482
[22:28] <mup> Bug #1541482: unable to download local: charm due to hash mismatch in multi-model deployment <2.0-count> <juju-release-support> <uosci> <juju:Fix Released by menno.smits> <juju-core:Triaged by menno.smits> <https://launchpad.net/bugs/1541482>
[22:29] <menn0> jamespage: yep. I fixed the same/similar issue for 2.0 so I seemed like the right person to deal with it for 1.25.
[22:29] <jamespage> menn0, awesome
[22:29] <jamespage> menn0, its killing our final release testing atm
[22:29] <jamespage> menn0, any ideas what triggers the race that causes it?
[22:29] <jamespage> we don't see it all of the time
[22:30] <menn0> jamespage: IIRC it's to do with the apiserver's on disk charm cache
[22:30] <menn0> jamespage: the solution is to remove the cache. it's not necessary any more (already done for 2.0)
[22:31] <jamespage> menn0, yeah - I saw the fix you did for 2.0
[22:32] <menn0> jamespage: there might be some other related fixes in that area too. I need to review previous PRs.
[22:32] <jamespage> menn0, ok - I'll leave it with you
[22:32] <menn0> jamespage: actually, I just checked and I've already done that
[22:33] <menn0> jamespage: those changes just haven't been released yet
[22:33] <menn0> jamespage: we need a 1.25.7 release
[22:33] <jamespage> yp!
[22:34] <menn0> jamespage: if it would help I can supply binaries for you to use in the mean time (to validate that the problem is indeed fixed for you)
[22:36] <thumper> jamespage: how urgent is the need for 1.25.7 for your team?
[22:37] <jamespage> thumper, we release thursday; currently we're doing final sweep of amulet testing tidy including disabling old release combos and enabling new ones
[22:38] <jamespage> thumper, but that final sweep has been going for alot longer than normal as we can't get a full amulet run through consistently atm
[22:38]  * thumper nods
[22:38] <jamespage> thumper, we tried a workaround, but we're having to unpick that now
[22:39] <jamespage> thumper, as that did something quite different to what we expected
[22:39] <jamespage> thumper, so fairly urgent - I'll just see if we can splice in custom binaries to our Charm CI or not
[22:41] <thumper> jamespage: well, there is no way it is happening this week :(
[22:41] <thumper> so, sorry
[22:41] <jamespage> that was my guess
[23:18] <alexisb> menn0, ping
[23:22] <anastasiamac> jamespage: plz email alexisb and rick_h_re:urgency for 1.25.7 to come out \o/
[23:42] <mwhudson> alexisb: well i started on that bug and now i'm reading the kernel source :)
[23:45] <thumper> menn0: are you working on https://launchpad.net/bugs/1631990 now?
[23:45] <mup> Bug #1631990: cmd/jujud/agent: sporadic test failure in MachineSuite.TestCertificateDNSUpdatedInvalidPrivateKey <ci> <unit-tests> <juju:Triaged by menno.smits> <https://launchpad.net/bugs/1631990>
[23:47] <menn0> thumper: I haven't gotten to it yet
[23:48] <menn0> thumper: did you want to pick it up, or were you just bitten by it?