[00:03] <wallyworld> veebers: do you know if bug 1621576 is in progress?
[00:03] <mup> Bug #1621576: get-set-unset config will be renamed <juju-ci-tools:Triaged> <https://launchpad.net/bugs/1621576>
[00:09] <veebers> wallyworld: it looks like curtis landed the fix in the tests for that. Seems he didn't update the bug. I'll confirm with him that it's finished (but it looks like it is)
[00:09] <wallyworld> veebers: awesome. the reason for asking is that we will be looking to land the code changes to juju to use the new command syntax
[00:09] <wallyworld> and without the ci script changes, ci will break
[00:15] <veebers> wallyworld: ack, I'll have confirmation for you tomorrow :-) But I'm pretty sure that fix is complete
[00:15] <wallyworld> ty
[00:16] <wallyworld> the juju pr needs a little fixing, so that won't be ready till when the US comes back online anyway
[00:45] <mup> Bug #1560487 changed: local provider fails to create lxc container from template <canonical-is> <local-provider> <juju-core:Won't Fix> <juju-core 1.25:Triaged by alexis-bruemmer> <OPNFV:New> <https://launchpad.net/bugs/1560487>
[01:56] <menn0> axw: easy one: http://reviews.vapour.ws/r/5646/
[01:57] <axw> menn0: LGTM
[01:59] <menn0> axw: thanks
[02:50] <perrito666> Nites, anyone happens to know dimitern's mobile?
[03:06] <menn0> wallyworld: another migration cleanup: http://reviews.vapour.ws/r/5648/
[03:07] <wallyworld> ok
[03:11] <anastasiamac> wallyworld: re-instatement (?) of vsphere supported architecture - http://reviews.vapour.ws/r/5649/
[03:12] <wallyworld> ok, on my list
[03:12] <anastasiamac> :)
[03:15] <rick_h_> perrito666: not here, you make it in?
[04:02] <rick_h_> wallyworld: do you have the config changes spec handy? I thought application config was just config now?
[04:02] <wallyworld> rick_h_: it is
[04:02] <wallyworld> but not in beta18
[04:02] <rick_h_> wallyworld: did it not make b18? oh crap
[04:02] <wallyworld> PR up today, will land tonght
[04:02] <rick_h_> ah, thought b18 got all but one commit
[04:02] <rick_h_> ah, gotcha
[04:02] <wallyworld> rick_h_: sorry :-(
[04:03] <wallyworld> we just ran out of time
[04:03] <rick_h_> wallyworld: all good, just working on my slides for tomorrow and checking my thoughts vs reality in the beta
[04:03] <wallyworld> needed to coordinaye withCI etc
[04:03] <rick_h_> wallyworld: understand
[04:03] <wallyworld> put an asterisk :=)
[04:03] <rick_h_> yep, will work it out
[04:04] <wallyworld> rick_h_: also, i will have a fix today for machines not reporting as Down when they get killed
[04:04] <wallyworld> just a cosmetic thing, but very annoying
[04:04] <rick_h_> wallyworld: <3
[04:05] <wallyworld> especially if you are an admin trying to script whether to enable ha or not
[04:13] <anastasiamac> ... if CI gets a run without a failure... all landings I've seen today report similar failures :)
[04:17] <perrito666> Rick_h_ just getting out of the airport after 1h or more in migrations queue I wanted to message him to get dinner but I guess I'll be arriving too late
[04:19] <rick_h_> perrito666: gotcha, sucky on the queue fun
[04:21] <perrito666> Happens :) seems I picked a specially busy day
[04:23] <perrito666> Juju is in town so all these people are coming for the charmer summit, evidently :p
[04:27] <menn0> wallyworld, axw: do you know if anyone is looking into all the test timeouts in apiserver/applications
[04:27] <menn0> it's happened to me and lots of other merge attempts it seems
[04:27] <menn0> apiserver/application
[04:27] <axw> don't know
[04:27] <wallyworld> menn0: i'm not, i haven't been monitoring landing bot today
[04:27] <veebers> menn0: you're seeing this in the merge job? (anastasiamac ^^)
[04:27] <menn0> wallyworld: ok... i'll start looking
[04:27] <wallyworld> damn, something broke
[04:28] <wallyworld> menn0: i am fixing the annoying go cookies issue
[04:28] <menn0> veebers: yep, most merge attempts today have failed because of this
[04:28] <menn0> so someone managed to land something which is failing most of the time
[04:28]  * menn0 hopes it wasn't him :)
[04:28] <veebers> menn0: right, I was checking to see if it was CI/infra related. I've changed which machine the next run will happen on in hopes it might help.
[04:29] <menn0> veebers: ok thanks.
[04:29] <menn0> veebers: I can't repro the problem locally of course
[04:30] <veebers> menn0: heh :-\ always the way. FYI the last mereg that passed on that job was: "fwereade charm-life" (http://juju-ci.vapour.ws:8080/job/github-merge-juju/9167/)
[04:30] <veebers> menn0: I'll track the next queued up job that will run on the older machine and let you know how it gets on
[04:30] <menn0> wallyworld, axw, anastasiamac: the stuck test appears to be TestAddCharmConcurrently if that rings any bells?
[04:31] <anastasiamac> menn0: no bells but veebers pointed out the commit ^^ that seems to b the culprit :D
[04:32]  * anastasiamac have to get a kid from school, b back l8r
[04:33] <menn0> veebers: cool, I'll start looking at that merge
[04:36] <anastasiamac> wallyworld: m considering to remove arch caching from vsphere on current pr as well.. any idea how heavily supported architectures retrieval is used?
[04:37] <anastasiamac> wallyworld: it'll b calling simplestream image retrival evry time constraints validator is constructed...
[04:37] <wallyworld> in a couple of places
[04:37] <wallyworld> twice in one api call
[04:37] <wallyworld> when adding a machine i tihnk
[04:38] <anastasiamac> wallyworld: k.. i'll leave it out cached for now.. let's tackle it later for 2.1 maybe...
[04:38] <wallyworld> tha's from memory thugh
[04:38] <wallyworld> would need to check code again
[04:39] <anastasiamac> wallyworld: k.. i've created a separate bug for it and we'll address separately then
[04:39] <anastasiamac> mayb we'll even have some help with perfomance benchmarking (veebers :D) to determine how much better/worse we'd do without caching supported architectures :)
[04:40] <veebers> heh :-)
[04:45] <wallyworld> menn0: you're busy, if you get a chance later, here's that status fix http://reviews.vapour.ws/r/5651/. If no time, I can ask elsewhere
[04:47] <menn0> wallyworld: looking now
[04:48] <wallyworld> ta
[04:55] <menn0> wallyworld: good stuff.
[04:55] <menn0> wallyworld: i'm creating a card now as the migrations prechecks will need to use this too
[04:55] <wallyworld> menn0: thanks menno. btw did you book your flights yet? when you arriving/leaving?
[04:57] <menn0> wallyworld: I've sent the email to the agent but haven't heard back yet (unsurprisingly since they're not at work yet)
[04:57] <wallyworld> you looking to arrive the first sat and leave the following sat?
[04:58] <menn0> wallyworld: I'm likely to be leaving on Saturday night, which gets me in on Sunday evening
[04:58] <menn0> wallyworld: leaving the sprint on Saturday morning
[04:58] <wallyworld> you'll miss drinks :-)
[04:58] <menn0> wallyworld: possibly
[04:58] <menn0> wallyworld: sometimes they're a bit later
[04:59] <wallyworld> depending on flights, i'm going to try and arrive sat evening
[05:15] <menn0> wallyworld: so it looks like will hit the timeout in apiserver/application twice while trying to merge. he assumed it was bug 1596960
[05:15] <mup> Bug #1596960: Intermittent test timeout in application tests <tech-debt> <unit-tests> <juju:Triaged> <https://launchpad.net/bugs/1596960>
[05:15] <menn0> but that one says it's only windows
[05:15] <menn0> I'm guessing his changes have made it more likely to happen
[05:15] <wallyworld> damn, sounds pluasible
[05:17] <wallyworld> looks messy as well
[05:54] <axw> wallyworld: will you have a chance to look at that ca-cert issue? I'm trying to stay focused on azure
[05:55] <wallyworld> axw: yeah, i can look
[05:56] <wallyworld> axw: just read emails, so the cert issue is just disabling the validation check IIANM
[05:57] <axw> wallyworld: see uros's latest email, there's also an issue with credentials looking up provider based on controller cloud
[05:57] <axw> which seems wrong...
[05:59] <wallyworld> yeah
[07:26] <blahdeblah> Quick Q: in order for a unit to log to rsyslog on node 0, should there be a rule in the secgroup that allows access to tcp port 6514?  And should juju add this automatically?
[07:52] <wallyworld> urulama: http://reviews.vapour.ws/r/5652/ FYI
[07:54] <urulama> thanks
[07:54] <wallyworld> blahdeblah: units can ask for ports to be opened on a bespoke basis
[07:55] <wallyworld> it's not something we'd do unilaterally
[07:55] <blahdeblah> wallyworld: so it wouldn't be done as part of add-unit when a machine is added via the manual provider?
[07:55] <urulama> wallyworld: been running it with that fix since axw pointed it out :)
[07:57] <wallyworld> blahdeblah: not that i am aware of. manual provider assumes pretty much that everything is in place. juju tends to try not to mess with manual machines
[07:57] <blahdeblah> wallyworld: OK - thanks
[07:58] <wallyworld> urulama: i was hinting for a review from your folks :-)
[07:59] <wallyworld> axw: fyi urulama thinks that add-model issue may be with the controller proxy, so we're off the hook for now
[08:00] <axw> wallyworld urulama: yeah, I think it's most likely due to something around Cloud.DefaultCloud and/or Cloud.Cloud
[08:00] <wallyworld> axw: yep, i traced to to the cli making an api call to client.Cloud() and it's all goo in core
[08:00] <wallyworld> good
[08:00] <wallyworld> but something missing in proxy most likely
[09:55] <voidspace> babbageclunk: https://github.com/juju/juju/compare/master...voidspace:1534103-run-action
[09:57]  * frobware needs to run an errand; back in an hour.
[10:04] <fwereade> voidspace, may I have a 5-second review of http://reviews.vapour.ws/r/5653/ please?
[10:04] <fwereade> voidspace, apparently it has been failing a bunch
[10:05] <voidspace> fwereade: ok
[10:12] <voidspace> fwereade: LGTM
[10:13] <fwereade> voidspace, ta
[11:55] <mup> Bug #1594977 changed: Better generate-image help <helpdocs> <oil-2.0> <v-pil> <juju:Triaged> <https://launchpad.net/bugs/1594977>
[11:55] <mup> Bug #1622581 opened: Cryptic error message when using bad GCE credentials <juju-core:New> <https://launchpad.net/bugs/1622581>
[12:19] <mup> Bug #1622581 changed: Cryptic error message when using bad GCE credentials <juju-core:New> <https://launchpad.net/bugs/1622581>
[13:05] <fwereade> is anyone free for a ramble about cleanups with a detour into refcounting? axw, babbageclunk?
[13:12] <babbageclunk> yup yup
[13:14] <fwereade> babbageclunk, so, the refcount stuff I extracted
[13:14] <fwereade> babbageclunk, short version: it's safe in parallel but not in serial
[13:14] <babbageclunk> babbageclunk: ?
[13:15] <natefinch> fwereade: that is impressive
[13:15] <voidspace> that's impressive
[13:15] <babbageclunk> I didn't think that was a thing we needed to worry about.
[13:15] <voidspace> hard to do
[13:15] <natefinch> voidspace: hi5
[13:15] <voidspace> o/
[13:15] <fwereade> babbageclunk, i.e. refcount is 2; 2 separate transactions decref; one will fail, reread with refcount 1, successfuly hit 0 and detect
[13:15] <voidspace> natefinch: :-)
[13:15] <fwereade> voidspace, natefinch: I'm rather proud of it, indeed
[13:15] <natefinch> lol
[13:16] <babbageclunk> but isn't serial just slow parallel?
[13:16] <fwereade> babbageclunk, refcount is 2, one transaction gets composed of separate ops that hit same refcount: will decref to 2, but won't ever "realise" it did so, so there's no guaranteed this-will-hit-0 detection
[13:17] <babbageclunk> ugh
[13:17] <fwereade> babbageclunk, we're always composing transactions from ops based on a read state from before the txn started
[13:17] <babbageclunk> All the asserts happen before all of the ops?
[13:17] <fwereade> babbageclunk, yeah
[13:17] <fwereade> babbageclunk, that's how it works
[13:18] <babbageclunk> of course. ouch. so each assert passes, but they leave it at 0 with no cleanup
[13:18] <fwereade> babbageclunk, yeah, exactly
[13:19] <voidspace> fwereade: you have two days to fix this, right
[13:19] <fwereade> voidspace, perhaps :)
[13:19] <voidspace> :-)
[13:20] <fwereade> voidspace, stateful refcount thingy is one, wanton spamming of possibly-needed cleanups is another
[13:20] <fwereade> voidspace, I'm slightly hopeful that you have a third?
[13:20] <voidspace> I hope so too
[13:21] <fwereade> voidspace, oh
[13:21] <voidspace> oh, no
[13:21] <voidspace> sorry
[13:21] <fwereade> voidspace, days, I thought you said ways
[13:21] <voidspace> fwereade: I hope I have at least a third day
[13:21] <fwereade> voidspace, I would imagine so ;)
[13:21]  * babbageclunk lols sadly.
[13:21] <voidspace> fwereade: unless they purge juju of everyone you know...
[13:21] <voidspace> fresh start and all that
[13:22] <fwereade> voidspace, we have always been at war with...
[13:22] <voidspace> :-)
[13:23] <fwereade> voidspace, babbageclunk: so on the one hand there is this problem with txns
[13:24] <fwereade> voidspace, babbageclunk: and it's one that bites us increasingly hard as we try to isolate and decouple individual changes to the db
[13:25] <fwereade> voidspace, babbageclunk: and I don't really have an answer to either the problem or the increased risk we take on as we further isolate ops-generation
[13:28] <babbageclunk> Why do we compose these operations into one transaction? Shouldn't they be multiple transactions?
[13:28] <mup> Bug #1622136 changed: Interfaces file source an outside file for IP assignment to management interface <juju:Triaged by rharding> <https://launchpad.net/bugs/1622136>
[13:28] <fwereade> babbageclunk, that is basically where I'm going
[13:28] <babbageclunk> Not sure how we could prevent it though
[13:29] <fwereade> babbageclunk, I cannot, indeed, think of a reason that the app remove ops have to be bundled into the final-unit-remove ops
[13:30] <fwereade> babbageclunk, and in fact, that approach is itself vulnerable to that style of bug -- if we wrap up the final *2* unit-removes, we'd miss the app-remove code
[13:30] <babbageclunk> fwereade: But it would be nice if the transaction system could prevent you from combining these transactions together somehow since they're not valid.
[13:32] <fwereade> babbageclunk, that would probably be sensible, but I can't see any non-blunt ways of doing it -- only one op per doc, I guess? but that works *hard* against any prospect of decomposition
[13:33] <fwereade> babbageclunk, the usual escape valve is cleanup ops, ofc -- you can apply a partial change and leave a note to pick it up later, and that's great
[13:33] <babbageclunk> fwereade: can it be more fine-grained than that - one op touching any attribute of a doc in one transaction?
[13:34] <fwereade> babbageclunk, perhaps so, but it sorta sucks not to be able to incref unitcount by 5, for example
[13:34] <babbageclunk> (Not sure how easy that would be to do in the mongo expression lang)
[13:34] <babbageclunk> true
[13:35] <fwereade> babbageclunk, and anything at the txn layer has sort of lost the real context of the ops, so it's likely hard/impossible to DTRT re compressing ops into one
[13:36] <fwereade> babbageclunk, (I heartily support this style of thinking, I just don't think I can do much about it in 2 days, hence cleanups)
[13:37] <babbageclunk> fwereade: yeah, it seems like it would be hard to do that in a generic way - I can see it working for refcounts, but I'm sure the same problem can come from other things harder to reason about.
[13:37] <babbageclunk> so, cleanups!
[13:38] <fwereade> babbageclunk, so, if we simplify unit removal (and relation removal, same latent bug) such that it doesn't even care about app refcounts, and just read life and drop in a maybe-clean-the-app-up op
[13:38] <fwereade> babbageclunk, the cleanups will run and everyone is happy
[13:39] <fwereade> babbageclunk, except that the time taken to remove a service once its last unit goes away has gone from ~0s to 5-10s
[13:39] <fwereade> babbageclunk, because the cleanups run on the watcher schedule
[13:39] <babbageclunk> fwereade: Oh, 'cause that's when a cleanup will run.
[13:39] <voidspace> so that's the "spam extra cleanup checks" approach
[13:40] <voidspace> but removing the service once the units have gone is *mostly* admin right
[13:40] <fwereade> babbageclunk, yeah -- and the more we do this, the better our decoupling but the more we'll see cleanups spawning cleanups and require ever more generations to actually get where we're going
[13:40] <voidspace> or is there resource removal that only happens at cleanup time too?
[13:40] <fwereade> voidspace, yeah, but you can't deploy another app with the same name, for example
[13:40] <voidspace> right
[13:41] <voidspace> is that a common need?
[13:41] <voidspace> maybe I guess
[13:41] <babbageclunk> do watcher polling more frequently!
[13:41] <babbageclunk> ;)
[13:42] <fwereade> babbageclunk, that is certainly an option, and it does speed things up, but it's also the sort of tuning parameter that I am loath to fiddle with without paying close attention to the Nth-order effects at various scales and so on
[13:43] <babbageclunk> What about rather than dropping a cleanup you drop another txn that does the removal, with an assert that the refcount's 0?
[13:43] <fwereade> babbageclunk, can't guarantee they both apply -- that is the purpose of a cleanup, to queue another txn, really
[13:43] <babbageclunk> Ah, no - the cleanup gets created in the txn, right?
[13:44] <fwereade> babbageclunk, and you can't really write ops for future execution in the general case -- if they fail, there's no attached logic to recreate or forget about them, we can only forget
[13:45] <fwereade> babbageclunk, voidspace: anyway, one watcher-tick delay is not so terrible
[13:46] <babbageclunk> no
[13:47] <fwereade> babbageclunk, voidspace: so I was thinking I could just tweak the cleanup worker: expose NeedsCleanup, and check it in a loop that cleans up until nothing's left
[13:48] <fwereade> babbageclunk, voidspace: which at least gives us freedom to explore more-staggered cleanup ops without macro-visible impact
[13:48] <fwereade> babbageclunk, voidspace: and which I can probably get done fairly quickly
[13:48] <voidspace> sounds reasonable
[13:49] <babbageclunk> +1
[13:49] <fwereade> babbageclunk, voidspace: barring unexpected surprises in trying to separate service-remove from unit-remove
[13:49] <fwereade> babbageclunk, voidspace: excellent, thank you
[14:11] <voidspace> # github.com/juju/juju/cmd/jujud
[14:11] <voidspace> /usr/lib/go-1.6/pkg/tool/linux_amd64/link: running gcc failed: fork/exec /usr/bin/gcc: cannot allocate memory
[14:12] <mgz> yeah, I suffer a fair bit from that
[14:12] <mgz> linking with 1.6 takes a lot of memory
[14:12] <voidspace> time to switch to 1.7 then I guess
[14:16] <voidspace> I haven't seen it before and now I'm seeing it consistently with master
[14:55] <voidspace_> ok, so a reboot fixed the memory issues
[15:34] <dimitern> frobware: hey, not sure if you've seen my PM
[15:34] <dimitern> frobware: here's the PR I'm talking about: https://github.com/juju/juju/pull/6219
[15:51] <fwereade> babbageclunk, voidspace_: I think I have a happier medium, in case I don't land anything else: http://reviews.vapour.ws/r/5644/
[15:51] <fwereade> babbageclunk, voidspace_: would either of you be free to take a look before EOD?
[15:52] <babbageclunk> fwereade: Sure, looking now
[15:52] <fwereade> babbageclunk, tyvm
[16:01] <redir> morning juju-dev
[16:48] <babbageclunk> fwereade: Sorry, I got distracted - still looking!
[16:51] <voidspace> fwereade: you still here?
[17:00] <fwereade> babbageclunk, voidspace: heyhey
[17:01] <voidspace> fwereade: so this implementation of a failaction operation seems to work and "do the right thing" https://github.com/juju/juju/compare/master...voidspace:1534103-run-action#diff-ae955475ac58e0d2683d2cfd6101b3f7R1
[17:03] <voidspace> fwereade: which is mostly copied from runaction.go
[17:07] <fwereade> voidspace, that certainly looks sane to me
[17:07] <voidspace> fwereade: cool, it seems to fix the bug and behave sanely - so I'll add tests and propose
[17:16] <fwereade> voidspace, cool, tyvm
[17:31] <perrito666> hey, juju restore survives suspending the machine for 10 mins, sweet
[17:55] <perrito666> does annyone know if there is a way to list all models?
[17:55] <perrito666> fwereade: ?
[17:56] <fwereade> perrito666, I thought there was literally a list-models?
[17:56] <perrito666> fwereade: sorry I meant in state :p
[17:56] <fwereade> perrito666, not sure offhand, how does the list-models apiserver do it?
[17:56]  * perrito666 accidectally mixed chia and earl grey and is not happy about the result
[17:57] <perrito666> fwereade: an ungly thing that gets models for a user
[17:57] <perrito666> I was trying to avoid constructing another one of those
[17:57] <perrito666> :p
[17:59] <perrito666> hey, there is an AllModels here
[17:59] <perrito666> nice
[18:01] <fwereade> perrito666, well, the raw collection is pretty terrible
[18:02] <fwereade> perrito666, but, resolved anyway ;p
[18:52] <mbruzek> hmo: http://ppa.launchpad.net/juju/devel/ubuntu/pool/main/j/juju-core/juju-core_2.0-beta15-0ubuntu1~16.04.1~juju1.debian.tar.xz
[20:01] <perrito666> is the message "Contacting juju controller <private ip>" correct here? http://pastebin.ubuntu.com/23170667/
[20:08] <natefinch> perrito666: buh... that can't be right, unless somehow you can connect to the private address of AWS from where you're running the client
[20:09] <perrito666> I cant
[20:12] <natefinch> perrito666: weird then.  probably just posting the first address in whatever list
[20:12] <perrito666> yep, after a restore, juju status will also show that address
[20:39] <mup> Bug #1622738 opened: Multi-series charms failing in 1.25.6 <juju-core:New> <https://launchpad.net/bugs/1622738>
[22:33] <wallyworld> redir i am free for a bit but need coffee so give me 5 if you still had a question
[22:33] <redir> cool
[22:33] <redir> I'm here
[22:36] <redir> but going to make tea while you make coffee
[22:55] <perrito666> wallyworld: hey, this https://bugs.launchpad.net/juju/+bug/1595720 is still happening but now its a big issue since admin users are hitting this  :(
[22:55] <mup> Bug #1595720: Problems using `juju ssh` with shared models <ssh> <usability> <juju:Triaged> <https://launchpad.net/bugs/1595720>
[23:12] <wallyworld> perrito666: damn, i'll add to the list of todo items for rc1, yay
[23:16] <wallyworld> thumper: standup?
[23:36] <thumper> review up: http://reviews.vapour.ws/r/5657/
[23:40] <marcoceppi>  rick_h_ https://bugs.launchpad.net/juju/+bug/1622787
[23:40] <mup> Bug #1622787: If you name a credential with an @ Juju barfs <juju:New> <https://launchpad.net/bugs/1622787>
[23:41] <rick_h_> marcoceppi: lol ty for keeping the barf part
[23:43] <thumper> o/ marcoceppi