/srv/irclogs.ubuntu.com/2015/12/04/#juju-dev.txt

=== natefinch-afk is now known as natefinch
cheryljwallyworld: ping?02:23
wallyworldcherylj: hey give me a sec02:24
cheryljnp02:24
wallyworldcherylj: hey. btw do you have a link to the 2.0 todo list that has been srated for the sprint?02:26
cheryljwallyworld: I don't know that one has been started yet02:27
wallyworldah, ok, np. next week then02:27
cheryljwallyworld: I know that when you guys were working the bootstack upgrade issues, you came across the problem in this bug:  https://bugs.launchpad.net/juju-core/+bug/145903302:28
mupBug #1459033: Invalid binary version, version "1.23.3--amd64" or "1.23.3--armhf" <constraints> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1459033>02:28
wallyworldyes02:28
cheryljbut I cannot find / remember what the cause of that was determined to be02:28
wallyworldthey had bad data in their tools metadata collection in state02:28
wallyworldat some point, juju i think must have returned "" for an unknown series lookup02:29
wallyworldand so when tools were being imported, wily or xenial tools for that bad version02:29
cheryljdid you have to manually recover the db?02:29
wallyworldyeah, i went in and deleted the tools metadata02:29
wallyworldthis causes juju to go out to streams.canonical.com02:30
wallyworldto fetch tools instead of using any cached values02:30
wallyworldand the newly downloaded tools are then cached again02:30
cheryljokay, cool.  Thanks!02:30
wallyworldbut hard to reporduce02:31
wallyworldi don't know how old that "" series issue is02:31
cheryljI can take a look02:31
wallyworldthere's a lot of moving parts02:32
wallyworldmay be hard to pin down a definitive "this is it" release02:32
wallyworldit should be an upgrade check02:32
wallyworldthe upgrader checks for bad metadata and deletes it02:32
wallyworldaxw: here's the pr http://reviews.vapour.ws/r/3317/02:50
axwwallyworld: ta, looking02:50
axwwallyworld: reviewed03:07
wallyworldaxw: good pickups on the other issues; i've explained one point, hopefully it makes sense03:40
axwwallyworld: ok, looking again03:40
axwwallyworld: alternatively just "ServiceOffers" if URL is the canonical identifier03:41
axwwallyworld: your call tho03:41
wallyworldthat might work03:42
wallyworldshit , just saw a typo03:42
wallyworldformatOfferedServiceDetailss03:42
wallyworldwill fix that03:42
axwwallyworld: responded03:58
wallyworldlooking04:00
axwwallyworld: going to go have lunch and start packing, will check back in a little while04:01
wallyworldok, np, i'll push changes04:01
axwwallyworld: under what circumstances? re  "even if the query(filter) succeeds and the Error above is nil, converting the data from a particular query result item may have an error."05:04
axwwallyworld: just wondering if it's actually worthwhile departing from the conventional all-or-nothing per result05:05
wallyworld_axw: it gets the query result and then does things like look up the service and/or charm details (can't recall exactly)05:05
wallyworld_so that op per record could fail05:05
axwwallyworld_: does it make sense to include that in the result at all then? what're you going to do with that? you didn't specifically ask for an item, you just said "give me all the things that this filter matches"05:06
wallyworld_the doc in offered services collection does match. but composing the result errors05:07
wallyworld_we could ignore such errors i guess05:07
axwwallyworld_: yep... why would the user care about that?05:07
wallyworld_and pretent the item doesn't exist05:07
wallyworld_i'll rework it05:08
axwwallyworld_: it's just not clear to me how the user can action on that error05:08
axwwallyworld_: it's not due to an error in input, it's a server-side error05:08
axwwallyworld_: if you like, defer to a follow up05:09
wallyworld_as person making an offer, i would want to know if one of my offers went bad somehow05:09
wallyworld_let's land now and iterate next week? there's a fair bit of other cruft to fix also05:10
wallyworld_this is a good start though05:10
axwwallyworld_: then I tink we need to move the error inside the details, rather than outside the details. that would be more like the errors we have in machine status, I think05:10
axwwallyworld_: sure05:10
wallyworld_hmm, ok, i could move inside05:10
wallyworld_i'll land as is and we can think a bit. i need to go pack etc05:11
axwwallyworld_: do it later, I'll take a once over now05:11
wallyworld_ok05:11
axwwallyworld_: couple small fixes please, then shipit05:15
wallyworldaxw: thanks. my eyes get sore looking at all those params structs. they all start to merge into a big mess05:20
axwwallyworld: :)05:20
=== ashipika1 is now known as ashipika
TheMueHeya, old team. Next week in OAK/SFO?08:29
voidspacedimitern: standup?10:04
dimiternvoidspace, omw - having some HO issues10:04
mupBug #1522484 changed: state package tests no longer run since PR #3806 <blocker> <ci> <regression> <unit-tests> <juju-core:Fix Released by dimitern> <https://launchpad.net/bugs/1522484>10:24
mupBug #1522484 opened: state package tests no longer run since PR #3806 <blocker> <ci> <regression> <unit-tests> <juju-core:Fix Released by dimitern> <https://launchpad.net/bugs/1522484>10:27
mupBug #1522484 changed: state package tests no longer run since PR #3806 <blocker> <ci> <regression> <unit-tests> <juju-core:Fix Released by dimitern> <https://launchpad.net/bugs/1522484>10:30
fwereadedimitern, axw, rogpeppe: you have all contributed to this logic, so you may have insight: ISTM that the broken/closed logic in api/apiclient.go is really rather likely to deadlock on Close(); which would match a bug I've seen; any comments/recollections?10:53
dimiternfwereade, I don't recall much around that code, i have to do a refresher10:55
fwereadedimitern, axw, rogpeppe2: and it STM that http://paste.ubuntu.com/13665970/ would fix it10:55
dimiternfwereade, that looks like it should have been like this to begin with :)10:55
fwereadedimitern, it's just about the interaction of 2 channels (s.closed/s.broken) and how the heartbeatMonitor goroutine doesn't always close broken; but Close always waits for broken to be closed10:56
rogpeppe2fwereade: looking10:56
rogpeppe2fwereade: what's the difference between those two pieces of code?10:57
rogpeppe2fwereade: i ca't see that using defer makes a difference10:57
fwereaderogpeppe2, ...goddammit10:57
fwereaderogpeppe2, this is why it's a good idea to talk to you about these things ;)10:57
rogpeppe2fwereade: :)10:57
fwereaderogpeppe2, hadn't picked up that we always ping until we fail10:58
* fwereade wonders if a ping could hang somehow...10:58
rogpeppe2uiteam: support removing multiple entities at once: https://github.com/CanonicalLtd/charmstore-client/pull/15011:09
dimiternvoidspace, ping11:35
frobwaredooferlad, your python observations on add-juju-bridge.py. I think I'm going to land on master as-is because I'm going to work on the multiple bridge / multiple NICs straight away and I can a) fix them there and b) don't want to invalidate any of the manual testing. Ok?11:40
dooferladfrobware: sure11:41
dimiternfrobware, voidspace, I found out why David is having that issue - I'm about to propose a PR that fixes it by allowing Subnets() to be called without an instanceId (returning all subnets)11:48
frobwaredimitern, great & thanks.11:55
voidspacedimitern: pong11:57
voidspacedimitern: without an instanceId... interesting11:58
dimiternvoidspace, yes - like "gimme all subnets there are11:58
voidspacedimitern: sure, what will use that?11:58
dimiternvoidspace, with the new api that's easy, and in fact already implemented11:58
dimiternvoidspace, in Spaces()11:58
voidspacedimitern: for maas, yes11:58
dimiternvoidspace, yes, only for maas and until we have import in place11:59
dimiternvoidspace, this will allow David & et al to try spaces in constraints11:59
voidspacedimitern: ok, cool11:59
rogpeppe2uiteam: now updated to allow a -n flag: https://github.com/CanonicalLtd/charmstore-client/pull/15012:24
* rogpeppe2 lunches12:25
voidspacedimitern: frobware: dooferlad: wife ill in bed, I'm looking after the boy12:56
voidspacehopefully she'll be rested and up soon12:56
frobwarevoidspace, ack12:56
dimiternvoidspace, sure - speedy recovery!12:57
dooferladvoidspace: hope things improve soon :-(12:58
frobwaredimitern, dooferlad, voidspace: PTAL @ http://reviews.vapour.ws/r/3319/13:23
dimiternfrobware, looking13:25
dimiternfrobware, LGTM13:27
frobwaredimitern, once that lands on master I plan to rebase maas-spaces as I can do the multi nic / multi bridge with that change in place13:33
dimiternfrobware, great13:34
* dimitern steps out for ~1h13:34
* frobware lunches13:34
fwereaderogpeppe2, re that heartbeatMonitor13:56
rogpeppe2fwereade: yes?13:56
fwereaderogpeppe2, the only reason I can see for it to block on Close is if Ping somehow hangs; and while I can't explain exactly why that would happen, it doesn't seem *intrinsically* implausible if the state server is misbehaving somehow13:57
fwereaderogpeppe2, (I'm mainly asking you because I think you wrote v1? anyway)13:58
=== rogpeppe2 is now known as rogpeppe
rogpeppefwereade: yeah, i'm responsible for the design of most of that code, i think13:58
rogpeppefwereade: have you got a reproducible test case?13:58
fwereaderogpeppe, anything obviously insane about timing the ping out and stopping? AFAICS none of the clients need to block until it's *actually* stoppped?13:58
fwereaderogpeppe, sadly not13:58
fwereaderogpeppe, I am experimenting with messing around with connectioons and it's been flawless for me so far13:59
rogpeppefwereade: is there a bug report?13:59
fwereaderogpeppe, but there's a bug open -- and that includes a panicking state server somewhere -- yeah, https://bugs.launchpad.net/juju-core/+bug/152254414:00
mupBug #1522544: workers restart endlessly <juju-core:Triaged by fwereade> <https://launchpad.net/bugs/1522544>14:00
rogpeppefwereade: how would a hung-up heartbeatMonitor cause endless restarts?14:00
rogpeppefwereade: i'd've thought it would have the opposite effect14:00
rogpeppefwereade: i.e. it *can't* restart when needed14:01
fwereaderogpeppe, the worker that wraps it never finishes and is never restarted; so the conn resource doesn't get replaced, and everybody keeps gamely trying to connect with the old one14:02
fwereaderogpeppe, after all, it might just have been some transient error ;)14:02
rogpeppefwereade: hmm14:02
fwereaderogpeppe, (but ofc they all just start up, try to do something, fall down)14:02
fwereaderogpeppe, anyway14:02
rogpeppefwereade: before putting a timeout on the ping, i would make sure that that actually is happening14:04
rogpeppefwereade: for example by getting a *full* stack trace of all goroutines when this is happening14:04
rogpeppefwereade: if the connection to the state machine has gone, the Ping *should* exit14:04
fwereaderogpeppe, yeah, I'm not worried about that14:04
rogpeppefwereade: if it doesn't then it may be a bug in the rpc package14:05
fwereaderogpeppe, I'm worried about the worst case of what a confused state server might induce I guess14:06
fwereaderogpeppe, ehh, just a thought -- the interesting thing I suppose is that the same failure could have happened silently before, I suspect, it'd just manifest as an agent blocked for some reason and when there's no clear cause it's all too easy to bounce it and move on14:07
rogpeppefwereade: all the client request *should* return regardless of the server state14:09
rogpeppefwereade: because we close the connection14:09
fwereaderogpeppe, do you recall the rationale for waiting for a failed ping, rather than just exiting on close? I don't think a successful ping-failure implies anything useful about whether any other clients are using the conn14:09
rogpeppefwereade: and if the connection's closed the response reader should close14:10
rogpeppefwereade: no, i think it would be just fine to return after reading on s.closed14:10
rogpeppefwereade: i don't think that'll make any difference though14:11
rogpeppefwereade: because sending an API request after the client is closed will immediately return an error without actually doing anything14:12
rogpeppefwereade: actually i do see at least one rationale14:12
rogpeppefwereade: which is that State.Close waits on s.broken before returning14:12
fwereaderogpeppe, I'm thinking of scenarios like "a panicking apiserver has somehow deadlocked itself mid-rpc"14:13
rogpeppefwereade: thus ensuring that the heartbeat monitor is cleaned up14:13
rogpeppefwereade: even then, it should cause a client to hang up14:13
rogpeppefwereade: i mean, feel free to experiment14:13
rogpeppefwereade: but i think that if you try making an rpc server that never replies on any request, you should still be able to close the client ok14:14
rogpeppefwereade: (if not it's a bug)14:14
fwereaderogpeppe, yeah, I think you're right14:16
fwereaderogpeppe, cheers :)14:16
rogpeppefwereade: np14:17
voidspacefrobware: you rebased yet?14:42
mupBug #1522861 opened: Panic in ClenupOldMetrics <juju-core:New> <https://launchpad.net/bugs/1522861>14:51
frobwarevoidspace, just about to. make check works ok on maas-spaces.14:55
frobwarevoidspace, I forget (doh!) what we agreed for rebasing. Just push, or PR and push?14:57
frobwaredimitern, ^^14:57
voidspacefrobware: I'd say just push14:57
voidspacefrobware: you can't merge the PR anyway, so no point (IMO)14:58
frobwarevoidspace, exactly.14:58
frobwarevoidspace, couldn't remember what I did last time...14:58
voidspaceyou PR'd then pushed14:58
frobwarevoidspace, for the record, diff against master looks like: http://pastebin.ubuntu.com/13669884/15:02
dimiternfrobware, +1 for just push15:02
frobwaredimitern, dooferlad, voidspace: rebased. http://pastebin.ubuntu.com/13669966/15:06
dooferladfrobware: cool, thanks!15:07
dimiternfrobware, sweet!15:19
natefinchmgz:  you around?15:44
mgznatefinch: yo15:45
natefinchmgz: something wonky with my feature branch here: http://juju-ci.vapour.ws:8080/job/github-merge-juju-charm/20/console15:47
natefinch+ /var/lib/jenkins/juju-ci-tools/git_gate.py --project gopkg.in/juju/charm.nate-minver15:47
natefinchshould be --project gopkg.in/juju/charm.v6-unstable15:47
natefinchprobably the first time we've ever tried a feature branch on a gopkg.in library15:48
natefinchmgz: not really sure how we're supposed to make that work.  I can understand what the CI code is doing... I don't know how to tell it "pretend this is charm.v6-unstable"15:49
mgzyeah, I don't think gopkg.in and feature branches are compatible15:50
mgzwell, they are, it's just via the go mechanism of rewriting all your imports15:51
mgzyou can't name a branch something other than v6-unstable then import it as that via gopkg.in15:51
natefinchwell, I mean, it works fine using godeps from juju/juju.... because godeps just sets the right commit... but yeah, I guess there's no real way to do that solely from the gopkg.in branch directly15:58
natefinchlike, you can do go get gopkg.in/juju/charm.v6-unstable and then git checkout minver and it'll work fine15:58
natefincher nate-minver15:59
mgzyou can get git to give you any rev it can find in the repo16:00
natefinchright16:00
mgzthat doesn't mean it's in any relevent history16:00
natefinchI think we need feature branch support in CI.  Otherwise we get into the case where a change to one of these versioned branches breaks juju/juju ... just like that email I sent a week ago or so16:01
mgzwe can't really hack around the way gopkg.in works16:02
mgzci on github.com/ stuff is fine16:02
mgzbut we'd need to actually sed imports to make gopokg.in work I think, and that's... not very productive16:02
natefinchsure you can.  I just said how: You do go get gopkg.in/juju/charm.v6-unstable and then git checkout nate-minver16:03
natefinch(and then godeps dependencies.tsv)_16:03
natefinchthat's how I do development on my feature branch16:03
natefinchlet me double check that that actually works from scratch16:04
natefinchyep, totally works16:05
mgzhm, the simple case works with just mv on the dir yeah16:05
natefinchyeah, you just need the code from the branch to be in the directory go expects16:06
natefinchso the current CI code could work if we added a simple mv statement... though again, it has to understand what the code "expects" to be called, which now has to live outside the branch name.16:07
mgznatefinch: can we just naming scheme it?16:11
mgzhow many dots is too many dots?16:11
natefinchmgz: whatever is easiest for you guys. *I* don't care what the name of my branch is16:12
natefinchmgz: I'm pretty sure that gopkg.in only cares about that first dot16:12
natefinchmgz: we could do gopkg.in/juju/charm.v6-unstable.nate-minver16:12
natefinchlol almost semantic versioning at that point16:13
mgzcharm.v6-unstable.featurename seems somewhat workabl... right16:13
natefinchmgz: would you like me to poke at the CI code to get something like that working? I need it to get my min juju version stuff tested and landed16:58
mgzlp:juju-ci-tools git_gate.py I'd add a flag to do magic dot handling16:59
natefinchgood lord bzr is slow17:00
natefinchmgz: thanks, I'll look at it17:00
mgzyou should just be able to mangle the directory variable in go_test()17:00
natefinchmgz: can you rename my branch?17:03
mgznatefinch: I can17:04
natefinchlol, writing python just broke sublime17:08
natefinchbbiab17:08
=== natefinch is now known as natefinch-afk
dimiternfrobware, voidspace, dooferlad, any of you guys still around?17:41
frobwaredimitern, yep17:44
dimiternfrobware, before I go today I need a review on the PR I'm proposing to unblock David17:44
dimiternfrobware, doing a final live test on maas 1.9 now and I'll publish it17:45
frobwaredimitern, OK17:45
frobwaredimitern, it's on the maas-spaces branch so we're pretty contained17:45
dimiternfrobware, yep17:45
frobwaredimitern, the alternative is to just email a patch to Davi if you're unsure that you want to land it.17:49
dimiternfrobware, oh boy :/ I've just discovered it won't work because of yet another maas bug17:49
dimiternfrobware, http://paste.ubuntu.com/13673162/17:50
dimiternfrobware, well, I can still propose it and land it, as it's useful but it won't unblock David until this is resolved17:50
frobwaredimitern, why does that fail?17:56
dimiternfrobware, no idea - looking at the maas logs now17:56
dimiternfrobware, PR: http://reviews.vapour.ws/r/3322/17:58
=== natefinch-afk is now known as natefinch
dimiternfrobware, I don't know why it includes already merged commits though :/17:58
dimiternmy changes are only in provider/maas/environ.go and environ_whitebox_test.go17:59
frobwaredimitern, is that because of my rebase?17:59
dimiternfrobware, well, I did a rebase of my origin/maas-spaces onto upstream/maas-spaces17:59
dimiternfrobware, and then rebased the last 2 commits on top of that18:00
dimiternanyway, I need to start packing..18:00
frobwaredimitern, your commit touches a lot of files18:00
frobwaredimitern, that's the single patch for David?18:01
dimiternfrobware, most of the changed files come from voidspace's ProviderId for subnets/spaces18:01
frobwaredimitern, are you merging from his branch?18:01
dimiternfrobware, I guess I'll redo it cleanly starting from fresh upstream/maas-spaces18:01
dimiternfrobware, no, I was rebasing.. well I did something wrong obviously18:02
voidspacedimitern: I'm here, sort of18:03
frobwaredimitern, the review is split over two pages for me18:03
dimiternvoidspace, false alarm it seems :)18:04
dimiternfrobware, yeah, I'll redo it cleanly, but not now18:04
voidspacedimitern: how come that PR has my "already landed" changes in it18:04
dimiternvoidspace, a git late-friday-evening mystery :)18:05
voidspacedimitern: heh18:05

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!