[00:20] <perrito666> wallyworld: btw I began using quassel, it sucks that they dont support ubuntu indicator for messages
[00:20] <wallyworld> yeah, maybe htere's a plugin, not sure
[00:21] <perrito666> doesnt seem to implement plugins
[00:22] <perrito666> but having the server in amazon works better than bip
[00:29] <thumper> wallyworld: re bug 1236471
[00:29] <mup> Bug #1236471: Sporadic test failure w/ bot inside Uniter: FilterSuite.TestUnitRemoval <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1236471>
[00:29] <thumper> wallyworld: it seems to be just utopic
[00:29] <thumper> is there someone running utopic that could test it?
[00:29] <wallyworld> oh joy
[00:29] <wallyworld> i can
[00:29] <thumper> works fine here
[00:29] <wallyworld> sigh
[00:29] <wallyworld> i'll try it now
[00:30] <thumper> wallyworld: although a key indicator to the problem could be the comment on line 111
[00:31] <wallyworld> thumper: works fine for me on utopic
[00:31] <thumper> wallyworld: reading the code...
[00:31] <thumper> wallyworld: it seems that sometimes destroy removes the unit, and sometimes it just sets it to dying
[00:32] <thumper> smells like a race condition
[00:32] <thumper> the filter that we check dies seems to not find the unit
[00:32] <wallyworld> yeah, sadly
[00:32] <thumper> so it feels like in that case, the doc was removed
[00:32] <thumper> as opposed to just set to dying
[00:33] <thumper> wallyworld: reading the comment in state/unit.go Destroy method
[00:33] <wallyworld> that makes sense having now read the code
[00:33] <thumper> wallyworld: it seems that once destroy has been called, we can't depend on it being there
[00:33] <thumper> so the test looks fucked
[00:33] <thumper> or the filter is screwed
[00:33] <thumper> as it expects it to be there
[00:33] <thumper> and it has goine
[00:33] <thumper> perhaps the test should handle the missing unit error
[00:34] <thumper> I bet what is happening
[00:34] <thumper> is that the remove is being executed before the filter go routine starts
[00:34] <wallyworld> thumper: this comment
[00:34] <wallyworld> / Ensure we get a signal on f.Dead()
[00:34] <wallyworld> seems to imply the test expects the unit to be Dead
[00:35] <wallyworld> but it's either still dying or already removed as you say
[00:35] <thumper> seems like the test needs to ensure that the filter go routine has started
[00:35] <thumper> and it probably doesn't
[00:35] <thumper> nope
[00:35] <thumper> it doesn't
[00:35] <thumper> there is the race
[00:37] <thumper> filter.go line 281 will be where it errors
[00:38]  * thumper grunts
[00:38] <thumper> wallyworld: perhaps the best way to fix this is to have NewFilter not return until the goroutine for the loop has hit a ready state?
[00:38] <thumper> wallyworld: thoughts?
[00:39] <wallyworld> thumper: i'm just reading the filter code and in various places there's this: IsCodeNotFoundOrCodeUnauthorized
[00:39] <wallyworld> so it seems that the filter can expect the unit to not be there
[00:39] <wallyworld> and in that case, it returns an ErrTerminateAGent which is what the test wants
[00:40] <wallyworld> do you see what i mean?
[00:40] <wallyworld> in the loop()
[00:41] <wallyworld> oh, i just read what you posted above
[00:41] <wallyworld> let me look at that
[00:42] <menn0> wallyworld, thumper: fyi I have made little progress with bug 1409827
[00:42] <mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827>
[00:42] <thumper> wallyworld: but you can see from the bug, the error is a "not found" one
[00:42] <menn0> wallyworld, thumper: mainly due to distractions at home
[00:42] <wallyworld> menn0: i hate those tests
[00:43] <thumper> wallyworld: because it is a params.Error not an rpc.Error
[00:43] <menn0> wallyworld, thumper: I am questioning why it's a CI blocker though. it fails far less often than the other ones you're currently looking at
[00:43] <menn0> wallyworld, thumper: I can only find one recent failure
[00:43] <wallyworld> menn0: maybe sinzui can answer
[00:45] <menn0> wallyworld, thumper, sinzui: it obviously needs to be dealt with, I just don't know if it's CI blocker material
[00:45] <menn0> wallyworld, thumper, sinzui: continuing with it now anyway
[00:46] <wallyworld> ty
[00:47] <thumper> wallyworld: I think I know why it isn't being caught
[00:47] <wallyworld> thumper: so the test log shows the CharmURL() api call failing with a params.NotFound error but this error should have been caught in the defer and changed to ErrTerminateAgent
[00:47] <thumper> wallyworld: because if any of the errors are traced or wrapped, then they don't match the call
[00:48] <thumper> maybe...
[00:48] <wallyworld> thumper: because ?
[00:48] <wallyworld> if err, _ := err.(rpc.ErrorCoder); err != nil {
[00:48] <wallyworld> err could be wrapped?
[00:49] <thumper> I'm thinking it is a possibility
[00:49] <thumper> however...
[00:49] <thumper> no...
[00:49] <thumper> because the resulting error is a params.Error
[00:51] <thumper> wallyworld: I don't see how this could fail... :(
[00:51] <wallyworld> sigh, me either, but we might me missing something subtle not being familiar with the code
[00:59] <wallyworld> thumper: if i comment out the deferred error handling to convert a not found to a errterminateagent, i can get it to fail just like in the test
[01:00] <wallyworld> so that does suggest that that error handling being used is critical to making the test pass, and i can't see how the error could be escaping
[01:01] <thumper> looking at the bug test log
[01:02] <thumper> [LOG] 67.77366 DEBUG juju.rpc.jsoncodec -> {"RequestId":12,"Response":
[01:02] <thumper> indicates that line 320 is the return statement in question
[01:02] <thumper> but I agree, can't see why it wasn't chagned
[01:02] <wallyworld> thumper: sill, i do think we need an errors.Cause() instead of a straight cast tp rpc.Error in the params error code stuff
[01:03] <wallyworld> rpc.ErrorCoder i mean
[01:04] <thumper> while I generally agree
[01:04] <thumper> I'm trying to work out this failure
[01:04] <thumper> and this isn't it...
[01:04] <thumper> api/uniter/unit.go line 446
[01:05]  * thumper thinks...
[01:05] <thumper> hang on...
[01:05] <wallyworld> sure, that was a general cooment
[01:06] <wallyworld> not for this fix
[01:06] <thumper> fucker
[01:06] <thumper> ...
[01:06] <thumper> damn it
[01:06]  * thumper looks deeps
[01:07] <wallyworld> result.Error won't be caught maybe
[01:07] <wallyworld> i saw that line before and assumed it would be caught
[01:08] <wallyworld> oh wait, yes it will
[01:08] <thumper> wallyworld: I need a reference to a recent failure
[01:09] <thumper> result.Error is a pointer
[01:09] <thumper> and the *params.Error does match the interface
[01:09] <thumper> we need to see the recent failure
[01:09] <thumper> looking at modern code for an old failure is a waste of time
[01:09] <thumper> too much can change
[01:09] <wallyworld> thumper: attached to the bug
[01:09] <wallyworld> https://launchpadlibrarian.net/190620335/filter-failure.log
[01:10] <thumper> wallyworld: the 19th of November isn't recent
[01:10] <wallyworld> thumper: looks like a wrapping issue
[01:10] <thumper> ah, that one is a wrapping issue
[01:10] <wallyworld> obtained *errors.Err = &errors.Err
[01:11] <thumper> haha
[01:11] <thumper> yeah, that is one
[01:11] <wallyworld> so maybe we should do my previous suggestion
[01:11] <thumper> I'll fix this
[01:11] <wallyworld> awesome
[01:15] <menn0> wallyworld, thumper: I think I see a potential data race relating to bug 1409827
[01:15] <mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827>
[01:16] <menn0> mustNext in worker/peergrouper/worker_test.go
[01:16] <menn0> the return value is set in a separate goroutine
[01:17] <menn0> isn't that a no-no?
[01:17] <menn0> thumper: ^
[01:18] <menn0> might not be the source of the test failures but looks fishy
[01:18] <wallyworld> menn0: just in meeting, will look soon
[01:19] <axw> katco: standup?
[01:20] <katco> axw: shoot sorry
[01:21] <thumper> menn0: /me looks
[01:21] <thumper> menn0: which line?
[01:22] <menn0> thumper: from 513 onwards
[01:23] <menn0> thumper: the return value "val" is assigned directly from the goroutine
[01:23] <menn0> thumper: the more I think about it the more I don't think this is the test failure (because if nil was returned we'd see a different kind of failure)
[01:23] <menn0> thumper: but i'm clearing it up anyway
[01:24] <thumper> you could be right
[01:24] <thumper> but we aren't seeing "timeout waiting" are we?
[01:24] <thumper> menn0: it is wrong, but not the source I think
[01:25] <thumper> wallyworld: this bug isn't a critical blocker, but done anyway...
[01:26] <wallyworld> thumper: in meeting, will look in a sec
[01:27] <thumper> wallyworld: https://bugs.launchpad.net/juju-core/+bug/1236471
[01:27] <mup> Bug #1236471: Sporadic test failure w/ bot inside Uniter: FilterSuite.TestUnitRemoval <test-failure> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1236471>
[01:27] <thumper> gah, wrong paste: http://reviews.vapour.ws/r/723/diff/#
[01:36] <thumper> axw: test added
[01:36] <axw> thumper: thanks
[01:37]  * thumper wonders if the bot will pick this up
[01:47] <wallyworld> thumper: yay, bot got your change. so i think the SetMembers test is now the only remaining blocker
[01:47] <axw> menn0: I took a look at the test yesterday, it just looked like time sensitivity to me
[01:48] <axw> menn0: if you double that last sleep, it'll fail each time
[01:49] <axw> the sleep durations do not leave a lot of margin for error/jitter/whatever
[01:53] <menn0> axw: totally agree. i'm figuring out how to rewrite it to not be so fragile.
[01:55] <menn0> axw: in fact, checking that retries happen with exponential backup seems somewhat unnecessary
[01:55] <menn0> axw: what do you think about just ensuring that retries are occurring and leaving it at that?
[01:57] <thumper> wallyworld: branch landed, bug updated to fix released
[01:58] <thumper> wallyworld: as it was just a test failure bug
[01:58] <wallyworld> thumper: awesome, thanks
[01:58] <axw> menn0: seems fine to me
[01:58] <axw> menn0: it's a bit of an overkill test
[02:14] <ericsnow> wwitzel3: the patch to fix PortSet was pretty simple
[02:14] <ericsnow> wwitzel3: not that it affects us, but Intersection had the same problem
[02:14] <ericsnow> wwitzel3: anyway, I'm EOD :)
[02:14] <menn0> axw: I have a fix but thought I'd run one other possibility by you
[02:15] <menn0> axw: you don't think there's a possibility that the issue is to do with the way the count variable is being handled?
[02:15] <menn0> axw: it's being updated in another goroutine
[02:15] <axw> menn0: moment
[02:16] <menn0> axw: davecheney tells me that there's no guarantees about how updates will be seen by other goroutines
[02:16] <axw> menn0: yeah, I think that's wrong. it should just return the value on the channel...
[02:16] <axw> menn0: I don't think that's the cause of the failure tho
[02:17] <axw> well, it could be but I think the time sensitivity is more likely
[02:17] <menn0> axw: so do i but i thought i'd run it past you
[02:17] <menn0> axw: anyway, i've removed all the fragile timing checks... i'll have that up for review shortly
[02:18] <axw> cool
[02:18] <axw> menn0: maybe just change the chan bool to a chan struct{bool, interface{}} while you're there? :)
[02:19] <menn0> axw: oh i've done that
[02:19] <axw> great
[02:19] <menn0> axw: but what I meant is the function passed to setErrorFuncFor in TestSetMembersErrorIsNotFatal
[02:19] <menn0> axw: it updates the voyeur with a integer
[02:19] <axw> menn0: oh.. looking
[02:19] <menn0> axw: actually... never mind
[02:20] <menn0> axw: that's fine
[02:20] <menn0> axw: i'm looking too closely
[02:20] <menn0> axw: the variable is only used and updated from one goroutine so there's no issue
[02:20] <axw> yup
[02:26] <menn0> axw: http://reviews.vapour.ws/r/725/
[02:26] <axw> looking
[02:26] <menn0> axw: the channel change is a separate PR
[02:30] <menn0> axw: here's the other one: http://reviews.vapour.ws/r/726/
[02:34] <menn0> axw: all ok?
[02:35] <axw> menn0: reviewed, I'd prefer if we got rid of any explicit sleeps
[02:35] <axw> lemme know what you think - maybe I'm being too pedantic
[02:37] <axw> menn0: alternatively just get rid of the sleep in the test, since mustNext will wait up to LongWait anyway
[02:39] <menn0> axw: I guess I was wanting to see multiple retries
[02:39] <menn0> axw: but that can be done with multiple mustNext calls
[02:39] <menn0> so i'll do that
[02:40] <axw> menn0: hence the loop in my code, but yes, multiple mustNexts will do that too
[02:44] <waigani> ericsnow: is http://reviews.vapour.ws/r/724 really that big (about 40 files, 5000 lines changed) or has RB gotten confused?
[02:44]  * _thumper_ headdesks
[02:44]  * thumper headdesks
[02:44] <ericsnow> waigani: oops, no it's like 15 lines :)
[02:44] <thumper> yay string constants
[02:45] <thumper> go on, search the codebase for "90168e4c-2f10-4e9c-83c2-feedfacee5a9"
[02:45]  * thumper fixes
[02:45] <waigani> ericsnow: few. I thought, "there goes my afternoon..."
[02:46] <waigani> whoha, that's a lot of feedface
[02:46] <ericsnow> waigani: what you saw is the GCE provider patch (minus +/- 1500 lines of tests we're still writing)
[02:47] <ericsnow> waigani: we'll be splitting that up into multiple review requests though :)
[02:47] <menn0> axw: http://reviews.vapour.ws/r/725/ updated
[02:47] <waigani> ericsnow: that would be good, otherwise it's a hell of a patch to review!
[02:48] <ericsnow> waigani: :)
[02:48] <axw> menn0: lgtm, thanks
[02:48] <menn0> axw: sweet. thanks.
[02:49] <menn0> thumper, wallyworld : fix for bug 1409827 merging. is the policy that I can mark the ticket as Fix Released once it's in b/c it's a test only fix?
[02:49] <mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1409827>
[02:50] <thumper> menn0: I think so
[02:50] <wallyworld> menn0: not sure, what tim said
[02:52] <menn0> wallyworld: good answer :-p
[02:56] <wallyworld> yep :-)
[03:02] <thumper> waigani: do you remember where we generate the uuid for new environments?
[03:02] <thumper> waigani: I remember we moved it...
[03:02] <thumper> but can't remember where
[03:03] <waigani> thumper: I think it's actually in util?
[03:03] <thumper> waigani: well, where we assign it into the environ config
[03:03] <waigani> thumper: will look in a sec, hangon
[03:05] <thumper> environs.ensureUUID
[03:05] <thumper> which is in environs.prepare
[03:07] <thumper> hmm...
[03:07] <thumper> ok
[03:09] <waigani> thumper: sorry, did you want me to hunt for anything else now?
[03:09] <thumper> waigani: nope
[03:09] <thumper> got it now
[03:10] <waigani> cool
[03:23] <thumper> damn... how long do these tests take to run?
[03:23] <thumper> geez
[03:24] <anastasiamac> waigani: wallyworld: had to change http://reviews.vapour.ws/r/722/
[03:24] <anastasiamac> waigani: wallyworld: could u PTAL again?
[03:24]  * thumper taps his fingers
[03:25] <anastasiamac> waigani: wallyworld: Get at apiserver now takes params.Entities
[03:25] <anastasiamac> waigani: wallyworld: everything else should be the same
[03:25] <anastasiamac> waigani: wallyworld: in fact, it's a revert to oirginal rather than a change...
[03:25] <waigani> anastasiamac: just read review, yep that makes sense to me
[03:26] <anastasiamac> waigani: awesome - so I'll keep ur shipit :)
[03:26] <waigani> anastasiamac: sure :)
[03:26] <wallyworld> anastasiamac: +1
[03:27] <thumper> can someone tell me how long the tests should take in worker/provisioner plz?
[03:27] <anastasiamac> thumper: my last run
[03:27] <anastasiamac> thumper: github.com/juju/juju/worker/provisioner 56.706s
[03:27] <thumper> anastasiamac: ta
[03:28] <anastasiamac> wallyworld: thnx!!
[03:30] <menn0> merges are unblocked people
[03:31] <anastasiamac> menn0: yes there r already changes qued up in jenkins :)
[03:32] <anastasiamac> menn0: axw: thnx for unblocking it :)
[03:32] <menn0> anastasiamac: all those queued merges may or may not be mine
[03:32]  * menn0 ducks
[03:33] <anastasiamac> menn0: and here i was thinking to start the collection for a case of scotch...
[03:33] <menn0> ha
[03:44] <thumper> fark
[03:44] <thumper> fark fark fark
[03:44] <thumper> bitten by this same fucking issue again
[03:45] <thumper> FYI, machine config only has APIInfo structure set for state server machines
[03:46] <thumper> hmm...
[03:46] <thumper> no
[03:46] <thumper> damn
[04:13] <menn0> thumper: i'm trying to get my head around how an API connection for a new environment will be opened
[04:13] <menn0> thumper: the password will always be the same right? it's stored on the user, not the envuser.
[04:14] <menn0> thumper: actually... how do passwords for machines work?
[04:14] <thumper> yes, machines have passwords
[04:14] <thumper> menn0: but that bit isn't working right now
[04:15] <menn0> thumper: fixes in your branch?
[04:15] <thumper> menn0: I've opened a huge pile of worms with this env uuid in the agent config stuff
[04:15] <thumper> broken heaps
[04:15] <thumper> and slowly untangling
[04:15] <thumper> but I'm being called away
[04:15] <thumper> to walk the dog
[04:15] <thumper> so I'm done
[04:15] <thumper> for today
[04:15] <menn0> that sucks
[04:15] <menn0> ok
[04:15] <menn0> i will ignore that part for now
[04:21] <axw> anastasiamac: looks like your branch failed on the bot, but I think it might be an infrastructure issue
[04:21] <axw> take a look, you can probably just retry it
[04:27] <anastasiamac> axw: thnx will look
[05:18] <anastasiamac> is there a comment I can send to bot to not try to $$merge$$?
[05:18] <anastasiamac> like cancel a merge?
[05:20] <wallyworld> anastasiamac: you need to have credentials, i can cancel an in progress landing if you want
[05:21] <anastasiamac> wallyworld: ic
[05:21] <wallyworld> axw: if have have a few minutes, i'd appreciate a review of http://reviews.vapour.ws/r/727/
[05:21] <anastasiamac> wallyworld: no big deal i have qd couple of branches and noticed too late that one of them has unresolved conflict
[05:21] <axw> wallyworld: looking
[05:22] <anastasiamac> wallyworld: it'll just fail.. but thnx :)
[05:22] <wallyworld> ok
[05:22] <wallyworld> anastasiamac: annotations-tags?
[05:23] <anastasiamac> wallyworld: no sync-tools
[05:23] <anastasiamac> wallyworld: annotations tags are about to be backported to 1.22 ;)
[05:23] <anastasiamac> wallyworld: since they've merged well :P
[05:24] <jam> anastasiamac: wallyworld: aren't we in feature freeze for 1.22?
[05:24] <anastasiamac> jam: this is not a feature that's new to 1.22
[05:25] <anastasiamac> jam: it's kind of a bug... that needs to be fixed in 1.22
[05:25] <anastasiamac> jam: trying to get ckient signature right to avoid conflicts later
[05:25] <anastasiamac> jam: client*
[05:26] <anastasiamac> jam: s/conflicts/headaches
[05:26] <jam> k
[05:27] <anastasiamac> jam: thnx for checking :)
[05:36] <axw> wallyworld: reviewed
[05:36] <wallyworld> ty looking
[05:37] <axw> grrrrrr, shitty tests
[05:39] <anastasiamac> wallyworld: cherry picked annotations change http://reviews.vapour.ws/r/728/
[05:42] <wallyworld> axw: i introduced a new bootstrap method to avoid churn on the other providers.  i can quite see how to wrap the finaliser though such that the instance id is available to it, since it's called from bootstrap/bootstrap.go with a machine cfg without the id and is only filled in inside the closure
[05:44] <wallyworld> s/can/can't
[05:45] <axw> wallyworld: maybe I'm wrong, gimme a sec
[05:45] <axw> wallyworld: yep, you're right, sorry
[05:46] <wallyworld> axw: sure, np. you happy with the sig change to avoid churn?
[05:46] <axw> wallyworld: yes that'd be good, thanks
[05:47] <wallyworld> axw: to be clear, I made the sig change to avoid updating all the other providers. but you didn;t like it
[05:47] <wallyworld> also, the way i have it avoids the boilerplate error checking
[05:47] <wallyworld> that would otherwise have to be introduced
[05:47] <axw> wallyworld: where would there be extra boilerplate?
[05:48] <wallyworld> axw: what's inside the new Bootstrap() func - those 5 lines or so
[05:49] <axw> wallyworld: I don't follow. the existing code hasn't changed much, and the callers of Bootstrap still need to check an error
[05:49] <wallyworld> since the environs.Bootstrap interface method i would think we'd want to retain
[05:49] <wallyworld> they just return common.Bootstrap() directly since the signature matches that of environs.Bootstrap
[05:50] <axw> wallyworld: ah yeah, that would need to change
[05:50] <axw> ok
[05:50] <axw> forget it
[05:50] <wallyworld> ok, i'll make the err change though
[05:50] <axw> wallyworld: keep it as is. I'll comment on the branch
[05:50] <wallyworld> ok, ta
[05:52] <wallyworld> axw: also, i started looking at bug 1384259. it seems cloud init is directly running the various apt commands it is configured with , and something else on the machine locks apt and then cloud init is sad. but i haven't dug any deeper. not sure if you have any ideas
[05:52] <mup> Bug #1384259: race condition running apt in bootstrap <bootstrap> <ci> <oil> <race-condition> <juju-core:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1384259>
[05:52] <wallyworld> not sure if we want to wrap the cloud init apt commands with a retry
[05:54] <axw> wallyworld: apt is called from the ssh script, not cloud-init (on bootstrap only)
[05:54] <wallyworld> ah doh, yeah. i saw cloud init text in the log on the bug
[05:54] <axw> we could lock... I don't *think* cloud-init does anything like that though
[05:55] <wallyworld> not sure off hand how to solve this one, need to dig into it some more. any suggestions welcome.
[06:01] <axw> wallyworld: only thing I can think of is to add a script that waits for any apt or dpkg processes to stop running before we do anything
[06:02] <wallyworld> yuk, but may have no choice :-(
[06:02] <wallyworld> i've updated the pr too
[06:02] <axw> wallyworld: it'd be good to know what it's conflicting with, that might give us a better approach
[06:03] <wallyworld> i'll ask on the bug
[06:04] <axw> wallyworld: lgtm
[06:04] <wallyworld> ty
[06:06] <wallyworld> axw: in doing that branch i lost so much time due to not realising maas gave back instance ids that were different from the system ids to be passed over the api. sigh. i now know
[06:08] <axw> wallyworld: :(   any idea why we're using the resource_uri instead?
[06:08] <wallyworld> nope :-(
[06:08] <wallyworld> i found a helper function someone wrote to convert
[06:08] <wallyworld> so it must have been a deliberate decision
[06:13] <axw> whee, finally. you can now provision ec2 instances with volumes
[06:30] <wallyworld> axw: whoot! fantastic
[07:12] <axw> wallyworld: I forgot to ask before: is there a way we can flag some manual testing as being  required for the next release?
[07:12] <axw> (e.g. ensuring MAAS 1.7 deployments work well, for non-bootstrap machines)
[07:19] <wallyworld> axw: yes, i plan on raising this with curtis tomorrow
[07:20] <axw> wallyworld: ok, cool
[07:20] <axw> wallyworld: but in general, should we be using launchpad bugs or what...?
[07:21] <wallyworld> for recording testing notes? using lp bugs seems reasonable
[07:21] <wallyworld> this close to release, i wanted to do it more explicitly
[07:21] <axw> wallyworld: not so much a note, as "we should not release unless we know this thing has been tested"
[07:22] <wallyworld> yeah that, sorry, was using the term generically
[07:22] <axw> ok
[07:22] <wallyworld> but we don't have a documented process AFAIK to flag critical testing issues
[07:23] <axw> wallyworld: in my previous job we used to create tasks for every new feature and major bug fix that would block a release. they'd generally need to be done by someone other than the implementer. we had the luxury of having big, dedicated testing teams though :)
[07:24] <wallyworld> axw: yeah, we had something similar previously for me also
[07:25] <wallyworld> we just need to make sure that wes the release manager and the QA team are informed, and that other stakeholders are brought in as needed to help test
[08:33] <dimitern> axw, hey, are you still around?
[08:34] <axw> dimitern: heya, yes I am
[08:34] <TheMue> morning *yawn*
[08:34] <axw> morning
[08:35] <dimitern> axw, a quick storage question: do we plan to mount devices in lxc containers?
[08:35] <dimitern> morning TheMue
[08:36] <axw> dimitern: we want to be able to, yes. it's going to require some changes to lxc templates to allow mounting and so on
[08:36] <axw> dimitern: why do you ask?
[08:37] <dimitern> axw, because due to the networking work I plan to make lxc config file templates more flexible
[08:37] <dimitern> axw, and this should also help for storage
[08:38] <axw> dimitern: I see, yes, that will be helpful
[08:38] <dimitern> axw, sweet, I'll let you know when my changes are in then
[08:39] <axw> dimitern: thanks very much. probably won't be getting to lxc for a little while yet, but that'll be much appreciated
[08:39] <axw> wallyworld: ^^ dimitern is helping with storage now ;)
[08:39] <dimitern> wallyworld, axw, :D more like side-effecting it
[09:31] <dimitern> wallyworld, axw, can any of you have a look at a small goamz PR? https://github.com/go-amz/amz/pull/16 thanks!
[09:43] <voidspace> dimitern: ping
[09:43] <voidspace> dimitern: cannot use parent (type names.Tag) as type names.MachineTag in function argument: need type assertion
[09:44] <dimitern> voidspace, hmm
[09:44] <dimitern> voidspace, yeah?
[09:44] <voidspace> dimitern: just getting the code
[09:44] <voidspace> dimitern: I want to know if it's safe to just do the conversion
[09:45] <voidspace> dimitern: if I actually have the right tag
[09:45] <voidspace> dimitern: just finding the place where I get the tag and where I'm using it
[09:45] <dimitern> voidspace, it is safe if you actually have a names.MachineTag
[09:45] <voidspace> parent := p.authorizer.GetAuthTag()
[09:45] <voidspace> parentTag, err := names.ParseMachineTag(parent)
[09:46] <voidspace> parentMachine, err := p.getMachine(canAccess, parentTag)
[09:46] <voidspace> ah
[09:46] <voidspace> now the error is
[09:46] <voidspace>  cannot use parent (type names.Tag) as type string in function argument
[09:46] <voidspace> dimitern: so just convert then...
[09:47] <dimitern> voidspace, wait a sec
[09:47] <voidspace> dimitern: is the result of GetAuthTag() the machine tag?
[09:47] <dimitern> voidspace, GetAuthTag does return names.Tag, but if authorizer.AuthMachineAgent() is true then it's safe to cast it
[09:48] <voidspace> if it isn't true we shouldn't be running...
[09:48] <voidspace> so I should check I guess
[09:48] <dimitern> voidspace, yeah - have a look at NewProvisionerAPI in apiserver
[09:48] <dimitern> voidspace, the very first check is if !authorizer.AuthMachineAgent() && !authorizer.AuthEnvironManager() { return nil, common.ErrPerm }
[09:49] <dimitern> voidspace, actually, the getAuthFunc defined there is just what you need
[09:49] <dimitern> voidspace, it already checks parent/child relationship
[09:49] <voidspace> ah
[09:50] <voidspace> and I'm using that later anyway
[09:50] <voidspace> so maybe I don't need a separate check
[09:50] <voidspace> I'll look at that, thanks
[09:50] <dimitern> voidspace, yes, *I think* you can just use that getAuthFunc to validate the passed tag
[09:51] <voidspace> cool, thanks
[10:02] <dimitern> voidspace, standup?
[10:05] <voidspace> dimitern: oops, sorry
[10:56] <dimitern> voidspace, I have a cunning plan :)
[10:56] <dimitern> voidspace, you can't tag IPs, but you can tag NICs
[10:58] <dimitern> voidspace, e.g. we can add tags like "juju:machine-id=<id>", "juju:<mid>:address:<#>=<ip>" to the NIC after calling RunInstances
[11:00] <dimitern> voidspace, so each time we call AttachPrivateIpAddress successfully, we also add a tag "juju:0/lxc/0:address:0"="" (we don't know the address yet), but then when listing all instance IPs we use the tags to decide which goes where
[11:02] <dimitern> voidspace, and the instance updater can set "juju:0/lxc/0:address:0"="<some yet-unassigned ip>" as a tag and also in state
[11:03] <dimitern> anyway.. just thinking out loud - tags can be pretty powerful way of adding metadata accessible via aws api even if apiserver dies/cannot be reached, we can use the tags to intelligently cleanup dependent resources
[11:06] <TheMue> dimitern: to stay with the golang naming conventions I would call it AttachPrivateIPAddress()
[11:07] <perrito666> morning
[11:07] <dimitern> TheMue, in goamz it's called AssignPrivateIPAddresses actually
[11:07] <TheMue> perrito666: heya and good morning
[11:07] <dimitern> morning perrito666
[11:07] <TheMue> dimitern: +1
[11:07] <TheMue> dimitern: just took a look at net package ;)
[11:08] <dimitern> TheMue, yeah :)
[11:37] <perrito666> dimitern: TheMue any of you knows what is the status of blocking bugs?
[11:37] <dimitern> perrito666, all resolved
[11:37] <perrito666> dimitern: and merged?
[11:37] <dimitern> perrito666, for now at least, so no longer blocked
[11:37]  * perrito666 looks at the topic hoping it will dissapear
[11:47] <perrito666> wallyworld: do you not sleep?
[11:47] <wallyworld> sometimes
[11:48] <wallyworld> like you can talk :-)
[11:49] <perrito666> heh fair
[11:50] <perrito666> anyway your mail makes sense, that is why i added a unit ptr as a member of the unitagent, we can use tag from there
[11:53] <wallyworld> perrito666: i don't think it makes sense to embed the whole unit into unitagent - i thought we talked about having unitagent very lightweight, just doing status get/set
[11:53] <voidspace> dimitern: that's a terrible abuse of tags :-D
[11:54] <dimitern> voidspace, :) oh, I'm just getting started
[11:54] <voidspace> :-)
[11:55] <perrito666> wallyworld: I did not embed it, its just a member
[11:55] <perrito666> wallyworld: you might need some sleep and a couple of drinks
[11:55] <wallyworld> perrito666: the latter is taken care of :-)
[11:55] <perrito666> lol
[11:56] <wallyworld> but i'm still not sure about even referencing unit
[11:56] <wallyworld> we don't need all that baggage inside UnitAgent struct, which for now is just about get/set status
[11:57] <perrito666> I am all ears about Tag then :p
[11:59] <wallyworld> we could invent a new one eg "unitagent-foo/0", or easier, just have SetAgentStatus pass the "unit" tag across the wire and the method knows how to deal with it
[11:59] <wallyworld> the latter seems best, but maybe i'm missing something
[12:01] <perrito666> I am not sure of the implication of the latter, I guess it could work
[12:02] <wallyworld> i think it will be ok, but would need to start coding to see where it ends up
[12:03] <perrito666> well, that is what I am for
[12:03] <perrito666> and tonight I have meetings at 11pm and 00 (its 9am) so I seem to have time ahead of me
[12:24] <voidspace> dimitern: hmmm, I bet you can't set tags atomically though
[12:28] <dimitern> voidspace, well it would appear so.. although you can't set tags on instance/NIC/etc. creation according to the docs, you *can* launch an instance via the AWS web console and add tags to it
[12:28] <dimitern> voidspace, i've enabled the cloudtrail API logging and experimenting now to see how they do it
[12:28] <voidspace> cool
[12:28] <perrito666> mm, on a machine from scratch here and our tests seem to expect a /usr/lib/juju/bin/mongod
[12:29] <perrito666> that is sort of wrong for the tests isn't it?
[13:13] <wwitzel3> perrito666: short answer, yes ;)
[14:25] <jw4> backport PR to remove accidentally added file from 1.22: http://reviews.vapour.ws/r/731/
[14:25] <jw4> OCR PTAL ^^
[14:25] <jw4> :)
[14:25] <dimitern> jw4, ship it! :)
[14:25] <jw4> dimitern: :)
[14:27] <perrito666> that is so close to occipital
[14:28] <jw4> perrito666: http://en.wikipedia.org/wiki/Occipital_bone ?
[14:29] <perrito666> true
[14:29] <perrito666> the OCRPTAL bone
[14:29] <jw4> hehe
[14:29]  * jw4 just got it
[15:05] <TheMue> o/
[15:12] <perrito666> OCR PTAL http://reviews.vapour.ws/r/732/
[15:15] <dimitern> perrito666, we should just start using "occipital" :D
[15:17] <perrito666> dimitern: yes, it was very hard to resist the temptation
[15:35] <katco> one-line change and test; blocking 1.22; up for review: http://reviews.vapour.ws/r/733/
[15:40] <dimitern> katco, wow!
[15:40] <dimitern> katco, a return is all it takes?
[15:41] <katco> dimitern: i told you i had already thought of the possibility, but i ignored my own warning ;)
[15:41] <dimitern> katco, hehe - you've got a review
[15:42] <katco> dimitern: ty, i'll add the bug#
[15:44] <katco> dimitern: would you be able to do a quick test of the code on your environment? or has the opportunity passed?
[15:45] <dimitern> katco, sure, let me pull your branch
[15:45] <katco> dimitern: ty so much :)
[15:51] <dimitern> katco, np - it's bootstrapping now
[15:51] <katco> dimitern: cool ty again
[16:06] <dimitern> katco, ok, so no panic, just a few warnings about dns resolving - http://paste.ubuntu.com/9749392/
[16:06] <katco> dimitern: that's expected; looks good, yes?
[16:06] <dimitern> katco, yes, however isn't the warning message a bit misleading?
[16:07] <katco> dimitern: how so?
[16:07] <dimitern> katco, "Status may be incorrect" ?
[16:07] <katco> dimitern: well, it's showing that you're running on no subnets and utilizing no ports
[16:07] <dimitern> katco, got it, right
[16:07] <dimitern> katco, lgtm then
[16:08] <katco> dimitern: ty for all the help; finding it, reporting it, everything :)
[16:09] <dimitern> katco, np, thanks for fixing it so quickly :)
[16:10] <katco> dimitern: it's much easier to troubleshoot/fix something when you know (almost) everything about it :)
[16:11] <katco> dimitern: and the fact that i could write a unit test sped up the process as well
[16:11] <dimitern> katco, exactly!
[16:43] <voidspace> dimitern: ping
[16:43] <voidspace> dimitern: you still around?
[16:44] <dimitern> voidspace, yep
[16:44] <voidspace>  dimitern state.State supports adding a subnet that doesn't yet exist in state or fetching one that already exists
[16:44] <voidspace> dimitern: what I *want* is "get me this subnet - adding it if it doesn't exist"
[16:44] <voidspace> dimitern: better to do that in a State method or just hand code the logic
[16:45] <dimitern> voidspace, too many "states" :)
[16:45] <voidspace> hah
[16:45] <dimitern> voidspace, does not exist in which state?
[16:45] <dimitern> voidspace, ah, sorry
[16:45] <voidspace> the stored state
[16:45] <dimitern> voidspace, got you
[16:45] <voidspace> mongo I guess
[16:45] <voidspace> I mean, I know it's mongo
[16:45] <voidspace> but I guess that's a better way of saying it...
[16:46] <dimitern> voidspace, right - we can change AddSubnet to AddOrUpdateSubnet perhaps?
[16:46] <voidspace> dimitern: ok
[16:47] <voidspace> gah, and there's network.SubnetInfo plus state.SubnetInfo
[16:47] <voidspace> I have a network.SubnetInfo, I need a state.SubnetInfo
[16:47] <dimitern> voidspace, let me have a look
[16:48] <voidspace> dimitern: I wrote the code, I only have myself to blame
[16:48] <dimitern> voidspace, right, so the unfortunate duplication is on purpose
[16:48] <voidspace> dimitern: I'm ok
[16:49] <katco> dimitern: backport of same change to v1.22: http://reviews.vapour.ws/r/734/
[16:49] <dimitern> voidspace, state shouldn't depend on other packages, the same applies to params
[16:49] <dimitern> katco, looking
[16:49] <voidspace> dimitern: although state does depend on network anyway I believe
[16:49] <dimitern> katco, ship it! :)
[16:50] <katco> dimitern: woot! grats on quick turn around on this :)
[16:50] <dimitern> voidspace, well it does for network.Address I think
[16:50] <dimitern> katco, well I've seen it before lol
[16:51] <katco> dimitern: i mean the whole bug :)
[16:51] <katco> dimitern: wouldn't have gotten resolved, nor so quickly w/o your help
[16:51] <dimitern> katco, ah, yeah - one of the fastest fixes lately
[16:52] <dimitern> katco, np, glad to help
[16:53] <dimitern> voidspace, so.. the state documents shouldn't depend on things outside of state, which might change out-of-band and lead to docs getting serialized differently
[16:53] <voidspace> dimitern: fair enough
[16:53] <dimitern> voidspace, we're not entirely depend-less, but let's not make it worse :)
[16:54] <dimitern> voidspace, as for params - same issue - serialization; we shouldn't change the on-the-wire format of the api incompatibly
[16:54] <voidspace> I'm aware of that one
[16:55] <voidspace> for state I don't think it's a *genuine* issue though as we populate a subnetDoc from the SubnetInfo
[16:55] <dimitern> voidspace, sorry :/
[16:55] <voidspace> so we're safe from "out of band changes" anyway
[16:55] <voidspace> as we already have a layer of indirection for the actual serialisation
[16:55] <dimitern> voidspace, yeah, that's right
[16:56] <voidspace> adding SubnetInfo is *two layers* of indirection
[16:56] <voidspace> :-p
[16:56] <dimitern> voidspace, we should consult fwereade here I think
[16:56] <voidspace> dimitern: let me work with the code and see how it feels - I'll just write a "caster function" I guess
[16:57] <dimitern> voidspace, because not depending on packages for the sake of stable serialization format for mongo docs is one thing, but no dependencies at all might be too much
[16:57] <voidspace> ok
[16:57] <voidspace> and Subnet representation (network package) is a low level dependency not a structural dependency
[16:58] <dimitern> voidspace, I think so, yes
[18:25] <voidspace> dimitern: late ping
[18:27] <dimitern> voidspace, yeah? i'm around on and off
[18:28] <voidspace> dimitern: you added network.InterfaceInfo recently, with the intention it be used by the ProviderAPI api?
[18:29] <dimitern> voidspace, not over the wire though - there's a params.NetworkInfo for that
[18:30] <voidspace> dimitern: I have subnet info and ip address and am wondering how I get the extra information if that's what I'm required to
[18:31] <voidspace> dimitern: from the subnet CIDR I'll have to fetch the NIC info
[18:31] <voidspace> dimitern: it doesn't look like there's a provider method for this (that I can see), can I assume state will have it correctly?
[18:31] <voidspace> for the host machine
[18:33] <dimitern> voidspace, sorry, what extra info?
[18:33] <voidspace> DeviceIndex, MACAddress, NetworkName, InterfaceName
[18:33] <voidspace> etc...
[18:33] <voidspace> everything on NetworkInfo that isn't on SubnetInfo
[18:34] <dimitern> voidspace, hmm
[18:34] <voidspace> let me double check there's no environ.Interfaces
[18:34] <dimitern> voidspace, from state you mean?
[18:34] <voidspace> dimitern: I call environ.Subnets() which returns []network.SubnetInfo
[18:34] <voidspace> dimitern: there is an interface collection in state though I *believe*
[18:35] <voidspace> dimitern: maybe a problem for tomorrow as it's late for me too
[18:36] <voidspace> dimitern: I thought you might know easily... :-)
[18:36] <dimitern> voidspace, yeah - let's call it a day :) I'm a bit dumb now I'm afraid
[18:36] <voidspace> dimitern: it's even later for you than it is for me! Goodnight, see you tomorrow.
[18:36] <voidspace> and goodnight everyone
[19:43] <thumper> morning folks
[19:44] <thumper> geez... you make one small thing required and suddenly heaps of tests break...
[19:45] <perrito666> thumper: what did you break while we where not looking?
[19:45] <thumper> perrito666: I'm needing to add the environment uuid to the agent config
[19:45] <thumper> perrito666: otherwise all the machine and unit agents don't know which environment to connect to
[19:46] <thumper> perrito666: but that opened a world of hurt
[19:46] <thumper> that I've spent about five hours unpicking
[19:46] <thumper> I think I'm almost there
[19:46] <thumper> then I need to write an upgrade step
[19:46] <perrito666> I am sure you said "this should be easy" before starting, that usually complicates things
[19:46] <thumper> I think I did
[19:47] <thumper> I expected it to be a few hours
[19:47] <thumper> not days
[19:47] <perrito666> well, you should never jynx it
[19:52] <thumper> and I seem to have made the provisioner tests never exit
[19:52] <thumper> ...
[19:52] <thumper> not sure how that happened
[19:53]  * thumper looks up and sees two open critical bugs
[19:53] <thumper> WTF
[19:53] <thumper> ok... topic is wrong
[19:54] <perrito666> thumper: yep, I dont know why is not back to  none
[19:54] <perrito666> build is unblocked
[19:58] <perrito666> now, this is unexpected, there is an mtv channel that actually has music
[19:59] <thumper> haha
[20:14] <thumper> menn0: bot is unblocked, land your pending ones if you havent' already
[20:20]  * thumper makes a sad face
[20:20] <thumper> just found the most horrible fragile test
[20:20] <thumper> but don't have time to fix it right now
[20:21] <perrito666> thumper: which is?
[20:21] <thumper> func (*cloudinitSuite) TestWindowsCloudInit(c *gc.C) {
[20:21] <perrito666> ah, oops
[20:22] <thumper> no shit, doing an equality test with a 850 line string
[20:22] <thumper> any change in any cloundinit stuff means the string has to change
[20:22] <menn0> thumper: I landed then all yesterday (I unblocked the bot and got them all in before telling anyone else :-0)
[20:23] <menn0> thumper: that sounds wonderful
[20:23] <menn0> thumper: (that test)
[20:27] <thumper> tech debt item: all cloudinit tests are awful and fragile
[20:42] <thumper> ah ha...
[20:43] <thumper> I think I found the culprit
[20:52] <wwitzel3> ericsnow: ping
[20:52] <ericsnow> wwitzel3: hey
[21:34] <menn0> waigani: Ship It!
[21:35] <waigani> menn0: sweet, thanks
[21:50] <menn0> ericsnow: I just reviewed your Attempt PR again (Ship It if you like)
[21:51] <ericsnow> menn0: thanks
[22:04] <thumper> by joves I think I may have fixed all the test failures
[22:04] <thumper> ...
[22:04]  * thumper runs full suite again
[22:13] <thumper> menn0: 31 files changed, 185 insertions(+), 87 deletions(-) to get the tests passing on requiring environ uuid
[22:16] <thumper> menn0: do you have a few minutes to chat? I need to talk through an issue
[22:16] <thumper> although I think I know the answer
[22:24] <thumper> menn0, waigani_: beware with upgrade steps landing since 1.22 was branched, we should have 1.23 upgrade steps now
[22:24] <waigani_> thumper: right, noted
[22:24] <thumper> we should check any that have landed since the branch (if any)
[22:25] <thumper> I was just thinking of this now as I'm about to write an upgrade setp
[22:39] <menn0> thumper: hi, sorry just noticed this. was deep in thought. chat now?
[22:40] <thumper> 2 minutes, booking a shuttle for tomorrow
[22:43] <thumper> menn0: standup hangout?
[22:43] <menn0> thumper: see you there
[23:35] <sinzui> thumper, wallyworld_ can you get someone to look into the windows regression reported in bug 1411024
[23:35] <mup> Bug #1411024: Win client/agent cannot bug built because of backup deps <ci> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1411024>
[23:35] <davecheney> thumper: is there an agenda for our feb sprint ?
[23:46] <davecheney> menn0: so what i'm hearing is "no, don't install the latest kernel update if you want wifi to work"
[23:50] <perrito666> sinzui: I am the culprit, Ill fix it