perrito666 | wallyworld: btw I began using quassel, it sucks that they dont support ubuntu indicator for messages | 00:20 |
---|---|---|
wallyworld | yeah, maybe htere's a plugin, not sure | 00:20 |
perrito666 | doesnt seem to implement plugins | 00:21 |
perrito666 | but having the server in amazon works better than bip | 00:22 |
thumper | wallyworld: re bug 1236471 | 00:29 |
mup | Bug #1236471: Sporadic test failure w/ bot inside Uniter: FilterSuite.TestUnitRemoval <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1236471> | 00:29 |
thumper | wallyworld: it seems to be just utopic | 00:29 |
thumper | is there someone running utopic that could test it? | 00:29 |
wallyworld | oh joy | 00:29 |
wallyworld | i can | 00:29 |
thumper | works fine here | 00:29 |
wallyworld | sigh | 00:29 |
wallyworld | i'll try it now | 00:29 |
thumper | wallyworld: although a key indicator to the problem could be the comment on line 111 | 00:30 |
wallyworld | thumper: works fine for me on utopic | 00:31 |
thumper | wallyworld: reading the code... | 00:31 |
thumper | wallyworld: it seems that sometimes destroy removes the unit, and sometimes it just sets it to dying | 00:31 |
thumper | smells like a race condition | 00:32 |
thumper | the filter that we check dies seems to not find the unit | 00:32 |
wallyworld | yeah, sadly | 00:32 |
thumper | so it feels like in that case, the doc was removed | 00:32 |
thumper | as opposed to just set to dying | 00:32 |
thumper | wallyworld: reading the comment in state/unit.go Destroy method | 00:33 |
wallyworld | that makes sense having now read the code | 00:33 |
thumper | wallyworld: it seems that once destroy has been called, we can't depend on it being there | 00:33 |
thumper | so the test looks fucked | 00:33 |
thumper | or the filter is screwed | 00:33 |
thumper | as it expects it to be there | 00:33 |
thumper | and it has goine | 00:33 |
thumper | perhaps the test should handle the missing unit error | 00:33 |
thumper | I bet what is happening | 00:34 |
thumper | is that the remove is being executed before the filter go routine starts | 00:34 |
wallyworld | thumper: this comment | 00:34 |
wallyworld | / Ensure we get a signal on f.Dead() | 00:34 |
wallyworld | seems to imply the test expects the unit to be Dead | 00:34 |
wallyworld | but it's either still dying or already removed as you say | 00:35 |
thumper | seems like the test needs to ensure that the filter go routine has started | 00:35 |
thumper | and it probably doesn't | 00:35 |
thumper | nope | 00:35 |
thumper | it doesn't | 00:35 |
thumper | there is the race | 00:35 |
thumper | filter.go line 281 will be where it errors | 00:37 |
* thumper grunts | 00:38 | |
thumper | wallyworld: perhaps the best way to fix this is to have NewFilter not return until the goroutine for the loop has hit a ready state? | 00:38 |
thumper | wallyworld: thoughts? | 00:38 |
wallyworld | thumper: i'm just reading the filter code and in various places there's this: IsCodeNotFoundOrCodeUnauthorized | 00:39 |
wallyworld | so it seems that the filter can expect the unit to not be there | 00:39 |
wallyworld | and in that case, it returns an ErrTerminateAGent which is what the test wants | 00:39 |
wallyworld | do you see what i mean? | 00:40 |
wallyworld | in the loop() | 00:40 |
wallyworld | oh, i just read what you posted above | 00:41 |
wallyworld | let me look at that | 00:41 |
menn0 | wallyworld, thumper: fyi I have made little progress with bug 1409827 | 00:42 |
mup | Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827> | 00:42 |
thumper | wallyworld: but you can see from the bug, the error is a "not found" one | 00:42 |
menn0 | wallyworld, thumper: mainly due to distractions at home | 00:42 |
wallyworld | menn0: i hate those tests | 00:42 |
thumper | wallyworld: because it is a params.Error not an rpc.Error | 00:43 |
menn0 | wallyworld, thumper: I am questioning why it's a CI blocker though. it fails far less often than the other ones you're currently looking at | 00:43 |
menn0 | wallyworld, thumper: I can only find one recent failure | 00:43 |
wallyworld | menn0: maybe sinzui can answer | 00:43 |
menn0 | wallyworld, thumper, sinzui: it obviously needs to be dealt with, I just don't know if it's CI blocker material | 00:45 |
menn0 | wallyworld, thumper, sinzui: continuing with it now anyway | 00:45 |
wallyworld | ty | 00:46 |
thumper | wallyworld: I think I know why it isn't being caught | 00:47 |
wallyworld | thumper: so the test log shows the CharmURL() api call failing with a params.NotFound error but this error should have been caught in the defer and changed to ErrTerminateAgent | 00:47 |
thumper | wallyworld: because if any of the errors are traced or wrapped, then they don't match the call | 00:47 |
thumper | maybe... | 00:48 |
wallyworld | thumper: because ? | 00:48 |
wallyworld | if err, _ := err.(rpc.ErrorCoder); err != nil { | 00:48 |
wallyworld | err could be wrapped? | 00:48 |
thumper | I'm thinking it is a possibility | 00:49 |
thumper | however... | 00:49 |
thumper | no... | 00:49 |
thumper | because the resulting error is a params.Error | 00:49 |
thumper | wallyworld: I don't see how this could fail... :( | 00:51 |
wallyworld | sigh, me either, but we might me missing something subtle not being familiar with the code | 00:51 |
wallyworld | thumper: if i comment out the deferred error handling to convert a not found to a errterminateagent, i can get it to fail just like in the test | 00:59 |
wallyworld | so that does suggest that that error handling being used is critical to making the test pass, and i can't see how the error could be escaping | 01:00 |
thumper | looking at the bug test log | 01:01 |
thumper | [LOG] 67.77366 DEBUG juju.rpc.jsoncodec -> {"RequestId":12,"Response": | 01:02 |
thumper | indicates that line 320 is the return statement in question | 01:02 |
thumper | but I agree, can't see why it wasn't chagned | 01:02 |
wallyworld | thumper: sill, i do think we need an errors.Cause() instead of a straight cast tp rpc.Error in the params error code stuff | 01:02 |
wallyworld | rpc.ErrorCoder i mean | 01:03 |
thumper | while I generally agree | 01:04 |
thumper | I'm trying to work out this failure | 01:04 |
thumper | and this isn't it... | 01:04 |
thumper | api/uniter/unit.go line 446 | 01:04 |
* thumper thinks... | 01:05 | |
thumper | hang on... | 01:05 |
wallyworld | sure, that was a general cooment | 01:05 |
wallyworld | not for this fix | 01:06 |
thumper | fucker | 01:06 |
thumper | ... | 01:06 |
thumper | damn it | 01:06 |
* thumper looks deeps | 01:06 | |
wallyworld | result.Error won't be caught maybe | 01:07 |
wallyworld | i saw that line before and assumed it would be caught | 01:07 |
wallyworld | oh wait, yes it will | 01:08 |
thumper | wallyworld: I need a reference to a recent failure | 01:08 |
thumper | result.Error is a pointer | 01:09 |
thumper | and the *params.Error does match the interface | 01:09 |
thumper | we need to see the recent failure | 01:09 |
thumper | looking at modern code for an old failure is a waste of time | 01:09 |
thumper | too much can change | 01:09 |
wallyworld | thumper: attached to the bug | 01:09 |
wallyworld | https://launchpadlibrarian.net/190620335/filter-failure.log | 01:09 |
thumper | wallyworld: the 19th of November isn't recent | 01:10 |
wallyworld | thumper: looks like a wrapping issue | 01:10 |
thumper | ah, that one is a wrapping issue | 01:10 |
wallyworld | obtained *errors.Err = &errors.Err | 01:10 |
thumper | haha | 01:11 |
thumper | yeah, that is one | 01:11 |
wallyworld | so maybe we should do my previous suggestion | 01:11 |
thumper | I'll fix this | 01:11 |
wallyworld | awesome | 01:11 |
menn0 | wallyworld, thumper: I think I see a potential data race relating to bug 1409827 | 01:15 |
mup | Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827> | 01:15 |
menn0 | mustNext in worker/peergrouper/worker_test.go | 01:16 |
menn0 | the return value is set in a separate goroutine | 01:16 |
menn0 | isn't that a no-no? | 01:17 |
menn0 | thumper: ^ | 01:17 |
menn0 | might not be the source of the test failures but looks fishy | 01:18 |
wallyworld | menn0: just in meeting, will look soon | 01:18 |
axw | katco: standup? | 01:19 |
katco | axw: shoot sorry | 01:20 |
thumper | menn0: /me looks | 01:21 |
thumper | menn0: which line? | 01:21 |
menn0 | thumper: from 513 onwards | 01:22 |
menn0 | thumper: the return value "val" is assigned directly from the goroutine | 01:23 |
menn0 | thumper: the more I think about it the more I don't think this is the test failure (because if nil was returned we'd see a different kind of failure) | 01:23 |
menn0 | thumper: but i'm clearing it up anyway | 01:23 |
thumper | you could be right | 01:24 |
thumper | but we aren't seeing "timeout waiting" are we? | 01:24 |
thumper | menn0: it is wrong, but not the source I think | 01:24 |
thumper | wallyworld: this bug isn't a critical blocker, but done anyway... | 01:25 |
wallyworld | thumper: in meeting, will look in a sec | 01:26 |
thumper | wallyworld: https://bugs.launchpad.net/juju-core/+bug/1236471 | 01:27 |
mup | Bug #1236471: Sporadic test failure w/ bot inside Uniter: FilterSuite.TestUnitRemoval <test-failure> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1236471> | 01:27 |
thumper | gah, wrong paste: http://reviews.vapour.ws/r/723/diff/# | 01:27 |
thumper | axw: test added | 01:36 |
axw | thumper: thanks | 01:36 |
* thumper wonders if the bot will pick this up | 01:37 | |
wallyworld | thumper: yay, bot got your change. so i think the SetMembers test is now the only remaining blocker | 01:47 |
axw | menn0: I took a look at the test yesterday, it just looked like time sensitivity to me | 01:47 |
axw | menn0: if you double that last sleep, it'll fail each time | 01:48 |
axw | the sleep durations do not leave a lot of margin for error/jitter/whatever | 01:49 |
menn0 | axw: totally agree. i'm figuring out how to rewrite it to not be so fragile. | 01:53 |
menn0 | axw: in fact, checking that retries happen with exponential backup seems somewhat unnecessary | 01:55 |
menn0 | axw: what do you think about just ensuring that retries are occurring and leaving it at that? | 01:55 |
thumper | wallyworld: branch landed, bug updated to fix released | 01:57 |
thumper | wallyworld: as it was just a test failure bug | 01:58 |
wallyworld | thumper: awesome, thanks | 01:58 |
axw | menn0: seems fine to me | 01:58 |
axw | menn0: it's a bit of an overkill test | 01:58 |
ericsnow | wwitzel3: the patch to fix PortSet was pretty simple | 02:14 |
ericsnow | wwitzel3: not that it affects us, but Intersection had the same problem | 02:14 |
ericsnow | wwitzel3: anyway, I'm EOD :) | 02:14 |
menn0 | axw: I have a fix but thought I'd run one other possibility by you | 02:14 |
menn0 | axw: you don't think there's a possibility that the issue is to do with the way the count variable is being handled? | 02:15 |
menn0 | axw: it's being updated in another goroutine | 02:15 |
axw | menn0: moment | 02:15 |
menn0 | axw: davecheney tells me that there's no guarantees about how updates will be seen by other goroutines | 02:16 |
axw | menn0: yeah, I think that's wrong. it should just return the value on the channel... | 02:16 |
axw | menn0: I don't think that's the cause of the failure tho | 02:16 |
axw | well, it could be but I think the time sensitivity is more likely | 02:17 |
menn0 | axw: so do i but i thought i'd run it past you | 02:17 |
menn0 | axw: anyway, i've removed all the fragile timing checks... i'll have that up for review shortly | 02:17 |
axw | cool | 02:18 |
axw | menn0: maybe just change the chan bool to a chan struct{bool, interface{}} while you're there? :) | 02:18 |
menn0 | axw: oh i've done that | 02:19 |
axw | great | 02:19 |
menn0 | axw: but what I meant is the function passed to setErrorFuncFor in TestSetMembersErrorIsNotFatal | 02:19 |
menn0 | axw: it updates the voyeur with a integer | 02:19 |
axw | menn0: oh.. looking | 02:19 |
menn0 | axw: actually... never mind | 02:19 |
menn0 | axw: that's fine | 02:20 |
menn0 | axw: i'm looking too closely | 02:20 |
menn0 | axw: the variable is only used and updated from one goroutine so there's no issue | 02:20 |
axw | yup | 02:20 |
menn0 | axw: http://reviews.vapour.ws/r/725/ | 02:26 |
axw | looking | 02:26 |
menn0 | axw: the channel change is a separate PR | 02:26 |
menn0 | axw: here's the other one: http://reviews.vapour.ws/r/726/ | 02:30 |
menn0 | axw: all ok? | 02:34 |
axw | menn0: reviewed, I'd prefer if we got rid of any explicit sleeps | 02:35 |
axw | lemme know what you think - maybe I'm being too pedantic | 02:35 |
axw | menn0: alternatively just get rid of the sleep in the test, since mustNext will wait up to LongWait anyway | 02:37 |
menn0 | axw: I guess I was wanting to see multiple retries | 02:39 |
menn0 | axw: but that can be done with multiple mustNext calls | 02:39 |
menn0 | so i'll do that | 02:39 |
axw | menn0: hence the loop in my code, but yes, multiple mustNexts will do that too | 02:40 |
waigani | ericsnow: is http://reviews.vapour.ws/r/724 really that big (about 40 files, 5000 lines changed) or has RB gotten confused? | 02:44 |
* _thumper_ headdesks | 02:44 | |
=== _thumper_ is now known as thumper | ||
* thumper headdesks | 02:44 | |
ericsnow | waigani: oops, no it's like 15 lines :) | 02:44 |
thumper | yay string constants | 02:44 |
thumper | go on, search the codebase for "90168e4c-2f10-4e9c-83c2-feedfacee5a9" | 02:45 |
* thumper fixes | 02:45 | |
waigani | ericsnow: few. I thought, "there goes my afternoon..." | 02:45 |
waigani | whoha, that's a lot of feedface | 02:46 |
ericsnow | waigani: what you saw is the GCE provider patch (minus +/- 1500 lines of tests we're still writing) | 02:46 |
ericsnow | waigani: we'll be splitting that up into multiple review requests though :) | 02:47 |
menn0 | axw: http://reviews.vapour.ws/r/725/ updated | 02:47 |
waigani | ericsnow: that would be good, otherwise it's a hell of a patch to review! | 02:47 |
ericsnow | waigani: :) | 02:48 |
axw | menn0: lgtm, thanks | 02:48 |
menn0 | axw: sweet. thanks. | 02:48 |
menn0 | thumper, wallyworld : fix for bug 1409827 merging. is the policy that I can mark the ticket as Fix Released once it's in b/c it's a test only fix? | 02:49 |
mup | Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1409827> | 02:49 |
thumper | menn0: I think so | 02:50 |
wallyworld | menn0: not sure, what tim said | 02:50 |
menn0 | wallyworld: good answer :-p | 02:52 |
wallyworld | yep :-) | 02:56 |
thumper | waigani: do you remember where we generate the uuid for new environments? | 03:02 |
thumper | waigani: I remember we moved it... | 03:02 |
thumper | but can't remember where | 03:02 |
waigani | thumper: I think it's actually in util? | 03:03 |
thumper | waigani: well, where we assign it into the environ config | 03:03 |
waigani | thumper: will look in a sec, hangon | 03:03 |
thumper | environs.ensureUUID | 03:05 |
thumper | which is in environs.prepare | 03:05 |
thumper | hmm... | 03:07 |
thumper | ok | 03:07 |
waigani | thumper: sorry, did you want me to hunt for anything else now? | 03:09 |
thumper | waigani: nope | 03:09 |
thumper | got it now | 03:09 |
waigani | cool | 03:10 |
thumper | damn... how long do these tests take to run? | 03:23 |
thumper | geez | 03:23 |
anastasiamac | waigani: wallyworld: had to change http://reviews.vapour.ws/r/722/ | 03:24 |
anastasiamac | waigani: wallyworld: could u PTAL again? | 03:24 |
* thumper taps his fingers | 03:24 | |
anastasiamac | waigani: wallyworld: Get at apiserver now takes params.Entities | 03:25 |
anastasiamac | waigani: wallyworld: everything else should be the same | 03:25 |
anastasiamac | waigani: wallyworld: in fact, it's a revert to oirginal rather than a change... | 03:25 |
waigani | anastasiamac: just read review, yep that makes sense to me | 03:25 |
anastasiamac | waigani: awesome - so I'll keep ur shipit :) | 03:26 |
waigani | anastasiamac: sure :) | 03:26 |
wallyworld | anastasiamac: +1 | 03:26 |
thumper | can someone tell me how long the tests should take in worker/provisioner plz? | 03:27 |
anastasiamac | thumper: my last run | 03:27 |
anastasiamac | thumper: github.com/juju/juju/worker/provisioner 56.706s | 03:27 |
thumper | anastasiamac: ta | 03:27 |
anastasiamac | wallyworld: thnx!! | 03:28 |
menn0 | merges are unblocked people | 03:30 |
anastasiamac | menn0: yes there r already changes qued up in jenkins :) | 03:31 |
anastasiamac | menn0: axw: thnx for unblocking it :) | 03:32 |
menn0 | anastasiamac: all those queued merges may or may not be mine | 03:32 |
* menn0 ducks | 03:32 | |
anastasiamac | menn0: and here i was thinking to start the collection for a case of scotch... | 03:33 |
menn0 | ha | 03:33 |
=== kadams54 is now known as kadams54-away | ||
=== kadams54-away is now known as kadams54 | ||
thumper | fark | 03:44 |
thumper | fark fark fark | 03:44 |
thumper | bitten by this same fucking issue again | 03:44 |
thumper | FYI, machine config only has APIInfo structure set for state server machines | 03:45 |
thumper | hmm... | 03:46 |
thumper | no | 03:46 |
thumper | damn | 03:46 |
menn0 | thumper: i'm trying to get my head around how an API connection for a new environment will be opened | 04:13 |
menn0 | thumper: the password will always be the same right? it's stored on the user, not the envuser. | 04:13 |
=== kadams54 is now known as kadams54-away | ||
menn0 | thumper: actually... how do passwords for machines work? | 04:14 |
thumper | yes, machines have passwords | 04:14 |
thumper | menn0: but that bit isn't working right now | 04:14 |
menn0 | thumper: fixes in your branch? | 04:15 |
thumper | menn0: I've opened a huge pile of worms with this env uuid in the agent config stuff | 04:15 |
thumper | broken heaps | 04:15 |
thumper | and slowly untangling | 04:15 |
thumper | but I'm being called away | 04:15 |
thumper | to walk the dog | 04:15 |
thumper | so I'm done | 04:15 |
thumper | for today | 04:15 |
menn0 | that sucks | 04:15 |
menn0 | ok | 04:15 |
menn0 | i will ignore that part for now | 04:15 |
=== kadams54-away is now known as kadams54 | ||
axw | anastasiamac: looks like your branch failed on the bot, but I think it might be an infrastructure issue | 04:21 |
axw | take a look, you can probably just retry it | 04:21 |
anastasiamac | axw: thnx will look | 04:27 |
=== kadams54 is now known as kadams54-away | ||
anastasiamac | is there a comment I can send to bot to not try to $$merge$$? | 05:18 |
anastasiamac | like cancel a merge? | 05:18 |
wallyworld | anastasiamac: you need to have credentials, i can cancel an in progress landing if you want | 05:20 |
anastasiamac | wallyworld: ic | 05:21 |
wallyworld | axw: if have have a few minutes, i'd appreciate a review of http://reviews.vapour.ws/r/727/ | 05:21 |
anastasiamac | wallyworld: no big deal i have qd couple of branches and noticed too late that one of them has unresolved conflict | 05:21 |
axw | wallyworld: looking | 05:21 |
anastasiamac | wallyworld: it'll just fail.. but thnx :) | 05:22 |
wallyworld | ok | 05:22 |
wallyworld | anastasiamac: annotations-tags? | 05:22 |
anastasiamac | wallyworld: no sync-tools | 05:23 |
anastasiamac | wallyworld: annotations tags are about to be backported to 1.22 ;) | 05:23 |
anastasiamac | wallyworld: since they've merged well :P | 05:23 |
jam | anastasiamac: wallyworld: aren't we in feature freeze for 1.22? | 05:24 |
anastasiamac | jam: this is not a feature that's new to 1.22 | 05:24 |
anastasiamac | jam: it's kind of a bug... that needs to be fixed in 1.22 | 05:25 |
anastasiamac | jam: trying to get ckient signature right to avoid conflicts later | 05:25 |
anastasiamac | jam: client* | 05:25 |
anastasiamac | jam: s/conflicts/headaches | 05:26 |
jam | k | 05:26 |
anastasiamac | jam: thnx for checking :) | 05:27 |
axw | wallyworld: reviewed | 05:36 |
wallyworld | ty looking | 05:36 |
axw | grrrrrr, shitty tests | 05:37 |
anastasiamac | wallyworld: cherry picked annotations change http://reviews.vapour.ws/r/728/ | 05:39 |
wallyworld | axw: i introduced a new bootstrap method to avoid churn on the other providers. i can quite see how to wrap the finaliser though such that the instance id is available to it, since it's called from bootstrap/bootstrap.go with a machine cfg without the id and is only filled in inside the closure | 05:42 |
wallyworld | s/can/can't | 05:44 |
axw | wallyworld: maybe I'm wrong, gimme a sec | 05:45 |
axw | wallyworld: yep, you're right, sorry | 05:45 |
wallyworld | axw: sure, np. you happy with the sig change to avoid churn? | 05:46 |
axw | wallyworld: yes that'd be good, thanks | 05:46 |
wallyworld | axw: to be clear, I made the sig change to avoid updating all the other providers. but you didn;t like it | 05:47 |
wallyworld | also, the way i have it avoids the boilerplate error checking | 05:47 |
wallyworld | that would otherwise have to be introduced | 05:47 |
axw | wallyworld: where would there be extra boilerplate? | 05:47 |
wallyworld | axw: what's inside the new Bootstrap() func - those 5 lines or so | 05:48 |
axw | wallyworld: I don't follow. the existing code hasn't changed much, and the callers of Bootstrap still need to check an error | 05:49 |
wallyworld | since the environs.Bootstrap interface method i would think we'd want to retain | 05:49 |
wallyworld | they just return common.Bootstrap() directly since the signature matches that of environs.Bootstrap | 05:49 |
axw | wallyworld: ah yeah, that would need to change | 05:50 |
axw | ok | 05:50 |
axw | forget it | 05:50 |
wallyworld | ok, i'll make the err change though | 05:50 |
axw | wallyworld: keep it as is. I'll comment on the branch | 05:50 |
wallyworld | ok, ta | 05:50 |
wallyworld | axw: also, i started looking at bug 1384259. it seems cloud init is directly running the various apt commands it is configured with , and something else on the machine locks apt and then cloud init is sad. but i haven't dug any deeper. not sure if you have any ideas | 05:52 |
mup | Bug #1384259: race condition running apt in bootstrap <bootstrap> <ci> <oil> <race-condition> <juju-core:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1384259> | 05:52 |
wallyworld | not sure if we want to wrap the cloud init apt commands with a retry | 05:52 |
axw | wallyworld: apt is called from the ssh script, not cloud-init (on bootstrap only) | 05:54 |
wallyworld | ah doh, yeah. i saw cloud init text in the log on the bug | 05:54 |
axw | we could lock... I don't *think* cloud-init does anything like that though | 05:54 |
wallyworld | not sure off hand how to solve this one, need to dig into it some more. any suggestions welcome. | 05:55 |
axw | wallyworld: only thing I can think of is to add a script that waits for any apt or dpkg processes to stop running before we do anything | 06:01 |
wallyworld | yuk, but may have no choice :-( | 06:02 |
wallyworld | i've updated the pr too | 06:02 |
axw | wallyworld: it'd be good to know what it's conflicting with, that might give us a better approach | 06:02 |
wallyworld | i'll ask on the bug | 06:03 |
axw | wallyworld: lgtm | 06:04 |
wallyworld | ty | 06:04 |
wallyworld | axw: in doing that branch i lost so much time due to not realising maas gave back instance ids that were different from the system ids to be passed over the api. sigh. i now know | 06:06 |
axw | wallyworld: :( any idea why we're using the resource_uri instead? | 06:08 |
wallyworld | nope :-( | 06:08 |
wallyworld | i found a helper function someone wrote to convert | 06:08 |
wallyworld | so it must have been a deliberate decision | 06:08 |
axw | whee, finally. you can now provision ec2 instances with volumes | 06:13 |
wallyworld | axw: whoot! fantastic | 06:30 |
axw | wallyworld: I forgot to ask before: is there a way we can flag some manual testing as being required for the next release? | 07:12 |
axw | (e.g. ensuring MAAS 1.7 deployments work well, for non-bootstrap machines) | 07:12 |
wallyworld | axw: yes, i plan on raising this with curtis tomorrow | 07:19 |
axw | wallyworld: ok, cool | 07:20 |
axw | wallyworld: but in general, should we be using launchpad bugs or what...? | 07:20 |
wallyworld | for recording testing notes? using lp bugs seems reasonable | 07:21 |
wallyworld | this close to release, i wanted to do it more explicitly | 07:21 |
axw | wallyworld: not so much a note, as "we should not release unless we know this thing has been tested" | 07:21 |
wallyworld | yeah that, sorry, was using the term generically | 07:22 |
axw | ok | 07:22 |
wallyworld | but we don't have a documented process AFAIK to flag critical testing issues | 07:22 |
axw | wallyworld: in my previous job we used to create tasks for every new feature and major bug fix that would block a release. they'd generally need to be done by someone other than the implementer. we had the luxury of having big, dedicated testing teams though :) | 07:23 |
wallyworld | axw: yeah, we had something similar previously for me also | 07:24 |
wallyworld | we just need to make sure that wes the release manager and the QA team are informed, and that other stakeholders are brought in as needed to help test | 07:25 |
dimitern | axw, hey, are you still around? | 08:33 |
axw | dimitern: heya, yes I am | 08:34 |
TheMue | morning *yawn* | 08:34 |
axw | morning | 08:34 |
dimitern | axw, a quick storage question: do we plan to mount devices in lxc containers? | 08:35 |
dimitern | morning TheMue | 08:35 |
axw | dimitern: we want to be able to, yes. it's going to require some changes to lxc templates to allow mounting and so on | 08:36 |
axw | dimitern: why do you ask? | 08:36 |
dimitern | axw, because due to the networking work I plan to make lxc config file templates more flexible | 08:37 |
dimitern | axw, and this should also help for storage | 08:37 |
axw | dimitern: I see, yes, that will be helpful | 08:38 |
dimitern | axw, sweet, I'll let you know when my changes are in then | 08:38 |
axw | dimitern: thanks very much. probably won't be getting to lxc for a little while yet, but that'll be much appreciated | 08:39 |
axw | wallyworld: ^^ dimitern is helping with storage now ;) | 08:39 |
dimitern | wallyworld, axw, :D more like side-effecting it | 08:39 |
dimitern | wallyworld, axw, can any of you have a look at a small goamz PR? https://github.com/go-amz/amz/pull/16 thanks! | 09:31 |
voidspace | dimitern: ping | 09:43 |
voidspace | dimitern: cannot use parent (type names.Tag) as type names.MachineTag in function argument: need type assertion | 09:43 |
dimitern | voidspace, hmm | 09:44 |
dimitern | voidspace, yeah? | 09:44 |
voidspace | dimitern: just getting the code | 09:44 |
voidspace | dimitern: I want to know if it's safe to just do the conversion | 09:44 |
voidspace | dimitern: if I actually have the right tag | 09:45 |
voidspace | dimitern: just finding the place where I get the tag and where I'm using it | 09:45 |
dimitern | voidspace, it is safe if you actually have a names.MachineTag | 09:45 |
voidspace | parent := p.authorizer.GetAuthTag() | 09:45 |
voidspace | parentTag, err := names.ParseMachineTag(parent) | 09:45 |
voidspace | parentMachine, err := p.getMachine(canAccess, parentTag) | 09:46 |
voidspace | ah | 09:46 |
voidspace | now the error is | 09:46 |
voidspace | cannot use parent (type names.Tag) as type string in function argument | 09:46 |
voidspace | dimitern: so just convert then... | 09:46 |
dimitern | voidspace, wait a sec | 09:47 |
voidspace | dimitern: is the result of GetAuthTag() the machine tag? | 09:47 |
dimitern | voidspace, GetAuthTag does return names.Tag, but if authorizer.AuthMachineAgent() is true then it's safe to cast it | 09:47 |
voidspace | if it isn't true we shouldn't be running... | 09:48 |
voidspace | so I should check I guess | 09:48 |
dimitern | voidspace, yeah - have a look at NewProvisionerAPI in apiserver | 09:48 |
dimitern | voidspace, the very first check is if !authorizer.AuthMachineAgent() && !authorizer.AuthEnvironManager() { return nil, common.ErrPerm } | 09:48 |
dimitern | voidspace, actually, the getAuthFunc defined there is just what you need | 09:49 |
dimitern | voidspace, it already checks parent/child relationship | 09:49 |
voidspace | ah | 09:49 |
voidspace | and I'm using that later anyway | 09:50 |
voidspace | so maybe I don't need a separate check | 09:50 |
voidspace | I'll look at that, thanks | 09:50 |
dimitern | voidspace, yes, *I think* you can just use that getAuthFunc to validate the passed tag | 09:50 |
voidspace | cool, thanks | 09:51 |
dimitern | voidspace, standup? | 10:02 |
voidspace | dimitern: oops, sorry | 10:05 |
dimitern | voidspace, I have a cunning plan :) | 10:56 |
dimitern | voidspace, you can't tag IPs, but you can tag NICs | 10:56 |
dimitern | voidspace, e.g. we can add tags like "juju:machine-id=<id>", "juju:<mid>:address:<#>=<ip>" to the NIC after calling RunInstances | 10:58 |
dimitern | voidspace, so each time we call AttachPrivateIpAddress successfully, we also add a tag "juju:0/lxc/0:address:0"="" (we don't know the address yet), but then when listing all instance IPs we use the tags to decide which goes where | 11:00 |
dimitern | voidspace, and the instance updater can set "juju:0/lxc/0:address:0"="<some yet-unassigned ip>" as a tag and also in state | 11:02 |
dimitern | anyway.. just thinking out loud - tags can be pretty powerful way of adding metadata accessible via aws api even if apiserver dies/cannot be reached, we can use the tags to intelligently cleanup dependent resources | 11:03 |
TheMue | dimitern: to stay with the golang naming conventions I would call it AttachPrivateIPAddress() | 11:06 |
perrito666 | morning | 11:07 |
dimitern | TheMue, in goamz it's called AssignPrivateIPAddresses actually | 11:07 |
TheMue | perrito666: heya and good morning | 11:07 |
dimitern | morning perrito666 | 11:07 |
TheMue | dimitern: +1 | 11:07 |
TheMue | dimitern: just took a look at net package ;) | 11:07 |
dimitern | TheMue, yeah :) | 11:08 |
perrito666 | dimitern: TheMue any of you knows what is the status of blocking bugs? | 11:37 |
dimitern | perrito666, all resolved | 11:37 |
perrito666 | dimitern: and merged? | 11:37 |
dimitern | perrito666, for now at least, so no longer blocked | 11:37 |
* perrito666 looks at the topic hoping it will dissapear | 11:37 | |
perrito666 | wallyworld: do you not sleep? | 11:47 |
wallyworld | sometimes | 11:47 |
wallyworld | like you can talk :-) | 11:48 |
perrito666 | heh fair | 11:49 |
perrito666 | anyway your mail makes sense, that is why i added a unit ptr as a member of the unitagent, we can use tag from there | 11:50 |
wallyworld | perrito666: i don't think it makes sense to embed the whole unit into unitagent - i thought we talked about having unitagent very lightweight, just doing status get/set | 11:53 |
voidspace | dimitern: that's a terrible abuse of tags :-D | 11:53 |
dimitern | voidspace, :) oh, I'm just getting started | 11:54 |
voidspace | :-) | 11:54 |
perrito666 | wallyworld: I did not embed it, its just a member | 11:55 |
perrito666 | wallyworld: you might need some sleep and a couple of drinks | 11:55 |
wallyworld | perrito666: the latter is taken care of :-) | 11:55 |
perrito666 | lol | 11:55 |
wallyworld | but i'm still not sure about even referencing unit | 11:56 |
wallyworld | we don't need all that baggage inside UnitAgent struct, which for now is just about get/set status | 11:56 |
perrito666 | I am all ears about Tag then :p | 11:57 |
wallyworld | we could invent a new one eg "unitagent-foo/0", or easier, just have SetAgentStatus pass the "unit" tag across the wire and the method knows how to deal with it | 11:59 |
wallyworld | the latter seems best, but maybe i'm missing something | 11:59 |
perrito666 | I am not sure of the implication of the latter, I guess it could work | 12:01 |
wallyworld | i think it will be ok, but would need to start coding to see where it ends up | 12:02 |
perrito666 | well, that is what I am for | 12:03 |
perrito666 | and tonight I have meetings at 11pm and 00 (its 9am) so I seem to have time ahead of me | 12:03 |
voidspace | dimitern: hmmm, I bet you can't set tags atomically though | 12:24 |
dimitern | voidspace, well it would appear so.. although you can't set tags on instance/NIC/etc. creation according to the docs, you *can* launch an instance via the AWS web console and add tags to it | 12:28 |
dimitern | voidspace, i've enabled the cloudtrail API logging and experimenting now to see how they do it | 12:28 |
voidspace | cool | 12:28 |
perrito666 | mm, on a machine from scratch here and our tests seem to expect a /usr/lib/juju/bin/mongod | 12:28 |
perrito666 | that is sort of wrong for the tests isn't it? | 12:29 |
wwitzel3 | perrito666: short answer, yes ;) | 13:13 |
jw4 | backport PR to remove accidentally added file from 1.22: http://reviews.vapour.ws/r/731/ | 14:25 |
jw4 | OCR PTAL ^^ | 14:25 |
jw4 | :) | 14:25 |
dimitern | jw4, ship it! :) | 14:25 |
jw4 | dimitern: :) | 14:25 |
perrito666 | that is so close to occipital | 14:27 |
jw4 | perrito666: http://en.wikipedia.org/wiki/Occipital_bone ? | 14:28 |
perrito666 | true | 14:29 |
perrito666 | the OCRPTAL bone | 14:29 |
jw4 | hehe | 14:29 |
* jw4 just got it | 14:29 | |
TheMue | o/ | 15:05 |
perrito666 | OCR PTAL http://reviews.vapour.ws/r/732/ | 15:12 |
dimitern | perrito666, we should just start using "occipital" :D | 15:15 |
perrito666 | dimitern: yes, it was very hard to resist the temptation | 15:17 |
katco | one-line change and test; blocking 1.22; up for review: http://reviews.vapour.ws/r/733/ | 15:35 |
dimitern | katco, wow! | 15:40 |
dimitern | katco, a return is all it takes? | 15:40 |
katco | dimitern: i told you i had already thought of the possibility, but i ignored my own warning ;) | 15:41 |
dimitern | katco, hehe - you've got a review | 15:41 |
katco | dimitern: ty, i'll add the bug# | 15:42 |
katco | dimitern: would you be able to do a quick test of the code on your environment? or has the opportunity passed? | 15:44 |
dimitern | katco, sure, let me pull your branch | 15:45 |
katco | dimitern: ty so much :) | 15:45 |
dimitern | katco, np - it's bootstrapping now | 15:51 |
katco | dimitern: cool ty again | 15:51 |
dimitern | katco, ok, so no panic, just a few warnings about dns resolving - http://paste.ubuntu.com/9749392/ | 16:06 |
katco | dimitern: that's expected; looks good, yes? | 16:06 |
dimitern | katco, yes, however isn't the warning message a bit misleading? | 16:06 |
katco | dimitern: how so? | 16:07 |
dimitern | katco, "Status may be incorrect" ? | 16:07 |
katco | dimitern: well, it's showing that you're running on no subnets and utilizing no ports | 16:07 |
dimitern | katco, got it, right | 16:07 |
dimitern | katco, lgtm then | 16:07 |
katco | dimitern: ty for all the help; finding it, reporting it, everything :) | 16:08 |
dimitern | katco, np, thanks for fixing it so quickly :) | 16:09 |
katco | dimitern: it's much easier to troubleshoot/fix something when you know (almost) everything about it :) | 16:10 |
katco | dimitern: and the fact that i could write a unit test sped up the process as well | 16:11 |
dimitern | katco, exactly! | 16:11 |
voidspace | dimitern: ping | 16:43 |
voidspace | dimitern: you still around? | 16:43 |
dimitern | voidspace, yep | 16:44 |
voidspace | dimitern state.State supports adding a subnet that doesn't yet exist in state or fetching one that already exists | 16:44 |
voidspace | dimitern: what I *want* is "get me this subnet - adding it if it doesn't exist" | 16:44 |
voidspace | dimitern: better to do that in a State method or just hand code the logic | 16:44 |
dimitern | voidspace, too many "states" :) | 16:45 |
voidspace | hah | 16:45 |
dimitern | voidspace, does not exist in which state? | 16:45 |
dimitern | voidspace, ah, sorry | 16:45 |
voidspace | the stored state | 16:45 |
dimitern | voidspace, got you | 16:45 |
voidspace | mongo I guess | 16:45 |
voidspace | I mean, I know it's mongo | 16:45 |
voidspace | but I guess that's a better way of saying it... | 16:45 |
dimitern | voidspace, right - we can change AddSubnet to AddOrUpdateSubnet perhaps? | 16:46 |
voidspace | dimitern: ok | 16:46 |
voidspace | gah, and there's network.SubnetInfo plus state.SubnetInfo | 16:47 |
voidspace | I have a network.SubnetInfo, I need a state.SubnetInfo | 16:47 |
dimitern | voidspace, let me have a look | 16:47 |
voidspace | dimitern: I wrote the code, I only have myself to blame | 16:48 |
dimitern | voidspace, right, so the unfortunate duplication is on purpose | 16:48 |
voidspace | dimitern: I'm ok | 16:48 |
katco | dimitern: backport of same change to v1.22: http://reviews.vapour.ws/r/734/ | 16:49 |
dimitern | voidspace, state shouldn't depend on other packages, the same applies to params | 16:49 |
dimitern | katco, looking | 16:49 |
voidspace | dimitern: although state does depend on network anyway I believe | 16:49 |
dimitern | katco, ship it! :) | 16:49 |
katco | dimitern: woot! grats on quick turn around on this :) | 16:50 |
dimitern | voidspace, well it does for network.Address I think | 16:50 |
dimitern | katco, well I've seen it before lol | 16:50 |
katco | dimitern: i mean the whole bug :) | 16:51 |
katco | dimitern: wouldn't have gotten resolved, nor so quickly w/o your help | 16:51 |
dimitern | katco, ah, yeah - one of the fastest fixes lately | 16:51 |
dimitern | katco, np, glad to help | 16:52 |
dimitern | voidspace, so.. the state documents shouldn't depend on things outside of state, which might change out-of-band and lead to docs getting serialized differently | 16:53 |
voidspace | dimitern: fair enough | 16:53 |
dimitern | voidspace, we're not entirely depend-less, but let's not make it worse :) | 16:53 |
dimitern | voidspace, as for params - same issue - serialization; we shouldn't change the on-the-wire format of the api incompatibly | 16:54 |
voidspace | I'm aware of that one | 16:54 |
voidspace | for state I don't think it's a *genuine* issue though as we populate a subnetDoc from the SubnetInfo | 16:55 |
dimitern | voidspace, sorry :/ | 16:55 |
voidspace | so we're safe from "out of band changes" anyway | 16:55 |
voidspace | as we already have a layer of indirection for the actual serialisation | 16:55 |
dimitern | voidspace, yeah, that's right | 16:55 |
voidspace | adding SubnetInfo is *two layers* of indirection | 16:56 |
voidspace | :-p | 16:56 |
dimitern | voidspace, we should consult fwereade here I think | 16:56 |
voidspace | dimitern: let me work with the code and see how it feels - I'll just write a "caster function" I guess | 16:56 |
dimitern | voidspace, because not depending on packages for the sake of stable serialization format for mongo docs is one thing, but no dependencies at all might be too much | 16:57 |
voidspace | ok | 16:57 |
voidspace | and Subnet representation (network package) is a low level dependency not a structural dependency | 16:57 |
dimitern | voidspace, I think so, yes | 16:58 |
=== Viperz28 is now known as Guest82007 | ||
voidspace | dimitern: late ping | 18:25 |
dimitern | voidspace, yeah? i'm around on and off | 18:27 |
voidspace | dimitern: you added network.InterfaceInfo recently, with the intention it be used by the ProviderAPI api? | 18:28 |
dimitern | voidspace, not over the wire though - there's a params.NetworkInfo for that | 18:29 |
voidspace | dimitern: I have subnet info and ip address and am wondering how I get the extra information if that's what I'm required to | 18:30 |
voidspace | dimitern: from the subnet CIDR I'll have to fetch the NIC info | 18:31 |
voidspace | dimitern: it doesn't look like there's a provider method for this (that I can see), can I assume state will have it correctly? | 18:31 |
voidspace | for the host machine | 18:31 |
dimitern | voidspace, sorry, what extra info? | 18:33 |
voidspace | DeviceIndex, MACAddress, NetworkName, InterfaceName | 18:33 |
voidspace | etc... | 18:33 |
voidspace | everything on NetworkInfo that isn't on SubnetInfo | 18:33 |
dimitern | voidspace, hmm | 18:34 |
voidspace | let me double check there's no environ.Interfaces | 18:34 |
dimitern | voidspace, from state you mean? | 18:34 |
voidspace | dimitern: I call environ.Subnets() which returns []network.SubnetInfo | 18:34 |
voidspace | dimitern: there is an interface collection in state though I *believe* | 18:34 |
voidspace | dimitern: maybe a problem for tomorrow as it's late for me too | 18:35 |
voidspace | dimitern: I thought you might know easily... :-) | 18:36 |
dimitern | voidspace, yeah - let's call it a day :) I'm a bit dumb now I'm afraid | 18:36 |
voidspace | dimitern: it's even later for you than it is for me! Goodnight, see you tomorrow. | 18:36 |
voidspace | and goodnight everyone | 18:36 |
=== kadams54 is now known as kadams54-away | ||
=== kadams54-away is now known as kadams54 | ||
thumper | morning folks | 19:43 |
thumper | geez... you make one small thing required and suddenly heaps of tests break... | 19:44 |
perrito666 | thumper: what did you break while we where not looking? | 19:45 |
thumper | perrito666: I'm needing to add the environment uuid to the agent config | 19:45 |
thumper | perrito666: otherwise all the machine and unit agents don't know which environment to connect to | 19:45 |
thumper | perrito666: but that opened a world of hurt | 19:46 |
thumper | that I've spent about five hours unpicking | 19:46 |
thumper | I think I'm almost there | 19:46 |
thumper | then I need to write an upgrade step | 19:46 |
perrito666 | I am sure you said "this should be easy" before starting, that usually complicates things | 19:46 |
thumper | I think I did | 19:46 |
thumper | I expected it to be a few hours | 19:47 |
thumper | not days | 19:47 |
perrito666 | well, you should never jynx it | 19:47 |
thumper | and I seem to have made the provisioner tests never exit | 19:52 |
thumper | ... | 19:52 |
thumper | not sure how that happened | 19:52 |
* thumper looks up and sees two open critical bugs | 19:53 | |
thumper | WTF | 19:53 |
thumper | ok... topic is wrong | 19:53 |
perrito666 | thumper: yep, I dont know why is not back to none | 19:54 |
perrito666 | build is unblocked | 19:54 |
perrito666 | now, this is unexpected, there is an mtv channel that actually has music | 19:58 |
thumper | haha | 19:59 |
thumper | menn0: bot is unblocked, land your pending ones if you havent' already | 20:14 |
* thumper makes a sad face | 20:20 | |
thumper | just found the most horrible fragile test | 20:20 |
thumper | but don't have time to fix it right now | 20:20 |
perrito666 | thumper: which is? | 20:21 |
thumper | func (*cloudinitSuite) TestWindowsCloudInit(c *gc.C) { | 20:21 |
perrito666 | ah, oops | 20:21 |
thumper | no shit, doing an equality test with a 850 line string | 20:22 |
thumper | any change in any cloundinit stuff means the string has to change | 20:22 |
menn0 | thumper: I landed then all yesterday (I unblocked the bot and got them all in before telling anyone else :-0) | 20:22 |
menn0 | thumper: that sounds wonderful | 20:23 |
menn0 | thumper: (that test) | 20:23 |
thumper | tech debt item: all cloudinit tests are awful and fragile | 20:27 |
=== kadams54 is now known as kadams54-away | ||
thumper | ah ha... | 20:42 |
thumper | I think I found the culprit | 20:43 |
wwitzel3 | ericsnow: ping | 20:52 |
ericsnow | wwitzel3: hey | 20:52 |
=== kadams54-away is now known as kadams54 | ||
menn0 | waigani: Ship It! | 21:34 |
waigani | menn0: sweet, thanks | 21:35 |
menn0 | ericsnow: I just reviewed your Attempt PR again (Ship It if you like) | 21:50 |
ericsnow | menn0: thanks | 21:51 |
thumper | by joves I think I may have fixed all the test failures | 22:04 |
thumper | ... | 22:04 |
* thumper runs full suite again | 22:04 | |
thumper | menn0: 31 files changed, 185 insertions(+), 87 deletions(-) to get the tests passing on requiring environ uuid | 22:13 |
=== kadams54 is now known as kadams54-away | ||
thumper | menn0: do you have a few minutes to chat? I need to talk through an issue | 22:16 |
thumper | although I think I know the answer | 22:16 |
thumper | menn0, waigani_: beware with upgrade steps landing since 1.22 was branched, we should have 1.23 upgrade steps now | 22:24 |
waigani_ | thumper: right, noted | 22:24 |
thumper | we should check any that have landed since the branch (if any) | 22:24 |
thumper | I was just thinking of this now as I'm about to write an upgrade setp | 22:25 |
menn0 | thumper: hi, sorry just noticed this. was deep in thought. chat now? | 22:39 |
thumper | 2 minutes, booking a shuttle for tomorrow | 22:40 |
thumper | menn0: standup hangout? | 22:43 |
menn0 | thumper: see you there | 22:43 |
=== ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1411024 | ||
sinzui | thumper, wallyworld_ can you get someone to look into the windows regression reported in bug 1411024 | 23:35 |
mup | Bug #1411024: Win client/agent cannot bug built because of backup deps <ci> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1411024> | 23:35 |
davecheney | thumper: is there an agenda for our feb sprint ? | 23:35 |
davecheney | menn0: so what i'm hearing is "no, don't install the latest kernel update if you want wifi to work" | 23:46 |
perrito666 | sinzui: I am the culprit, Ill fix it | 23:50 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!