[00:04] veebers: adding State to cloudContainer would have been the wrong choice - we deliberately are not adding State anywhere else new as it was wrong to do it that way way back when [00:05] veebers: i am not 100% on adding a new method to Unit though - that is bloating the Unit entity. we are in state package, can do stuff without needed to add that method to Unit itself [00:06] the general design principal is to keep the domain entities small and use the onion modelto extend their functionality [00:09] wallyworld: it's not a new method on Unit, it's a new type, func newUnitCloudContainer(st *State, name string) *UnitCloudContainer. It has set/get status [00:10] ah right, that's great then :-) [00:17] anastasiamac: is this another instance of the intermittent failure you're working on? http://ci.jujucharms.com/job/github-check-merge-juju/3328/testReport/junit/github/com_juju_juju_worker_provisioner/TestPackage/ [00:17] * anastasiamac looking [00:18] thanks [00:18] babbageclunk: it was arecently added one... i'll add it to my today's list.. thnx, babbageclunk :D [00:23] wallyworld: when you have a moment, pushed updates to https://github.com/juju/juju/pull/9081. Just need to add tests for UnitCloudContainer, a little apprehensive to add another ConnSuite suite though :-P [00:24] it's only another test to an existing suite i would think [00:25] wallyworld: yeah should be able to wrangle it in. Partly tested by side effect of being used in other tests too [00:26] anastasiamac: ok, thanks! [00:27] veebers: there's still one of these fmt.Sprintf("unable to get events for PVC") [00:27] in an errors.Annotate [00:27] wallyworld: argh no way sorry [00:28] sigh, visiting that buffer that's where my cursor was; must have been distracted before I could fix it and then moved to something else [00:29] all good [00:34] * anastasiamac is impressed that veebers has more than one buffer... i only have one buffer and one cursor [00:35] I have a big screen that I fill with many frames. Means I don't have to remember anything, its all there to see :-P I need a meat-memory upgrade [00:36] ah babbageclunk, that issue is the same one i described in standup - my test is killing worker whenever it wants... i have a fix to do it in more controllled manner... i'll pr and will probably ask u or wallyworld to review... [00:37] anastasiamac: ok, thanks for taking a look. [00:42] veebers: got time to jump in HO? [00:44] is there an ide that provides autocomplete for charm writing? [00:45] Doctor_Nick: you mean for writing Python hooks etc? so auto complete for charm helper and reactive libray functions? any good Pythin IDE will do that [00:45] like PyCharm etc [00:53] wallyworld: that other PR I mentioned? https://github.com/juju/juju/pull/9164 [00:53] ok [00:58] thanks! [01:01] babbageclunk: why can't we remove the core store's ExpireLease method? [01:15] wallyworld: It's part of the lease.Store interface, and the mongo store needs it, since it can't respond to the time updates in the same way. [01:16] (sorry, was grabbing a lunch) [01:17] babbageclunk: ah right ok. i guess invalid err is ok, although we do tend to use NotSupported in these cases I think [01:20] wallyworld: I figured invalid is best here since the lease manager is explicitly expecting that. [01:20] wallyworld: I do now, back from lunch [01:20] ok, i guess the raft side never calls it anyway [01:20] veebers: ok, see you in HO [01:51] a credentials file for google cloud is giving me grief. i need key 'client-id' but when i include it i get the following with add-credential: [01:51] ERROR parsing credentials file: credentials.google.juju-gce-sa.client-id: expected string, got float64(2.0651723337507478e+20) [01:52] the line looks like this: [01:52] client-id: 206517233375074786882 [01:52] when i use interactive method and point to my yaml file it works fine. any ideas? [02:01] pmatulis: you need quotes around it [02:01] client-id: "203..." [02:01] pmatulis: weird - you could try putting quotes around the value. Not sure why it would work in one context and not the other though [02:01] oops [02:24] anastasiamac: the gce/azure/aws invalid credential detection, is that 2.4 or 2.5? [02:27] wallyworld: ^^ do you know? [02:28] 2.5 i think [02:28] some of the work landed in 2.4 [02:28] but only really functional in 2.5 [02:28] ok, ta [02:39] veebers, babbageclunk, ok, that seemed to be it. in the cloud-city configs this key's value is not quoted... [02:40] pmatulis: hmm, interesting. Perhaps jujupy does the quoting when it reads then regurgitates that data into the pre-primed JUJU_DATA [02:40] dunno [02:58] wallyworld: PR for s390x unit test failure: https://github.com/juju/juju/pull/9165 [02:58] gr8 looking [02:59] That one's against develop - it needs to be fixed in 2.4 too, but the test data's been moved, so I'll do a separate PR for it. [02:59] babbageclunk: ty, thought it would be simple like that :-) [03:00] :) [03:00] babbageclunk: nice! [03:31] anastasiamac: Did your fix for the not-a-valid-zip-file failure land? [03:32] I've just seen it on a test run [03:32] http://ci.jujucharms.com/job/github-merge-juju/1087/testReport/junit/github/com_juju_juju_api/TestAll/ [03:46] babbageclunk: wasn't a fix but improvemnt.. i'll take another look.. wasnt easy to repro locally [03:47] anastasiamac: where you able to repro locally at all? [03:50] oh, right [03:52] anastasiamac: the bug probably shouldn't be fix-committed then, should it? https://bugs.launchpad.net/juju/+bug/1788401 [03:52] Bug #1788401: Intermittent test failure: cannot upload charm ... not a valid zip file [03:52] veebers: no [03:52] babbageclunk: no [03:52] ack [03:52] babbageclunk: especially if we now have a new accourence :( [03:54] sorry :( [03:55] babbageclunk: ? [03:55] no need to be sorry, especially if we can figure repro steps \o/ [04:03] wallyworld: I don't think Im' being blind, I can't see a way to ignore the status collection in the migration export tests. I see a feature flag for StrictMigration; but that changes the test as a whole [04:04] babbageclunk: perhaps you know? Is there a way in the migration export tests to ignore a collection that gets left behind? [04:04] veebers: sorry, was caffinateing [04:04] let me check [04:05] no worries, that's always important! [04:05] oh just thinking about it - the test would be expecting some status items to be exported, but we are adding extra ones [04:06] so we can't ignore the entire collection right? [04:06] we may need to check in th test that all is done except for container status values [04:06] which test is it? [04:06] i'm just hand waving atm [04:08] wallyworld: all the MigrationExportSuite tests (well 11, not sure if that's all) [04:08] veebers: what do you mean a collection that gets left behind? [04:08] no, thats just some of the 49-ish tests [04:09] babbageclunk: as in we're not (yet) doing anything with the cloud container status collection (that's newly been added). [04:09] I handwaved over it by doing the extract but doing nothing with the data, but that seems a bit iffy [04:10] veebers: so you export it but don't import it yet? [04:10] veebers: can you pastebin a failure [04:10] wallyworld: https://pastebin.canonical.com/p/zrXypw6XtW/ [04:11] oh, just reminding myself about the StrictMigration flag. [04:12] babbageclunk: this is what I had proposed: https://github.com/juju/juju/pull/9081/files/#diff-d02499513782195ec1b9d18198275e62 [04:13] veebers: right, so there are status entries for this new entity which are being read when the exporter reads status, but that you're not using in the export yet. [04:14] veebers: so for now, i think you could consume the container status values but not add to the juju/description/model [04:14] with a big todo [04:14] babbageclunk:note the test is failing as I removed that change per PR comments [04:14] I don't think there's any kind of "these ones are alright" ignore-list at the moment. [04:14] i guess that's what you were doing before [04:14] wallyworld: so add back what was there [04:14] sorry yeah [04:15] wallyworld: heh cool, I'm on the same page. no worries [04:15] i didn't realise we needed to do that [04:15] i thought we had the mechanism to cope with it [04:15] aye, I'm sure for other migration stuff there is a whitelist/ignore list so was expecting the same here [04:15] but we don't for this scenario [04:15] yeah, we ignore whole collections or entity attrs [04:15] consuming them with a todo sounds like the best bet - we have exclude lists for new collections, but these are status records, so they're always loaded. [04:16] I suspect it'll need a bigger todo than // TODO(caas) - Actually use the exported cloud container details. [04:16] wallyworld: Is this failure the caasfirewaller one you and thumper were looking at? http://ci.jujucharms.com/job/github-check-merge-juju/3331/testReport/junit/github/com_juju_juju_worker_caasfirewaller/TestAll/ [04:17] veebers: how big a TODO have you got? [04:17] babbageclunk: just what I posted just now ^_^ [04:18] babbageclunk: not that one specifically IIANM. it would be watcher related though. how often is it failing? [04:18] In a nutshell, is this grabbing collection data from a source unit and populating the new target unit with this data? [04:18] as part of an application migration stage [04:20] veebers: https://pastebin.ubuntu.com/p/qZzcJQVn99/ [04:20] babbageclunk: hahahah I *love* it [04:20] it's grabbing the status data from the mgo model and populating an in memory description [04:21] someone has too much time on their hands [04:21] wallyworld: seems to be failing after about 5 runs under stress-race [04:21] sigh ok [04:21] wallyworld: I just used artist-mode - more emacs magic! [04:21] wallyworld: I'll have a go if no-one's already working on it. [04:21] how did i know it would be %#!W#!$W emacs [04:22] if you want that would be gr8 [04:22] babbageclunk: there is also T̖͇̠O̵͉D͏O̸:̳ [04:22] (as per the Zalgo thing you showed me the other day) [04:22] lol [04:23] Perhaps we should run all our comments through the zalgo generator [04:54] wallyworld: I'm just doing some manual functional testing now of those changes [04:54] sounds good [04:57] wallyworld: will show-status-log ever show the current status or must it be status leading up to but not including current status [04:57] i thought it ahows the current status [04:58] wallyworld: oh ok, I'll double check [05:26] wallyworld: super-quick one: https://github.com/juju/juju/pull/9167 [05:26] sure [05:27] wallyworld: Thanks! I've made it against 2.4, is someone going to do a merge from 2.4 to develop sometime soon? [05:27] babbageclunk: nice. but is there still arace? [05:27] we normally merge about once a week [05:28] wallyworld: I don't think there's still a race? [05:28] Oh, but I was only running that one test, maybe elsewhere in the suite? [05:28] babbageclunk: oh nevermind, i think it's ok. the test sends 2 app change events, and we match those with 2 errors right? [05:29] yeah, that's right [05:29] i think that;s ok [05:29] It should be - the stub's all locky [05:59] wallyworld: babbageclunk: PTAL https://github.com/juju/juju/pull/9168 - promised provisioner tests [06:00] ok, just finishing off something, then will llook [06:01] nws [06:20] anastasiamac: i had a question about the test coverage i left in the PR [06:39] wallyworld: what coverage do you think is missing? [06:39] that which is in the deleted test [06:40] checks that only transient errors cause retriesl that retries are not done for non transient errors etc [06:40] i couldn't see where the delete test code was replaced [06:41] wallyworld: whether the errors are transient or not is not determined by the task. it's deterimned in apiserver layer [06:41] it has never been covered in this test. [06:41] Data: map[string]interface{}{"transient": true}, [06:41] that line from the deleted test [06:42] set up an error as being transient [06:42] and the test checked that m3, m4 etc were done properly [06:42] wallyworld: that's in setting machine status in JujuCOnnSuite which is read by apiserver not provisioner task in worker package [06:42] right but the test is deletd and there's no replacement test cover is there? [06:43] again, this is a functinality that needs to be covered in apiserver not in worker [06:43] right but it's not [06:43] lemme check if it exists there [06:43] so we are not covered [06:43] unless i can't find it [06:43] maybe i just had a boy look [06:45] wallyworld: https://github.com/juju/juju/blob/develop/apiserver/facades/agent/provisioner/provisioner_test.go#L434 [06:45] etc... [06:45] and even as expected there is one to test with permission at line 489 [06:46] whether the status.error is transient or not was never a concern of worker.provisioner_task but of apisever [06:47] we had to set up the status with error in previous test because we were driving through the whole stack with jujuconnsuite and this was the only way to get machines with transient errors into the task [06:47] so, exisitng coverage is still there [06:48] anastasiamac: makes sense, ty. sorry i was blind freddy. in tnhe newer code, we use MachinesWithTransientErrors() result and check that retires happen on those results i think then? [06:52] wallyworld: yes :) [06:56] in fact, i've run cover against clean develop in this package (78.5% of stmts covered) and against my branch (78.5%) and m happy with the result :) [08:40] is there any documentation around updating aws instance types? [08:41] running "go generate ." inside cloud yeilds no changes [08:46] s/yeilds/yields [09:16] yay - found it [09:17] I'll write a discourse about this [13:37] manadart, can you explain breifly why the worker keeps state for both `completedUnits` and `preparedUnits` in your open PR? [13:38] manadart, I thought we had combined the two concepts... And not knowing any better I thought I'd ask before you EOD [13:39] externalreality: It is a symptom of how we designed the API - rather copying state locally, we ask questions. Two of those are: what units are in the [13:39] "prepare completed" state? and what units are in the "completed" state? [13:40] In practice, the workflow means that only one should ever actually data in it. [13:40] *have data in it. [13:42] externalreality: We *could* use a single collection and output it in the report with a different key based on the machine status. That might be better. [13:42] manadart, ah, I see so you are just sorting them rather than having to attach the status and check it each time. Roger. [13:42] Ja. [14:55] stickupkid: pr 9172 is reviewed/approved. just had a question in the tests. [15:01] hml: i responded with a question [15:04] stickupkid: unfortunately i haven’t played with ec2 image types either. perhaps a bug to look into this later? to get more data. or address this now? [15:06] hml: this is going to land on 2.3.9 so I'd rather address now maybe? [15:06] jam: do you have any time to CR this PR - https://github.com/juju/juju/pull/9172 [15:06] stickupkid: i was on the fence, mostly because it’s been this way for a while. but fixing while we’re here is best [15:06] looking [15:14] jam: i'm concerned our default instance is changing too much [15:14] hence why I'm a bit suspect on landing this one [15:14] stickupkid: heh. that was my comment as well [15:14] jam: ha [15:15] jam: what's the best course of action do you think? [15:16] stickupkid: understand why the default is changing. I thought we looked up the default based off of its specs, but that doesn't seem to be true. [15:23] jam: "At the time of writing, this will choose the cheapest non-burstable instance available in the account/region." [15:23] stickupkid: also checking what the price schedule is, and what the tradeoffs are [15:29] jam: https://pastebin.canonical.com/p/Nf2SFXSC99/ [15:29] jam: interestingly, the m3.medium is deprecated [15:30] stickupkid: isn't it t3 not m3 ? [15:31] jam: so the m3 was the default (the bottom one) and not the t3 has become the default (the top one) [15:32] stickupkid: t2.medium was the old default, I'm pretty sure [15:32] stickupkid: line 1694 of your diff is -t2 + t3 [15:33] jam: ah sorry, you're right [15:33] jam: my bad [15:34] stickupkid: it seems they released a new series of 't' types, and I would be curious to know what the various stats are [15:34] jam: the t series just reduced the cost and that was the winner [15:35] stickupkid: so 't' are all 'burst' types, so you get a number of credits, and that allows your CPU to go fast, until you run out of credits, but they are a lot cheaper. [15:36] but it is a reasonable tradeoff for Juju controllers that aren't heavily loaded. [15:36] but t2 vs t3, I don't know the differences. [15:36] jam: memory seems like a large tradeoff [15:36] jam: do we know what the average controller usage is for a default controller? [15:37] ^ in terms of memory [15:38] stickupkid: so I'm out of time to work on this tonight, but gathering the information of what is available and sending a discourse/email is probably the way forward [15:38] jam: sure, sounds like a plan [20:31] Morning all o/ [20:52] morning veebers! [20:53] hey babbageclunk o/ How's thing? [20:53] things even [20:53] Oh I only have the one, it's good. [20:53] excellent! [20:53] Singing in a concert tomorrow [20:54] babbageclunk: oh wow awesome! Which concert? [20:54] morning [20:54] veebers: https://www.orchestrawellington.co.nz/events/requiem [20:54] babbageclunk: nice, good luck [20:55] cheers" [20:55] ! [20:55] I found a nice choir, but have no time [20:55] so... not doing it [20:55] And they have no tim [20:55] babbageclunk: hey, you'll find this amusing, I have a new song I'm doing [20:55] babbageclunk: You're welcome [20:56] I think radionz is broadcasting it while we're performing [20:56] thumper: oh awesome! [20:56] Such a good song [20:56] yeah, it is [20:56] although I'm singing it at the top of my range, so it isn't comfortable yet [20:57] Yeah I guess those high bits are pretty high [20:57] babbageclunk: you're on fire this morning :-) [20:57] babbageclunk: awesome, all the best for the concert! [20:58] B) [21:46] rick_h_, wallyworld: was talking with jam last night about the windows CI failures on develop [21:47] the path length? [21:47] it looks like the vendoring broke the windows builds due to the path length [21:47] yeah [21:47] go 1.11 fixes this [21:47] so... [21:47] let's move! [21:47] should we move everything? [21:47] yeah, task for sprint [21:48] would be a good time to coordinate [21:48] +1 [21:49] rick_h_, externalreality_: was thinking a bit more about the all watcher and the series upgrade stuff [21:49] if we want the GUI to be able to respond and show the events it needs to be in the all watcher [21:49] and I do think a new delta type probably makes sense there [21:50] thumper: wallyworld so I talked to hml about using her CI day tomorrow to see what does or doesn't compile with 1.11 so we'll have a list to start from next week [21:51] i think it's just windows right? [21:51] sweet [21:51] oh, 1.11 you said [21:51] nevermind [23:59] babbageclunk: remind me, what's the bootstrap syntax for specifiying the lgacy lease feature flag?