[00:55] oh shit [00:55] * thumper sadface [00:56] wallyworld: juju run seems to be broken for hosted environments [00:56] yay [00:56] in master? [00:56] I recall having a different system identity created for the hosted environments [00:57] yes master [00:57] I have a hosted env called to-move [00:57] juju ssh -e to-move 0 [00:57] works [00:57] juju run -e to-move --machine 0 'ls' [00:57] does not [00:59] damn [01:05] wallyworld: probably not the most thorough review I ever did, but your PR is good to go [01:06] axw: tyvm, a lot of it is boilerplate - moving stuff around rather than too many renames. we can fix things if we find them === natefinch-afk is now known as natefinch [02:45] first migration demo is a success [02:45] all seems to work exactly as designed [02:45] as long as you only have simple machines and no services or storage [02:45] or networks [02:45] :) [02:45] and manual hackery to update agents config [02:51] * thumper wonders why the 2UTC call is at 3UTC [02:56] wallyworld: how would you feel about me adding the -o flag at least temporarily? existing bootstrap tests depend on certain config set (e.g. default-series, agent-stream) [02:57] axw: sure, go for it [02:57] wallyworld: I can back it out once we have a --config [02:57] np [02:57] axw: we'll just let rick know it's a stop gap [02:59] wallyworld: I wasn't planning on documenting it, but can do [02:59] axw: if it's not needed for the demo, then don't :-) [03:00] wallyworld: no, just for the unit tests [03:00] ok, np, no doco needed [03:25] can I get a quick review? http://reviews.vapour.ws/r/3676/ [03:25] It's adding some logging in an attempt to track down some recent CI failures [03:33] * thumper is done done [03:33] laters [03:48] wallyworld: fwiw, you'll probably want to pull that PR into the api-command-rename branch as well, as it is also seeing the connection is shutdown failures. [03:49] cherylj: i do need to merge master, will do that after the latest stuff lands [03:50] wallyworld: could you take a quick look at that RB? http://reviews.vapour.ws/r/3676/ [03:50] oh [03:50] n/m [03:50] cherylj: i just hit merge :-) [03:50] You'll need to JFDI it [03:50] I can do it [03:50] ah yeah [06:14] wallyworld: finally, http://reviews.vapour.ws/r/3677 [06:15] ok [06:44] axw: one possibly stupid question to answer [06:47] wallyworld: ta [06:47] np [06:48] wallyworld: which one's the possibly stupid question? [06:48] axw: so many to choose from :-) the one about why region is set to cloud name [06:48] wallyworld: comment about setting region? [06:48] ok [06:48] yup [06:48] wallyworld: that's a hack for lxd [06:48] wallyworld: I think we want to detect the region name instead [06:48] region should be local in that case [06:49] localhost [06:49] and only done for lxd and not all clouds [06:49] wallyworld: sure, the lxd provider would return a region of "localhost" when asked to detect. I'll write a TODO to fix this. [06:49] ta [06:50] wallyworld: well, I don't want to go special casing things like we did for the local provider [06:50] yep, sorry, was just using a general statement [06:50] wallyworld: I think there's value in having detection for other things too, e.g. to pick up OS_REGION... I'll add a TODO and we can go over it in more detail later [06:50] implementation tba sort of thing [06:50] yep [06:51] yeah, todo is good, allows readers of code to grok what is wip etc [06:51] wallyworld: I was running tests and found out why the cmd/juju/status tests are so insanely slow. the status filtering code is doing IP resolution (and way more than it even needs to) [06:52] oh dear [06:52] wallyworld: it shouldn't need to at all, since we store IPs [06:52] let's add a card to fix that [06:52] yep [06:52] ty [07:08] Bug #1539428 opened: cmd/juju/status: status filtering performs IP resolution of patterns === ashipika1 is now known as ashipika [09:21] fwereade_: any idea how a unit might not have a CharmURL (i.e. why does state.Unit.CharmURL return a bool) ? [09:25] axw: does the open stack provider have the storage hook support? [09:25] icey: yes, it supports block storage via cinder [09:26] awesome, thanks axw! [09:26] nps [09:30] axw: what version is required for thta? [09:30] that* [09:31] icey: 1.25 [09:31] icey: something going wrong? [09:32] nope, just discussing what we can do with testing for the ceph charms, wanting to get some more advanced storage stuff tested axw [09:33] icey: okey dokey. FYI, we have storage support in AWS, MAAS, OpenStack, Azure, and GCE [09:33] all from 1.25 onwards [09:33] great, thanks axw! [09:35] dimitern: ping [09:41] voidspace, pong [09:42] dimitern: soooo... [09:42] dimitern: I'm still working on a test in MachineSuite, in cmd/jujud/agent/machine_test.go [09:42] dimitern: and PatchValue isn't working! [09:43] dimitern: I'm doing exactly what other tests are doing (as far as I can see) [09:43] but with the patch in place the original function is still called (if I put a panic in the original function and run just the new test it panics) [09:43] voidspace, hmm - are you doing it in SetUpSuite? [09:43] dimitern: not in SetUpSuite, in the test [09:43] dimitern: https://github.com/juju/juju/compare/maas-spaces...voidspace:maas-spaces-networking-api10#diff-918e88c5445d929c38db9bf4f0a85cc8R1014 [09:43] voidspace, looking [09:44] + s.PatchValue(&newDiscoverSpaces, newDiscoverSpaces) [09:44] it's called from startEnvWorkers in machine.go [09:44] https://github.com/juju/juju/compare/maas-spaces...voidspace:maas-spaces-networking-api10#diff-f5ec9ed405cc8f3a833355afdc629bd3R1257 [09:44] what are you patching? [09:45] dimitern: that's an alias for discoverspaces.NewWorker [09:45] dimitern: this is exactly what many other tests do for patching out worker creation [09:45] as far as I can tell identical anyway, I'm obviously missing something [09:45] voidspace, well, the func you're patching it with has the same name, so aren't you doing a no-op? [09:45] hah [09:46] dimitern: yep [09:46] dimitern: thank you [09:46] voidspace, ;) [09:46] the patch is applying locally, I'm shadowing it [09:46] voidspace, np [09:46] dimitern: needed another set of eyes [09:46] I burned more than an hour on that yesterday [09:46] dimitern: thanks [09:47] voidspace, I'll do that later today for some reviews btw [09:47] dimitern: heh, no problem [09:47] voidspace, I know I said I could use some help yesterday, but it turned out it's simpler to do as one PR than to split it artificially [09:48] hmmmm, no still doesn't seem like my func is being called [09:48] let me put that panic back in [09:49] ok, it doesn't panic so the fake is in place at least [09:54] voidspace, why not pass a channel to newDiscoverSpaces and closed it when done, instead of a mutex? [09:56] dimitern: I only need to wait for discovery if discovery was actually started [09:56] dimitern: so with a bool, it defaults to false and we only set it to true when discovery starts [09:56] dimitern: not sure there's a clean way of doing the same with a channel [10:00] voidspace, I'm trying to find a good example.. [10:00] dimitern: I'd need a way to tell if a channel has been started but not yet closed [10:00] http://dave.cheney.net/2013/04/30/curious-channels [10:01] dimitern: frobware: dooferlad: stdup? [10:01] Friday meeting room [10:01] voidspace, you can make it buffered, so you write true to it once it starts and wait to read from it in a select later [10:01] voidspace, omw [10:01] dimitern: that doesn't sound any simpler than a bool and a mutex [10:58] rogpeppe1, sorry I missed you [10:59] fwereade_: np, i think i worked it out anyway [10:59] rogpeppe1, a unit doesn't have a charm url until it knows what charm it's actually going to run because it's downloaded and verified [11:00] fwereade_: yeah, i figured that [11:38] morning all [11:46] * perrito666 visits an office and fixes their network [14:07] dimitern: hey, is this meeting still happening this afternoon? [14:14] voidspace, the one in ~15m ? [14:14] I think so [14:15] dimitern: I thought frobware and dooferlad would be invited, but they're not on the list so I assume so too :-) [14:16] dimitern: my test is complete and passes [14:17] dimitern: changing it to use a channel now, which does change the test - but shouldn't take long [14:17] ah, no it doesn't pass [14:17] grrr, it did a minute ago [14:17] nearly there anyway [14:18] ah, admin user can't log in [14:18] how odd [14:18] "invalid entity name or password" [14:18] have to find someone who can login [14:18] voidspace, right - it's more about the plan for merging into master - it's not necessary to go if you don't want to [14:19] voidspace, ah :) you'd likely need to also create the user you're trying to use [14:19] and set password [14:19] grrr [14:19] this is from s.AdminUserTag(c) [14:19] I assumed that user existed [14:19] it's probably the wrong password [14:20] dimitern: hah, the password is "dummy-secret"!!! [14:20] now it passes... [14:20] voidspace, \o/ sweet! [14:30] dimitern, is there a merge meeting taking place that I should attend? [14:31] frobware: https://plus.google.com/hangouts/_/canonical.com/maas-spaces?authuser=0&hceid=YWxleGlzLmJydWVtbWVyQGNhbm9uaWNhbC5jb20.5fqsbuut5c9fh29v70q11o7vt4 [14:31] frobware, it's scheduled, but I don't know more - I'm joining now [14:44] dimitern: frobware: PR for merge of latest master onto maas-spaces [14:44] http://reviews.vapour.ws/r/3681/ [14:44] dimitern: the NetworkManager job has gone now too, right? [14:45] that came back across in the merge from master and I deleted it [14:46] voidspace, it's not gone, but it's no longer used to decide whether the networker should run or not (as it's gone as well) [14:46] dimitern: interestingly on maas-spaces JobManageNetworking isn't in cmd/jujud/agent/machine.go [14:46] ok [14:46] voidspace, that will take some time to read btw ... will have a look through the 15+ pages a bit later [14:46] dimitern: I deleted it from machine.go after the merge from master which had added it back [14:47] heh [14:47] dimitern: most of it was automatic - except cmd/jujud/agent/machine.go [14:47] and the networker changes which I had to redelete [14:49] voidspace, ok, it should be easier then [14:49] dimitern: if we're dropping support for maas < 1.9 then we can drop some of the legacy codepaths [14:49] and just fail if maas < 1.9 [14:53] dimitern, voidspace: part of me says we should understand why 1.8 doesn't work - CI-wise. [14:53] frobware, agreed [14:53] voidspace, it's too early to do that properly I think [14:54] understanding is good [14:54] but we have until monday to merge... [14:56] voidspace, this could be just ... take maas-spaces and bootstrap on 1.8, even if its just to say "aha..." [15:02] voidspace, were there any conflicts? [15:02] natefinch: standup? [15:02] frobware: changes to the networker worker and api, which we've deleted (so easy to resolve) [15:02] ericsnow: coming [15:02] frobware: and changes around starting the networker in cmd./jujud/agent/machine.go [15:02] frobware: I just deleted all references, dimitern can check if I did it right. [15:03] frobware: everything alright at home - you still able to go to FOSDEM and charm conf? [15:23] voidspace, have you seen the changes around the machine agent? [15:24] voidspace, it looks like with them it's much simpler to check the order of started workers, as well as add a "discover-spaces-started-gate" like the "upgrade-steps-gate" to signaling the import started [15:44] dimitern: I haven't [15:44] dimitern: I guess I should look and I'll have to change my test :-) [15:44] dimitern: maybe have to change how this worker is started [15:45] dimitern: let's get the merge in [15:45] dimitern: I'm taking a late lunch [15:46] voidspace, well, I think it will be worth it - it does look much nicer and easier [15:46] dimitern: great [15:49] voidspace, *whew*.. finished with that monstrous diff - it looks good [15:53] dimitern: thanks [15:57] Bug #1539656 opened: leadership dependency failures in CI deploy tests [16:33] Bug #1539684 opened: storage-get unable to access previously attached devices === marcoc|airplane is now known as marcoceppi === urulama is now known as urulama__ [19:55] natefinch: looks like I've addressed everything except the context patch review (and I'm working on that now) [19:57] ericsnow: cool [20:05] ericsnow: btw, thanks for the pointer on the new doc comments in the names package. That helps clarify things a lot... and it sounds like we agree entirely about how they should be used. I really there was only one string representation of a tag... having two just seems like it's asking for trouble, especially when the two functions are called Id() and String(). [20:05] ericsnow: "new" (march of last year ;) [20:05] natefinch: :) [20:44] ericsnow: in the persistence layer, I was using fmt.Sprintf("resource#%s#%s#%s", serviceID, id, unitID) for the unitresource's id... but do you think we really need the service ID? maybe just fmt.Sprintf("resource#%s#%s", unitID, id)? [20:45] natefinch: given the dependence of the unit ID on the service ID, yeah, we probably don't need it [20:48] ericsnow: would it be evil to consolidate SetUnitResource and SetResource into a single SetResource(id, ownerID string, res resource.Resource)? [20:48] ericsnow: since the only difference now is the name of the function and the name of the argument [20:48] natefinch: that would make it too easy to get it wrong [20:49] ericsnow: fair enough, I can make two exported functions, but send them to the same internal functions [20:49] natefinch: the distinction still matters even if they do the same thing [20:49] natefinch: sounds good [21:02] natefinch: I've responded to all the context-related review comments and addressed nearly all of them [21:02] natefinch: would you mind running through them real quick? [21:02] ericsnow: sure, I'll go look right now [21:02] natefinch: I should be able to land the branch now [21:03] ericsnow: sweet [21:03] natefinch: we can resolve the more philosophical differences once we have time to breath [21:04] marcoceppi: ping [21:32] natefinch: BTW, just ran through a manual test and it worked correctly on the first try :) [21:43] ericsnow, that sounds encouraging :) [21:44] heh, I've been doing manual tests for a few days... other than those two bugs we found earlier in the week, yeah, it all looks good [21:44] alexisb: it was working already, but we've since cleaned up the code we'd hastily written and added tests :) [21:44] nice that it's surviving refactoring and edits, though :) [21:44] yep [21:52] fwereade_: do you happen to be around? [21:53] natefinch: I'm going to land the patch and leave those extra issues open on the context review as a reminder [21:58] mm, at a status level we seem to lack information regarding if a relation is a peer or normal relation [21:59] ericsnow: that's fine [22:00] ericsnow: I'll be back tonight and will rebase my patches off of the feature branch and finish up the review changes [22:00] natefinch: k [22:04] Bug #1539785 opened: lxd provider leaks resources [22:20] cherylj: thats for creating that doc. the latest run has the restore tests passing. as you found, the CI scripts need tweaking and that should address some of the other failures, so that branch is looking good for next week [22:20] *thanks [22:21] wallyworld: I am worried about this bug, though: https://bugs.launchpad.net/juju-core/+bug/1539656 [22:21] Bug #1539656: leadership dependency failures in CI deploy tests [22:22] cherylj: that issue is only seen in the feature branch? [22:23] god I've looked at too many bugs today. Let me check [22:24] there was a note in the doc that said it has been seen elsewhere [22:24] if that's the case, and it's in master, then we shouldn't block the feature branch on that [22:24] ah, glad I wrote it down at some point, then [22:24] :-) you has so much to keep track of [22:25] your brain must be full [22:25] cherylj: there were some leadership changes in master just recently to address another bug i can't recall right now, might be related to that [22:26] wallyworld: I checked that master passed with dave's changes [22:26] and it did [22:26] now I'm wondering if I was mistaken about this leadership one... the other failure with similar symptoms have different logs [22:27] cherylj: ok, will need to look closer then [22:28] cherylj: so it seems that bug above is only on the maas 1.9 deploy according to the doc; the other maas failures are due to the replicaset issue [22:29] wallyworld: the 1.7 maas deploy also fails with that leadership problem [22:30] regardless, it's weird it's only on one substrate. dort of indicates it's a timing issue [22:30] wallyworld: and I'm thinking the replicaset issue may be an unfortunate "working as designed" [22:30] wallyworld: I think that's the only place this charm is deployed in CI? maybe? [22:30] not sure, will need to check [22:31] sinzui, abentley mgz - is the dummy-sink-0 charm deployed to other substrates in CI tests? or is that just a MAAS test? [22:31] cherylj: every substrate for every series [22:31] hmm [22:31] interesting [22:32] cherylj: the dummies are trrivial and consistent for every series, though the windows version is definitely different [22:32] sinzui: what would the other tests be named? [22:32] I want to look at their test logs [22:33] gce-deployer-bundle? [22:33] cherylj: almost every job that is xxx--* [22:33] ok [22:33] cherylj: on the surface, i can't see how this is related to the branch per se, but it could be. the fact it's ony one substrate indicates a more general timing or other issue. will need to look closer into it [22:33] cherylj: deployer and quickskstart and functional are not dummies [22:34] * wallyworld afk for a bit [22:34] wallyworld: here's an example of a successful deploy of that charm: http://data.vapour.ws/juju-ci/products/version-3555/aws-deploy-trusty-amd64-on-wily-amd64/build-588/unit-dummy-sink-0.log.gz [22:34] for when you get back :) [22:35] cherylj: http://reports.vapour.ws/releases/3557 The CPC and Substrates sections are all using dummy source and sink to very hooks trade infoirmation [22:35] and sinzui, wallyworld, for that manual provider connection is shut down bug, I think the changing of the replicaset is valid. It's switching to use the cloud-local address rather than the public address. [22:36] it's unfortunate that it makes mongo drop the connections [22:38] anyways, I need to go pick up the kid [22:38] bbl [23:19] Bug #1539806 opened: [ARM64][LXD Provider][ 2.0-alpha1-0ubuntu1~16.04.1~juju1] kibana 'hook failed: "install"' [23:31] Bug #1539806 changed: [ARM64][LXD Provider][ 2.0-alpha1-0ubuntu1~16.04.1~juju1] kibana 'hook failed: "install"' [23:34] Bug #1539806 opened: [ARM64][LXD Provider][ 2.0-alpha1-0ubuntu1~16.04.1~juju1] kibana 'hook failed: "install"' [23:48] cherylj: with the successful deploy referenced above in build 3555, it also works in build 3557 on mass 1.8 trusty amd64 and all other substrates, but fails only maas 1.7. and it works on maas 1.9 whereas it failed in a previous run where the only change was to add the restore-backup alias. on the surface, I cannot see how this issue is specific to the feature branch