[01:11] thumper: is hosted env destruction going to be merged into master before the jes flag is removed? and CreateEnvironment added? === axw_ is now known as axw [01:12] axw: CreateEnvironment is already there [01:12] axw: and I hope so [01:12] thumper: oh, sweet, thanks [01:12] I've just found that some of my assumptions about how bootstrap works are wrong [01:12] :( [01:13] thumper: which branch has CreateEnvironment (not PrepareForCreateEnvironment)? [01:13] master does... [01:14] you mean api, cli, ? [01:14] thumper: I mean in EnvironProvider [01:14] ... [01:14] I'm not working on that [01:15] thumper: remember that discussion about creating resources for azure? [01:15] yeah... [01:15] ugh... [01:15] thumper: ok, I thought it would be part of multi-env [01:15] it has fallen off my todo list [01:15] it should be [01:15] my todo list gained weeks of work yesterday [01:15] that floated to the top [01:15] thumper: heh :) [01:16] thumper: BTW, I don't know if it'll be useful for other providers, but in Azure it's useful to know the UUID of the controller model [01:16] thumper: s/useful/critical/ [01:16] :) [01:17] thumper: might be worth adding to the env config at some point [01:17] yeah... [01:17] atm I'm just adding it to the azure config [01:17] k [01:17] as an internal thing [01:18] * axw wishes we had somewhere to store internal data that's not really config [02:04] axw: with Offer and ServiceOffer - Offer was from anatasia's branch and it's going away but there' stuff that depends on it. i've added a todo in my next branch but didn't add the todo in the previous branch [02:05] sorry for confusion [02:05] wallyworld_: ok, so long as it dies a quick death [02:05] it will [02:05] tomorrow [02:05] excellent [02:05] :) [02:05] any ideas on why a juju deploy to a container would use the wrong uncompress option? its trying to use xz on a .tar.gz cloud image tarball [02:07] bradm: it's the lxc scripts which uncompress the image [02:07] https://pastebin.canonical.com/143978/ is the slightly truncated output [02:08] juju merely downloads the image for lxc to then use as it sees fit [02:08] right. [02:09] its definately pulling the tarball from the right looking location [02:09] bradm: i know there was recent lxc breakage in wily due to upstream issues, but am not across the details [02:10] wallyworld_: that was more around the networking [02:10] wallyworld_: this is using trusty though? [02:10] rick_h__: ah yes, you are right [02:10] wallyworld_: nothing with the image compression formats that I'm aware of [02:10] yeah, correct, i forgot [02:11] bradm: all i can suggest is to look at the lxc ubuntu-cloud template which is a bash script to see what it is doing [02:11] i think it's in /etc/lxc somewhere [02:11] wallyworld_: would you expect juju to be providing a .tar.gz or a .tar.xz ? [02:12] hmm, no, not in /etc/lxc [02:12] bradm: .tar.gz - we simply download from cloud-images.ubuntu.com. we use the clould-image-query script to find out what to download from a series [02:12] I'll find it. [02:12] bradm: tl;dr; juju relies on upstream utils [02:13] bradm: /usr/share/lxc/templates [02:13] wallyworld_: right, but this is all on trusty, I'm a bit concern it just broke [02:13] yeah me too [02:13] hah, thats exactly where I'm looking [02:13] we need to fix obviously :-) [02:13] but we need to look upstream to diagnose [02:13] trying to work out if its lxc-ubuntu or lxc-ubuntu-cloud [02:14] bradm: i suspect lxc-ubuntu-cloud [02:14] the lxc-ctrea script chooses i think [02:14] lxc-create [02:15] bradm: this smells like upstream moved to xz for better compression but juju grabbed the tar.gz image [02:15] yeah, its definately hard coded to using xz, according to the script [02:16] damn, i'm o sick of this kernel bug killing my network [02:16] bradm: right https://github.com/lxc/lxc/commit/27c278a76931bfc4660caa85d1942ca91c86e0bf [02:17] bradm: line 334 in the diff seems about the right place [02:17] "lxc-ubuntu-cloud: Replace .tar.gz by .tar.xz and don't auto-generate missing tarballs" [02:17] from the release notes [02:17] rick_h__: hah, snap. :) [02:17] same thing, slightly different direction [02:18] bradm: yea, can you file a bug on that please and copy myself and cherylj into the bug please? [02:18] bradm: including your version of lxc, juju, etc? [02:18] rick_h__: sure. bug on where though? [02:18] ie juju-core or lxc? [02:18] bradm: and we'll have to see if that needs to be made more flexible (there's an auto detect the format flag we use in juju-gui tarball for xz) or something else [02:19] bradm: on lxc [02:19] bradm: we can't get a release of juju out to fix this tomorrow [02:19] bradm: so weneed to file a backward incompatible bug with them or something. Maybe it'll end up a bug in how juju is getting the image that it's not getting the new ones? [02:19] bradm: but let's start there and we'll start working together on it please [02:20] rick_h__: for sure, easy enough to move bugs around. [02:20] bradm: ty much and <3 for the catch [02:20] rick_h__: bradm: juju uses upstream cloud-images-query to get the image url [02:21] wallyworld: right, so something changed there in lxc that it's thinking a different image should be used? I'm not sure tbh. [02:21] so if that is telling juju the wrong image, that will ned to be fixed too [02:23] rick_h__: ubuntu-cloudimg-query trusty released amd64 --format %{url} on my system returns [02:23] https://cloud-images.ubuntu.com/server/releases/trusty/release-20151105/ubuntu-14.04-server-cloudimg-amd64.tar.gz [02:24] wallyworld: rgr [02:24] bradm: ^^^^ what does the above return on yours [02:25] wallyworld: same. [02:25] hmmm [02:26] that's what i would have expected [02:26] wallyworld: apparently the lxc container creation template script didn't. :) [02:26] that's what juju downloads [02:26] damn :-( [02:27] so ubuntu-clouldimg-query and lxc are out of sync [02:27] how did this not show up in our testing [02:27] dunno [02:28] wallyworld: we must not have tested the latest lxc release. This just came out the other day. Is it in backports/etc? [02:28] ah i see. not sure where it lives [02:28] wallyworld: so this was released 2 days ago [02:28] bradm: can you look where you get it from? [02:28] well i guess our tests will break soon enough :-) [02:29] wallyworld: hah [02:29] wallyworld: any particular reason why we need to model "remote endpoints", rather than just passing charm.Relations around? [02:30] aha! [02:30] we have -proposed enabled [02:30] so it looks like it hasn't made it to the main archive yet [02:31] axw: you mean for params across the wire? [02:31] wallyworld: yes [02:32] wallyworld: I'm reviewing anastasiamac's branch, just wondering whether we need params.RemoteEndpoint, or if we can just use charm.Relation [02:32] axw: we want to model wire structs distinct from the domain model. we pass in domain objects to api layer and map to params.* [02:32] rick_h__: theres the answer then, its not out in the wild yet [02:32] axw: and on the way out, we map params.* back to domain model [02:33] axw: but we currently leak params.* eberywhere :-( [02:33] because much of our domain model is defined in state [02:33] not is a model package [02:33] axw: this way we could also easily distinguish exported endpoints from native ones, i guess :(... at some stage... if we want.... \o/ [02:33] bradm: glad you caught that before it escaped :-) [02:34] wallyworld anastasiamac: we're using charm.Relation in apiserver/params for non-remote relations, I'm just trying to understand if there's a good reason to have separate ways of serialising them [02:35] wallyworld anastasiamac: I'm wondering if it's ever going to be the case that they'll have different information [02:35] axw: IMO what's there now then is a mistake, but i could be told otherwise [02:36] wallyworld: yeah, I know fwereade doesn't like that we just pass charm.X over the wire, but I don't know that having two ways of doing it is a good thing either [02:37] wallyworld: I was thinking the same thing about your argument for using the term "Endpoint" btw. it's true that "Relation" isn't a good name, but I think it will be confusing to have two ways to refer to the same thing in the codebase [02:37] that's worthy of consideration for sure [02:37] wallyworld: we'll be going from being consistently inconsistent to inconsistently inconsistent [02:37] axw: at some point, we need to fix things [02:37] and with juju 2.0 we can vreak stuff [02:38] so perhaps this new work is a good place to start [02:38] wallyworld: can't wait :) [02:38] * axw sharpens the axe [02:38] let's do it "the right way" now and fix the other stuff after 2.0 [02:38] ok [02:38] axw: we live on the edge! what's wrong with " inconsistently inconsistent"? :D [02:38] anastasiamac: cognitive overhead [02:39] axw: no sense of adventure :D [02:39] anastasiamac: I like to get shit done instead of labouring over what something means :) [02:39] axw: wallyworld: i agree that we'd benefit from doing the right thing now... sorry for mudding the mud [02:40] anastasiamac: no apologies required, I just wanted to check what we should be doing [02:40] axw: i prefer to code :D [02:40] axw: and it was not an apology :P [02:43] wallyworld: LP#1515463 if you want to poke at the bug for any reason [02:43] axw: that service directory branch should be good to go. a few of the things you mentioned were prior issues with cleanup pending as stuff is glued together over the next day or two [02:43] bradm: ty [02:44] wallyworld: I hope I captured the bug appropriately. [02:47] wallyworld: ok, looking [02:48] ouch, irc go boom [02:48] bradm: looks good. maybe a comment suggesting that ubuntu-cloudimg-query needs to be loooked at [02:48] bradm: ty you're my hero for catching it in proposed [02:48] bradm: +1 [02:49] wallyworld: LGTM [02:49] ty [02:50] wallyworld: \o/ plz land!!! :D [02:50] ah, do I have to make this against a particular lxc version? [02:52] bradm: the bug triager will allocate to the right version? maybe? [02:53] I'd just hate them to miss it and the package get out to the wild [02:54] * rick_h_ runs for the night, evening all [02:58] bradm: stehpane has commented on the bug. i've also commneted [03:00] wallyworld_: perfect. [03:01] wallyworld_: looks like its well in hand now then. [03:02] bradm: yeah, it does. we'll keep an eye on it at our end also :-) [03:02] I really don't care who fixes it where, just that I can deploy containers again. :) [03:04] yup :-) [03:05] well, I can by dropping -proposed, thats easy enough for now. [03:48] i just noticed that "juju run" stopped working in a 1.24.7 environment: https://pastebin.canonical.com/143985/plain/ [03:48] the unit log has the following: https://pastebin.canonical.com/143986/plain/ [03:48] hmm [03:48] interesting [03:49] :( [03:49] machine-0.log: https://pastebin.canonical.com/143988/plain/ [03:51] pjdc: it seems the agent is wedged [03:52] if you restart the agent, it should fix it [03:52] thumper: on machine 0? [03:52] which machine was the log from? [03:52] i already restarted the agent on the jenkins unit, which is the first log [03:52] that should fix it [03:53] it's still spewing leadership failure [03:53] was the log before or after the restart? [03:53] before [03:54] here's the restart: https://pastebin.canonical.com/143989/plain/ [03:54] and it's just been logging the dying/stopped lines ever since [03:54] ugh [03:54] is the environment HA? [03:55] it's not [03:55] can I see the logs from machine 0? [03:55] from before or after the chunk in https://pastebin.canonical.com/143988/plain/ ? [03:56] all i have in machine-0.log lining up with the jenkins unit agent restart is this: https://pastebin.canonical.com/143990/plain/ [03:56] how much do you have? [03:57] heh [03:57] i have everything; the environment is onyl a few hours old [03:57] provider? [03:57] openstack [03:57] hmm... [03:58] how big is the environment? [03:58] I think the first step is to file a bug for this failure [03:58] pretty small. three machines, one service each, and a few subordinates deployed to each [03:58] then we'll work out how to fix [03:58] it [03:58] kk [03:59] can I get you to do this: `juju set-env logging-config=juju=DEBUG` to change the logging level [03:59] then restart the machine-0 agent [03:59] done and done [03:59] this should give us more output [03:59] let it settle for 20s or so [03:59] then lets look at the logs [04:00] pjdc: did you want to open the bug or shall I? [04:00] i can open it [04:00] cheers [04:02] well, that's annoying [04:02] the restart seems to have made it work again [04:04] \o/ [04:04] right [04:04] the reason juju run was failing was due to the uniter bouncing [04:04] juju run is executed through the uniter worker [04:05] the uniter was bouncing due to leadership issues [04:05] it seems that bouncing the server fixed those issues... [04:05] which it shouldn't have had [04:06] pjdc: to get the logging back to the default, you can say `juju set-env logging-config=juju=WARNING` [04:07] righto, ta [04:07] rick_h_: ping [04:07] thumper: pong [04:07] rick_h_: you moved our meeting to a time I now have busy [04:07] thumper: was just about to ping you about moving that meeting [04:07] thumper: ah, sorry didn't show a conflict [04:07] thumper: what works for you? [04:07] rick_h_: if you can't make the earlier one, I'll change my one [04:07] rick_h_: it is a gym thing :) [04:07] thumper: yes sorry, I missed parent-teacher conferences on my personal calendar [04:07] the team lead meeting was moved to my normal gym time [04:07] I need some way to combine the two better so I don't schedule work stuff over personal stuff [04:07] so I booked a personal trainer for a session [04:08] thumper: hah, ok [04:08] filed as #1515475, fwiw [04:08] Bug #1515475: "juju run" stopped working after a few hours (1.24.7, newly deployed) [04:08] thumper: well we can move or do tonight or ... [04:08] pjdc: ta [04:08] rick_h_: as in now? [04:08] thumper: if you're ok with it? [04:08] sure, I have some questions [04:08] thumper: or I can move it forward 4hrs from where it sits now? [04:09] * thumper looks [04:09] thumper: earlier into the day, not sure if that's too early your time [04:09] thumper: around your standup time I guess [04:09] rick_h_: it is 13:30 now [04:09] four hours earlier is 9:30 [04:09] which is fine [04:09] what questions? want to do that now and still keep it tomorrow? [04:09] thumper: or what do we need from here? [04:10] tomorrow... as I think we'll need the hour :) [04:10] ok [04:10] Bug #1515475 opened: "juju run" stopped working after a few hours (1.24.7, newly deployed) [04:10] thumper: ah I can't...wife is away. doh [04:10] rick_h_: can't tomorrow? [04:10] rick_h_: we can go fast now if you like [04:10] thumper: can we do 8:30am? and bump your standup 30min? [04:11] could do 8am [04:11] and not bump standup [04:11] thumper: why fast now? if you're rnuning out are you heading back and we can do the full hour later tonight? [04:12] dinner date :) [04:12] 15 minutes now and some monday? [04:12] thumper: maybe, will be in london for customer thing monday [04:12] my questions aren't deep [04:12] really? heading to london? [04:12] thumper: k, let's do that and I'll try to get something else [04:12] for how long? [04:12] thumper: for 3 days, Tues customer meeting [04:12] \o/ [04:13] * thumper chuckles to himself [04:13] thumper: https://plus.google.com/hangouts/_/canonical.com/rick?authuser=1 [04:13] rick_h_: lets go! [04:28] Bug #1515475 changed: "juju run" stopped working after a few hours (1.24.7, newly deployed) [04:34] Bug #1515475 opened: "juju run" stopped working after a few hours (1.24.7, newly deployed) [04:41] wallyworld, rick_h_: its probably totally pointless testing telling us what we already know, but I just downgraded lxc related packages, put it on held and then redeployed, and its looking good. [04:42] yay [04:42] I'll mention it on the bug. [04:42] even though it seems well in hand. [04:42] bradm: by testing, i mean that our CI testing should catch this issue also [04:42] ty [04:43] wallyworld: right. has it just not run on proposed, or is there something else going on there? [04:43] bradm: we don't test with proposed AFAIK. but i guess we should [04:44] bradm: there are so many combinations of series, substrate, juju version etc [04:44] adding in proposed adds a whole new axis [04:44] wallyworld: indeed. [04:45] wallyworld: its just another set of jenkins jobs, right? ;) [04:46] bradm: yeah, but we don't have enough hardware. hardware is currently on order AFAIK [04:47] wallyworld: I know that feeling. [04:47] :-) [09:19] frobware: looks like I'm going to have to miss our standup again... my wife needs me to take her to the Dr today. [09:24] jam: ack. and for the record I might miss tomorrow's as I have a dental appointment. [10:00] frobware: I assume we're doing juju-core instead of standup? [10:01] voidspace, I vote we do standup [10:01] frobware: heh [10:03] voidspace, our call is cooler :) [10:04] dimitern: our call is way cooler [10:06] voidspace, so you're coming? [10:07] dimitern: eh, are you serious? we shouldn't miss juju-core should we? [10:07] even though not much is happening [10:07] voidspace, oh c'mon :P [10:08] dimitern: voidspace: :( [10:10] anastasiamac: o/ :-( [10:29] nope, they're done [10:56] dimitern, care to HO around maas/spaces? [10:57] wallyworld: ping [11:07] life after breakfast is much better [11:07] frobware, hey, yeah - in 10m? [11:08] dimitern, 30m [11:08] frobware, even better [11:11] dimitern, any chance you could join this https://plus.google.com/hangouts/_/midokura.com/juju_openstack [11:11] frobware, ok, just a sec.. [11:30] dimitern, thanks [11:30] frobware, I hope it was useful :) should we make the spaces call now? [11:31] dimitern, please [11:39] dimitern: looks like we can get static ranges from the subnets api in maas 1.9 [11:39] dimitern: we have to fetch all the subnets (1 api call) and then make an additional call per subnet to get the range [12:14] dimitern: a trivial one that should have been part of my last branch [12:14] (oops) [12:14] http://reviews.vapour.ws/r/3125/ [12:16] dimitern: in terms of the ListSpaces implementation for the maas provider [12:16] dimitern: will we make it part of the networking environ interface? [12:17] it will only be needed / used for maas [12:17] but I think it will have to be part of the interface for autodiscovery to use it [12:18] or I can provide a helper function that does it in the maas namespace, that casts a given provider to a maasEnviron and calls ListSpaces [12:21] voidspace, looking [12:22] voidspace, LGTM [12:22] dimitern: thanks [12:23] voidspace, yes, let's add a Spaces() method, taking no arguments for now (until we need filtering) [12:23] dimitern: add to the interface? [12:23] voidspace, yes, right after Subnets() should be a good place for it - don't you think? [12:24] dimitern: well, we're extending the public interface for all providers solely for maas [12:24] dimitern: so I quite liked the helper function idea [12:24] voidspace, nope, we'll use that for all providers eventually [12:24] if we ever get to shared spaces... [12:25] voidspace, shared or not doesn't matter [12:25] voidspace, EC2 can list your env spaces by looking at subnet tags + env uuid [12:25] for EC2 you just check the model [12:25] voidspace, or, it can list the global "shared" spaces, when we get there [12:25] the model is the source of truth [12:25] if we don't have shared spaces then juju is the source of truth about what spaces there are and you don't need to go to the provider [12:25] it's only once you have shared spaces (as maas does) that you need to ask the provider [12:26] but fair enough [12:26] interface method it is [12:26] in theory we might need it for other providers... [12:26] voidspace, we can't get away without shared spaces I'm afraid, we're just postponing the moment where we need to deal with them [12:26] I'm sceptical it will become the highest priority any time soon [12:26] but time will tell [12:27] indeed\ [12:36] dimitern: my current card (subnet api) may take a bit longer than I imagined (maybe an extra day) [12:36] dimitern: the code itself is simple, but I'll need to extend the gomaasapi test server again... [12:39] voidspace, sure [12:40] voidspace, it needs to work and be tested with both legacy and new APIs, and Subnets() is a big part of "maas spaces (basic) support" (the other main part is what dooferlad is doing after bootstrap) [12:43] dimitern: gah, the "reserved_ip_ranges" operation on maas lists all the ip ranges *except* the static range [12:43] so you can deduce it [12:43] I might just take the whole cidr minus the dynamic range [12:43] the other ranges are single ips for the gateway and cluster [12:43] voidspace, that's not entirely correct [12:43] dimitern: which bit [12:44] I guess it might not be correct [12:44] voidspace, static-range != cidr - dynamic-range [12:44] right [12:44] voidspace, as there might be IPs in neither of the ranges [12:47] voidspace, however, looking at http://maas.ubuntu.com/docs/api.html (development trunk version) [12:47] dimitern: there's unreserved range too [12:47] voidspace, there's op=statistics, which claims to include "subnet ranges - the specific IP ranges present in ths subnet (if specified)" [12:47] and even better: [12:47] Optional arguments: include_ranges: if True, includes detailed information about the usage of this range [12:48] voidspace, I'll give it a go on my maas now as it has all ranges set [12:48] dimitern: the static range there is included as "unused" and is the same range as returned by unused_ip_ranges [12:48] dimitern: I'm going to try reducing the size of the static range (so there are genuinely unused portions of the cidr) and see what happens [12:49] voidspace, yeah, good idea [12:50] dimitern: my static range is defined to start at 172.16.0.4 - but the unused range starts at 3 [12:51] voidspace, perfect! see this: $ maas hw-root subnet statistics 2 include_ranges=True -> http://paste.ubuntu.com/13238168/ [12:51] subnet 2 is my pxe subnet - 10.14.0.0/20 [12:51] fwereade_: priv ping me when you are around plz [12:52] dimitern: nope [12:52] dimitern: on my maas the static range starts at 172.16.0.4 [12:52] dimitern: however the "unused" range reported by statistics (and by unreserved_ip_range) starts at 172.16.0.3 [12:53] dimitern: I've updated my bug report [12:53] voidspace, all of the "unused" ranges are part of the static range (10.14.0.100 - 10.14.1.200 as defined on the cluster interface; dhcp range is 10.14.0.30-.90) [12:54] dimitern: the cluster configuration has "Static IP range low value" set to 172.16.0.4 [12:54] dimitern: are you saying that ignoring that is the correct behaviour? [12:55] voidspace, what do you mean? [12:55] dimitern: I'm talking about my maas here [12:55] dimitern: I have static IP range low value set to 172.16.0.4 [12:55] voidspace, ok [12:55] dimitern: but the range reported as unused/unreserved returns the low value as 172.16.0.3 [12:55] dimitern: it isn't returning the static range (as defined on the cluster interface) [12:55] voidspace, is 172.16.0.4 used for anything? [12:55] dimitern: but is returning the portion of the cidr unused by anything else [12:56] dimitern: that's the low bounds of the static range [12:56] it isn't used, but I don't see that it's relevant [12:56] voidspace, check the web ui for the subnet - e.g. http://10.14.0.1/MAAS/#/subnet/2 in my case [12:56] voidspace, ah, I see you point [12:57] voidspace, "unused" includes IPs not part of static range [12:57] voidspace, but not assigned, cluster, or dynamic ips [12:58] dimitern: correct [12:58] dimitern: in #juju on canonical, roaksox is saying that it doesn't matter and we should use the unreserved range anyway [12:58] dimitern: and in 2.0 the static range is going away [12:58] voidspace, awesome news! [12:58] :) [12:58] presumably it will just be implied from cidr - dynamic range [12:59] voidspace, the let's just do that and not use the node group interfaces [12:59] dimitern: yep [12:59] voidspace, then the bug I asked you to file is moot and can be closed [12:59] dimitern: it's done [13:00] voidspace, cheers [13:07] perrito666: are you moonstone? [13:07] voidspace: I am not [13:08] perrito666: ok [13:27] tests failed because "your quota allows for 0 more running instance(s). You requested at least 1" [13:27] *sigh* [13:29] dimitern: hmmmm... the networks api we're currently using for subnets allows filtering by nodeId (which we use) [13:29] dimitern: I don't think the new subnets api does [13:30] checking [13:32] voidspace, well, provided you use static IPs for all your nodes, this seems to work for me: $ maas hw-juju subnet ip-addresses 2 with_nodes=True -> http://paste.ubuntu.com/13238364/ [13:33] dimitern: so, request all subnets, request all ip addresses with node information, then request the unreserved range for every subnet [13:33] that's hardly simpler than what we currently have :-D [13:33] but we'll have space information [13:33] and we do the filtering rather than have maas do it [13:34] matching allocated addresses to subnets to do the node filtering [13:34] and that's a bunch of stuff to add to the test server as well :-/ [13:35] voidspace, yeah :/ it seems we still need 3 API calls [13:35] voidspace, but not every time [13:35] well, we need 1 plus 1 per subnet [13:35] and if we're filtering by instance id an extra one [13:35] voidspace, yeah [13:36] although for that case we can trim down the number of subnets we need to query [13:39] dimitern: hmmm... ec2 provider allows subnetIds to be empty (meaning list all subnets) [13:39] dimitern: and I remember a bug about that [13:39] dimitern: maas doesn't allow that [13:41] dimitern: apiserver/subnets/subnets.go calls netEnv.Subnets with an empty slice of subnet ids [13:41] dimitern: that will fail on maas [13:41] voidspace, yeah, because we had no way of linking networks to nodes apart from going via the cluster interfaces [13:42] dimitern: I guess it didn't matter when ec2 was the only platform supporting spaces [13:42] dimitern: but that needs fixing too [13:43] I bet "juju subnets list" fails for maas [13:43] voidspace, well, can't it return an error with empty subnetIDs only for the new api? [13:43] dimitern: other way round [13:43] voidspace, nope, it won't fail as it doesn't hit maas at all - just state [13:43] dimitern: it currently returns an error for empty subnetIds [13:43] voidspace, yeah, you got me :) [13:44] apiserver/subnets/subnets.go calls netEnv.Subnets [13:44] dimitern: in cacheSubnets [13:44] voidspace, that's for "subnet add" only [13:45] dimitern: ah, fair enough [13:45] maybe not an issue then [13:45] I won't fix it until we need to [13:45] voidspace, +1 [13:46] oh dear.. ci's f*cked again - euca-run-instances: error (InstanceLimitExceeded): Your quota allows for 0 more running instance(s). You requested at least 1 [13:46] yep [13:56] dimitern: hm, the gating job? I'll see what else is up in ec2. [13:59] mgz_, yeah, and we were seeing some weired unit test failures from a parallel universe :) where state.machineDoc doesn't have Principals field (added my aram originally IIRC) [14:01] are you sure the deps are correct? Ci beuilds a completely clean tarball, which is not the same thing as building out of a local GOPATH [14:02] frobware: I know it's a bit late, but I did verify that your fix resolved the EMPTYCONFIG problem I ran into on maas [14:02] mgz_, well, something's fishy for sure, as machineDoc has "Principals []string" - no omitempty or anything, so it will be there, unless mongo returns bogus docs from the collection [14:02] cherylj, which fix? setting static IP range, or the fix I committed yesterday? [14:03] er, that's not good, the jenkins web ui just went down [14:03] frobware: I just checked with the latest master, since I saw you had already merged http://reviews.vapour.ws/r/3102/ [14:03] cherylj, result! [14:05] frobware: I think we're going to try to cut a 1.25.1 soon. Should I move the 1.25 milestone for bug 1412621 to 1.25.2? or do you think you'll get to make the fix for 1.25 in the next day or so? [14:05] Bug #1412621: replica set EMPTYCONFIG MAAS bootstrap [14:05] cherylj, happening now/this afternoon. Was on my list. [14:05] frobware: oh awesome, thanks! [14:06] mgz_: do you think you'll get bug 1512399 merged into 1.25 in the next day or so? [14:06] Bug #1512399: ERROR environment destruction failed: destroying storage: listing volumes: Get https://x.x.x.x:8776/v2//volumes/detail: local error: record overflow 1.25:Triaged> [14:06] mgz_: because we should probably get that into 1.25.1 [14:09] cherylj, mgz - yes please :-) bundletester + openstack provider is in always-false-fail mode atm. [14:10] cherylj: yeah, I should have that finished this week [14:10] ok, thanks, mgz_ ! [14:21] dimitern, ok to close http://reviews.vapour.ws/r/3088/ as we're not doing 1.24? [14:35] frobware, yeah, I wanted to keep the branch around until I forward port it, but the PR and RB entries can be closed [14:37] frobware, done [14:37] dimitern, thanks [14:54] Bug #1515647 opened: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates [15:00] Bug #1515647 changed: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates [15:03] dimitern, could I leverage your expertise ... ? [15:05] frobware, can it wait for a while? trying to do a few things at once here.. [15:05] dimitern, ok I'll pester voidspace. Need some help with the vanguard issue ^^ [15:07] frobware: shouldn't bug squad do it [15:08] I'm trying to do feature work after two weeks on bug squad [15:15] Bug #1515647 opened: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates [15:22] ericsnow: natefinch: sorry ubuntu froze on me. it's going to be a bit of a day i can tell [15:23] katco: heh no problem, we're just bullshitting about providers [15:23] voidspace, bug squad picked it up; wasn't sure of the process. [15:33] cherylj, replica set issue committed for 1.25 now - https://bugs.launchpad.net/juju-core/+bug/1412621 [15:33] Bug #1412621: replica set EMPTYCONFIG MAAS bootstrap [15:33] awesome, thanks, frobware ! [15:51] frobware: cool [16:01] fwereade_: you around? [16:23] cherylj, heyhey [16:23] cherylj, sorry I missed you [16:23] cherylj, what can I do for you? [16:30] fwereade_: thanks for the additional info on the instancepoller. I think that will help simplify some of the work. [16:31] fwereade_: but, how would we track the instance progress with lxc? [16:31] fwereade_: do you have any thoughts on that? [16:32] * fwereade_ scratches head vaguely -- not sure how granular the info we can get from lxd is -- is .Status() intrinsically limited there? [16:32] cherylj, if we use the cloudinit2 report-progress-back, would that help? [16:32] fwereade_: the problem is that all the "interesting things" happen before we return an instance back from StartInstance [16:33] cherylj, ah damn yes ofc [16:33] voidspace, ping [16:34] * fwereade_ reloading context a bit... [16:34] cherylj, I think that callback is the cleanest option... [16:34] cherylj, so I don't *think* we need additional workers [16:35] fwereade_: what do you mean by callback? [16:36] cherylj, so StartInstanceParams gets something like `StatusCallback func(InstanceStatus, string)` [16:36] alexisb: popng [16:36] *pong even [16:37] voidspace, 1x1? [16:37] alexisb: ah yes! [16:37] sorry [16:37] :) [16:38] cherylj, if we need more special tracking after StartInstance I'd hope we could get it via InstancePoller like everything else (with an option on a cloudinit2 alternative/supplement to instancepoller one day) [16:39] fwereade_: the alternative is that we make creating container asynchronous. We do enough to get the instance Id, return it to the provisioner, then start a goroutine and go about our merry way [16:41] cherylj, I would prefer not to -- that'd imply that the lxd broker had to accept long-term responsibility for completing the deployment in the face of all possible weirdness [16:42] fwereade_: it would move that retry logic into the container code :) [16:43] fwereade_: I think even with a callback, we have a chicken and egg problem [16:43] cherylj, but also add a bunch of responsibility for maintaining local state, surely? [16:44] fwereade_: if we report provisioning status on an instance, not a machine [16:44] fwereade_: we still need that instance back from StartInstance before we can report its status [16:44] cherylj, I'm imagining a StatusCallback implementation will be something like [16:45] Bug #1515647 changed: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates [16:45] func(status InstanceStatus, info string) error { [16:45] if err := machine.SetInstanceStatus(status, info, nil); err != nil { [16:46] / etc [16:46] fwereade_: and we could do that before we associate an instance with the machine? [16:47] cherylj, I think so -- model-wise, instance data is just a satellite of the machine entity -- and so is machine status, and so I think can be instance status [16:47] fwereade_: okay, I can dig more down that path. Thanks! [16:48] Bug #1515647 opened: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates [16:48] so machine has .[Set]Status(), and [Set]InstanceStatus, and there's some AggregateStatus that takes the output of both to build the user-facing status doc [16:49] cherylj, (and that way we can represent doing-stuff-but-no-instance-id-yet in status) [16:49] fwereade_: ahhh, nice [16:49] cherylj, a pleasure [16:51] Bug #1515647 changed: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates [17:02] ericsnow: can you have a look at http://reviews.vapour.ws/r/3004/ when you have a chance? [17:04] katco: will do [17:06] ericsnow: ty === sarnold_ is now known as sarnold [17:41] frobware, voidspace: Tiny review if you have a moment: http://reviews.vapour.ws/r/3127/ [18:26] dooferlad: didn't I already do that... [18:27] dooferlad: it's already on maas-spaces branch [18:28] dooferlad: it shouldn't land on master [19:15] Bug #1515736 opened: juju storage filesystem list panics and dumps stack trace [19:15] hi mup that is my bug. [19:16] Who is working on the storage feature? I found a panic that mup just pointed out. [19:17] oh I see cherylj already triaged it. [19:17] Thank you cherylj [19:18] mbruzek: np. I can fix that later this afternoon. Super simple problem [19:19] But it is surprising that it landed. Means no one tried to actually run the command [19:19] I guess it didn't get hit because of the mocking that happens in our unit tests [19:20] mbruzek: do you want me to give you a patched juju to run until the bug is fixed? [19:20] cherylj: no need [19:20] k [19:21] rick_h_: we don't need this meeting in 10 minutes do we? [19:21] rick_h_: although we do need to talk about environment users [19:22] cherylj: I am just glad it is triage, no hurry on the fix. I am trying to document the storage feature [19:22] thumper: up to you, I tried to make sure we had a space in case we did need it [19:22] mbruzek: ah, okay. Thanks for helping us find these issues ;) [19:22] * thumper thinks [19:25] rick_h_: yeah, lets chat [19:31] cherylj: who wrote the storage feature? I have questions. [19:31] mbruzek: axw [19:55] sometimes I forget how crazy slow amazon is [19:56] natefinch: compared to what? [19:57] perrito666: the local provider, lxd provider... any machine built in the past 5 years [19:57] lol, well if you count the amount of time I spend fixing my machine after some local provider tests I wouldn't be so sure [20:00] perrito666: that's why the lxd provider is so awesome. I wish our providers were plugins, so I could just use the lxd provider on my current bug (which is on 1.24) === akhavr1 is now known as akhavr [20:18] Bug #1515401 changed: destroy-environment leaving jujud on manual machines [20:21] Bug #1515401 opened: destroy-environment leaving jujud on manual machines [20:24] Bug #1515401 changed: destroy-environment leaving jujud on manual machines [21:55] * fwereade_ has that unique sinking feeling when he finally finds a strange-looking goroutine at the bottom of the timeout and it leads back to code that... I saw earlier today and annotated with an "I don't think this is right" [22:31] wallyworld: release standup? [22:40] natefinch: looks like master if open [22:42] natefinch: kicked off a merge for you [23:56] thumper: lucky(~/src/github.com/juju/juju/utils) % ls [23:56] package_test.go syslog yaml.go [23:56] gettting there [23:57] yaml.go is next on the chopping block