[01:11] thumper: is hosted env destruction going to be merged into master before the jes flag is removed? and CreateEnvironment added? === axw_ is now known as axw [01:12] axw: CreateEnvironment is already there [01:12] axw: and I hope so [01:12] thumper: oh, sweet, thanks [01:12] I've just found that some of my assumptions about how bootstrap works are wrong [01:12] :( [01:13] thumper: which branch has CreateEnvironment (not PrepareForCreateEnvironment)? [01:13] master does... [01:14] you mean api, cli, ? [01:14] thumper: I mean in EnvironProvider [01:14] ... [01:14] I'm not working on that [01:15] thumper: remember that discussion about creating resources for azure? [01:15] yeah... [01:15] ugh... [01:15] thumper: ok, I thought it would be part of multi-env [01:15] it has fallen off my todo list [01:15] it should be [01:15] my todo list gained weeks of work yesterday [01:15] that floated to the top [01:15] thumper: heh :) [01:16] thumper: BTW, I don't know if it'll be useful for other providers, but in Azure it's useful to know the UUID of the controller model [01:16] thumper: s/useful/critical/ [01:16] :) [01:17] thumper: might be worth adding to the env config at some point [01:17] yeah... [01:17] atm I'm just adding it to the azure config [01:17] k [01:17] as an internal thing [01:18] * axw wishes we had somewhere to store internal data that's not really config [02:04] axw: with Offer and ServiceOffer - Offer was from anatasia's branch and it's going away but there' stuff that depends on it. i've added a todo in my next branch but didn't add the todo in the previous branch [02:05] sorry for confusion [02:05] wallyworld_: ok, so long as it dies a quick death [02:05] it will [02:05] tomorrow [02:05] excellent [02:05] :) [02:05] any ideas on why a juju deploy to a container would use the wrong uncompress option? its trying to use xz on a .tar.gz cloud image tarball [02:07] bradm: it's the lxc scripts which uncompress the image [02:07] https://pastebin.canonical.com/143978/ is the slightly truncated output [02:08] juju merely downloads the image for lxc to then use as it sees fit [02:08] right. [02:09] its definately pulling the tarball from the right looking location [02:09] bradm: i know there was recent lxc breakage in wily due to upstream issues, but am not across the details [02:10] wallyworld_: that was more around the networking [02:10] wallyworld_: this is using trusty though? [02:10] rick_h__: ah yes, you are right [02:10] wallyworld_: nothing with the image compression formats that I'm aware of [02:10] yeah, correct, i forgot [02:11] bradm: all i can suggest is to look at the lxc ubuntu-cloud template which is a bash script to see what it is doing [02:11] i think it's in /etc/lxc somewhere [02:11] wallyworld_: would you expect juju to be providing a .tar.gz or a .tar.xz ? [02:12] hmm, no, not in /etc/lxc [02:12] bradm: .tar.gz - we simply download from cloud-images.ubuntu.com. we use the clould-image-query script to find out what to download from a series [02:12] I'll find it. [02:12] bradm: tl;dr; juju relies on upstream utils [02:13] bradm: /usr/share/lxc/templates [02:13] wallyworld_: right, but this is all on trusty, I'm a bit concern it just broke [02:13] yeah me too [02:13] hah, thats exactly where I'm looking [02:13] we need to fix obviously :-) [02:13] but we need to look upstream to diagnose [02:13] trying to work out if its lxc-ubuntu or lxc-ubuntu-cloud [02:14] bradm: i suspect lxc-ubuntu-cloud [02:14] the lxc-ctrea script chooses i think [02:14] lxc-create [02:15] bradm: this smells like upstream moved to xz for better compression but juju grabbed the tar.gz image [02:15] yeah, its definately hard coded to using xz, according to the script [02:16] damn, i'm o sick of this kernel bug killing my network [02:16] bradm: right https://github.com/lxc/lxc/commit/27c278a76931bfc4660caa85d1942ca91c86e0bf [02:17] bradm: line 334 in the diff seems about the right place [02:17] "lxc-ubuntu-cloud: Replace .tar.gz by .tar.xz and don't auto-generate missing tarballs" [02:17] from the release notes [02:17] rick_h__: hah, snap. :) [02:17] same thing, slightly different direction [02:18] bradm: yea, can you file a bug on that please and copy myself and cherylj into the bug please? [02:18] bradm: including your version of lxc, juju, etc? [02:18] rick_h__: sure. bug on where though? [02:18] ie juju-core or lxc? [02:18] bradm: and we'll have to see if that needs to be made more flexible (there's an auto detect the format flag we use in juju-gui tarball for xz) or something else [02:19] bradm: on lxc [02:19] bradm: we can't get a release of juju out to fix this tomorrow [02:19] bradm: so weneed to file a backward incompatible bug with them or something. Maybe it'll end up a bug in how juju is getting the image that it's not getting the new ones? [02:19] bradm: but let's start there and we'll start working together on it please [02:20] rick_h__: for sure, easy enough to move bugs around. [02:20] bradm: ty much and <3 for the catch [02:20] rick_h__: bradm: juju uses upstream cloud-images-query to get the image url [02:21] wallyworld: right, so something changed there in lxc that it's thinking a different image should be used? I'm not sure tbh. [02:21] so if that is telling juju the wrong image, that will ned to be fixed too [02:23] rick_h__: ubuntu-cloudimg-query trusty released amd64 --format %{url} on my system returns [02:23] https://cloud-images.ubuntu.com/server/releases/trusty/release-20151105/ubuntu-14.04-server-cloudimg-amd64.tar.gz [02:24] wallyworld: rgr [02:24] bradm: ^^^^ what does the above return on yours [02:25] wallyworld: same. [02:25] hmmm [02:26] that's what i would have expected [02:26] wallyworld: apparently the lxc container creation template script didn't. :) [02:26] that's what juju downloads [02:26] damn :-( [02:27] so ubuntu-clouldimg-query and lxc are out of sync [02:27] how did this not show up in our testing [02:27] dunno [02:28] wallyworld: we must not have tested the latest lxc release. This just came out the other day. Is it in backports/etc? [02:28] ah i see. not sure where it lives [02:28] wallyworld: so this was released 2 days ago [02:28] bradm: can you look where you get it from? [02:28] well i guess our tests will break soon enough :-) [02:29] wallyworld: hah [02:29] wallyworld: any particular reason why we need to model "remote endpoints", rather than just passing charm.Relations around? [02:30] aha! [02:30] we have -proposed enabled [02:30] so it looks like it hasn't made it to the main archive yet [02:31] axw: you mean for params across the wire? [02:31] wallyworld: yes [02:32] wallyworld: I'm reviewing anastasiamac's branch, just wondering whether we need params.RemoteEndpoint, or if we can just use charm.Relation [02:32] axw: we want to model wire structs distinct from the domain model. we pass in domain objects to api layer and map to params.* [02:32] rick_h__: theres the answer then, its not out in the wild yet [02:32] axw: and on the way out, we map params.* back to domain model [02:33] axw: but we currently leak params.* eberywhere :-( [02:33] because much of our domain model is defined in state [02:33] not is a model package [02:33] axw: this way we could also easily distinguish exported endpoints from native ones, i guess :(... at some stage... if we want.... \o/ [02:33] bradm: glad you caught that before it escaped :-) [02:34] wallyworld anastasiamac: we're using charm.Relation in apiserver/params for non-remote relations, I'm just trying to understand if there's a good reason to have separate ways of serialising them [02:35] wallyworld anastasiamac: I'm wondering if it's ever going to be the case that they'll have different information [02:35] axw: IMO what's there now then is a mistake, but i could be told otherwise [02:36] wallyworld: yeah, I know fwereade doesn't like that we just pass charm.X over the wire, but I don't know that having two ways of doing it is a good thing either [02:37] wallyworld: I was thinking the same thing about your argument for using the term "Endpoint" btw. it's true that "Relation" isn't a good name, but I think it will be confusing to have two ways to refer to the same thing in the codebase [02:37] that's worthy of consideration for sure [02:37] wallyworld: we'll be going from being consistently inconsistent to inconsistently inconsistent [02:37] axw: at some point, we need to fix things [02:37] and with juju 2.0 we can vreak stuff [02:38] so perhaps this new work is a good place to start [02:38] wallyworld: can't wait :) [02:38] * axw sharpens the axe [02:38] let's do it "the right way" now and fix the other stuff after 2.0 [02:38] ok [02:38] axw: we live on the edge! what's wrong with " inconsistently inconsistent"? :D [02:38] anastasiamac: cognitive overhead [02:39] axw: no sense of adventure :D [02:39] anastasiamac: I like to get shit done instead of labouring over what something means :) [02:39] axw: wallyworld: i agree that we'd benefit from doing the right thing now... sorry for mudding the mud [02:40] anastasiamac: no apologies required, I just wanted to check what we should be doing [02:40] axw: i prefer to code :D [02:40] axw: and it was not an apology :P [02:43] wallyworld: LP#1515463 if you want to poke at the bug for any reason [02:43] axw: that service directory branch should be good to go. a few of the things you mentioned were prior issues with cleanup pending as stuff is glued together over the next day or two [02:43] bradm: ty [02:44] wallyworld: I hope I captured the bug appropriately. [02:47] wallyworld: ok, looking [02:48] ouch, irc go boom [02:48] bradm: looks good. maybe a comment suggesting that ubuntu-cloudimg-query needs to be loooked at [02:48] bradm: ty you're my hero for catching it in proposed [02:48] bradm: +1 [02:49] wallyworld: LGTM [02:49] ty [02:50] wallyworld: \o/ plz land!!! :D [02:50] ah, do I have to make this against a particular lxc version? [02:52] bradm: the bug triager will allocate to the right version? maybe? [02:53] I'd just hate them to miss it and the package get out to the wild [02:54] * rick_h_ runs for the night, evening all [02:58] bradm: stehpane has commented on the bug. i've also commneted [03:00] wallyworld_: perfect. [03:01] wallyworld_: looks like its well in hand now then. [03:02] bradm: yeah, it does. we'll keep an eye on it at our end also :-) [03:02] I really don't care who fixes it where, just that I can deploy containers again. :) [03:04] yup :-) [03:05] well, I can by dropping -proposed, thats easy enough for now. [03:48] i just noticed that "juju run" stopped working in a 1.24.7 environment: https://pastebin.canonical.com/143985/plain/ [03:48] the unit log has the following: https://pastebin.canonical.com/143986/plain/ [03:48] hmm [03:48] interesting [03:49] :( [03:49] machine-0.log: https://pastebin.canonical.com/143988/plain/ [03:51] pjdc: it seems the agent is wedged [03:52] if you restart the agent, it should fix it [03:52] thumper: on machine 0? [03:52] which machine was the log from? [03:52] i already restarted the agent on the jenkins unit, which is the first log [03:52] that should fix it [03:53] it's still spewing leadership failure [03:53] was the log before or after the restart? [03:53] before [03:54] here's the restart: https://pastebin.canonical.com/143989/plain/ [03:54] and it's just been logging the dying/stopped lines ever since [03:54] ugh [03:54] is the environment HA? [03:55] it's not [03:55] can I see the logs from machine 0? [03:55] from before or after the chunk in https://pastebin.canonical.com/143988/plain/ ? [03:56] all i have in machine-0.log lining up with the jenkins unit agent restart is this: https://pastebin.canonical.com/143990/plain/ [03:56] how much do you have? [03:57] heh [03:57] i have everything; the environment is onyl a few hours old [03:57] provider? [03:57] openstack [03:57] hmm... [03:58] how big is the environment? [03:58] I think the first step is to file a bug for this failure [03:58] pretty small. three machines, one service each, and a few subordinates deployed to each [03:58] then we'll work out how to fix [03:58] it [03:58] kk [03:59] can I get you to do this: `juju set-env logging-config=juju=DEBUG` to change the logging level [03:59] then restart the machine-0 agent [03:59] done and done [03:59] this should give us more output [03:59] let it settle for 20s or so [03:59] then lets look at the logs [04:00] pjdc: did you want to open the bug or shall I? [04:00] i can open it [04:00] cheers [04:02] well, that's annoying [04:02] the restart seems to have made it work again [04:04] \o/ [04:04] right [04:04] the reason juju run was failing was due to the uniter bouncing [04:04] juju run is executed through the uniter worker [04:05] the uniter was bouncing due to leadership issues [04:05] it seems that bouncing the server fixed those issues... [04:05] which it shouldn't have had [04:06] pjdc: to get the logging back to the default, you can say `juju set-env logging-config=juju=WARNING` [04:07] righto, ta [04:07] rick_h_: ping [04:07] thumper: pong [04:07] rick_h_: you moved our meeting to a time I now have busy [04:07] thumper: was just about to ping you about moving that meeting [04:07] thumper: ah, sorry didn't show a conflict [04:07] thumper: what works for you? [04:07] rick_h_: if you can't make the earlier one, I'll change my one [04:07] rick_h_: it is a gym thing :) [04:07] thumper: yes sorry, I missed parent-teacher conferences on my personal calendar [04:07] the team lead meeting was moved to my normal gym time [04:07] I need some way to combine the two better so I don't schedule work stuff over personal stuff [04:07] so I booked a personal trainer for a session [04:08] thumper: hah, ok [04:08] filed as #1515475, fwiw [04:08] Bug #1515475: "juju run" stopped working after a few hours (1.24.7, newly deployed) [04:08] thumper: well we can move or do tonight or ... [04:08] pjdc: ta [04:08] rick_h_: as in now? [04:08] thumper: if you're ok with it? [04:08] sure, I have some questions [04:08] thumper: or I can move it forward 4hrs from where it sits now? [04:09] * thumper looks [04:09] thumper: earlier into the day, not sure if that's too early your time [04:09] thumper: around your standup time I guess [04:09] rick_h_: it is 13:30 now [04:09] four hours earlier is 9:30 [04:09] which is fine [04:09] what questions? want to do that now and still keep it tomorrow? [04:09] thumper: or what do we need from here? [04:10] tomorrow... as I think we'll need the hour :) [04:10] ok [04:10] Bug #1515475 opened: "juju run" stopped working after a few hours (1.24.7, newly deployed) [04:10] thumper: ah I can't...wife is away. doh [04:10] rick_h_: can't tomorrow? [04:10] rick_h_: we can go fast now if you like [04:10] thumper: can we do 8:30am? and bump your standup 30min? [04:11] could do 8am [04:11] and not bump standup [04:11] thumper: why fast now? if you're rnuning out are you heading back and we can do the full hour later tonight? [04:12] dinner date :) [04:12] 15 minutes now and some monday? [04:12] thumper: maybe, will be in london for customer thing monday [04:12] my questions aren't deep [04:12] really? heading to london? [04:12] thumper: k, let's do that and I'll try to get something else [04:12] for how long? [04:12] thumper: for 3 days, Tues customer meeting [04:12] \o/ [04:13] * thumper chuckles to himself [04:13] thumper: https://plus.google.com/hangouts/_/canonical.com/rick?authuser=1 [04:13] rick_h_: lets go! [04:28] Bug #1515475 changed: "juju run" stopped working after a few hours (1.24.7, newly deployed) [04:34] Bug #1515475 opened: "juju run" stopped working after a few hours (1.24.7, newly deployed) [04:41] wallyworld, rick_h_: its probably totally pointless testing telling us what we already know, but I just downgraded lxc related packages, put it on held and then redeployed, and its looking good. [04:42] yay [04:42] I'll mention it on the bug. [04:42] even though it seems well in hand. [04:42] bradm: by testing, i mean that our CI testing should catch this issue also [04:42] ty [04:43] wallyworld: right. has it just not run on proposed, or is there something else going on there? [04:43] bradm: we don't test with proposed AFAIK. but i guess we should [04:44] bradm: there are so many combinations of series, substrate, juju version etc [04:44] adding in proposed adds a whole new axis [04:44] wallyworld: indeed. [04:45] wallyworld: its just another set of jenkins jobs, right? ;) [04:46] bradm: yeah, but we don't have enough hardware. hardware is currently on order AFAIK [04:47] wallyworld: I know that feeling. [04:47] :-) [09:19] frobware: looks like I'm going to have to miss our standup again... my wife needs me to take her to the Dr today. [09:24] jam: ack. and for the record I might miss tomorrow's as I have a dental appointment. [10:00] frobware: I assume we're doing juju-core instead of standup? [10:01] voidspace, I vote we do standup [10:01] frobware: heh [10:03] voidspace, our call is cooler :) [10:04] dimitern: our call is way cooler [10:06] voidspace, so you're coming? [10:07] dimitern: eh, are you serious? we shouldn't miss juju-core should we? [10:07] even though not much is happening [10:07] voidspace, oh c'mon :P [10:08] dimitern: voidspace: :( [10:10] anastasiamac: o/ :-( [10:29] nope, they're done [10:56] dimitern, care to HO around maas/spaces? [10:57] wallyworld: ping [11:07] life after breakfast is much better [11:07] frobware, hey, yeah - in 10m? [11:08] dimitern, 30m [11:08] frobware, even better [11:11] dimitern, any chance you could join this https://plus.google.com/hangouts/_/midokura.com/juju_openstack [11:11] frobware, ok, just a sec.. [11:30] dimitern, thanks [11:30] frobware, I hope it was useful :) should we make the spaces call now? [11:31] dimitern, please [11:39] dimitern: looks like we can get static ranges from the subnets api in maas 1.9 [11:39] dimitern: we have to fetch all the subnets (1 api call) and then make an additional call per subnet to get the range [12:14] dimitern: a trivial one that should have been part of my last branch [12:14] (oops) [12:14] http://reviews.vapour.ws/r/3125/ [12:16] dimitern: in terms of the ListSpaces implementation for the maas provider [12:16] dimitern: will we make it part of the networking environ interface? [12:17] it will only be needed / used for maas [12:17] but I think it will have to be part of the interface for autodiscovery to use it [12:18] or I can provide a helper function that does it in the maas namespace, that casts a given provider to a maasEnviron and calls ListSpaces [12:21] voidspace, looking [12:22] voidspace, LGTM [12:22] dimitern: thanks [12:23] voidspace, yes, let's add a Spaces() method, taking no arguments for now (until we need filtering) [12:23] dimitern: add to the interface? [12:23] voidspace, yes, right after Subnets() should be a good place for it - don't you think? [12:24] dimitern: well, we're extending the public interface for all providers solely for maas [12:24] dimitern: so I quite liked the helper function idea [12:24] voidspace, nope, we'll use that for all providers eventually [12:24] if we ever get to shared spaces... [12:25] voidspace, shared or not doesn't matter [12:25] voidspace, EC2 can list your env spaces by looking at subnet tags + env uuid [12:25] for EC2 you just check the model [12:25] voidspace, or, it can list the global "shared" spaces, when we get there [12:25] the model is the source of truth [12:25] if we don't have shared spaces then juju is the source of truth about what spaces there are and you don't need to go to the provider [12:25] it's only once you have shared spaces (as maas does) that you need to ask the provider [12:26] but fair enough [12:26] interface method it is [12:26] in theory we might need it for other providers... [12:26] voidspace, we can't get away without shared spaces I'm afraid, we're just postponing the moment where we need to deal with them [12:26] I'm sceptical it will become the highest priority any time soon [12:26] but time will tell [12:27] indeed\ [12:36] dimitern: my current card (subnet api) may take a bit longer than I imagined (maybe an extra day) [12:36] dimitern: the code itself is simple, but I'll need to extend the gomaasapi test server again... [12:39] voidspace, sure [12:40] voidspace, it needs to work and be tested with both legacy and new APIs, and Subnets() is a big part of "maas spaces (basic) support" (the other main part is what dooferlad is doing after bootstrap) [12:43] dimitern: gah, the "reserved_ip_ranges" operation on maas lists all the ip ranges *except* the static range [12:43] so you can deduce it [12:43] I might just take the whole cidr minus the dynamic range [12:43] the other ranges are single ips for the gateway and cluster [12:43] voidspace, that's not entirely correct [12:43] dimitern: which bit [12:44] I guess it might not be correct [12:44] voidspace, static-range != cidr - dynamic-range [12:44] right [12:44] voidspace, as there might be IPs in neither of the ranges [12:47] voidspace, however, looking at http://maas.ubuntu.com/docs/api.html (development trunk version) [12:47] dimitern: there's unreserved range too [12:47] voidspace, there's op=statistics, which claims to include "subnet ranges - the specific IP ranges present in ths subnet (if specified)" [12:47] and even better: [12:47] Optional arguments: include_ranges: if True, includes detailed information about the usage of this range [12:48] voidspace, I'll give it a go on my maas now as it has all ranges set [12:48] dimitern: the static range there is included as "unused" and is the same range as returned by unused_ip_ranges [12:48] dimitern: I'm going to try reducing the size of the static range (so there are genuinely unused portions of the cidr) and see what happens [12:49] voidspace, yeah, good idea [12:50] dimitern: my static range is defined to start at 172.16.0.4 - but the unused range starts at 3 [12:51] voidspace, perfect! see this: $ maas hw-root subnet statistics 2 include_ranges=True -> http://paste.ubuntu.com/13238168/ [12:51] subnet 2 is my pxe subnet - 10.14.0.0/20 [12:51] fwereade_: priv ping me when you are around plz [12:52] dimitern: nope [12:52] dimitern: on my maas the static range starts at 172.16.0.4 [12:52] dimitern: however the "unused" range reported by statistics (and by unreserved_ip_range) starts at 172.16.0.3 [12:53] dimitern: I've updated my bug report [12:53] voidspace, all of the "unused" ranges are part of the static range (10.14.0.100 - 10.14.1.200 as defined on the cluster interface; dhcp range is 10.14.0.30-.90) [12:54] dimitern: the cluster configuration has "Static IP range low value" set to 172.16.0.4 [12:54] dimitern: are you saying that ignoring that is the correct behaviour? [12:55] voidspace, what do you mean? [12:55] dimitern: I'm talking about my maas here [12:55] dimitern: I have static IP range low value set to 172.16.0.4 [12:55] voidspace, ok [12:55] dimitern: but the range reported as unused/unreserved returns the low value as 172.16.0.3 [12:55] dimitern: it isn't returning the static range (as defined on the cluster interface) [12:55] voidspace, is 172.16.0.4 used for anything? [12:55] dimitern: but is returning the portion of the cidr unused by anything else [12:56] dimitern: that's the low bounds of the static range [12:56] it isn't used, but I don't see that it's relevant [12:56] voidspace, check the web ui for the subnet - e.g. http://10.14.0.1/MAAS/#/subnet/2 in my case [12:56] voidspace, ah, I see you point [12:57] voidspace, "unused" includes IPs not part of static range [12:57]