axw_ | thumper: is hosted env destruction going to be merged into master before the jes flag is removed? and CreateEnvironment added? | 01:11 |
---|---|---|
=== axw_ is now known as axw | ||
thumper | axw: CreateEnvironment is already there | 01:12 |
thumper | axw: and I hope so | 01:12 |
axw | thumper: oh, sweet, thanks | 01:12 |
thumper | I've just found that some of my assumptions about how bootstrap works are wrong | 01:12 |
thumper | :( | 01:12 |
axw | thumper: which branch has CreateEnvironment (not PrepareForCreateEnvironment)? | 01:13 |
thumper | master does... | 01:13 |
thumper | you mean api, cli, ? | 01:14 |
axw | thumper: I mean in EnvironProvider | 01:14 |
thumper | ... | 01:14 |
thumper | I'm not working on that | 01:14 |
axw | thumper: remember that discussion about creating resources for azure? | 01:15 |
thumper | yeah... | 01:15 |
thumper | ugh... | 01:15 |
axw | thumper: ok, I thought it would be part of multi-env | 01:15 |
thumper | it has fallen off my todo list | 01:15 |
thumper | it should be | 01:15 |
thumper | my todo list gained weeks of work yesterday | 01:15 |
thumper | that floated to the top | 01:15 |
axw | thumper: heh :) | 01:15 |
axw | thumper: BTW, I don't know if it'll be useful for other providers, but in Azure it's useful to know the UUID of the controller model | 01:16 |
axw | thumper: s/useful/critical/ | 01:16 |
thumper | :) | 01:16 |
axw | thumper: might be worth adding to the env config at some point | 01:17 |
thumper | yeah... | 01:17 |
axw | atm I'm just adding it to the azure config | 01:17 |
thumper | k | 01:17 |
axw | as an internal thing | 01:17 |
* axw wishes we had somewhere to store internal data that's not really config | 01:18 | |
wallyworld_ | axw: with Offer and ServiceOffer - Offer was from anatasia's branch and it's going away but there' stuff that depends on it. i've added a todo in my next branch but didn't add the todo in the previous branch | 02:04 |
wallyworld_ | sorry for confusion | 02:05 |
axw | wallyworld_: ok, so long as it dies a quick death | 02:05 |
wallyworld_ | it will | 02:05 |
wallyworld_ | tomorrow | 02:05 |
axw | excellent | 02:05 |
axw | :) | 02:05 |
bradm | any ideas on why a juju deploy to a container would use the wrong uncompress option? its trying to use xz on a .tar.gz cloud image tarball | 02:05 |
wallyworld_ | bradm: it's the lxc scripts which uncompress the image | 02:07 |
bradm | https://pastebin.canonical.com/143978/ is the slightly truncated output | 02:07 |
wallyworld_ | juju merely downloads the image for lxc to then use as it sees fit | 02:08 |
bradm | right. | 02:08 |
bradm | its definately pulling the tarball from the right looking location | 02:09 |
wallyworld_ | bradm: i know there was recent lxc breakage in wily due to upstream issues, but am not across the details | 02:09 |
rick_h__ | wallyworld_: that was more around the networking | 02:10 |
bradm | wallyworld_: this is using trusty though? | 02:10 |
wallyworld_ | rick_h__: ah yes, you are right | 02:10 |
rick_h__ | wallyworld_: nothing with the image compression formats that I'm aware of | 02:10 |
wallyworld_ | yeah, correct, i forgot | 02:10 |
wallyworld_ | bradm: all i can suggest is to look at the lxc ubuntu-cloud template which is a bash script to see what it is doing | 02:11 |
wallyworld_ | i think it's in /etc/lxc somewhere | 02:11 |
bradm | wallyworld_: would you expect juju to be providing a .tar.gz or a .tar.xz ? | 02:11 |
bradm | hmm, no, not in /etc/lxc | 02:12 |
wallyworld_ | bradm: .tar.gz - we simply download from cloud-images.ubuntu.com. we use the clould-image-query script to find out what to download from a series | 02:12 |
bradm | I'll find it. | 02:12 |
wallyworld_ | bradm: tl;dr; juju relies on upstream utils | 02:12 |
wallyworld_ | bradm: /usr/share/lxc/templates | 02:13 |
bradm | wallyworld_: right, but this is all on trusty, I'm a bit concern it just broke | 02:13 |
wallyworld_ | yeah me too | 02:13 |
bradm | hah, thats exactly where I'm looking | 02:13 |
wallyworld_ | we need to fix obviously :-) | 02:13 |
wallyworld_ | but we need to look upstream to diagnose | 02:13 |
bradm | trying to work out if its lxc-ubuntu or lxc-ubuntu-cloud | 02:13 |
wallyworld_ | bradm: i suspect lxc-ubuntu-cloud | 02:14 |
wallyworld_ | the lxc-ctrea script chooses i think | 02:14 |
wallyworld_ | lxc-create | 02:14 |
rick_h__ | bradm: this smells like upstream moved to xz for better compression but juju grabbed the tar.gz image | 02:15 |
bradm | yeah, its definately hard coded to using xz, according to the script | 02:15 |
wallyworld | damn, i'm o sick of this kernel bug killing my network | 02:16 |
rick_h__ | bradm: right https://github.com/lxc/lxc/commit/27c278a76931bfc4660caa85d1942ca91c86e0bf | 02:16 |
rick_h__ | bradm: line 334 in the diff seems about the right place | 02:17 |
bradm | "lxc-ubuntu-cloud: Replace .tar.gz by .tar.xz and don't auto-generate missing tarballs" | 02:17 |
bradm | from the release notes | 02:17 |
bradm | rick_h__: hah, snap. :) | 02:17 |
bradm | same thing, slightly different direction | 02:17 |
rick_h__ | bradm: yea, can you file a bug on that please and copy myself and cherylj into the bug please? | 02:18 |
rick_h__ | bradm: including your version of lxc, juju, etc? | 02:18 |
bradm | rick_h__: sure. bug on where though? | 02:18 |
bradm | ie juju-core or lxc? | 02:18 |
rick_h__ | bradm: and we'll have to see if that needs to be made more flexible (there's an auto detect the format flag we use in juju-gui tarball for xz) or something else | 02:18 |
rick_h__ | bradm: on lxc | 02:19 |
rick_h__ | bradm: we can't get a release of juju out to fix this tomorrow | 02:19 |
rick_h__ | bradm: so weneed to file a backward incompatible bug with them or something. Maybe it'll end up a bug in how juju is getting the image that it's not getting the new ones? | 02:19 |
rick_h__ | bradm: but let's start there and we'll start working together on it please | 02:19 |
bradm | rick_h__: for sure, easy enough to move bugs around. | 02:20 |
rick_h__ | bradm: ty much and <3 for the catch | 02:20 |
wallyworld | rick_h__: bradm: juju uses upstream cloud-images-query to get the image url | 02:20 |
rick_h__ | wallyworld: right, so something changed there in lxc that it's thinking a different image should be used? I'm not sure tbh. | 02:21 |
wallyworld | so if that is telling juju the wrong image, that will ned to be fixed too | 02:21 |
wallyworld | rick_h__: ubuntu-cloudimg-query trusty released amd64 --format %{url} on my system returns | 02:23 |
wallyworld | https://cloud-images.ubuntu.com/server/releases/trusty/release-20151105/ubuntu-14.04-server-cloudimg-amd64.tar.gz | 02:23 |
rick_h__ | wallyworld: rgr | 02:24 |
wallyworld | bradm: ^^^^ what does the above return on yours | 02:24 |
bradm | wallyworld: same. | 02:25 |
wallyworld | hmmm | 02:25 |
wallyworld | that's what i would have expected | 02:26 |
bradm | wallyworld: apparently the lxc container creation template script didn't. :) | 02:26 |
wallyworld | that's what juju downloads | 02:26 |
wallyworld | damn :-( | 02:26 |
wallyworld | so ubuntu-clouldimg-query and lxc are out of sync | 02:27 |
wallyworld | how did this not show up in our testing | 02:27 |
bradm | dunno | 02:27 |
rick_h__ | wallyworld: we must not have tested the latest lxc release. This just came out the other day. Is it in backports/etc? | 02:28 |
wallyworld | ah i see. not sure where it lives | 02:28 |
rick_h__ | wallyworld: so this was released 2 days ago | 02:28 |
rick_h__ | bradm: can you look where you get it from? | 02:28 |
wallyworld | well i guess our tests will break soon enough :-) | 02:28 |
rick_h__ | wallyworld: hah | 02:29 |
axw | wallyworld: any particular reason why we need to model "remote endpoints", rather than just passing charm.Relations around? | 02:29 |
bradm | aha! | 02:30 |
bradm | we have -proposed enabled | 02:30 |
bradm | so it looks like it hasn't made it to the main archive yet | 02:30 |
wallyworld | axw: you mean for params across the wire? | 02:31 |
axw | wallyworld: yes | 02:31 |
axw | wallyworld: I'm reviewing anastasiamac's branch, just wondering whether we need params.RemoteEndpoint, or if we can just use charm.Relation | 02:32 |
wallyworld | axw: we want to model wire structs distinct from the domain model. we pass in domain objects to api layer and map to params.* | 02:32 |
bradm | rick_h__: theres the answer then, its not out in the wild yet | 02:32 |
wallyworld | axw: and on the way out, we map params.* back to domain model | 02:32 |
wallyworld | axw: but we currently leak params.* eberywhere :-( | 02:33 |
wallyworld | because much of our domain model is defined in state | 02:33 |
wallyworld | not is a model package | 02:33 |
anastasiamac | axw: this way we could also easily distinguish exported endpoints from native ones, i guess :(... at some stage... if we want.... \o/ | 02:33 |
wallyworld | bradm: glad you caught that before it escaped :-) | 02:33 |
axw | wallyworld anastasiamac: we're using charm.Relation in apiserver/params for non-remote relations, I'm just trying to understand if there's a good reason to have separate ways of serialising them | 02:34 |
axw | wallyworld anastasiamac: I'm wondering if it's ever going to be the case that they'll have different information | 02:35 |
wallyworld | axw: IMO what's there now then is a mistake, but i could be told otherwise | 02:35 |
axw | wallyworld: yeah, I know fwereade doesn't like that we just pass charm.X over the wire, but I don't know that having two ways of doing it is a good thing either | 02:36 |
axw | wallyworld: I was thinking the same thing about your argument for using the term "Endpoint" btw. it's true that "Relation" isn't a good name, but I think it will be confusing to have two ways to refer to the same thing in the codebase | 02:37 |
wallyworld | that's worthy of consideration for sure | 02:37 |
axw | wallyworld: we'll be going from being consistently inconsistent to inconsistently inconsistent | 02:37 |
wallyworld | axw: at some point, we need to fix things | 02:37 |
wallyworld | and with juju 2.0 we can vreak stuff | 02:37 |
wallyworld | so perhaps this new work is a good place to start | 02:38 |
axw | wallyworld: can't wait :) | 02:38 |
* axw sharpens the axe | 02:38 | |
wallyworld | let's do it "the right way" now and fix the other stuff after 2.0 | 02:38 |
axw | ok | 02:38 |
anastasiamac | axw: we live on the edge! what's wrong with " inconsistently inconsistent"? :D | 02:38 |
axw | anastasiamac: cognitive overhead | 02:38 |
anastasiamac | axw: no sense of adventure :D | 02:39 |
axw | anastasiamac: I like to get shit done instead of labouring over what something means :) | 02:39 |
anastasiamac | axw: wallyworld: i agree that we'd benefit from doing the right thing now... sorry for mudding the mud | 02:39 |
axw | anastasiamac: no apologies required, I just wanted to check what we should be doing | 02:40 |
anastasiamac | axw: i prefer to code :D | 02:40 |
anastasiamac | axw: and it was not an apology :P | 02:40 |
bradm | wallyworld: LP#1515463 if you want to poke at the bug for any reason | 02:43 |
wallyworld | axw: that service directory branch should be good to go. a few of the things you mentioned were prior issues with cleanup pending as stuff is glued together over the next day or two | 02:43 |
wallyworld | bradm: ty | 02:43 |
bradm | wallyworld: I hope I captured the bug appropriately. | 02:44 |
axw | wallyworld: ok, looking | 02:47 |
rick_h_ | ouch, irc go boom | 02:48 |
wallyworld | bradm: looks good. maybe a comment suggesting that ubuntu-cloudimg-query needs to be loooked at | 02:48 |
rick_h_ | bradm: ty you're my hero for catching it in proposed | 02:48 |
wallyworld | bradm: +1 | 02:48 |
axw | wallyworld: LGTM | 02:49 |
wallyworld | ty | 02:49 |
anastasiamac | wallyworld: \o/ plz land!!! :D | 02:50 |
bradm | ah, do I have to make this against a particular lxc version? | 02:50 |
wallyworld | bradm: the bug triager will allocate to the right version? maybe? | 02:52 |
bradm | I'd just hate them to miss it and the package get out to the wild | 02:53 |
* rick_h_ runs for the night, evening all | 02:54 | |
wallyworld_ | bradm: stehpane has commented on the bug. i've also commneted | 02:58 |
bradm | wallyworld_: perfect. | 03:00 |
bradm | wallyworld_: looks like its well in hand now then. | 03:01 |
wallyworld_ | bradm: yeah, it does. we'll keep an eye on it at our end also :-) | 03:02 |
bradm | I really don't care who fixes it where, just that I can deploy containers again. :) | 03:02 |
wallyworld_ | yup :-) | 03:04 |
bradm | well, I can by dropping -proposed, thats easy enough for now. | 03:05 |
pjdc | i just noticed that "juju run" stopped working in a 1.24.7 environment: https://pastebin.canonical.com/143985/plain/ | 03:48 |
pjdc | the unit log has the following: https://pastebin.canonical.com/143986/plain/ | 03:48 |
thumper | hmm | 03:48 |
thumper | interesting | 03:48 |
pjdc | :( | 03:49 |
pjdc | machine-0.log: https://pastebin.canonical.com/143988/plain/ | 03:49 |
thumper | pjdc: it seems the agent is wedged | 03:51 |
thumper | if you restart the agent, it should fix it | 03:52 |
pjdc | thumper: on machine 0? | 03:52 |
thumper | which machine was the log from? | 03:52 |
pjdc | i already restarted the agent on the jenkins unit, which is the first log | 03:52 |
thumper | that should fix it | 03:52 |
pjdc | it's still spewing leadership failure | 03:53 |
thumper | was the log before or after the restart? | 03:53 |
pjdc | before | 03:53 |
pjdc | here's the restart: https://pastebin.canonical.com/143989/plain/ | 03:54 |
pjdc | and it's just been logging the dying/stopped lines ever since | 03:54 |
thumper | ugh | 03:54 |
thumper | is the environment HA? | 03:54 |
pjdc | it's not | 03:55 |
thumper | can I see the logs from machine 0? | 03:55 |
pjdc | from before or after the chunk in https://pastebin.canonical.com/143988/plain/ ? | 03:55 |
pjdc | all i have in machine-0.log lining up with the jenkins unit agent restart is this: https://pastebin.canonical.com/143990/plain/ | 03:56 |
thumper | how much do you have? | 03:56 |
thumper | heh | 03:57 |
pjdc | i have everything; the environment is onyl a few hours old | 03:57 |
thumper | provider? | 03:57 |
pjdc | openstack | 03:57 |
thumper | hmm... | 03:57 |
thumper | how big is the environment? | 03:58 |
thumper | I think the first step is to file a bug for this failure | 03:58 |
pjdc | pretty small. three machines, one service each, and a few subordinates deployed to each | 03:58 |
thumper | then we'll work out how to fix | 03:58 |
thumper | it | 03:58 |
thumper | kk | 03:58 |
thumper | can I get you to do this: `juju set-env logging-config=juju=DEBUG` to change the logging level | 03:59 |
thumper | then restart the machine-0 agent | 03:59 |
pjdc | done and done | 03:59 |
thumper | this should give us more output | 03:59 |
thumper | let it settle for 20s or so | 03:59 |
thumper | then lets look at the logs | 03:59 |
thumper | pjdc: did you want to open the bug or shall I? | 04:00 |
pjdc | i can open it | 04:00 |
thumper | cheers | 04:00 |
pjdc | well, that's annoying | 04:02 |
pjdc | the restart seems to have made it work again | 04:02 |
thumper | \o/ | 04:04 |
thumper | right | 04:04 |
thumper | the reason juju run was failing was due to the uniter bouncing | 04:04 |
thumper | juju run is executed through the uniter worker | 04:04 |
thumper | the uniter was bouncing due to leadership issues | 04:05 |
thumper | it seems that bouncing the server fixed those issues... | 04:05 |
thumper | which it shouldn't have had | 04:05 |
thumper | pjdc: to get the logging back to the default, you can say `juju set-env logging-config=juju=WARNING` | 04:06 |
pjdc | righto, ta | 04:07 |
thumper | rick_h_: ping | 04:07 |
rick_h_ | thumper: pong | 04:07 |
thumper | rick_h_: you moved our meeting to a time I now have busy | 04:07 |
rick_h_ | thumper: was just about to ping you about moving that meeting | 04:07 |
rick_h_ | thumper: ah, sorry didn't show a conflict | 04:07 |
rick_h_ | thumper: what works for you? | 04:07 |
thumper | rick_h_: if you can't make the earlier one, I'll change my one | 04:07 |
thumper | rick_h_: it is a gym thing :) | 04:07 |
rick_h_ | thumper: yes sorry, I missed parent-teacher conferences on my personal calendar | 04:07 |
thumper | the team lead meeting was moved to my normal gym time | 04:07 |
rick_h_ | I need some way to combine the two better so I don't schedule work stuff over personal stuff | 04:07 |
thumper | so I booked a personal trainer for a session | 04:07 |
rick_h_ | thumper: hah, ok | 04:08 |
pjdc | filed as #1515475, fwiw | 04:08 |
mup | Bug #1515475: "juju run" stopped working after a few hours (1.24.7, newly deployed) <juju-core:New> <https://launchpad.net/bugs/1515475> | 04:08 |
rick_h_ | thumper: well we can move or do tonight or ... | 04:08 |
thumper | pjdc: ta | 04:08 |
thumper | rick_h_: as in now? | 04:08 |
rick_h_ | thumper: if you're ok with it? | 04:08 |
thumper | sure, I have some questions | 04:08 |
rick_h_ | thumper: or I can move it forward 4hrs from where it sits now? | 04:08 |
* thumper looks | 04:09 | |
rick_h_ | thumper: earlier into the day, not sure if that's too early your time | 04:09 |
rick_h_ | thumper: around your standup time I guess | 04:09 |
thumper | rick_h_: it is 13:30 now | 04:09 |
thumper | four hours earlier is 9:30 | 04:09 |
thumper | which is fine | 04:09 |
rick_h_ | what questions? want to do that now and still keep it tomorrow? | 04:09 |
rick_h_ | thumper: or what do we need from here? | 04:09 |
thumper | tomorrow... as I think we'll need the hour :) | 04:10 |
rick_h_ | ok | 04:10 |
mup | Bug #1515475 opened: "juju run" stopped working after a few hours (1.24.7, newly deployed) <juju-core:New> <https://launchpad.net/bugs/1515475> | 04:10 |
rick_h_ | thumper: ah I can't...wife is away. doh | 04:10 |
thumper | rick_h_: can't tomorrow? | 04:10 |
thumper | rick_h_: we can go fast now if you like | 04:10 |
rick_h_ | thumper: can we do 8:30am? and bump your standup 30min? | 04:10 |
thumper | could do 8am | 04:11 |
thumper | and not bump standup | 04:11 |
rick_h_ | thumper: why fast now? if you're rnuning out are you heading back and we can do the full hour later tonight? | 04:11 |
thumper | dinner date :) | 04:12 |
thumper | 15 minutes now and some monday? | 04:12 |
rick_h_ | thumper: maybe, will be in london for customer thing monday | 04:12 |
thumper | my questions aren't deep | 04:12 |
thumper | really? heading to london? | 04:12 |
rick_h_ | thumper: k, let's do that and I'll try to get something else | 04:12 |
thumper | for how long? | 04:12 |
rick_h_ | thumper: for 3 days, Tues customer meeting | 04:12 |
thumper | \o/ | 04:12 |
* thumper chuckles to himself | 04:13 | |
rick_h_ | thumper: https://plus.google.com/hangouts/_/canonical.com/rick?authuser=1 | 04:13 |
thumper | rick_h_: lets go! | 04:13 |
mup | Bug #1515475 changed: "juju run" stopped working after a few hours (1.24.7, newly deployed) <juju-core:New> <https://launchpad.net/bugs/1515475> | 04:28 |
mup | Bug #1515475 opened: "juju run" stopped working after a few hours (1.24.7, newly deployed) <juju-core:Triaged> <https://launchpad.net/bugs/1515475> | 04:34 |
bradm | wallyworld, rick_h_: its probably totally pointless testing telling us what we already know, but I just downgraded lxc related packages, put it on held and then redeployed, and its looking good. | 04:41 |
wallyworld | yay | 04:42 |
bradm | I'll mention it on the bug. | 04:42 |
bradm | even though it seems well in hand. | 04:42 |
wallyworld | bradm: by testing, i mean that our CI testing should catch this issue also | 04:42 |
wallyworld | ty | 04:42 |
bradm | wallyworld: right. has it just not run on proposed, or is there something else going on there? | 04:43 |
wallyworld | bradm: we don't test with proposed AFAIK. but i guess we should | 04:43 |
wallyworld | bradm: there are so many combinations of series, substrate, juju version etc | 04:44 |
wallyworld | adding in proposed adds a whole new axis | 04:44 |
bradm | wallyworld: indeed. | 04:44 |
bradm | wallyworld: its just another set of jenkins jobs, right? ;) | 04:45 |
wallyworld | bradm: yeah, but we don't have enough hardware. hardware is currently on order AFAIK | 04:46 |
bradm | wallyworld: I know that feeling. | 04:47 |
wallyworld | :-) | 04:47 |
jam | frobware: looks like I'm going to have to miss our standup again... my wife needs me to take her to the Dr today. | 09:19 |
frobware | jam: ack. and for the record I might miss tomorrow's as I have a dental appointment. | 09:24 |
voidspace | frobware: I assume we're doing juju-core instead of standup? | 10:00 |
frobware | voidspace, I vote we do standup | 10:01 |
voidspace | frobware: heh | 10:01 |
dimitern | voidspace, our call is cooler :) | 10:03 |
voidspace | dimitern: our call is way cooler | 10:04 |
dimitern | voidspace, so you're coming? | 10:06 |
voidspace | dimitern: eh, are you serious? we shouldn't miss juju-core should we? | 10:07 |
voidspace | even though not much is happening | 10:07 |
dimitern | voidspace, oh c'mon :P | 10:07 |
anastasiamac | dimitern: voidspace: :( | 10:08 |
voidspace | anastasiamac: o/ :-( | 10:10 |
dimitern | nope, they're done | 10:29 |
frobware | dimitern, care to HO around maas/spaces? | 10:56 |
rogpeppe | wallyworld: ping | 10:57 |
perrito666 | life after breakfast is much better | 11:07 |
dimitern | frobware, hey, yeah - in 10m? | 11:07 |
frobware | dimitern, 30m | 11:08 |
dimitern | frobware, even better | 11:08 |
frobware | dimitern, any chance you could join this https://plus.google.com/hangouts/_/midokura.com/juju_openstack | 11:11 |
dimitern | frobware, ok, just a sec.. | 11:11 |
frobware | dimitern, thanks | 11:30 |
dimitern | frobware, I hope it was useful :) should we make the spaces call now? | 11:30 |
frobware | dimitern, please | 11:31 |
voidspace | dimitern: looks like we can get static ranges from the subnets api in maas 1.9 | 11:39 |
voidspace | dimitern: we have to fetch all the subnets (1 api call) and then make an additional call per subnet to get the range | 11:39 |
voidspace | dimitern: a trivial one that should have been part of my last branch | 12:14 |
voidspace | (oops) | 12:14 |
voidspace | http://reviews.vapour.ws/r/3125/ | 12:14 |
voidspace | dimitern: in terms of the ListSpaces implementation for the maas provider | 12:16 |
voidspace | dimitern: will we make it part of the networking environ interface? | 12:16 |
voidspace | it will only be needed / used for maas | 12:17 |
voidspace | but I think it will have to be part of the interface for autodiscovery to use it | 12:17 |
voidspace | or I can provide a helper function that does it in the maas namespace, that casts a given provider to a maasEnviron and calls ListSpaces | 12:18 |
dimitern | voidspace, looking | 12:21 |
dimitern | voidspace, LGTM | 12:22 |
voidspace | dimitern: thanks | 12:22 |
dimitern | voidspace, yes, let's add a Spaces() method, taking no arguments for now (until we need filtering) | 12:23 |
voidspace | dimitern: add to the interface? | 12:23 |
dimitern | voidspace, yes, right after Subnets() should be a good place for it - don't you think? | 12:23 |
voidspace | dimitern: well, we're extending the public interface for all providers solely for maas | 12:24 |
voidspace | dimitern: so I quite liked the helper function idea | 12:24 |
dimitern | voidspace, nope, we'll use that for all providers eventually | 12:24 |
voidspace | if we ever get to shared spaces... | 12:24 |
dimitern | voidspace, shared or not doesn't matter | 12:25 |
dimitern | voidspace, EC2 can list your env spaces by looking at subnet tags + env uuid | 12:25 |
voidspace | for EC2 you just check the model | 12:25 |
dimitern | voidspace, or, it can list the global "shared" spaces, when we get there | 12:25 |
voidspace | the model is the source of truth | 12:25 |
voidspace | if we don't have shared spaces then juju is the source of truth about what spaces there are and you don't need to go to the provider | 12:25 |
voidspace | it's only once you have shared spaces (as maas does) that you need to ask the provider | 12:25 |
voidspace | but fair enough | 12:26 |
voidspace | interface method it is | 12:26 |
voidspace | in theory we might need it for other providers... | 12:26 |
dimitern | voidspace, we can't get away without shared spaces I'm afraid, we're just postponing the moment where we need to deal with them | 12:26 |
voidspace | I'm sceptical it will become the highest priority any time soon | 12:26 |
voidspace | but time will tell | 12:26 |
dimitern | indeed\ | 12:27 |
voidspace | dimitern: my current card (subnet api) may take a bit longer than I imagined (maybe an extra day) | 12:36 |
voidspace | dimitern: the code itself is simple, but I'll need to extend the gomaasapi test server again... | 12:36 |
dimitern | voidspace, sure | 12:39 |
dimitern | voidspace, it needs to work and be tested with both legacy and new APIs, and Subnets() is a big part of "maas spaces (basic) support" (the other main part is what dooferlad is doing after bootstrap) | 12:40 |
voidspace | dimitern: gah, the "reserved_ip_ranges" operation on maas lists all the ip ranges *except* the static range | 12:43 |
voidspace | so you can deduce it | 12:43 |
voidspace | I might just take the whole cidr minus the dynamic range | 12:43 |
voidspace | the other ranges are single ips for the gateway and cluster | 12:43 |
dimitern | voidspace, that's not entirely correct | 12:43 |
voidspace | dimitern: which bit | 12:43 |
voidspace | I guess it might not be correct | 12:44 |
dimitern | voidspace, static-range != cidr - dynamic-range | 12:44 |
voidspace | right | 12:44 |
dimitern | voidspace, as there might be IPs in neither of the ranges | 12:44 |
dimitern | voidspace, however, looking at http://maas.ubuntu.com/docs/api.html (development trunk version) | 12:47 |
voidspace | dimitern: there's unreserved range too | 12:47 |
dimitern | voidspace, there's op=statistics, which claims to include "subnet ranges - the specific IP ranges present in ths subnet (if specified)" | 12:47 |
dimitern | and even better: | 12:47 |
dimitern | Optional arguments: include_ranges: if True, includes detailed information about the usage of this range | 12:47 |
dimitern | voidspace, I'll give it a go on my maas now as it has all ranges set | 12:48 |
voidspace | dimitern: the static range there is included as "unused" and is the same range as returned by unused_ip_ranges | 12:48 |
voidspace | dimitern: I'm going to try reducing the size of the static range (so there are genuinely unused portions of the cidr) and see what happens | 12:48 |
dimitern | voidspace, yeah, good idea | 12:49 |
voidspace | dimitern: my static range is defined to start at 172.16.0.4 - but the unused range starts at 3 | 12:50 |
dimitern | voidspace, perfect! see this: $ maas hw-root subnet statistics 2 include_ranges=True -> http://paste.ubuntu.com/13238168/ | 12:51 |
dimitern | subnet 2 is my pxe subnet - 10.14.0.0/20 | 12:51 |
perrito666 | fwereade_: priv ping me when you are around plz | 12:51 |
voidspace | dimitern: nope | 12:52 |
voidspace | dimitern: on my maas the static range starts at 172.16.0.4 | 12:52 |
voidspace | dimitern: however the "unused" range reported by statistics (and by unreserved_ip_range) starts at 172.16.0.3 | 12:52 |
voidspace | dimitern: I've updated my bug report | 12:53 |
dimitern | voidspace, all of the "unused" ranges are part of the static range (10.14.0.100 - 10.14.1.200 as defined on the cluster interface; dhcp range is 10.14.0.30-.90) | 12:53 |
voidspace | dimitern: the cluster configuration has "Static IP range low value" set to 172.16.0.4 | 12:54 |
voidspace | dimitern: are you saying that ignoring that is the correct behaviour? | 12:54 |
dimitern | voidspace, what do you mean? | 12:55 |
voidspace | dimitern: I'm talking about my maas here | 12:55 |
voidspace | dimitern: I have static IP range low value set to 172.16.0.4 | 12:55 |
dimitern | voidspace, ok | 12:55 |
voidspace | dimitern: but the range reported as unused/unreserved returns the low value as 172.16.0.3 | 12:55 |
voidspace | dimitern: it isn't returning the static range (as defined on the cluster interface) | 12:55 |
dimitern | voidspace, is 172.16.0.4 used for anything? | 12:55 |
voidspace | dimitern: but is returning the portion of the cidr unused by anything else | 12:55 |
voidspace | dimitern: that's the low bounds of the static range | 12:56 |
voidspace | it isn't used, but I don't see that it's relevant | 12:56 |
dimitern | voidspace, check the web ui for the subnet - e.g. http://10.14.0.1/MAAS/#/subnet/2 in my case | 12:56 |
dimitern | voidspace, ah, I see you point | 12:56 |
dimitern | voidspace, "unused" includes IPs not part of static range | 12:57 |
dimitern | voidspace, but not assigned, cluster, or dynamic ips | 12:57 |
voidspace | dimitern: correct | 12:58 |
voidspace | dimitern: in #juju on canonical, roaksox is saying that it doesn't matter and we should use the unreserved range anyway | 12:58 |
voidspace | dimitern: and in 2.0 the static range is going away | 12:58 |
dimitern | voidspace, awesome news! | 12:58 |
dimitern | :) | 12:58 |
voidspace | presumably it will just be implied from cidr - dynamic range | 12:58 |
dimitern | voidspace, the let's just do that and not use the node group interfaces | 12:59 |
voidspace | dimitern: yep | 12:59 |
dimitern | voidspace, then the bug I asked you to file is moot and can be closed | 12:59 |
voidspace | dimitern: it's done | 12:59 |
dimitern | voidspace, cheers | 13:00 |
voidspace | perrito666: are you moonstone? | 13:07 |
perrito666 | voidspace: I am not | 13:07 |
voidspace | perrito666: ok | 13:08 |
voidspace | tests failed because "your quota allows for 0 more running instance(s). You requested at least 1" | 13:27 |
voidspace | *sigh* | 13:27 |
voidspace | dimitern: hmmmm... the networks api we're currently using for subnets allows filtering by nodeId (which we use) | 13:29 |
voidspace | dimitern: I don't think the new subnets api does | 13:29 |
voidspace | checking | 13:30 |
dimitern | voidspace, well, provided you use static IPs for all your nodes, this seems to work for me: $ maas hw-juju subnet ip-addresses 2 with_nodes=True -> http://paste.ubuntu.com/13238364/ | 13:32 |
voidspace | dimitern: so, request all subnets, request all ip addresses with node information, then request the unreserved range for every subnet | 13:33 |
voidspace | that's hardly simpler than what we currently have :-D | 13:33 |
voidspace | but we'll have space information | 13:33 |
voidspace | and we do the filtering rather than have maas do it | 13:33 |
voidspace | matching allocated addresses to subnets to do the node filtering | 13:34 |
voidspace | and that's a bunch of stuff to add to the test server as well :-/ | 13:34 |
dimitern | voidspace, yeah :/ it seems we still need 3 API calls | 13:35 |
dimitern | voidspace, but not every time | 13:35 |
voidspace | well, we need 1 plus 1 per subnet | 13:35 |
voidspace | and if we're filtering by instance id an extra one | 13:35 |
dimitern | voidspace, yeah | 13:35 |
voidspace | although for that case we can trim down the number of subnets we need to query | 13:36 |
voidspace | dimitern: hmmm... ec2 provider allows subnetIds to be empty (meaning list all subnets) | 13:39 |
voidspace | dimitern: and I remember a bug about that | 13:39 |
voidspace | dimitern: maas doesn't allow that | 13:39 |
voidspace | dimitern: apiserver/subnets/subnets.go calls netEnv.Subnets with an empty slice of subnet ids | 13:41 |
voidspace | dimitern: that will fail on maas | 13:41 |
dimitern | voidspace, yeah, because we had no way of linking networks to nodes apart from going via the cluster interfaces | 13:41 |
voidspace | dimitern: I guess it didn't matter when ec2 was the only platform supporting spaces | 13:42 |
voidspace | dimitern: but that needs fixing too | 13:42 |
voidspace | I bet "juju subnets list" fails for maas | 13:43 |
dimitern | voidspace, well, can't it return an error with empty subnetIDs only for the new api? | 13:43 |
voidspace | dimitern: other way round | 13:43 |
dimitern | voidspace, nope, it won't fail as it doesn't hit maas at all - just state | 13:43 |
voidspace | dimitern: it currently returns an error for empty subnetIds | 13:43 |
dimitern | voidspace, yeah, you got me :) | 13:43 |
voidspace | apiserver/subnets/subnets.go calls netEnv.Subnets | 13:44 |
voidspace | dimitern: in cacheSubnets | 13:44 |
dimitern | voidspace, that's for "subnet add" only | 13:44 |
voidspace | dimitern: ah, fair enough | 13:45 |
voidspace | maybe not an issue then | 13:45 |
voidspace | I won't fix it until we need to | 13:45 |
dimitern | voidspace, +1 | 13:45 |
dimitern | oh dear.. ci's f*cked again - euca-run-instances: error (InstanceLimitExceeded): Your quota allows for 0 more running instance(s). You requested at least 1 | 13:46 |
voidspace | yep | 13:46 |
mgz_ | dimitern: hm, the gating job? I'll see what else is up in ec2. | 13:56 |
dimitern | mgz_, yeah, and we were seeing some weired unit test failures from a parallel universe :) where state.machineDoc doesn't have Principals field (added my aram originally IIRC) | 13:59 |
mgz_ | are you sure the deps are correct? Ci beuilds a completely clean tarball, which is not the same thing as building out of a local GOPATH | 14:01 |
cherylj | frobware: I know it's a bit late, but I did verify that your fix resolved the EMPTYCONFIG problem I ran into on maas | 14:02 |
dimitern | mgz_, well, something's fishy for sure, as machineDoc has "Principals []string" - no omitempty or anything, so it will be there, unless mongo returns bogus docs from the collection | 14:02 |
frobware | cherylj, which fix? setting static IP range, or the fix I committed yesterday? | 14:02 |
mgz_ | er, that's not good, the jenkins web ui just went down | 14:03 |
cherylj | frobware: I just checked with the latest master, since I saw you had already merged http://reviews.vapour.ws/r/3102/ | 14:03 |
frobware | cherylj, result! | 14:03 |
cherylj | frobware: I think we're going to try to cut a 1.25.1 soon. Should I move the 1.25 milestone for bug 1412621 to 1.25.2? or do you think you'll get to make the fix for 1.25 in the next day or so? | 14:05 |
mup | Bug #1412621: replica set EMPTYCONFIG MAAS bootstrap <adoption> <bootstrap> <bug-squad> <charmers> <cpec> <cpp> <maas-provider> <mongodb> <oil> <juju-core:Fix Committed by frobware> <juju-core 1.24:Won't Fix> <juju-core 1.25:In Progress> <https://launchpad.net/bugs/1412621> | 14:05 |
frobware | cherylj, happening now/this afternoon. Was on my list. | 14:05 |
cherylj | frobware: oh awesome, thanks! | 14:05 |
cherylj | mgz_: do you think you'll get bug 1512399 merged into 1.25 in the next day or so? | 14:06 |
mup | Bug #1512399: ERROR environment destruction failed: destroying storage: listing volumes: Get https://x.x.x.x:8776/v2/<UUID>/volumes/detail: local error: record overflow <amulet> <bug-squad> <openstack> <sts> <uosci> <Go OpenStack Exchange:In Progress by gz> <juju-core:Triaged> <juju-core | 14:06 |
mup | 1.25:Triaged> <https://launchpad.net/bugs/1512399> | 14:06 |
cherylj | mgz_: because we should probably get that into 1.25.1 | 14:06 |
beisner | cherylj, mgz - yes please :-) bundletester + openstack provider is in always-false-fail mode atm. | 14:09 |
mgz_ | cherylj: yeah, I should have that finished this week | 14:10 |
cherylj | ok, thanks, mgz_ ! | 14:10 |
frobware | dimitern, ok to close http://reviews.vapour.ws/r/3088/ as we're not doing 1.24? | 14:21 |
dimitern | frobware, yeah, I wanted to keep the branch around until I forward port it, but the PR and RB entries can be closed | 14:35 |
dimitern | frobware, done | 14:37 |
frobware | dimitern, thanks | 14:37 |
mup | Bug #1515647 opened: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates <juju-core:New> <https://launchpad.net/bugs/1515647> | 14:54 |
mup | Bug #1515647 changed: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates <juju-core:New> <https://launchpad.net/bugs/1515647> | 15:00 |
frobware | dimitern, could I leverage your expertise ... ? | 15:03 |
dimitern | frobware, can it wait for a while? trying to do a few things at once here.. | 15:05 |
frobware | dimitern, ok I'll pester voidspace. Need some help with the vanguard issue ^^ | 15:05 |
voidspace | frobware: shouldn't bug squad do it | 15:07 |
voidspace | I'm trying to do feature work after two weeks on bug squad | 15:08 |
mup | Bug #1515647 opened: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates <juju-core:New> <https://launchpad.net/bugs/1515647> | 15:15 |
katco | ericsnow: natefinch: sorry ubuntu froze on me. it's going to be a bit of a day i can tell | 15:22 |
natefinch | katco: heh no problem, we're just bullshitting about providers | 15:23 |
frobware | voidspace, bug squad picked it up; wasn't sure of the process. | 15:23 |
frobware | cherylj, replica set issue committed for 1.25 now - https://bugs.launchpad.net/juju-core/+bug/1412621 | 15:33 |
mup | Bug #1412621: replica set EMPTYCONFIG MAAS bootstrap <adoption> <bootstrap> <bug-squad> <charmers> <cpec> <cpp> <maas-provider> <mongodb> <oil> <juju-core:Fix Committed by frobware> <juju-core 1.24:Won't Fix> <juju-core 1.25:Fix Committed> <https://launchpad.net/bugs/1412621> | 15:33 |
cherylj | awesome, thanks, frobware ! | 15:33 |
voidspace | frobware: cool | 15:51 |
cherylj | fwereade_: you around? | 16:01 |
fwereade_ | cherylj, heyhey | 16:23 |
fwereade_ | cherylj, sorry I missed you | 16:23 |
fwereade_ | cherylj, what can I do for you? | 16:23 |
cherylj | fwereade_: thanks for the additional info on the instancepoller. I think that will help simplify some of the work. | 16:30 |
cherylj | fwereade_: but, how would we track the instance progress with lxc? | 16:31 |
cherylj | fwereade_: do you have any thoughts on that? | 16:31 |
* fwereade_ scratches head vaguely -- not sure how granular the info we can get from lxd is -- is .Status() intrinsically limited there? | 16:32 | |
fwereade_ | cherylj, if we use the cloudinit2 report-progress-back, would that help? | 16:32 |
cherylj | fwereade_: the problem is that all the "interesting things" happen before we return an instance back from StartInstance | 16:32 |
fwereade_ | cherylj, ah damn yes ofc | 16:33 |
alexisb | voidspace, ping | 16:33 |
* fwereade_ reloading context a bit... | 16:34 | |
fwereade_ | cherylj, I think that callback is the cleanest option... | 16:34 |
fwereade_ | cherylj, so I don't *think* we need additional workers | 16:34 |
cherylj | fwereade_: what do you mean by callback? | 16:35 |
fwereade_ | cherylj, so StartInstanceParams gets something like `StatusCallback func(InstanceStatus, string)` | 16:36 |
voidspace | alexisb: popng | 16:36 |
voidspace | *pong even | 16:36 |
alexisb | voidspace, 1x1? | 16:37 |
voidspace | alexisb: ah yes! | 16:37 |
voidspace | sorry | 16:37 |
alexisb | :) | 16:37 |
fwereade_ | cherylj, if we need more special tracking after StartInstance I'd hope we could get it via InstancePoller like everything else (with an option on a cloudinit2 alternative/supplement to instancepoller one day) | 16:38 |
cherylj | fwereade_: the alternative is that we make creating container asynchronous. We do enough to get the instance Id, return it to the provisioner, then start a goroutine and go about our merry way | 16:39 |
fwereade_ | cherylj, I would prefer not to -- that'd imply that the lxd broker had to accept long-term responsibility for completing the deployment in the face of all possible weirdness | 16:41 |
cherylj | fwereade_: it would move that retry logic into the container code :) | 16:42 |
cherylj | fwereade_: I think even with a callback, we have a chicken and egg problem | 16:43 |
fwereade_ | cherylj, but also add a bunch of responsibility for maintaining local state, surely? | 16:43 |
cherylj | fwereade_: if we report provisioning status on an instance, not a machine | 16:44 |
cherylj | fwereade_: we still need that instance back from StartInstance before we can report its status | 16:44 |
fwereade_ | cherylj, I'm imagining a StatusCallback implementation will be something like | 16:44 |
mup | Bug #1515647 changed: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates <juju-core:Invalid by cox-katherine-e> <https://launchpad.net/bugs/1515647> | 16:45 |
fwereade_ | func(status InstanceStatus, info string) error { | 16:45 |
fwereade_ | if err := machine.SetInstanceStatus(status, info, nil); err != nil { | 16:45 |
fwereade_ | / etc | 16:46 |
cherylj | fwereade_: and we could do that before we associate an instance with the machine? | 16:46 |
fwereade_ | cherylj, I think so -- model-wise, instance data is just a satellite of the machine entity -- and so is machine status, and so I think can be instance status | 16:47 |
cherylj | fwereade_: okay, I can dig more down that path. Thanks! | 16:47 |
mup | Bug #1515647 opened: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates <juju-core:Invalid by cox-katherine-e> <https://launchpad.net/bugs/1515647> | 16:48 |
fwereade_ | so machine has .[Set]Status(), and [Set]InstanceStatus, and there's some AggregateStatus that takes the output of both to build the user-facing status doc | 16:48 |
fwereade_ | cherylj, (and that way we can represent doing-stuff-but-no-instance-id-yet in status) | 16:49 |
cherylj | fwereade_: ahhh, nice | 16:49 |
fwereade_ | cherylj, a pleasure | 16:49 |
mup | Bug #1515647 changed: Upgrade from 1.20.11 to 1.24.7 fails after machine-0 jujud updates <juju-core:Invalid by cox-katherine-e> <https://launchpad.net/bugs/1515647> | 16:51 |
katco | ericsnow: can you have a look at http://reviews.vapour.ws/r/3004/ when you have a chance? | 17:02 |
ericsnow | katco: will do | 17:04 |
katco | ericsnow: ty | 17:06 |
=== sarnold_ is now known as sarnold | ||
dooferlad | frobware, voidspace: Tiny review if you have a moment: http://reviews.vapour.ws/r/3127/ | 17:41 |
voidspace | dooferlad: didn't I already do that... | 18:26 |
voidspace | dooferlad: it's already on maas-spaces branch | 18:27 |
voidspace | dooferlad: it shouldn't land on master | 18:28 |
mup | Bug #1515736 opened: juju storage filesystem list panics and dumps stack trace <juju-core:New> <https://launchpad.net/bugs/1515736> | 19:15 |
mbruzek | hi mup that is my bug. | 19:15 |
mbruzek | Who is working on the storage feature? I found a panic that mup just pointed out. | 19:16 |
mbruzek | oh I see cherylj already triaged it. | 19:17 |
mbruzek | Thank you cherylj | 19:17 |
cherylj | mbruzek: np. I can fix that later this afternoon. Super simple problem | 19:18 |
cherylj | But it is surprising that it landed. Means no one tried to actually run the command | 19:19 |
cherylj | I guess it didn't get hit because of the mocking that happens in our unit tests | 19:19 |
cherylj | mbruzek: do you want me to give you a patched juju to run until the bug is fixed? | 19:20 |
mbruzek | cherylj: no need | 19:20 |
cherylj | k | 19:20 |
thumper | rick_h_: we don't need this meeting in 10 minutes do we? | 19:21 |
thumper | rick_h_: although we do need to talk about environment users | 19:21 |
mbruzek | cherylj: I am just glad it is triage, no hurry on the fix. I am trying to document the storage feature | 19:22 |
rick_h_ | thumper: up to you, I tried to make sure we had a space in case we did need it | 19:22 |
cherylj | mbruzek: ah, okay. Thanks for helping us find these issues ;) | 19:22 |
* thumper thinks | 19:22 | |
thumper | rick_h_: yeah, lets chat | 19:25 |
mbruzek | cherylj: who wrote the storage feature? I have questions. | 19:31 |
cherylj | mbruzek: axw | 19:31 |
natefinch | sometimes I forget how crazy slow amazon is | 19:55 |
perrito666 | natefinch: compared to what? | 19:56 |
natefinch | perrito666: the local provider, lxd provider... any machine built in the past 5 years | 19:57 |
perrito666 | lol, well if you count the amount of time I spend fixing my machine after some local provider tests I wouldn't be so sure | 19:57 |
natefinch | perrito666: that's why the lxd provider is so awesome. I wish our providers were plugins, so I could just use the lxd provider on my current bug (which is on 1.24) | 20:00 |
=== akhavr1 is now known as akhavr | ||
mup | Bug #1515401 changed: destroy-environment leaving jujud on manual machines <ci> <destroy-environment> <manual-provider> <juju-core:Triaged> <juju-core series-in-metadata:Triaged> <https://launchpad.net/bugs/1515401> | 20:18 |
mup | Bug #1515401 opened: destroy-environment leaving jujud on manual machines <ci> <destroy-environment> <manual-provider> <juju-core:Triaged> <juju-core series-in-metadata:Triaged> <https://launchpad.net/bugs/1515401> | 20:21 |
mup | Bug #1515401 changed: destroy-environment leaving jujud on manual machines <ci> <destroy-environment> <manual-provider> <juju-core:Triaged> <juju-core series-in-metadata:Triaged> <https://launchpad.net/bugs/1515401> | 20:24 |
* fwereade_ has that unique sinking feeling when he finally finds a strange-looking goroutine at the bottom of the timeout and it leads back to code that... I saw earlier today and annotated with an "I don't think this is right" | 21:55 | |
cherylj | wallyworld: release standup? | 22:31 |
katco | natefinch: looks like master if open | 22:40 |
katco | natefinch: kicked off a merge for you | 22:42 |
davecheney | thumper: lucky(~/src/github.com/juju/juju/utils) % ls | 23:56 |
davecheney | package_test.go syslog yaml.go | 23:56 |
davecheney | gettting there | 23:56 |
davecheney | yaml.go is next on the chopping block | 23:57 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!