[00:41] so I am back to: value of (*params.Error) is nil, but a typed nil [00:41] :/ [00:49] nm [01:15] wallyworld: I'm around now, let me know when you want to chat (can wait till 1:1 if you like) [01:15] axw: am just typing in PR, will push in a sec [01:15] thumper: tools migration is going well so far. here's one change - several more on their way: http://reviews.vapour.ws/r/5033/ [01:22] axw: i have not reviewed or tested live yet http://reviews.vapour.ws/r/5034/ we can chat soon [01:23] omfg those internal networking tests are a waste of time and a bitch to fix [01:24] damn, am still missing one apiserver test too [01:34] thumper: thanks [02:33] off to do dinner. bbiab === natefinch-afk is now known as natefinch [02:39] thumper: why is our default log level =WARNING;unit=DEBUG ? [02:40] because it is how we see what the units are doing [02:40] unit logging is the output from hooks [02:40] and always useful [02:40] but you can explicitly turn it off [02:40] ....then why is it at debug? [02:41] also, I thought it was juju.unit ? or is unit special? === Spads_ is now known as Spads [02:42] also, why don't we show info by default? defaulting to warning means we drop a ton of useful context on the floor, and make debugging production systems really difficult [02:46] wallyworld: if you ignore all the hook failures... http://pastebin.ubuntu.com/17163593/ [02:46] natefinch: unit is special [02:46] natefinch: we should probably change to default to INFO [02:47] I have no real good reason why [02:47] thumper: nice, were you going to split the charm url also? [02:47] not in this branch [02:47] lots of people want warning [02:47] info is too verbose for them [02:47] wallyworld: they're welcome to set it to warning, but I think Info is a more reasonable default [02:47] depends who the audience is [02:47] do we cater for developers of devop people [02:48] or [02:48] wallyworld: not really. We limit the amount of logs we store [02:48] wallyworld: and they can turn it down to warning if they want [02:48] and we can turn it up if we want [02:48] devop people i have met do not want lots of verbose logging [02:48] it's not verbose. It's specifically not. [02:48] but i have not talked to lots and lots of them [02:49] it's not debug... except unit, evidently :/ [02:49] but info is noise [02:49] verbose is subjective [02:49] yes it is noise [02:49] they just want warnings [02:49] I have tried working with logs set to warning and it's basically unusable [02:49] theu just want to know when things go wrong [02:49] you can't tell WTF is going on [02:49] natefinch: for us, yes [02:49] unusable for you as a dev [02:49] not unusable for a devop person [02:49] usable for anyone who wants to support the server and figure out what is wrong [02:50] and that's the friction that always happens in these cases [02:50] I don't believe the devops people choosing warning know what they're talking about. [02:51] wallyworld: http://reviews.vapour.ws/r/5035/ [02:51] you forgot the IMHO [02:51] looking after i finish current queue [02:51] wallyworld: in all cases, APIPort use by the providers is only used in StartInstance. how about we just add it to StartInstanceParams for now? [02:51] hmmm, that would work i think [02:51] .... some of them do. The people (mostly internal to canonical) who have used juju a lot, sure. [02:52] wallyworld: we could do the same for controller-uuid, and then add another method to Environ to destroy all hosted models/resources [02:52] (passing in the controller UUID to that) [02:52] axw: for now, i can just add controller uuid to setconfig params [02:52] and do that next bit later [02:53] wallyworld: yep doesn't have to be in one hit, but I think that's how we can make it a bit cleaner [02:53] +1 [02:53] one step at a time [03:09] thumper: well that's gone a bit better than charms. tools migration worked first time once all the required infrastructure was in place. [03:09] menn0: awesome [03:12] wallyworld: I'm looking at breaking out the charm now rather than my normal friday afternoon thing [03:12] rightio, almost starting a review === Spads_ is now known as Spads [03:22] axw: were we going to put region in controllers.yaml? [03:24] wallyworld: I already did [03:24] wallyworld: maybe we want to remove region from there? and just have it on the model? [03:25] yep [03:25] cloud on controller, region on model [03:25] yep [03:26] wallyworld: I added some comments to the diff [03:26] er review comments to your diff [03:26] ty, looking [03:27] axw: what's wrong with embedding that interface? [03:27] it's not what the interface is meant to be doing ... [03:27] wallyworld: its purpose is to get you a state.Model [03:27] not to get a model and model config and controller config [03:27] sure, but i'm extending its behaviour [03:28] wallyworld: which defines its purpose [03:28] an interface can do whatever methods you decide to put on it [03:28] i should change its name i guess [03:28] an Environ i think was from the old days when model was environ [03:29] wallyworld: no, I don't think you should change the name. the checkToolsAvailability function isn't even using the existing method on EnvironGetter AFAICS [03:30] wallyworld: separate responsibilities -> separate interfaces [03:30] axw: it does because it passes it to GetEnviron [03:30] wallyworld: which expects a ConfigGetter, no? [03:31] yes, or an interface that embeds that [03:31] wallyworld: so why would you wrap X in Y, only to pass X through to some other thing? that is pointless [03:32] and makes it unclear what the function really needs [03:32] it doesn't need the Model() method, it only needs the ConfigGetter part [03:32] it means we pass in one param whose behaviour we use in the method body in various places. i can do a separate param if you want [03:33] eg we pass in StateInterface in places and don't always use every method [03:33] wallyworld: yeah, that's a smell. we do that so we don't have to pass around a *state.State, which we used to [03:34] but in this case the method being called directly, its logic does use every ethod on the interface [03:34] less smelly, but still a smell [03:34] wallyworld: checkToolsAvailability doesn't. updateToolsAvailability does [03:35] updateToolsAvailability should take two things: an interface for getting the current config (ConfigGetter), and an interface for updating the model (EnvironGetter) [03:35] checkToolsAvailability only needs a ConfigGetter [03:35] ah, damn, i may have been dyslexic [03:35] i think i was confusing two method names as the same thing [03:35] ffs [03:43] wallyworld: am I making this login thing a critical/blocker to land? [03:43] sure [03:46] thumper, wallyworld: do we really want to repeat the cloud name for each model? they are always going to be the same [03:46] (in status) [03:47] i had read that as cloud region [03:47] axw: that's what was asked for [03:47] damn, dsylexic again [03:47] and it isn't always the same [03:47] if I have different models, they won't necessarily be in the same controller or cloud [03:47] hmm... [03:47] true, for the aggregated case [03:48] thumper: we're going to show models for multiple controllers? [03:48] I don't think so... [03:48] um... [03:48] thumper: OTOH it would be useful to see at a glance from a snapshot of status which cloud [03:48] perhaps I'm no longer clear what you are talking about [03:48] thumper: if I run "juju status", I'm seeing all the models for one controller [03:49] um... [03:49] thumper: ah hm never mind [03:49] if you run juju status, you only see one model [03:49] thumper: yep, forget me. that makes sense [03:52] axw: one of you comments in blank so the ditto beneath it makes no sense [03:53] wallyworld: ignore ditto sorry. I (tried to) delete a comment after I answered my own question [03:54] ok === Spads_ is now known as Spads [04:29] axw: i've left two issues open but answered the questions.... [04:30] thumper: tools migration done: http://reviews.vapour.ws/r/5036/ [04:31] wallyworld: "no, different models will want to use their own logging levels on the agents" -- the controller agent(s) manage multiple models\ [04:33] axw: so a machine agent on a worker for model 1 will want to log different to an agent for model 2 [04:33] model 1 and model 2 should have their own logging-config right? [04:33] wallyworld: I'm talking about the controller [04:33] wallyworld: they are the same agent [04:33] sure, but not on worker nodes [04:33] fair point about other workers tho [04:33] if anyone's feeling ambitious, this is a mostly mechanical change, to drop lxc support and use lxd in its place: http://reviews.vapour.ws/r/5027/ [04:34] wallyworld: I guess we shouldn't constrain it to how it works today anyyway. it would be nice if it weren't global. we could have each worker in the controlelr take a logger with levels configured for the model [04:34] wallyworld: so I'll drop [04:34] natefinch: any prgress on the --to lxd issue? [04:35] you have a +1 from eric right? [04:35] wallyworld: I do have a +1 from eric, yes.... do we need 2 +1's now? [04:35] func (fw *Firewaller) flushUnits(unitds []*unitData) error { [04:35] // flushUnits opens and closes ports for the passed unit data. [04:35] not if i have anything to do with it - except for when the review feels like they need a second opinion [04:35] worst, name. ever [04:36] wallyworld: also, no, I don't have an idea about the lxd thing... my guess is that it's a switch statement that we forgot to add lxd to [04:37] natefinch: so we can land this and then fix the other issue before release [04:38] wallyworld: going for lunch then fixing car, will finish review later [04:39] axw: np, ty [04:39] i'll start on the next bits [04:39] wallyworld: master is blocked, and this doesn't have a bug, AFAIK [04:40] natefinch: either jfdi or create a bug - i have been jfdi [04:40] we need this work for release [04:40] natefinch: ah but wait [04:41] we can't land until deploy --to is fixed [04:41] because it will break QA [04:41] doh [04:41] right, ok. I'll do that first [04:41] ty [04:46] gotta catch up on sleep, will figure it out in the morning. Seems like it's probably something pretty dumb. [05:18] wallyworld: axw whomever pr is in http://reviews.vapour.ws/r/5037/ [05:18] Be back in the local AM. [05:18] ty [05:30] https://github.com/juju/juju/pull/5594 [05:30] ^ anyone experienced with the firewaller, this is a small fix as a prereq for 1590161 [06:07] wallyworld: reviewed [06:33] axw: ty [07:01] wallyworld: you did a half change in your previous PR, you called the doc "defaultModelSettings" but the method is called "CloudConfig" still. shall I change it to DefaultModelConfig? sounds a bit off -- like it's config for a default model. maybe ModelConfigDefaults? [07:02] axw: yeah, sounds good ty === frankban|afk is now known as frankban [07:37] dimitern: ping [08:05] frobware: pong [08:05] dimitern: was just about the resolv.conf issue [08:05] frobware: I was looking at those bugs [08:05] frobware: trying to reproduce now with lxd on 1.9.3 [08:09] dimitern: I can help out in a bit - just trying to stash some stuff but in a meaningful state. [08:09] frobware: ok [08:27] frobware: no luck reproducing this so far :/ [08:27] dimitern: sounds like the whole of my yesterday :/ [08:27] frobware: (that is, if the lxds even come up ok) [08:27] dimitern: oh? [08:28] frobware: I noticed on machine-0 there was an issue and all 3 lxds came up with 10.0.0.x addresses [08:28] dimitern: heh, that caught me out this moring. they are on the LXD bridge. [08:29] dimitern: when we probe for an unused subnet, that's pretty much the default address you'll get as there's not much else, network-wise, running [08:29] frobware: yeah, the issue due to a race between setting the observed machine config with the created bridges and containers starting to deploy and trying to bridge their nics to yet-to-be-created host bridges [08:29] * frobware notes that his git stash list has grown to a depth of 32... [08:32] dimitern: explain that one to me in standup :) [08:32] frobware: otoh, if the bridges are created ok, lxds come up as expected with all NICs, and /e/resolv.conf has both nameserver and search (i.e. ping node-5 and ping node-5.maas-19 both work) [08:33] frobware: sure :) [09:02] dimitern: standup [09:07] voidspace, http://reviews.vapour.ws/r/5029/ [09:48] dimitern: regarding resolv.conf. we did a change way back to copy the /etc/resolv.conf from the host. is it possible that it is triggering that path but the host has no valid entry (not for you, but the bug reporter) [09:50] frobware: it's very much guaranteed that container's resolv.conf will be broken if their host's resolv.conf is also broken [09:51] frobware: btw commented on that bug for '--to lxd' [09:51] mgz: hey [09:52] mgz: are there any places in the CI tests which do the equivalent of 'juju deploy xyz --to lxd' ? [09:53] mgz: if there are any, it should be because there is a machine with hostname 'lxd' that's the intended target [10:13] dimitern: is it actually ambiguous? Can you use a maas-level machine name there instead of a juju-level machine number? [10:13] babbageclunk: of course you can [10:13] babbageclunk: unless your node happens to be called 'lxd' [10:14] dimitern: ok, just thought I'd check. [10:15] babbageclunk: actually... hmm - maybe only on maas I guess? [10:17] babbageclunk: placement is supposed to work with existing machines (including containers), or new containers on existing machines [10:18] dimitern: So is the bug really that --to lxd (or lxc or kvm) should be an error? [10:18] babbageclunk: it even supports a list when num-units > 1: `juju deploy ubuntu -n 3 --to 0,0/lxd/1,lxd:1` [10:19] babbageclunk: placement for deploy and add-machine/bootstrap is handled slightly differently [10:19] babbageclunk: for the latter you *can* use 'add-machine ... --to lxd' or 'bootstrap --to node-x' (on maas) [10:20] dimitern: yeah, I was getting confused between them - I've interacted with add-machine and bootstrap more. [10:20] babbageclunk: that's an inconsistency though [10:21] babbageclunk: add-machine can do more than that - e.g. add-machine ssh:10.20.30.2 [10:21] babbageclunk: bootstrap --to lxd at least fails with `error: unsupported bootstrap placement directive "lxd"` [10:23] babbageclunk: so it looks like a maas provider issue - it implements PrecheckInstance (called by state at AddMachine time), but apparently not very well [10:24] dimitern: Ok, that seems easy enough to fix. [10:24] babbageclunk: tell-tale comment on line 566 in provider/maas/environ.go: `// If there's no '=' delimiter, assume it's a node name.` [10:25] but doesn't bother to validate it [10:25] fwereade: hey [10:26] fwereade: I think we don't have a clear separation between deploy-time placement and provision-time placement (i.e. deploy --to X vs add-machine X) [10:28] fwereade: I might be wrong, but I think 'deploy ubuntu --to lxd' was never intended to work, unlike '--to lxd:2', '--to 0', or '--to 0/lxd/0' [10:38] frobware: how about if we pass a list of interfaces to bridge explicitly to the script? [10:38] dimitern: sure; can we HO anyway as I have discovered some issues with lxd on aws [10:39] frobware: I was just about to have a quick bite - top of the hour? [10:39] dimitern: or later if you want more time; that's only 20 mins [10:39] dimitern: let's say ~1 hour and I'll go and eat too [10:40] frobware: I'm trying to add a machine to understand deploying to lxd better, but when I do add-machine it never goes from Deploying to Deployed in MAAS. [10:40] frobware: ok, sgtm [10:41] babbageclunk: for that I think you'll have to dig into the MAAS logs. [10:41] babbageclunk: oh, 2.0? [10:41] babbageclunk: trusty? [10:42] frobware: 2.0, xenial [10:42] babbageclunk: you run 'add-machine lxd' ? [10:42] frobware: just the machine, first - haven't gotten to deploy anything into a container. [10:42] babbageclunk: I don't use 2.0 very much, if at all. Most of the bugs I'm looking at explicitly reference 1.9.x [10:43] frobware: no, add-machine --series=xenial [10:44] frobware: Any idea how I can get onto the machine? I think it's the network that's not coming up. [10:44] babbageclunk: you can get to and see the console? [10:44] frobware: yeah, but I don't know login details. [10:44] babbageclunk: use vmm ? [10:45] if it's a kvm on your machine.. [10:45] dimitern: what username/password though? [10:45] babbageclunk: apply this http://pastebin.ubuntu.com/17167820/ [10:46] babbageclunk: (cd juju/provider/maas; make) [10:46] babbageclunk: none will work; 'ubuntu' but pwd auth is disabled [10:46] babbageclunk: then build juju [10:46] babbageclunk: then either start-over or run upgrade-juju and add another machine [10:47] frobware: hmm, I might try removing all of the vlans from the node first. [10:47] babbageclunk: that's ^^ a useful exercise as it does allow you to login when we bork networking [10:47] frobware: ok, will try it. [11:05] dimitern, hey, sorry [11:06] frobware: what's up? [11:06] dimitern, in my understanding `--to lxd` means "hand over deployment to the notional lxd compute provider that spans the capable machines in your model" [11:07] dimitern, "I want it in a container, don't bother me with the details" [11:07] oops, sorry frobware [11:08] fwereade: well, why do we have container=lxd as a constraint then? [11:08] dimitern, hysterical raisins [11:10] fwereade: so 'juju deploy ubuntu --to lxd' is supposed to work exactly like 'juju add-machine lxd && juju deploy ubuntu --to X', where X is the 'created machine X' add-machine reports [11:16] dimitern, yes [11:52] dimitern: hey, I kept working... can we sync after I have some lunch. :) [11:53] frobware: sure :) [12:48] dimitern: ping [12:49] dimitern: a quick sanity check. Every LinkLayerDevice should have a corresponding refs doc with a ref that defaults to 0. If non-zero the references are the number of devices that have this device as a parent (set in ParentName)? [12:50] dimitern: so a quick scan of the linklayerdevices counting parent references should enable me to reproduce it without having to directly migrate it. [13:02] voidspace: sorry, just got back [13:02] voidspace: yes, I think that's correct [13:03] voidspace: ah, well 'quick scan' could work but only if nothing else can add or remove stuff from the db while you do it [13:15] frobware: I tried your patch after trying a few other things, but it seems like passwd -d ubuntu just makes it so that ubuntu can't login through the terminal. [13:15] frobware: trying it with chpasswd instead. [13:16] babbageclunk: I use that alll the time [13:17] frobware: hmm. Definitely didn't let me log in. [13:17] frobware: maybe it's hanging before the bridgescript runs? [13:17] frobware, babbageclunk: the ubuntu account is locked usually [13:17] babbageclunk: if that's the case my patch is either borked, or the bridgescript did not run [13:18] in the cloud images [13:20] frobware, dimitern - trying deploying from maas without juju. [13:21] frobware, dimitern - how does the bridgescript get run? Juju gives it to maas which runs it via cloud-init? [13:21] babbageclunk: yep [13:21] babbageclunk: yeah, as a runcmd: in cloud-init user data [13:22] dimitern: can we HO? [13:23] frobware: sure - omw [13:24] frobware: joined standup HO [13:24] dimitern: heh, I was in the other one. omw [13:24] frobware, dimitern - ok, I see the same problem deploying with maas-only, so presumably the bridgescript never gets to run. [13:24] babbageclunk: is this with trusty on maas 2.0 ? [13:26] the install's paused for a long time with "Raise network interfaces", then it times out and continues to stop at a login prompt, but it's before cloud-init runs. [13:26] dimitern: xenial on maas 2.0 [13:26] babbageclunk: hmm well that's odd [13:28] dimitern: yeah. I'm going to kill off the vlans, that seems to trigger it. But I don't see why, since they didn't cause a problem before. [13:28] dimitern: Then at least I can try to understand the lxd deploy bug better without this getting in the way. [13:33] babbageclunk: sorry, otp [14:02] dimitern, dooferlad: are you guys looking at the deploy --to lxd issue? I had started looking at that last night, but didn't get very far. I need that to be fixed so I can land my code that removes all the lxc stuff [14:12] natefinch: yeah, I posted updates as well [14:14] natefinch: we may need to reassign if not finished by EOD [14:14] natefinch: deploy --to lxc and --to lxd or --to kvm are equally broken, so it shouldn't block landing your patch [14:17] natefinch: side-note: I'm more concerned with removing the LXC container type as valid; wasn't there a discussion to still allow both 'lxd' and 'lxc' (but treat both the same as 'lxd') for backwards-compatibility with existing bundles? [14:27] dimitern: bundles will treat lxc like lxd, yes [14:27] dimitern: it's just everything else that is getting lxc removed [14:28] natefinch: ok then [14:28] dimitern: btw, I swear there used to be help text for --to lxc that said "deploy to a container on a new machine" [14:29] dimitern: but I don't see it now, so maybe I'm crazy [14:29] natefinch: if there was, it was never tested [14:30] dimitern: so are we fixing the bug that it doesn't immediately error out, or are we fixing the bug that it doesn't work? [14:30] natefinch: and I know for sure maas provider is not handling this as it should; not tried others [14:30] hey dimitern, should bug 1590689 be fixed in 1.25.6? [14:30] Bug #1590689: MAAS 1.9.3 + Juju 1.25.5 - on the Juju controller node eth0 and juju-br0 interfaces have the same IP address at the same time [14:31] cherylj: not without backporting the fix I linked to from master [14:31] dimitern: sorry, what I mean is, should we hold off releasing 1.25.6 until that gets done? [14:31] cherylj: oh, sorry not that one [14:32] cherylj: ah, yeah it *is* that one - and FWIW I think we should not release 1.25.6 without it [14:32] dimitern: is the backport already on your (or someone's) to do list? [14:33] Bug #1591225 opened: Generated image stream is not considered in bootstrap on private cloud [14:33] cherylj: not to my knowledge [14:33] cherylj: I could switch to that and propose it (I have too many things in progress..) [14:34] boy I know how that feels. [14:34] dimitern: I think we're still a couple days away from a 1.25.6, so maybe aim to have it in by Tuesday? [14:37] cherylj: that would be great! [14:38] thanks, dimitern! [14:38] bbl [14:39] frobware: guess what? [14:39] its broken [14:39] dimitern: in beta6 [14:39] frobware: nope :) it works just the same with beta6 [14:39] dimitern: sigh [14:39] dimitern: so are we fixing it so that deploy --to lxd errors out the way --to lxc does? in my tests --to lxc says: "ERROR cannot add application "ubuntu3": unknown placement directive: lxc" [14:39] (...for a change) [14:40] natefinch: is that on maas btw? [14:40] dimitern: whereas --to lxd doesn't error out (but then never works either) [14:40] frobware: added a comment anyway [14:40] dimitern: thx [14:41] dimitern: no. I never test on maas. don't have one. GCE. but I can try aws if it's not still broken like it was yesterday [14:41] dimitern: it should be provider independent, though [14:42] natefinch: yeah, it *should*, but as it turns out it's not unfortunately [14:42] dimitern: I guess maas has that messed up "if it doesn't match anything else, let's assume it's a node" thing [14:43] natefinch: I'll do a quick test now how deploy --to lxc and lxd is handled on maas, gce, and aws [14:43] dimitern: I did GCE, so you can skip that one === tvansteenburgh1 is now known as tvansteenburgh [14:43] natefinch: ok, I'll try azure then [14:44] dimitern: lxd and kvm behave the same - they both return no error, but then never create a machine either [14:45] natefinch: something just occurred to me.. lxd uses the 'lxd' as the default domain for container FQDNs [14:46] natefinch: it might be the reason why lxd is different [14:46] dimitern: I'm pretty sure a placement directive of just a container type is supposed to work: https://github.com/juju/juju/blob/master/instance/placement.go#L71 [14:48] natefinch: yeah, but there's also the PrecheckInstance from the prechecker state policy, which is called while adding a machine [14:51] natefinch: hmm it looks like only maas is affected [14:52] natefinch: as all other providers expect '=' to be present in the placement or parsing fails [14:54] natefinch: or like joyent simply fails with placement != "" [14:56] cloudsigma doesn't even bother to do anything.. precheckInstance is { return nil }.. why implement it then? [14:59] dimitern, natefinch: I can see in the add-machine case where the decision to add a new machine with a container is made for lxc, I can't find anything corresponding to that in the deploy code. [15:01] ahh, add machine, that's where it is: juju add-machine lxd (starts a new machine with an lxd container) [15:01] I don't know why deploy would be any different [15:01] dimitern, natefinch: ooh - does State.addMachineWithPlacement need to grow a call to AddMachineInsideNewMachine to do it? [15:02] (in state/state.go:1249) [15:02] natefinch: standup time [15:02] katco: oops, thanks [15:03] babbageclunk: 1275 [15:03] babbageclunk: the actual code deploy uses lives in juju/deploy.go [15:03] dimitern: Yeah, but that will only put a new container in an existing machine. [15:04] dimitern: vs this code from add-machine https://github.com/juju/juju/blob/master/apiserver/machinemanager/machinemanager.go#L158 [15:04] natefinch: on AWS 'deploy ubuntu --to lxd' and --to lxc both appear to work, but neither adds a machine for the unit [15:04] dimitern: yeah, same for GCE [15:06] natefinch: so it looks consistently broken everywhere :) [15:06] I'd vote to reject '--to ' for deploy on its own (i.e. still allow '--to :') [15:07] dimitern: So the code from add-machine will create a new host with a container inside, but the deploy codepath won't because it doesn't call AddMachineInsideNewMachine. [15:07] until we can untangle the mess around it and make add-machine and deploy --to behave the same way [15:08] babbageclunk: yeah, because nobody thought about it too much I guess [15:10] dimitern: I think it's just an extra check in that function - if machineId is "", call AddMachineInsideNewMachine instead of AddMachineInsideMachine. [15:11] dimitern: testing it now [15:11] babbageclunk: that sounds correct [15:12] babbageclunk: but definitely *isn't* the way to fix the bug [15:13] babbageclunk: I mean.. this will allow deploy --to lxd to work, but it might also open a whole new can of worms on all providers [15:13] dimitern: I don't see why? (But I haven't been following the discussion closely.) [15:14] babbageclunk: e.g. deploy --to kvm on aws will start an instance but then fail to deploy the unit as kvm won't be supported [15:14] dimitern: Isn't that the same behaviour as add-machine kvm? [15:15] babbageclunk: similarly, --to lxd with 'default-series: precise' will similarly seem to pass initially, then fail as lxd is not supported on precise [15:15] babbageclunk: add-machine is similarly broken in those cases [15:16] dimitern: Isn't it worth doing this fix so add-machine and deploy behave in the same way (although both broken in the cases you describe)? [15:16] babbageclunk: add-machine accepts other things, e.g. ssh:user@hostname [15:17] babbageclunk: they still won't act the same [15:17] babbageclunk: but, at least they will be a step closer [15:18] dimitern: Yeah, it still seems like people expect them to work in the same way in this case. [15:18] they should be as consistent as possible [15:19] babbageclunk: ok, please ignore my previous rants then :) what you suggest is a good fix to have [15:20] * dimitern is just twitchy about changing core behavior before the release.. [15:20] dimitern: :) I mean, I think you're right that those cases are problems. [15:23] we should have a well define format for placement, which allows provider-specific scopes; e.g. deploy --to/add-machine :; where := |; := |=[,..] [15:23] dimitern: in AWS with AA-FF why do we use static addresses and not dhcp? [15:23] dimitern: in containers [15:24] frobware: because the FF [15:24] dimitern: sure, but really asking why static in that case [15:24] frobware: i.e. the user asked for static IPs [15:25] frobware: we use dhcp otherwise [15:26] frobware: but the whole point of the FF and now the multi-NIC approach on maas has always been to have static IPs for containers [15:32] dimitern: it was AWS I was questioning; the MAAS I can see because you can ask for static/dhcp there [15:33] frobware: you can on AWS as well [15:33] frobware: AssignPrivateIpAddress [15:33] http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_AssignPrivateIpAddresses.html [15:34] well not nearly equivalent to what maas offers. === tvansteenburgh1 is now known as tvansteenburgh [16:01] natefinch, when you have five minutes I have a few qs [16:01] frobware: ping [16:01] frobware: here's my patch so far: http://paste.ubuntu.com/17174180/ [16:02] alexisb: sure. [16:02] frobware: now testing on aws w/ && w/o AC-FF (xenial), and on maas-19 (t) / maas-20 (x) [16:03] https://hangouts.google.com/hangouts/_/canonical.com/juju-release [16:03] natefinch, ^^ [16:03] dimitern: it's nuts... all this manual testing we're BOTH doing... Grrr. [16:03] cherylj, feel free to crash the party [16:04] frobware: yeah.. [16:05] dimitern: your patch "so far" - does that mean use or wait? [16:06] frobware: so far only as long as the currently running make check passes [16:07] frobware: or if something comes up from the live tests (will be able to tell you shortly); otherwise I think I covered everything in what I pasted [16:11] frobware: yeah, I've missed a few tests in container/kvm [16:15] babbageclunk, dimitern: what is the consensus for a fix on lp 1590960 ?? [16:16] lp1590960 [16:16] alexisb: maybe bug 1590960? Or is mup sulking? [16:16] Bug #1590960: juju deploy --to lxd does not create base machine [16:17] there we go :) [16:17] I've got a fix, tested manually, just finishing the unit test for it. [16:17] alexisb: we can fix deploy to work with --to , but that's not what's blocking natefinch's patch LXC-to-LXD [16:17] dimitern, correct it is not blocking [16:18] Should be up for review in ~10 mins [16:18] but looking at this mornings discussion there seemed to be some different ideas how what should work woith --to and what shouldnt [16:18] was just curious what the expected behavior should be [16:20] alexisb, natefinch looks like --to lxc is also a problem on 1.25: https://bugs.launchpad.net/juju-core/+bug/1590960/comments/6 [16:20] Bug #1590960: juju deploy --to lxd does not create base machine [16:20] alexisb: that's the real issue: behavior was neither clearly defined nor tested [16:20] dimitern, exactly [16:21] alexisb: but it's sensible to expect deploy --to X it should work like add-machine X does [16:21] dimitern, also agree [16:21] cherylj: an error is a lot better than silently half-working... but yeah, should be fixed to mirror add-machine [16:21] alexisb: and babbageclunk's fix should get us there [16:21] huzzah :) [16:22] but not all the way [16:22] dimitern, though a note to the juju-core team might be good so that we highlight the change and educate the team [16:22] babbageclunk, ^^ [16:22] agreed [16:22] [16:25] alexisb, dimitern: Clarifying - am I sending the note about this change? [16:25] babbageclunk: I'd appreciate if you do it, I can help clarifying something or other if you need though [16:26] frobware: so the patch didn't work for aws [16:26] babbageclunk, yeah just to the juju-core lanchpad group [16:26] dimitern: what happened? [16:26] frobware: ERROR juju.provisioner provisioner_task.go:681 cannot start instance for machine "0/lxd/0": missing [16:26] container network config [16:27] frobware: it slipped through somewhere.. looking [16:27] dimitern: why do I think that's an existent bug... ? [16:28] dimitern, alexisb: Ok cool - I think I understand the wider issues now. Basically just that this will still do slightly weird things on clouds that don't support the container type, but at least that the add-machine and deploy behaviour is more consistent. [16:28] babbageclunk, yep [16:29] and we as a team should be clear on what the current behaviour is and the gaps, so we can both explain to users *and* make better desicions on what the behaviour should be [16:30] cmars, cherylj, do we have any progress on https://bugs.launchpad.net/juju-core/+bug/1581157 [16:30] Bug #1581157: github.com/juju/juju/cmd/jujud test timeout on windows [16:41] frobware: nope, it was a warning before [16:41] frobware: I'll need to add a few more tweaks to the patch and will resend [16:50] alexisb: I haven't heard anything from cmars about it === frankban is now known as frankban|afk [17:18] Bug #1591290 opened: serverSuite.TestStop unexpected error [17:21] frobware: fixed patch: http://paste.ubuntu.com/17177501/ [17:23] frobware: should now work ok on AWS (testing again); all unit tests fixed [17:24] frobware: I probably should've proposed it rather than to bug you with it :/ [17:26] dimitern: you mention placement strings with = in them in that bug... but placement strings don't use = AFAIK? placement is like --to 0\lxc\0 or --to lxd:4 maybe you mean constraints? [17:28] natefinch: on maas you can do --to zone=foo [17:28] natefinch: and I think most others support zone= as well [17:29] natefinch: see, it's confusing :) [17:37] dimitern: gah, zone should be a constraint :/ [17:38] well... maybe not [17:38] I g uess constraints are for all units of a service [17:39] still... weird === benji is now known as Guest73726 [17:40] natefinch: yeah, it can't be useful as a constraint if we're to do automatic zone distribution [17:41] dimitern: right (sometimes you might not want them distributed,. but that's the exception). Anyway... many valid placements do not use =... like specifying containers or machines [17:41] natefinch: there's also a container=lxd constraint btw, hardly tested [17:42] natefinch, dimitern: halp! After state.AddApplication's been called, the units are just staged, is that right? When/how does juju/deploy:AddUnits get called? [17:43] Is it triggered by a watcher of some sort? [17:43] babbageclunk: there's the unitassigner that makes sure units get assigned to machnies [17:43] babbageclunk: it goes like this: cmd/juju/application/deploy.go -> api/application/deploy -> apiserver/application/deploy -> juju/deploy -> state [17:43] babbageclunk: it's a worker [17:45] dimitern: yeah, I could follow that, but none of the code in that chain actually ends up calling AssignUnitWithPlacement. [17:45] natefinch: Ah, ok - thanks. [17:46] babbageclunk: yeah, we add a staged assignment during deploy, and then the unitassigner reads those and turns them into real assignments. [17:50] natefinch: ok - that makes sense. I was trying to understand why I didn't see the error I see in my unit test when running deploy manually. [17:52] natefinch: It's because the errors are raised by the unitassigner and logged somewhere, rather than coming back from the api to the command. [17:55] frobware: FYI, proposed it: http://reviews.vapour.ws/r/5040/ [18:10] dimitern, natefinch: review please? http://reviews.vapour.ws/r/5041/ [18:10] babbageclunk: cheers, looking [18:10] dimitern: I mean, you shouldn't now! It's late there! It's kinda late here now! [18:12] dimitern: but thanks! [18:12] * babbageclunk is off home - have delightful weekends everyone! [18:12] babbageclunk: likewise! :) [18:20] dimitern: will take a look [18:47] brb reboot === Spads_ is now known as Spads [19:22] Bug #1588924 opened: juju list-controllers --format=yaml displays controller that cannot be addressed. === Spads_ is now known as Spads === natefinch is now known as natefinch-afk [20:41] cmars: still around? [21:19] Bug #1591379 opened: bootstrap failure with MAAS doesn't tell me which node has a problem [21:26] perrito666, yep, what's up? [21:26] * perrito666 deleted what he was writing because he began in spanish [21:26] cmars: I wanted to ask you about juju/permission [21:26] We are sort of moving in another direction http://reviews.vapour.ws/r/4973/#comment27181 [21:26] leo un poquito [21:27] looking [21:27] dont do that (the spanish) you just short circuited my brain badly :p [21:27] it is fun to see your own language and not understand it [21:28] :) [21:28] perrito666, is there a doc or tl;dr for the permissions changes? [21:55] Bug #1591387 opened: juju controller stuck in infinite loop during teardown [21:58] Bug #1591387 changed: juju controller stuck in infinite loop during teardown [22:10] Bug #1591387 opened: juju controller stuck in infinite loop during teardown