/srv/irclogs.ubuntu.com/2016/06/10/#juju-dev.txt

redirso I am back to: value of (*params.Error) is nil, but a typed nil00:41
redir:/00:41
redirnm00:49
axwwallyworld: I'm around now, let me know when you want to chat (can wait till 1:1 if you like)01:15
wallyworldaxw: am just typing in PR, will push in a sec01:15
menn0thumper: tools migration is going well so far. here's one change - several more on their way: http://reviews.vapour.ws/r/5033/01:15
wallyworldaxw: i have not reviewed or tested live yet http://reviews.vapour.ws/r/5034/ we can chat soon01:22
wallyworldomfg those internal networking tests are a waste of time and a bitch to fix01:23
wallyworlddamn, am still missing one apiserver test too01:24
menn0thumper: thanks01:34
rediroff to do dinner. bbiab02:33
=== natefinch-afk is now known as natefinch
natefinchthumper: why is our default log level <root>=WARNING;unit=DEBUG ?02:39
thumperbecause it is how we see what the units are doing02:40
thumperunit logging is the output from hooks02:40
thumperand always useful02:40
thumperbut you can explicitly turn it off02:40
natefinch....then why is it at debug?02:40
natefinchalso, I thought it was juju.unit ?  or is unit special?02:41
=== Spads_ is now known as Spads
natefinchalso, why don't we show info by default?  defaulting to warning means we drop a ton of useful context on the floor, and make debugging production systems really difficult02:42
thumperwallyworld: if you ignore all the hook failures... http://pastebin.ubuntu.com/17163593/02:46
thumpernatefinch: unit is special02:46
thumpernatefinch: we should probably change to default to INFO02:46
thumperI have no real good reason why02:47
wallyworldthumper: nice, were you going to split the charm url also?02:47
thumpernot in this branch02:47
wallyworldlots of people want warning02:47
wallyworldinfo is too verbose for them02:47
natefinchwallyworld:  they're welcome to set it to warning, but I think Info is a more reasonable default02:47
wallyworlddepends who the audience is02:47
wallyworlddo we cater for developers of devop people02:47
wallyworldor02:48
natefinchwallyworld: not really.  We limit the amount of logs we store02:48
natefinchwallyworld: and they can turn it down to warning if they want02:48
wallyworldand we can turn it up if we want02:48
wallyworlddevop people i have met do not want lots of verbose logging02:48
natefinchit's not verbose.  It's specifically not.02:48
wallyworldbut i have not talked to lots and lots of them02:48
natefinchit's not debug... except unit, evidently :/02:49
thumperbut info is noise02:49
wallyworldverbose is subjective02:49
wallyworldyes it is noise02:49
wallyworldthey just want warnings02:49
natefinchI have tried working with logs set to warning and it's basically unusable02:49
wallyworldtheu just want to know when things go wrong02:49
natefinchyou can't tell WTF is going on02:49
thumpernatefinch: for us, yes02:49
wallyworldunusable for you as a dev02:49
wallyworldnot unusable for a devop person02:49
natefinchusable for anyone who wants to support the server and figure out what is wrong02:49
wallyworldand that's the friction that always happens in these cases02:50
natefinchI don't believe the devops people choosing warning know what they're talking about.02:50
thumperwallyworld: http://reviews.vapour.ws/r/5035/02:51
wallyworldyou forgot the IMHO02:51
wallyworldlooking after i finish current queue02:51
axwwallyworld: in all cases, APIPort use by the providers is only used in StartInstance. how about we just add it to StartInstanceParams for now?02:51
wallyworldhmmm, that would work i think02:51
natefinch.... some of them do.  The people (mostly internal to canonical) who have used juju a lot, sure.02:51
axwwallyworld: we could do the same for controller-uuid, and then add another method to Environ to destroy all hosted models/resources02:52
axw(passing in the controller UUID to that)02:52
wallyworldaxw: for now, i can just add controller uuid to setconfig params02:52
wallyworldand do that next bit later02:52
axwwallyworld: yep doesn't have to be in one hit, but I think that's how we can make it a bit cleaner02:53
wallyworld+102:53
wallyworldone step at a time02:53
menn0thumper: well that's gone a bit better than charms. tools migration worked first time once all the required infrastructure was in place.03:09
thumpermenn0: awesome03:09
thumperwallyworld: I'm looking at breaking out the charm now rather than my normal friday afternoon thing03:12
wallyworldrightio, almost starting a review03:12
=== Spads_ is now known as Spads
wallyworldaxw: were we going to put region in controllers.yaml?03:22
axwwallyworld: I already did03:24
axwwallyworld: maybe we want to remove region from there? and just have it on the model?03:24
wallyworldyep03:25
axwcloud on controller, region on model03:25
wallyworldyep03:25
axwwallyworld: I added some comments to the diff03:26
axwer review comments to your diff03:26
wallyworldty, looking03:26
wallyworldaxw: what's wrong with embedding that interface?03:27
axwit's not what the interface is meant to be doing ...03:27
axwwallyworld: its purpose is to get you a state.Model03:27
axwnot to get a model and model config and controller config03:27
wallyworldsure, but i'm extending its behaviour03:27
axwwallyworld: which defines its purpose03:28
wallyworldan interface can do whatever methods you decide to put on it03:28
wallyworldi should change its name i guess03:28
wallyworldan Environ i think was from the old days when model was environ03:28
axwwallyworld: no, I don't think you should change the name. the checkToolsAvailability function isn't even using the existing method on EnvironGetter AFAICS03:29
axwwallyworld: separate responsibilities -> separate interfaces03:30
wallyworldaxw: it does because it passes it to GetEnviron03:30
axwwallyworld: which expects a ConfigGetter, no?03:30
wallyworldyes, or an interface that embeds that03:31
axwwallyworld: so why would you wrap X in Y, only to pass X through to some other thing? that is pointless03:31
axwand makes it unclear what the function really needs03:32
axwit doesn't need the Model() method, it only needs the ConfigGetter part03:32
wallyworldit means we pass in one param whose behaviour we use in the method body in various places. i can do a separate param if you want03:32
wallyworldeg we pass in StateInterface in places and don't always use every method03:33
axwwallyworld: yeah, that's a smell. we do that so we don't have to pass around a *state.State, which we used to03:33
wallyworldbut in this case the method being called directly, its logic does use every ethod on the interface03:34
axwless smelly, but still a smell03:34
axwwallyworld: checkToolsAvailability doesn't. updateToolsAvailability does03:34
axwupdateToolsAvailability should take two things: an interface for getting the current config (ConfigGetter), and an interface for updating the model (EnvironGetter)03:35
axwcheckToolsAvailability only needs a ConfigGetter03:35
wallyworldah, damn, i may have been dyslexic03:35
wallyworldi think i was confusing two method names as the same thing03:35
wallyworldffs03:35
axwwallyworld: am I making this login thing a critical/blocker to land?03:43
wallyworldsure03:43
axwthumper, wallyworld: do we really want to repeat the cloud name for each model? they are always going to be the same03:46
axw(in status)03:46
wallyworldi had read that as cloud region03:47
thumperaxw: that's what was asked for03:47
wallyworlddamn, dsylexic again03:47
thumperand it isn't always the same03:47
thumperif I have different models, they won't necessarily be in the same controller or cloud03:47
thumperhmm...03:47
wallyworldtrue, for the aggregated case03:47
axwthumper: we're going to show models for multiple controllers?03:48
axwI don't think so...03:48
thumperum...03:48
axwthumper: OTOH it would be useful to see at a glance from a snapshot of status which cloud03:48
thumperperhaps I'm no longer clear what you are talking about03:48
axwthumper: if I run "juju status", I'm seeing all the models for one controller03:48
thumperum...03:49
axwthumper: ah hm never mind03:49
thumperif you run juju status, you only see one model03:49
axwthumper: yep, forget me. that makes sense03:49
wallyworldaxw: one of you comments in blank so the ditto beneath it makes no sense03:52
axwwallyworld: ignore ditto sorry. I (tried to) delete a comment after I answered my own question03:53
wallyworldok03:54
=== Spads_ is now known as Spads
wallyworldaxw: i've left two issues open but answered the questions....04:29
menn0thumper: tools migration done: http://reviews.vapour.ws/r/5036/04:30
axwwallyworld: "no, different models will want to use their own logging levels on the agents" -- the controller agent(s) manage multiple models\04:31
wallyworldaxw: so a machine agent on a worker for model 1 will want to log different to an agent for model 204:33
wallyworldmodel 1 and model 2 should have their own logging-config right?04:33
axwwallyworld: I'm talking about the controller04:33
axwwallyworld: they are the same agent04:33
wallyworldsure, but not on worker nodes04:33
axwfair point about other workers tho04:33
natefinchif anyone's feeling ambitious, this is a mostly mechanical change, to drop lxc support and use lxd in its place: http://reviews.vapour.ws/r/5027/04:33
axwwallyworld: I guess we shouldn't constrain it to how it works today anyyway. it would be nice if it weren't global. we could have each worker in the controlelr take a logger with levels configured for the model04:34
axwwallyworld: so I'll drop04:34
wallyworldnatefinch: any prgress on the --to lxd issue?04:34
wallyworldyou have a +1 from eric right?04:35
natefinchwallyworld: I do have a +1 from eric, yes.... do we need 2 +1's now?04:35
davecheneyfunc (fw *Firewaller) flushUnits(unitds []*unitData) error {04:35
davecheney  // flushUnits opens and closes ports for the passed unit data.04:35
wallyworldnot if i have anything to do with it - except for when the review feels like they need a second opinion04:35
davecheneyworst, name. ever04:35
natefinchwallyworld: also, no, I don't have an idea about the lxd thing... my guess is that it's a switch statement that we forgot to add lxd to04:36
wallyworldnatefinch: so we can land this and then fix the other issue before release04:37
axwwallyworld: going for lunch then fixing car, will finish review later04:38
wallyworldaxw: np, ty04:39
wallyworldi'll start on the next bits04:39
natefinchwallyworld: master is blocked, and this doesn't have a bug, AFAIK04:39
wallyworldnatefinch: either jfdi or create a bug - i have been jfdi04:40
wallyworldwe need this work for release04:40
wallyworldnatefinch: ah but wait04:40
wallyworldwe can't land until deploy --to is fixed04:41
wallyworldbecause it will break QA04:41
wallyworlddoh04:41
natefinchright, ok. I'll do that first04:41
wallyworldty04:41
natefinchgotta catch up on sleep, will figure it out in the morning.  Seems like it's probably something pretty dumb.04:46
redirwallyworld: axw whomever pr is in http://reviews.vapour.ws/r/5037/05:18
redirBe back in the local AM.05:18
wallyworldty05:18
davechen1yhttps://github.com/juju/juju/pull/559405:30
davechen1y^ anyone experienced with the firewaller, this is a small fix as a prereq for 159016105:30
axwwallyworld: reviewed06:07
wallyworldaxw: ty06:33
axwwallyworld: you did a half change in your previous PR, you called the doc "defaultModelSettings" but the method is called "CloudConfig" still. shall I change it to DefaultModelConfig? sounds a bit off -- like it's config for a default model. maybe ModelConfigDefaults?07:01
wallyworldaxw: yeah, sounds good ty07:02
=== frankban|afk is now known as frankban
frobwaredimitern: ping07:37
dimiternfrobware: pong08:05
frobwaredimitern: was just about the resolv.conf issue08:05
dimiternfrobware: I was looking at those bugs08:05
dimiternfrobware: trying to reproduce now with lxd on 1.9.308:05
frobwaredimitern: I can help out in a bit - just trying to stash some stuff but in a meaningful state.08:09
dimiternfrobware: ok08:09
dimiternfrobware: no luck reproducing this so far :/08:27
frobwaredimitern: sounds like the whole of my yesterday :/08:27
dimiternfrobware: (that is, if the lxds even come up ok)08:27
frobwaredimitern: oh?08:27
dimiternfrobware: I noticed on machine-0 there was an issue and all 3 lxds came up with 10.0.0.x addresses08:28
frobwaredimitern: heh, that caught me out this moring. they are on the LXD bridge.08:28
frobwaredimitern: when we probe for an unused subnet, that's pretty much the default address you'll get as there's not much else, network-wise, running08:29
dimiternfrobware: yeah, the issue due to a race between setting the observed machine config with the created bridges and containers starting to deploy and trying to bridge their nics to yet-to-be-created host bridges08:29
* frobware notes that his git stash list has grown to a depth of 32...08:29
frobwaredimitern: explain that one to me in standup :)08:32
dimiternfrobware: otoh, if the bridges are created ok, lxds come up as expected with all NICs, and /e/resolv.conf has both nameserver and search (i.e. ping node-5 and ping node-5.maas-19 both work)08:32
dimiternfrobware: sure :)08:33
frobwaredimitern: standup09:02
fwereadevoidspace, http://reviews.vapour.ws/r/5029/09:07
frobwaredimitern: regarding resolv.conf. we did a change way back to copy the /etc/resolv.conf from the host. is it possible that it is triggering that path but the host has no valid entry (not for you, but the bug reporter)09:48
dimiternfrobware: it's very much guaranteed that container's resolv.conf will be broken if their host's resolv.conf is also broken09:50
dimiternfrobware: btw commented on that bug for '--to lxd'09:51
dimiternmgz: hey09:51
dimiternmgz: are there any places in the CI tests which do the equivalent of 'juju deploy xyz --to lxd' ?09:52
dimiternmgz: if there are any, it should be because there is a machine with hostname 'lxd' that's the intended target09:53
babbageclunkdimitern: is it actually ambiguous? Can you use a maas-level machine name there instead of a juju-level machine number?10:13
dimiternbabbageclunk: of course you can10:13
dimiternbabbageclunk: unless your node happens to be called 'lxd'10:13
babbageclunkdimitern: ok, just thought I'd check.10:14
dimiternbabbageclunk: actually... hmm - maybe only on maas I guess?10:15
dimiternbabbageclunk: placement is supposed to work with existing machines (including containers), or new containers on existing machines10:17
babbageclunkdimitern: So is the bug really that --to lxd (or lxc or kvm) should be an error?10:18
dimiternbabbageclunk: it even supports a list when num-units > 1: `juju deploy ubuntu -n 3 --to 0,0/lxd/1,lxd:1`10:18
dimiternbabbageclunk: placement for deploy and add-machine/bootstrap is handled slightly differently10:19
dimiternbabbageclunk: for the latter you *can* use 'add-machine ... --to lxd' or 'bootstrap --to node-x' (on maas)10:19
babbageclunkdimitern: yeah, I was getting confused between them - I've interacted with add-machine and bootstrap more.10:20
dimiternbabbageclunk: that's an inconsistency though10:20
dimiternbabbageclunk: add-machine can do more than that - e.g. add-machine ssh:10.20.30.210:21
dimiternbabbageclunk: bootstrap --to lxd at least fails with `error: unsupported bootstrap placement directive "lxd"`10:21
dimiternbabbageclunk: so it looks like a maas provider issue - it implements PrecheckInstance (called by state at AddMachine time), but apparently not very well10:23
babbageclunkdimitern: Ok, that seems easy enough to fix.10:24
dimiternbabbageclunk: tell-tale comment on line 566 in provider/maas/environ.go: `// If there's no '=' delimiter, assume it's a node name.`10:24
dimiternbut doesn't bother to validate it10:25
dimiternfwereade: hey10:25
dimiternfwereade: I think we don't have a clear separation between deploy-time placement and provision-time placement (i.e. deploy --to X vs add-machine X)10:26
dimiternfwereade: I might be wrong, but I think 'deploy ubuntu --to lxd' was never intended to work, unlike '--to lxd:2', '--to 0', or '--to 0/lxd/0'10:28
dimiternfrobware: how about if we pass a list of interfaces to bridge explicitly to the script?10:38
frobwaredimitern: sure; can we HO anyway as I have discovered some issues with lxd on aws10:38
dimiternfrobware: I was just about to have a quick bite - top of the hour?10:39
frobwaredimitern: or later if you want more time; that's only 20 mins10:39
frobwaredimitern: let's say ~1 hour and I'll go and eat too10:39
babbageclunkfrobware: I'm trying to add a machine to understand deploying to lxd better, but when I do add-machine it never goes from Deploying to Deployed in MAAS.10:40
dimiternfrobware: ok, sgtm10:40
frobwarebabbageclunk: for that I think you'll have to dig into the MAAS logs.10:41
frobwarebabbageclunk: oh, 2.0?10:41
dimiternbabbageclunk: trusty?10:41
babbageclunkfrobware: 2.0, xenial10:42
dimiternbabbageclunk: you run 'add-machine lxd' ?10:42
babbageclunkfrobware: just the machine, first - haven't gotten to deploy anything into a container.10:42
frobwarebabbageclunk: I don't use 2.0 very much, if at all. Most of the bugs I'm looking at explicitly reference 1.9.x10:42
babbageclunkfrobware: no, add-machine --series=xenial10:43
babbageclunkfrobware: Any idea how I can get onto the machine? I think it's the network that's not coming up.10:44
frobwarebabbageclunk: you can get to and see the console?10:44
babbageclunkfrobware: yeah, but I don't know login details.10:44
dimiternbabbageclunk: use vmm ?10:44
dimiternif it's a kvm on your machine..10:45
babbageclunkdimitern: what username/password though?10:45
frobwarebabbageclunk: apply this http://pastebin.ubuntu.com/17167820/10:45
frobwarebabbageclunk: (cd juju/provider/maas; make)10:46
dimiternbabbageclunk: none will work; 'ubuntu' but pwd auth is disabled10:46
frobwarebabbageclunk: then build juju10:46
frobwarebabbageclunk: then either start-over or run upgrade-juju and add another machine10:46
babbageclunkfrobware: hmm, I might try removing all of the vlans from the node first.10:47
frobwarebabbageclunk: that's ^^ a useful exercise as it does allow you to login when we bork networking10:47
babbageclunkfrobware: ok, will try it.10:47
fwereadedimitern, hey, sorry11:05
dimiternfrobware: what's up?11:06
fwereadedimitern, in my understanding `--to lxd` means "hand over deployment to the notional lxd compute provider that spans the capable machines in your model"11:06
fwereadedimitern, "I want it in a container, don't bother me with the details"11:07
dimiternoops, sorry frobware11:07
dimiternfwereade: well, why do we have container=lxd as a constraint then?11:08
fwereadedimitern, hysterical raisins11:08
dimiternfwereade: so 'juju deploy ubuntu --to lxd' is supposed to work exactly like 'juju add-machine lxd && juju deploy ubuntu --to X', where X is the 'created machine X' add-machine reports11:10
fwereadedimitern, yes11:16
frobwaredimitern: hey, I kept working... can we sync after I have some lunch. :)11:52
dimiternfrobware: sure :)11:53
voidspacedimitern: ping12:48
voidspacedimitern: a quick sanity check. Every LinkLayerDevice should have a corresponding refs doc with a ref that defaults to 0. If non-zero the references are the number of devices that have this device as a parent (set in ParentName)?12:49
voidspacedimitern: so a quick scan of the linklayerdevices counting parent references should enable me to reproduce it without having to directly migrate it.12:50
dimiternvoidspace: sorry, just got back13:02
dimiternvoidspace: yes, I think that's correct13:02
dimiternvoidspace: ah, well 'quick scan' could work but only if nothing else can add or remove stuff from the db while you do it13:03
babbageclunkfrobware: I tried your patch after trying a few other things, but it seems like passwd -d ubuntu just makes it so that ubuntu can't login through the terminal.13:15
babbageclunkfrobware: trying it with chpasswd instead.13:15
frobwarebabbageclunk: I use that alll the time13:16
babbageclunkfrobware: hmm. Definitely didn't let me log in.13:17
babbageclunkfrobware: maybe it's hanging before the bridgescript runs?13:17
dimiternfrobware, babbageclunk: the ubuntu account is locked usually13:17
frobwarebabbageclunk: if that's the case my patch is either borked, or the bridgescript did not run13:17
dimiternin the cloud images13:18
babbageclunkfrobware, dimitern - trying deploying from maas without juju.13:20
babbageclunkfrobware, dimitern - how does the bridgescript get run? Juju gives it to maas which runs it via cloud-init?13:21
frobwarebabbageclunk: yep13:21
dimiternbabbageclunk: yeah, as a runcmd: in cloud-init user data13:21
frobwaredimitern: can we HO?13:22
dimiternfrobware: sure - omw13:23
dimiternfrobware: joined standup HO13:24
frobwaredimitern: heh, I was in the other one. omw13:24
babbageclunkfrobware, dimitern - ok, I see the same problem deploying with maas-only, so presumably the bridgescript never gets to run.13:24
dimiternbabbageclunk: is this with trusty on maas 2.0 ?13:24
babbageclunkthe install's paused for a long time with "Raise network interfaces", then it times out and continues to stop at a login prompt, but it's before cloud-init runs.13:26
babbageclunkdimitern: xenial on maas 2.013:26
dimiternbabbageclunk: hmm well that's odd13:26
babbageclunkdimitern: yeah. I'm going to kill off the vlans, that seems to trigger it. But I don't see why, since they didn't cause a problem before.13:28
babbageclunkdimitern: Then at least I can try to understand the lxd deploy bug better without this getting in the way.13:28
dimiternbabbageclunk: sorry, otp13:33
natefinchdimitern, dooferlad:  are you guys looking at the deploy --to lxd issue?  I had started looking at that last night, but didn't get very far. I need that to be fixed so I can land my code that removes all the lxc stuff14:02
dimiternnatefinch: yeah, I posted updates as well14:12
frobwarenatefinch: we may need to reassign if not finished by EOD14:14
dimiternnatefinch: deploy --to lxc and --to lxd or --to kvm are equally broken, so it shouldn't block landing your patch14:14
dimiternnatefinch: side-note: I'm more concerned with removing the LXC container type as valid; wasn't there a discussion to still allow both 'lxd' and 'lxc' (but treat both the same as 'lxd') for backwards-compatibility with existing bundles?14:17
natefinchdimitern: bundles will treat lxc like lxd, yes14:27
natefinchdimitern: it's just everything else that is getting lxc removed14:27
dimiternnatefinch: ok then14:28
natefinchdimitern: btw, I swear there used to be help text for --to lxc that said "deploy to a container on a new machine"14:28
natefinchdimitern: but I don't see it now, so maybe I'm crazy14:29
dimiternnatefinch: if there was, it was never tested14:29
natefinchdimitern: so are we fixing the bug that it doesn't immediately error out, or are we fixing the bug that it doesn't work?14:30
dimiternnatefinch: and I know for sure maas provider is not handling this as it should; not tried others14:30
cheryljhey dimitern, should bug 1590689 be fixed in 1.25.6?14:30
mupBug #1590689: MAAS 1.9.3 + Juju 1.25.5 - on the Juju controller node eth0 and juju-br0 interfaces have the same IP address at the same time <cpec> <juju> <maas> <sts> <juju-core:Fix Committed> <juju-core 1.25:Triaged> <MAAS:Invalid> <https://launchpad.net/bugs/1590689>14:30
dimiterncherylj: not without backporting the fix I linked to from master14:31
cheryljdimitern: sorry, what I mean is, should we hold off releasing 1.25.6 until that gets done?14:31
dimiterncherylj: oh, sorry not that one14:31
dimiterncherylj: ah, yeah it *is* that one - and FWIW I think we should not release 1.25.6 without it14:32
cheryljdimitern: is the backport already on your (or someone's) to do list?14:32
mupBug #1591225 opened: Generated image stream is not considered in bootstrap on private cloud <juju-core:Incomplete> <https://launchpad.net/bugs/1591225>14:33
dimiterncherylj: not to my knowledge14:33
dimiterncherylj: I could switch to that and propose it (I have too many things in progress..)14:33
cheryljboy I know how that feels.14:34
cheryljdimitern: I think we're still a couple days away from a 1.25.6, so maybe aim to have it in by Tuesday?14:34
dimiterncherylj: that would be great!14:37
cheryljthanks, dimitern!14:38
perrito666bbl14:38
dimiternfrobware: guess what?14:39
frobwareits broken14:39
frobwaredimitern: in beta614:39
dimiternfrobware: nope :) it works just the same with beta614:39
frobwaredimitern: sigh14:39
natefinchdimitern: so are we fixing it so that deploy --to lxd errors out the way --to lxc does?  in my tests --to lxc says: "ERROR cannot add application "ubuntu3": unknown placement directive: lxc"14:39
dimitern(...for a change)14:39
dimiternnatefinch: is that on maas btw?14:40
natefinchdimitern: whereas --to lxd doesn't error out (but then never works either)14:40
dimiternfrobware: added a comment anyway14:40
frobwaredimitern: thx14:40
natefinchdimitern: no.  I never test on maas. don't have one.  GCE.  but I can try aws if it's not still broken like it was yesterday14:41
natefinchdimitern: it should be provider independent, though14:41
dimiternnatefinch: yeah, it *should*, but as it turns out it's not unfortunately14:42
natefinchdimitern: I guess maas has that messed up "if it doesn't match anything else, let's assume it's a node" thing14:42
dimiternnatefinch: I'll do a quick test now how deploy --to lxc and lxd is handled on maas, gce, and aws14:43
natefinchdimitern: I did GCE, so you can skip that one14:43
=== tvansteenburgh1 is now known as tvansteenburgh
dimiternnatefinch: ok, I'll try azure then14:43
natefinchdimitern: lxd and kvm behave the same - they both return no error, but then never create a machine either14:44
dimiternnatefinch: something just occurred to me.. lxd uses the 'lxd' as the default domain for container FQDNs14:45
dimiternnatefinch: it might be the reason why lxd is different14:46
natefinchdimitern: I'm pretty sure a placement directive of just a container type is supposed to work: https://github.com/juju/juju/blob/master/instance/placement.go#L7114:46
dimiternnatefinch: yeah, but there's also the PrecheckInstance from the prechecker state policy, which is called while adding a machine14:48
dimiternnatefinch: hmm it looks like only maas is affected14:51
dimiternnatefinch: as all other providers expect '=' to be present in the placement or parsing fails14:52
dimiternnatefinch: or like joyent simply fails with placement != ""14:54
dimiterncloudsigma doesn't even bother to do anything.. precheckInstance is { return nil }.. why implement it then?14:56
babbageclunkdimitern, natefinch: I can see in the add-machine case where the decision to add a new machine with a container is made for lxc, I can't find anything corresponding to that in the deploy code.14:59
natefinchahh, add machine, that's where it is: juju add-machine lxd                  (starts a new machine with an lxd container)15:01
natefinchI don't know why deploy would be any different15:01
babbageclunkdimitern, natefinch: ooh - does State.addMachineWithPlacement need to grow a call to AddMachineInsideNewMachine to do it?15:01
babbageclunk(in state/state.go:1249)15:02
katconatefinch: standup time15:02
natefinchkatco: oops, thanks15:02
natefinchbabbageclunk: 127515:03
dimiternbabbageclunk: the actual code deploy uses lives in juju/deploy.go15:03
babbageclunkdimitern: Yeah, but that will only put a new container in an existing machine.15:03
babbageclunkdimitern: vs this code from add-machine https://github.com/juju/juju/blob/master/apiserver/machinemanager/machinemanager.go#L15815:04
dimiternnatefinch: on AWS 'deploy ubuntu --to lxd' and --to lxc both appear to work, but neither adds a machine for the unit15:04
natefinchdimitern: yeah, same for GCE15:04
dimiternnatefinch: so it looks consistently broken everywhere :)15:06
dimiternI'd vote to reject '--to <container-type>' for deploy on its own (i.e. still allow '--to <ctype>:<id>')15:06
babbageclunkdimitern: So the code from add-machine will create a new host with a container inside, but the deploy codepath won't because it doesn't call AddMachineInsideNewMachine.15:07
dimiternuntil we can untangle the mess around it and make add-machine and deploy --to behave the same way15:07
dimiternbabbageclunk: yeah, because nobody thought about it too much I guess15:08
babbageclunkdimitern: I think it's just an extra check in that function - if machineId is "", call AddMachineInsideNewMachine instead of AddMachineInsideMachine.15:10
babbageclunkdimitern: testing it now15:11
dimiternbabbageclunk: that sounds correct15:11
dimiternbabbageclunk: but definitely *isn't* the way to fix the bug15:12
dimiternbabbageclunk: I mean.. this will allow deploy --to lxd to work, but it might also open a whole new can of worms on all providers15:13
babbageclunkdimitern: I don't see why? (But I haven't been following the discussion closely.)15:13
dimiternbabbageclunk: e.g. deploy --to kvm on aws will start an instance but then fail to deploy the unit as kvm won't be supported15:14
babbageclunkdimitern: Isn't that the same behaviour as add-machine kvm?15:14
dimiternbabbageclunk: similarly, --to lxd with 'default-series: precise' will similarly seem to pass initially, then fail as lxd is not supported on precise15:15
dimiternbabbageclunk: add-machine is similarly broken in those cases15:15
babbageclunkdimitern: Isn't it worth doing this fix so add-machine and deploy behave in the same way (although both broken in the cases you describe)?15:16
dimiternbabbageclunk: add-machine accepts other things, e.g. ssh:user@hostname15:16
dimiternbabbageclunk: they still won't act the same15:17
dimiternbabbageclunk: but, at least they will be a step closer15:17
babbageclunkdimitern: Yeah, it still seems like people expect them to work in the same way in this case.15:18
natefinchthey should be as consistent as possible15:18
dimiternbabbageclunk: ok, please ignore my previous rants then :) what you suggest is a good fix to have15:19
* dimitern is just twitchy about changing core behavior before the release..15:20
babbageclunkdimitern: :) I mean, I think you're right that those cases are problems.15:20
dimiternwe should have a well define format for placement, which allows provider-specific scopes; e.g. deploy --to/add-machine <scope>:<args>; where <scope> := <container-type>|<provider-type>; <args> := <target>|<key>=<value>[,..]15:23
frobwaredimitern: in AWS with AA-FF why do we use static addresses and not dhcp?15:23
frobwaredimitern: in containers15:23
dimiternfrobware: because the FF15:24
frobwaredimitern: sure, but really asking why static in that case15:24
dimiternfrobware: i.e. the user asked for static IPs15:24
dimiternfrobware: we use dhcp otherwise15:25
dimiternfrobware: but the whole point of the FF and now the multi-NIC approach on maas has always been to have static IPs for containers15:26
frobwaredimitern: it was AWS I was questioning; the MAAS I can see because you can ask for static/dhcp there15:32
dimiternfrobware: you can on AWS as well15:33
dimiternfrobware: AssignPrivateIpAddress15:33
dimiternhttp://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_AssignPrivateIpAddresses.html15:33
dimiternwell not nearly equivalent to what maas offers.15:34
=== tvansteenburgh1 is now known as tvansteenburgh
alexisbnatefinch, when you have five minutes I have a few qs16:01
dimiternfrobware: ping16:01
dimiternfrobware: here's my patch so far: http://paste.ubuntu.com/17174180/16:01
natefinchalexisb: sure.16:02
dimiternfrobware: now testing on aws w/ && w/o AC-FF (xenial), and on maas-19 (t) / maas-20 (x)16:02
alexisbhttps://hangouts.google.com/hangouts/_/canonical.com/juju-release16:03
alexisbnatefinch, ^^16:03
frobwaredimitern: it's nuts... all this manual testing we're BOTH doing... Grrr.16:03
alexisbcherylj, feel free to crash the party16:03
dimiternfrobware: yeah..16:04
frobwaredimitern: your patch "so far" - does that mean use or wait?16:05
dimiternfrobware: so far only as long as the currently running make check passes16:06
dimiternfrobware: or if something comes up from the live tests (will be able to tell you shortly); otherwise I think I covered everything in what I pasted16:07
dimiternfrobware: yeah, I've missed a few tests in container/kvm16:11
alexisbbabbageclunk, dimitern: what is the consensus for a fix on lp 1590960 ??16:15
alexisblp159096016:16
babbageclunkalexisb: maybe bug 1590960? Or is mup sulking?16:16
mupBug #1590960: juju deploy --to lxd does not create base machine <deploy> <lxd> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1590960>16:16
alexisbthere we go :)16:17
babbageclunkI've got a fix, tested manually, just finishing the unit test for it.16:17
dimiternalexisb: we can fix deploy to work with --to <container-type>, but that's not what's blocking natefinch's patch LXC-to-LXD16:17
alexisbdimitern, correct it is not blocking16:17
babbageclunkShould be up for review in ~10 mins16:18
alexisbbut looking at this mornings discussion there seemed to be some different ideas how what should work woith --to and what shouldnt16:18
alexisbwas just curious what the expected behavior should be16:18
cheryljalexisb, natefinch looks like --to lxc is also a problem on 1.25:  https://bugs.launchpad.net/juju-core/+bug/1590960/comments/616:20
mupBug #1590960: juju deploy --to lxd does not create base machine <deploy> <lxd> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1590960>16:20
dimiternalexisb: that's the real issue: behavior was neither clearly defined nor tested16:20
alexisbdimitern, exactly16:20
dimiternalexisb: but it's sensible to expect deploy --to X it should work like add-machine X does16:21
alexisbdimitern, also agree16:21
natefinchcherylj: an error is a lot better than silently half-working... but yeah, should be fixed to mirror add-machine16:21
dimiternalexisb: and babbageclunk's fix should get us there16:21
natefinchhuzzah :)16:21
dimiternbut not all the way16:22
alexisbdimitern, though a note to the juju-core team might be good so that we highlight the change and educate the team16:22
alexisbbabbageclunk, ^^16:22
dimiternagreed16:22
dimitern 16:22
babbageclunkalexisb, dimitern: Clarifying - am I sending the note about this change?16:25
dimiternbabbageclunk: I'd appreciate if you do it, I can help clarifying something or other if you need though16:25
dimiternfrobware: so the patch didn't work for aws16:26
alexisbbabbageclunk, yeah just to the juju-core lanchpad group16:26
frobwaredimitern: what happened?16:26
dimiternfrobware: ERROR juju.provisioner provisioner_task.go:681 cannot start instance for machine "0/lxd/0": missing16:26
dimiterncontainer network config16:26
dimiternfrobware: it slipped through somewhere.. looking16:27
frobwaredimitern: why do I think that's an existent bug... ?16:27
babbageclunkdimitern, alexisb: Ok cool - I think I understand the wider issues now. Basically just that this will still do slightly weird things on clouds that don't support the container type, but at least that the add-machine and deploy behaviour is more consistent.16:28
alexisbbabbageclunk, yep16:28
alexisband we as a team should be clear on what the current behaviour is and the gaps, so we can both explain to users *and* make better desicions on what the behaviour should be16:29
alexisbcmars, cherylj, do we have any progress on https://bugs.launchpad.net/juju-core/+bug/158115716:30
mupBug #1581157: github.com/juju/juju/cmd/jujud test timeout on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged by dave-cheney> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1581157>16:30
dimiternfrobware: nope, it was a warning before16:41
dimiternfrobware: I'll need to add a few more tweaks to the patch and will resend16:41
cheryljalexisb: I haven't heard anything from cmars about it16:50
=== frankban is now known as frankban|afk
mupBug #1591290 opened: serverSuite.TestStop unexpected error <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1591290>17:18
dimiternfrobware: fixed patch: http://paste.ubuntu.com/17177501/17:21
dimiternfrobware: should now work ok on AWS (testing again); all unit tests fixed17:23
dimiternfrobware: I probably should've proposed it rather than to bug you with it :/17:24
natefinchdimitern: you mention placement strings with = in them in that bug... but placement strings don't use = AFAIK?  placement is like --to 0\lxc\0 or --to lxd:4  maybe you mean constraints?17:26
dimiternnatefinch: on maas you can do --to zone=foo17:28
dimiternnatefinch: and I think most others support zone= as well17:28
dimiternnatefinch: see, it's confusing :)17:29
natefinchdimitern: gah, zone should be a constraint :/17:37
natefinchwell... maybe not17:38
natefinchI g uess constraints are for all units of a service17:38
natefinchstill... weird17:39
=== benji is now known as Guest73726
dimiternnatefinch: yeah, it can't be useful as a constraint if we're to do automatic zone distribution17:40
natefinchdimitern: right (sometimes you might not want them distributed,. but that's the exception).  Anyway... many valid placements do not use =... like specifying containers or machines17:41
dimiternnatefinch: there's also a container=lxd constraint btw, hardly tested17:41
babbageclunknatefinch, dimitern: halp! After state.AddApplication's been called, the units are just staged, is that right? When/how does juju/deploy:AddUnits get called?17:42
babbageclunkIs it triggered by a watcher of some sort?17:43
natefinchbabbageclunk: there's the unitassigner that makes sure units get assigned to machnies17:43
dimiternbabbageclunk: it goes like this: cmd/juju/application/deploy.go -> api/application/deploy -> apiserver/application/deploy -> juju/deploy -> state17:43
natefinchbabbageclunk: it's a worker17:43
babbageclunkdimitern: yeah, I could follow that, but none of the code in that chain actually ends up calling AssignUnitWithPlacement.17:45
babbageclunknatefinch: Ah, ok - thanks.17:45
natefinchbabbageclunk: yeah, we add a staged assignment during deploy, and then the unitassigner reads those and turns them into real assignments.17:46
babbageclunknatefinch: ok - that makes sense. I was trying to understand why I didn't see the error I see in my unit test when running deploy manually.17:50
babbageclunknatefinch: It's because the errors are raised by the unitassigner and logged somewhere, rather than coming back from the api to the command.17:52
dimiternfrobware: FYI, proposed it: http://reviews.vapour.ws/r/5040/17:55
babbageclunkdimitern, natefinch: review please? http://reviews.vapour.ws/r/5041/18:10
dimiternbabbageclunk: cheers, looking18:10
babbageclunkdimitern: I mean, you shouldn't now! It's late there! It's kinda late here now!18:10
babbageclunkdimitern: but thanks!18:12
* babbageclunk is off home - have delightful weekends everyone!18:12
dimiternbabbageclunk: likewise! :)18:12
frobwaredimitern: will take a look18:20
redirbrb reboot18:47
=== Spads_ is now known as Spads
mupBug #1588924 opened: juju list-controllers --format=yaml displays controller that cannot be addressed. <juju-core:Fix Committed> <juju-deployer:Invalid> <https://launchpad.net/bugs/1588924>19:22
=== Spads_ is now known as Spads
=== natefinch is now known as natefinch-afk
perrito666cmars: still around?20:41
mupBug #1591379 opened: bootstrap failure with MAAS doesn't tell me which node has a problem <v-pil> <juju-core:New> <https://launchpad.net/bugs/1591379>21:19
cmarsperrito666, yep, what's up?21:26
* perrito666 deleted what he was writing because he began in spanish21:26
perrito666cmars: I wanted to ask you about juju/permission21:26
perrito666We are sort of moving in another direction http://reviews.vapour.ws/r/4973/#comment2718121:26
cmarsleo un poquito21:26
cmarslooking21:27
perrito666dont do that (the spanish) you just short circuited my brain badly :p21:27
perrito666it is fun to see your own language and not understand it21:27
cmars:)21:28
cmarsperrito666, is there a doc or tl;dr for the permissions changes?21:28
mupBug #1591387 opened: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>21:55
mupBug #1591387 changed: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>21:58
mupBug #1591387 opened: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>22:10

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!