/srv/irclogs.ubuntu.com/2016/06/10/#juju-dev.txt

redir	so I am back to: value of (*params.Error) is nil, but a typed nil	00:41
redir	:/	00:41
redir	nm	00:49
axw	wallyworld: I'm around now, let me know when you want to chat (can wait till 1:1 if you like)	01:15
wallyworld	axw: am just typing in PR, will push in a sec	01:15
menn0	thumper: tools migration is going well so far. here's one change - several more on their way: http://reviews.vapour.ws/r/5033/	01:15
wallyworld	axw: i have not reviewed or tested live yet http://reviews.vapour.ws/r/5034/ we can chat soon	01:22
wallyworld	omfg those internal networking tests are a waste of time and a bitch to fix	01:23
wallyworld	damn, am still missing one apiserver test too	01:24
menn0	thumper: thanks	01:34
redir	off to do dinner. bbiab	02:33
=== natefinch-afk is now known as natefinch
natefinch	thumper: why is our default log level <root>=WARNING;unit=DEBUG ?	02:39
thumper	because it is how we see what the units are doing	02:40
thumper	unit logging is the output from hooks	02:40
thumper	and always useful	02:40
thumper	but you can explicitly turn it off	02:40
natefinch	....then why is it at debug?	02:40
natefinch	also, I thought it was juju.unit ? or is unit special?	02:41
=== Spads_ is now known as Spads
natefinch	also, why don't we show info by default? defaulting to warning means we drop a ton of useful context on the floor, and make debugging production systems really difficult	02:42
thumper	wallyworld: if you ignore all the hook failures... http://pastebin.ubuntu.com/17163593/	02:46
thumper	natefinch: unit is special	02:46
thumper	natefinch: we should probably change to default to INFO	02:46
thumper	I have no real good reason why	02:47
wallyworld	thumper: nice, were you going to split the charm url also?	02:47
thumper	not in this branch	02:47
wallyworld	lots of people want warning	02:47
wallyworld	info is too verbose for them	02:47
natefinch	wallyworld: they're welcome to set it to warning, but I think Info is a more reasonable default	02:47
wallyworld	depends who the audience is	02:47
wallyworld	do we cater for developers of devop people	02:47
wallyworld	or	02:48
natefinch	wallyworld: not really. We limit the amount of logs we store	02:48
natefinch	wallyworld: and they can turn it down to warning if they want	02:48
wallyworld	and we can turn it up if we want	02:48
wallyworld	devop people i have met do not want lots of verbose logging	02:48
natefinch	it's not verbose. It's specifically not.	02:48
wallyworld	but i have not talked to lots and lots of them	02:48
natefinch	it's not debug... except unit, evidently :/	02:49
thumper	but info is noise	02:49
wallyworld	verbose is subjective	02:49
wallyworld	yes it is noise	02:49
wallyworld	they just want warnings	02:49
natefinch	I have tried working with logs set to warning and it's basically unusable	02:49
wallyworld	theu just want to know when things go wrong	02:49
natefinch	you can't tell WTF is going on	02:49
thumper	natefinch: for us, yes	02:49
wallyworld	unusable for you as a dev	02:49
wallyworld	not unusable for a devop person	02:49
natefinch	usable for anyone who wants to support the server and figure out what is wrong	02:49
wallyworld	and that's the friction that always happens in these cases	02:50
natefinch	I don't believe the devops people choosing warning know what they're talking about.	02:50
thumper	wallyworld: http://reviews.vapour.ws/r/5035/	02:51
wallyworld	you forgot the IMHO	02:51
wallyworld	looking after i finish current queue	02:51
axw	wallyworld: in all cases, APIPort use by the providers is only used in StartInstance. how about we just add it to StartInstanceParams for now?	02:51
wallyworld	hmmm, that would work i think	02:51
natefinch	.... some of them do. The people (mostly internal to canonical) who have used juju a lot, sure.	02:51
axw	wallyworld: we could do the same for controller-uuid, and then add another method to Environ to destroy all hosted models/resources	02:52
axw	(passing in the controller UUID to that)	02:52
wallyworld	axw: for now, i can just add controller uuid to setconfig params	02:52
wallyworld	and do that next bit later	02:52
axw	wallyworld: yep doesn't have to be in one hit, but I think that's how we can make it a bit cleaner	02:53
wallyworld	+1	02:53
wallyworld	one step at a time	02:53
menn0	thumper: well that's gone a bit better than charms. tools migration worked first time once all the required infrastructure was in place.	03:09
thumper	menn0: awesome	03:09
thumper	wallyworld: I'm looking at breaking out the charm now rather than my normal friday afternoon thing	03:12
wallyworld	rightio, almost starting a review	03:12
=== Spads_ is now known as Spads
wallyworld	axw: were we going to put region in controllers.yaml?	03:22
axw	wallyworld: I already did	03:24
axw	wallyworld: maybe we want to remove region from there? and just have it on the model?	03:24
wallyworld	yep	03:25
axw	cloud on controller, region on model	03:25
wallyworld	yep	03:25
axw	wallyworld: I added some comments to the diff	03:26
axw	er review comments to your diff	03:26
wallyworld	ty, looking	03:26
wallyworld	axw: what's wrong with embedding that interface?	03:27
axw	it's not what the interface is meant to be doing ...	03:27
axw	wallyworld: its purpose is to get you a state.Model	03:27
axw	not to get a model and model config and controller config	03:27
wallyworld	sure, but i'm extending its behaviour	03:27
axw	wallyworld: which defines its purpose	03:28
wallyworld	an interface can do whatever methods you decide to put on it	03:28
wallyworld	i should change its name i guess	03:28
wallyworld	an Environ i think was from the old days when model was environ	03:28
axw	wallyworld: no, I don't think you should change the name. the checkToolsAvailability function isn't even using the existing method on EnvironGetter AFAICS	03:29
axw	wallyworld: separate responsibilities -> separate interfaces	03:30
wallyworld	axw: it does because it passes it to GetEnviron	03:30
axw	wallyworld: which expects a ConfigGetter, no?	03:30
wallyworld	yes, or an interface that embeds that	03:31
axw	wallyworld: so why would you wrap X in Y, only to pass X through to some other thing? that is pointless	03:31
axw	and makes it unclear what the function really needs	03:32
axw	it doesn't need the Model() method, it only needs the ConfigGetter part	03:32
wallyworld	it means we pass in one param whose behaviour we use in the method body in various places. i can do a separate param if you want	03:32
wallyworld	eg we pass in StateInterface in places and don't always use every method	03:33
axw	wallyworld: yeah, that's a smell. we do that so we don't have to pass around a *state.State, which we used to	03:33
wallyworld	but in this case the method being called directly, its logic does use every ethod on the interface	03:34
axw	less smelly, but still a smell	03:34
axw	wallyworld: checkToolsAvailability doesn't. updateToolsAvailability does	03:34
axw	updateToolsAvailability should take two things: an interface for getting the current config (ConfigGetter), and an interface for updating the model (EnvironGetter)	03:35
axw	checkToolsAvailability only needs a ConfigGetter	03:35
wallyworld	ah, damn, i may have been dyslexic	03:35
wallyworld	i think i was confusing two method names as the same thing	03:35
wallyworld	ffs	03:35
axw	wallyworld: am I making this login thing a critical/blocker to land?	03:43
wallyworld	sure	03:43
axw	thumper, wallyworld: do we really want to repeat the cloud name for each model? they are always going to be the same	03:46
axw	(in status)	03:46
wallyworld	i had read that as cloud region	03:47
thumper	axw: that's what was asked for	03:47
wallyworld	damn, dsylexic again	03:47
thumper	and it isn't always the same	03:47
thumper	if I have different models, they won't necessarily be in the same controller or cloud	03:47
thumper	hmm...	03:47
wallyworld	true, for the aggregated case	03:47
axw	thumper: we're going to show models for multiple controllers?	03:48
axw	I don't think so...	03:48
thumper	um...	03:48
axw	thumper: OTOH it would be useful to see at a glance from a snapshot of status which cloud	03:48
thumper	perhaps I'm no longer clear what you are talking about	03:48
axw	thumper: if I run "juju status", I'm seeing all the models for one controller	03:48
thumper	um...	03:49
axw	thumper: ah hm never mind	03:49
thumper	if you run juju status, you only see one model	03:49
axw	thumper: yep, forget me. that makes sense	03:49
wallyworld	axw: one of you comments in blank so the ditto beneath it makes no sense	03:52
axw	wallyworld: ignore ditto sorry. I (tried to) delete a comment after I answered my own question	03:53
wallyworld	ok	03:54
=== Spads_ is now known as Spads
wallyworld	axw: i've left two issues open but answered the questions....	04:29
menn0	thumper: tools migration done: http://reviews.vapour.ws/r/5036/	04:30
axw	wallyworld: "no, different models will want to use their own logging levels on the agents" -- the controller agent(s) manage multiple models\	04:31
wallyworld	axw: so a machine agent on a worker for model 1 will want to log different to an agent for model 2	04:33
wallyworld	model 1 and model 2 should have their own logging-config right?	04:33
axw	wallyworld: I'm talking about the controller	04:33
axw	wallyworld: they are the same agent	04:33
wallyworld	sure, but not on worker nodes	04:33
axw	fair point about other workers tho	04:33
natefinch	if anyone's feeling ambitious, this is a mostly mechanical change, to drop lxc support and use lxd in its place: http://reviews.vapour.ws/r/5027/	04:33
axw	wallyworld: I guess we shouldn't constrain it to how it works today anyyway. it would be nice if it weren't global. we could have each worker in the controlelr take a logger with levels configured for the model	04:34
axw	wallyworld: so I'll drop	04:34
wallyworld	natefinch: any prgress on the --to lxd issue?	04:34
wallyworld	you have a +1 from eric right?	04:35
natefinch	wallyworld: I do have a +1 from eric, yes.... do we need 2 +1's now?	04:35
davecheney	func (fw Firewaller) flushUnits(unitds []unitData) error {	04:35
davecheney	// flushUnits opens and closes ports for the passed unit data.	04:35
wallyworld	not if i have anything to do with it - except for when the review feels like they need a second opinion	04:35
davecheney	worst, name. ever	04:35
natefinch	wallyworld: also, no, I don't have an idea about the lxd thing... my guess is that it's a switch statement that we forgot to add lxd to	04:36
wallyworld	natefinch: so we can land this and then fix the other issue before release	04:37
axw	wallyworld: going for lunch then fixing car, will finish review later	04:38
wallyworld	axw: np, ty	04:39
wallyworld	i'll start on the next bits	04:39
natefinch	wallyworld: master is blocked, and this doesn't have a bug, AFAIK	04:39
wallyworld	natefinch: either jfdi or create a bug - i have been jfdi	04:40
wallyworld	we need this work for release	04:40
wallyworld	natefinch: ah but wait	04:40
wallyworld	we can't land until deploy --to is fixed	04:41
wallyworld	because it will break QA	04:41
wallyworld	doh	04:41
natefinch	right, ok. I'll do that first	04:41
wallyworld	ty	04:41
natefinch	gotta catch up on sleep, will figure it out in the morning. Seems like it's probably something pretty dumb.	04:46
redir	wallyworld: axw whomever pr is in http://reviews.vapour.ws/r/5037/	05:18
redir	Be back in the local AM.	05:18
wallyworld	ty	05:18
davechen1y	https://github.com/juju/juju/pull/5594	05:30
davechen1y	^ anyone experienced with the firewaller, this is a small fix as a prereq for 1590161	05:30
axw	wallyworld: reviewed	06:07
wallyworld	axw: ty	06:33
axw	wallyworld: you did a half change in your previous PR, you called the doc "defaultModelSettings" but the method is called "CloudConfig" still. shall I change it to DefaultModelConfig? sounds a bit off -- like it's config for a default model. maybe ModelConfigDefaults?	07:01
wallyworld	axw: yeah, sounds good ty	07:02
=== frankban\|afk is now known as frankban
frobware	dimitern: ping	07:37
dimitern	frobware: pong	08:05
frobware	dimitern: was just about the resolv.conf issue	08:05
dimitern	frobware: I was looking at those bugs	08:05
dimitern	frobware: trying to reproduce now with lxd on 1.9.3	08:05
frobware	dimitern: I can help out in a bit - just trying to stash some stuff but in a meaningful state.	08:09
dimitern	frobware: ok	08:09
dimitern	frobware: no luck reproducing this so far :/	08:27
frobware	dimitern: sounds like the whole of my yesterday :/	08:27
dimitern	frobware: (that is, if the lxds even come up ok)	08:27
frobware	dimitern: oh?	08:27
dimitern	frobware: I noticed on machine-0 there was an issue and all 3 lxds came up with 10.0.0.x addresses	08:28
frobware	dimitern: heh, that caught me out this moring. they are on the LXD bridge.	08:28
frobware	dimitern: when we probe for an unused subnet, that's pretty much the default address you'll get as there's not much else, network-wise, running	08:29
dimitern	frobware: yeah, the issue due to a race between setting the observed machine config with the created bridges and containers starting to deploy and trying to bridge their nics to yet-to-be-created host bridges	08:29
* frobware notes that his git stash list has grown to a depth of 32...		08:29
frobware	dimitern: explain that one to me in standup :)	08:32
dimitern	frobware: otoh, if the bridges are created ok, lxds come up as expected with all NICs, and /e/resolv.conf has both nameserver and search (i.e. ping node-5 and ping node-5.maas-19 both work)	08:32
dimitern	frobware: sure :)	08:33
frobware	dimitern: standup	09:02
fwereade	voidspace, http://reviews.vapour.ws/r/5029/	09:07
frobware	dimitern: regarding resolv.conf. we did a change way back to copy the /etc/resolv.conf from the host. is it possible that it is triggering that path but the host has no valid entry (not for you, but the bug reporter)	09:48
dimitern	frobware: it's very much guaranteed that container's resolv.conf will be broken if their host's resolv.conf is also broken	09:50
dimitern	frobware: btw commented on that bug for '--to lxd'	09:51
dimitern	mgz: hey	09:51
dimitern	mgz: are there any places in the CI tests which do the equivalent of 'juju deploy xyz --to lxd' ?	09:52
dimitern	mgz: if there are any, it should be because there is a machine with hostname 'lxd' that's the intended target	09:53
babbageclunk	dimitern: is it actually ambiguous? Can you use a maas-level machine name there instead of a juju-level machine number?	10:13
dimitern	babbageclunk: of course you can	10:13
dimitern	babbageclunk: unless your node happens to be called 'lxd'	10:13
babbageclunk	dimitern: ok, just thought I'd check.	10:14
dimitern	babbageclunk: actually... hmm - maybe only on maas I guess?	10:15
dimitern	babbageclunk: placement is supposed to work with existing machines (including containers), or new containers on existing machines	10:17
babbageclunk	dimitern: So is the bug really that --to lxd (or lxc or kvm) should be an error?	10:18
dimitern	babbageclunk: it even supports a list when num-units > 1: `juju deploy ubuntu -n 3 --to 0,0/lxd/1,lxd:1`	10:18
dimitern	babbageclunk: placement for deploy and add-machine/bootstrap is handled slightly differently	10:19
dimitern	babbageclunk: for the latter you can use 'add-machine ... --to lxd' or 'bootstrap --to node-x' (on maas)	10:19
babbageclunk	dimitern: yeah, I was getting confused between them - I've interacted with add-machine and bootstrap more.	10:20
dimitern	babbageclunk: that's an inconsistency though	10:20
dimitern	babbageclunk: add-machine can do more than that - e.g. add-machine ssh:10.20.30.2	10:21
dimitern	babbageclunk: bootstrap --to lxd at least fails with `error: unsupported bootstrap placement directive "lxd"`	10:21
dimitern	babbageclunk: so it looks like a maas provider issue - it implements PrecheckInstance (called by state at AddMachine time), but apparently not very well	10:23
babbageclunk	dimitern: Ok, that seems easy enough to fix.	10:24
dimitern	babbageclunk: tell-tale comment on line 566 in provider/maas/environ.go: `// If there's no '=' delimiter, assume it's a node name.`	10:24
dimitern	but doesn't bother to validate it	10:25
dimitern	fwereade: hey	10:25
dimitern	fwereade: I think we don't have a clear separation between deploy-time placement and provision-time placement (i.e. deploy --to X vs add-machine X)	10:26
dimitern	fwereade: I might be wrong, but I think 'deploy ubuntu --to lxd' was never intended to work, unlike '--to lxd:2', '--to 0', or '--to 0/lxd/0'	10:28
dimitern	frobware: how about if we pass a list of interfaces to bridge explicitly to the script?	10:38
frobware	dimitern: sure; can we HO anyway as I have discovered some issues with lxd on aws	10:38
dimitern	frobware: I was just about to have a quick bite - top of the hour?	10:39
frobware	dimitern: or later if you want more time; that's only 20 mins	10:39
frobware	dimitern: let's say ~1 hour and I'll go and eat too	10:39
babbageclunk	frobware: I'm trying to add a machine to understand deploying to lxd better, but when I do add-machine it never goes from Deploying to Deployed in MAAS.	10:40
dimitern	frobware: ok, sgtm	10:40
frobware	babbageclunk: for that I think you'll have to dig into the MAAS logs.	10:41
frobware	babbageclunk: oh, 2.0?	10:41
dimitern	babbageclunk: trusty?	10:41
babbageclunk	frobware: 2.0, xenial	10:42
dimitern	babbageclunk: you run 'add-machine lxd' ?	10:42
babbageclunk	frobware: just the machine, first - haven't gotten to deploy anything into a container.	10:42
frobware	babbageclunk: I don't use 2.0 very much, if at all. Most of the bugs I'm looking at explicitly reference 1.9.x	10:42
babbageclunk	frobware: no, add-machine --series=xenial	10:43
babbageclunk	frobware: Any idea how I can get onto the machine? I think it's the network that's not coming up.	10:44
frobware	babbageclunk: you can get to and see the console?	10:44
babbageclunk	frobware: yeah, but I don't know login details.	10:44
dimitern	babbageclunk: use vmm ?	10:44
dimitern	if it's a kvm on your machine..	10:45
babbageclunk	dimitern: what username/password though?	10:45
frobware	babbageclunk: apply this http://pastebin.ubuntu.com/17167820/	10:45
frobware	babbageclunk: (cd juju/provider/maas; make)	10:46
dimitern	babbageclunk: none will work; 'ubuntu' but pwd auth is disabled	10:46
frobware	babbageclunk: then build juju	10:46
frobware	babbageclunk: then either start-over or run upgrade-juju and add another machine	10:46
babbageclunk	frobware: hmm, I might try removing all of the vlans from the node first.	10:47
frobware	babbageclunk: that's ^^ a useful exercise as it does allow you to login when we bork networking	10:47
babbageclunk	frobware: ok, will try it.	10:47
fwereade	dimitern, hey, sorry	11:05
dimitern	frobware: what's up?	11:06
fwereade	dimitern, in my understanding `--to lxd` means "hand over deployment to the notional lxd compute provider that spans the capable machines in your model"	11:06
fwereade	dimitern, "I want it in a container, don't bother me with the details"	11:07
dimitern	oops, sorry frobware	11:07
dimitern	fwereade: well, why do we have container=lxd as a constraint then?	11:08
fwereade	dimitern, hysterical raisins	11:08
dimitern	fwereade: so 'juju deploy ubuntu --to lxd' is supposed to work exactly like 'juju add-machine lxd && juju deploy ubuntu --to X', where X is the 'created machine X' add-machine reports	11:10
fwereade	dimitern, yes	11:16
frobware	dimitern: hey, I kept working... can we sync after I have some lunch. :)	11:52
dimitern	frobware: sure :)	11:53
voidspace	dimitern: ping	12:48
voidspace	dimitern: a quick sanity check. Every LinkLayerDevice should have a corresponding refs doc with a ref that defaults to 0. If non-zero the references are the number of devices that have this device as a parent (set in ParentName)?	12:49
voidspace	dimitern: so a quick scan of the linklayerdevices counting parent references should enable me to reproduce it without having to directly migrate it.	12:50
dimitern	voidspace: sorry, just got back	13:02
dimitern	voidspace: yes, I think that's correct	13:02
dimitern	voidspace: ah, well 'quick scan' could work but only if nothing else can add or remove stuff from the db while you do it	13:03
babbageclunk	frobware: I tried your patch after trying a few other things, but it seems like passwd -d ubuntu just makes it so that ubuntu can't login through the terminal.	13:15
babbageclunk	frobware: trying it with chpasswd instead.	13:15
frobware	babbageclunk: I use that alll the time	13:16
babbageclunk	frobware: hmm. Definitely didn't let me log in.	13:17
babbageclunk	frobware: maybe it's hanging before the bridgescript runs?	13:17
dimitern	frobware, babbageclunk: the ubuntu account is locked usually	13:17
frobware	babbageclunk: if that's the case my patch is either borked, or the bridgescript did not run	13:17
dimitern	in the cloud images	13:18
babbageclunk	frobware, dimitern - trying deploying from maas without juju.	13:20
babbageclunk	frobware, dimitern - how does the bridgescript get run? Juju gives it to maas which runs it via cloud-init?	13:21
frobware	babbageclunk: yep	13:21
dimitern	babbageclunk: yeah, as a runcmd: in cloud-init user data	13:21
frobware	dimitern: can we HO?	13:22
dimitern	frobware: sure - omw	13:23
dimitern	frobware: joined standup HO	13:24
frobware	dimitern: heh, I was in the other one. omw	13:24
babbageclunk	frobware, dimitern - ok, I see the same problem deploying with maas-only, so presumably the bridgescript never gets to run.	13:24
dimitern	babbageclunk: is this with trusty on maas 2.0 ?	13:24
babbageclunk	the install's paused for a long time with "Raise network interfaces", then it times out and continues to stop at a login prompt, but it's before cloud-init runs.	13:26
babbageclunk	dimitern: xenial on maas 2.0	13:26
dimitern	babbageclunk: hmm well that's odd	13:26
babbageclunk	dimitern: yeah. I'm going to kill off the vlans, that seems to trigger it. But I don't see why, since they didn't cause a problem before.	13:28
babbageclunk	dimitern: Then at least I can try to understand the lxd deploy bug better without this getting in the way.	13:28
dimitern	babbageclunk: sorry, otp	13:33
natefinch	dimitern, dooferlad: are you guys looking at the deploy --to lxd issue? I had started looking at that last night, but didn't get very far. I need that to be fixed so I can land my code that removes all the lxc stuff	14:02
dimitern	natefinch: yeah, I posted updates as well	14:12
frobware	natefinch: we may need to reassign if not finished by EOD	14:14
dimitern	natefinch: deploy --to lxc and --to lxd or --to kvm are equally broken, so it shouldn't block landing your patch	14:14
dimitern	natefinch: side-note: I'm more concerned with removing the LXC container type as valid; wasn't there a discussion to still allow both 'lxd' and 'lxc' (but treat both the same as 'lxd') for backwards-compatibility with existing bundles?	14:17
natefinch	dimitern: bundles will treat lxc like lxd, yes	14:27
natefinch	dimitern: it's just everything else that is getting lxc removed	14:27
dimitern	natefinch: ok then	14:28
natefinch	dimitern: btw, I swear there used to be help text for --to lxc that said "deploy to a container on a new machine"	14:28
natefinch	dimitern: but I don't see it now, so maybe I'm crazy	14:29
dimitern	natefinch: if there was, it was never tested	14:29
natefinch	dimitern: so are we fixing the bug that it doesn't immediately error out, or are we fixing the bug that it doesn't work?	14:30
dimitern	natefinch: and I know for sure maas provider is not handling this as it should; not tried others	14:30
cherylj	hey dimitern, should bug 1590689 be fixed in 1.25.6?	14:30
mup	Bug #1590689: MAAS 1.9.3 + Juju 1.25.5 - on the Juju controller node eth0 and juju-br0 interfaces have the same IP address at the same time <cpec> <juju> <maas> <sts> <juju-core:Fix Committed> <juju-core 1.25:Triaged> <MAAS:Invalid> <https://launchpad.net/bugs/1590689>	14:30
dimitern	cherylj: not without backporting the fix I linked to from master	14:31
cherylj	dimitern: sorry, what I mean is, should we hold off releasing 1.25.6 until that gets done?	14:31
dimitern	cherylj: oh, sorry not that one	14:31
dimitern	cherylj: ah, yeah it is that one - and FWIW I think we should not release 1.25.6 without it	14:32
cherylj	dimitern: is the backport already on your (or someone's) to do list?	14:32
mup	Bug #1591225 opened: Generated image stream is not considered in bootstrap on private cloud <juju-core:Incomplete> <https://launchpad.net/bugs/1591225>	14:33
dimitern	cherylj: not to my knowledge	14:33
dimitern	cherylj: I could switch to that and propose it (I have too many things in progress..)	14:33
cherylj	boy I know how that feels.	14:34
cherylj	dimitern: I think we're still a couple days away from a 1.25.6, so maybe aim to have it in by Tuesday?	14:34
dimitern	cherylj: that would be great!	14:37
cherylj	thanks, dimitern!	14:38
perrito666	bbl	14:38
dimitern	frobware: guess what?	14:39
frobware	its broken	14:39
frobware	dimitern: in beta6	14:39
dimitern	frobware: nope :) it works just the same with beta6	14:39
frobware	dimitern: sigh	14:39
natefinch	dimitern: so are we fixing it so that deploy --to lxd errors out the way --to lxc does? in my tests --to lxc says: "ERROR cannot add application "ubuntu3": unknown placement directive: lxc"	14:39
dimitern	(...for a change)	14:39
dimitern	natefinch: is that on maas btw?	14:40
natefinch	dimitern: whereas --to lxd doesn't error out (but then never works either)	14:40
dimitern	frobware: added a comment anyway	14:40
frobware	dimitern: thx	14:40
natefinch	dimitern: no. I never test on maas. don't have one. GCE. but I can try aws if it's not still broken like it was yesterday	14:41
natefinch	dimitern: it should be provider independent, though	14:41
dimitern	natefinch: yeah, it should, but as it turns out it's not unfortunately	14:42
natefinch	dimitern: I guess maas has that messed up "if it doesn't match anything else, let's assume it's a node" thing	14:42
dimitern	natefinch: I'll do a quick test now how deploy --to lxc and lxd is handled on maas, gce, and aws	14:43
natefinch	dimitern: I did GCE, so you can skip that one	14:43
=== tvansteenburgh1 is now known as tvansteenburgh
dimitern	natefinch: ok, I'll try azure then	14:43
natefinch	dimitern: lxd and kvm behave the same - they both return no error, but then never create a machine either	14:44
dimitern	natefinch: something just occurred to me.. lxd uses the 'lxd' as the default domain for container FQDNs	14:45
dimitern	natefinch: it might be the reason why lxd is different	14:46
natefinch	dimitern: I'm pretty sure a placement directive of just a container type is supposed to work: https://github.com/juju/juju/blob/master/instance/placement.go#L71	14:46
dimitern	natefinch: yeah, but there's also the PrecheckInstance from the prechecker state policy, which is called while adding a machine	14:48
dimitern	natefinch: hmm it looks like only maas is affected	14:51
dimitern	natefinch: as all other providers expect '=' to be present in the placement or parsing fails	14:52
dimitern	natefinch: or like joyent simply fails with placement != ""	14:54
dimitern	cloudsigma doesn't even bother to do anything.. precheckInstance is { return nil }.. why implement it then?	14:56
babbageclunk	dimitern, natefinch: I can see in the add-machine case where the decision to add a new machine with a container is made for lxc, I can't find anything corresponding to that in the deploy code.	14:59
natefinch	ahh, add machine, that's where it is: juju add-machine lxd (starts a new machine with an lxd container)	15:01
natefinch	I don't know why deploy would be any different	15:01
babbageclunk	dimitern, natefinch: ooh - does State.addMachineWithPlacement need to grow a call to AddMachineInsideNewMachine to do it?	15:01
babbageclunk	(in state/state.go:1249)	15:02
katco	natefinch: standup time	15:02
natefinch	katco: oops, thanks	15:02
natefinch	babbageclunk: 1275	15:03
dimitern	babbageclunk: the actual code deploy uses lives in juju/deploy.go	15:03
babbageclunk	dimitern: Yeah, but that will only put a new container in an existing machine.	15:03
babbageclunk	dimitern: vs this code from add-machine https://github.com/juju/juju/blob/master/apiserver/machinemanager/machinemanager.go#L158	15:04
dimitern	natefinch: on AWS 'deploy ubuntu --to lxd' and --to lxc both appear to work, but neither adds a machine for the unit	15:04
natefinch	dimitern: yeah, same for GCE	15:04
dimitern	natefinch: so it looks consistently broken everywhere :)	15:06
dimitern	I'd vote to reject '--to <container-type>' for deploy on its own (i.e. still allow '--to <ctype>:<id>')	15:06
babbageclunk	dimitern: So the code from add-machine will create a new host with a container inside, but the deploy codepath won't because it doesn't call AddMachineInsideNewMachine.	15:07
dimitern	until we can untangle the mess around it and make add-machine and deploy --to behave the same way	15:07
dimitern	babbageclunk: yeah, because nobody thought about it too much I guess	15:08
babbageclunk	dimitern: I think it's just an extra check in that function - if machineId is "", call AddMachineInsideNewMachine instead of AddMachineInsideMachine.	15:10
babbageclunk	dimitern: testing it now	15:11
dimitern	babbageclunk: that sounds correct	15:11
dimitern	babbageclunk: but definitely isn't the way to fix the bug	15:12
dimitern	babbageclunk: I mean.. this will allow deploy --to lxd to work, but it might also open a whole new can of worms on all providers	15:13
babbageclunk	dimitern: I don't see why? (But I haven't been following the discussion closely.)	15:13
dimitern	babbageclunk: e.g. deploy --to kvm on aws will start an instance but then fail to deploy the unit as kvm won't be supported	15:14
babbageclunk	dimitern: Isn't that the same behaviour as add-machine kvm?	15:14
dimitern	babbageclunk: similarly, --to lxd with 'default-series: precise' will similarly seem to pass initially, then fail as lxd is not supported on precise	15:15
dimitern	babbageclunk: add-machine is similarly broken in those cases	15:15
babbageclunk	dimitern: Isn't it worth doing this fix so add-machine and deploy behave in the same way (although both broken in the cases you describe)?	15:16
dimitern	babbageclunk: add-machine accepts other things, e.g. ssh:user@hostname	15:16
dimitern	babbageclunk: they still won't act the same	15:17
dimitern	babbageclunk: but, at least they will be a step closer	15:17
babbageclunk	dimitern: Yeah, it still seems like people expect them to work in the same way in this case.	15:18
natefinch	they should be as consistent as possible	15:18
dimitern	babbageclunk: ok, please ignore my previous rants then :) what you suggest is a good fix to have	15:19
* dimitern is just twitchy about changing core behavior before the release..		15:20
babbageclunk	dimitern: :) I mean, I think you're right that those cases are problems.	15:20
dimitern	we should have a well define format for placement, which allows provider-specific scopes; e.g. deploy --to/add-machine <scope>:<args>; where <scope> := <container-type>\|<provider-type>; <args> := <target>\|<key>=<value>[,..]	15:23
frobware	dimitern: in AWS with AA-FF why do we use static addresses and not dhcp?	15:23
frobware	dimitern: in containers	15:23
dimitern	frobware: because the FF	15:24
frobware	dimitern: sure, but really asking why static in that case	15:24
dimitern	frobware: i.e. the user asked for static IPs	15:24
dimitern	frobware: we use dhcp otherwise	15:25
dimitern	frobware: but the whole point of the FF and now the multi-NIC approach on maas has always been to have static IPs for containers	15:26
frobware	dimitern: it was AWS I was questioning; the MAAS I can see because you can ask for static/dhcp there	15:32
dimitern	frobware: you can on AWS as well	15:33
dimitern	frobware: AssignPrivateIpAddress	15:33
dimitern	http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_AssignPrivateIpAddresses.html	15:33
dimitern	well not nearly equivalent to what maas offers.	15:34
=== tvansteenburgh1 is now known as tvansteenburgh
alexisb	natefinch, when you have five minutes I have a few qs	16:01
dimitern	frobware: ping	16:01
dimitern	frobware: here's my patch so far: http://paste.ubuntu.com/17174180/	16:01
natefinch	alexisb: sure.	16:02
dimitern	frobware: now testing on aws w/ && w/o AC-FF (xenial), and on maas-19 (t) / maas-20 (x)	16:02
alexisb	https://hangouts.google.com/hangouts/_/canonical.com/juju-release	16:03
alexisb	natefinch, ^^	16:03
frobware	dimitern: it's nuts... all this manual testing we're BOTH doing... Grrr.	16:03
alexisb	cherylj, feel free to crash the party	16:03
dimitern	frobware: yeah..	16:04
frobware	dimitern: your patch "so far" - does that mean use or wait?	16:05
dimitern	frobware: so far only as long as the currently running make check passes	16:06
dimitern	frobware: or if something comes up from the live tests (will be able to tell you shortly); otherwise I think I covered everything in what I pasted	16:07
dimitern	frobware: yeah, I've missed a few tests in container/kvm	16:11
alexisb	babbageclunk, dimitern: what is the consensus for a fix on lp 1590960 ??	16:15
alexisb	lp1590960	16:16
babbageclunk	alexisb: maybe bug 1590960? Or is mup sulking?	16:16
mup	Bug #1590960: juju deploy --to lxd does not create base machine <deploy> <lxd> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1590960>	16:16
alexisb	there we go :)	16:17
babbageclunk	I've got a fix, tested manually, just finishing the unit test for it.	16:17
dimitern	alexisb: we can fix deploy to work with --to <container-type>, but that's not what's blocking natefinch's patch LXC-to-LXD	16:17
alexisb	dimitern, correct it is not blocking	16:17
babbageclunk	Should be up for review in ~10 mins	16:18
alexisb	but looking at this mornings discussion there seemed to be some different ideas how what should work woith --to and what shouldnt	16:18
alexisb	was just curious what the expected behavior should be	16:18
cherylj	alexisb, natefinch looks like --to lxc is also a problem on 1.25: https://bugs.launchpad.net/juju-core/+bug/1590960/comments/6	16:20
mup	Bug #1590960: juju deploy --to lxd does not create base machine <deploy> <lxd> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1590960>	16:20
dimitern	alexisb: that's the real issue: behavior was neither clearly defined nor tested	16:20
alexisb	dimitern, exactly	16:20
dimitern	alexisb: but it's sensible to expect deploy --to X it should work like add-machine X does	16:21
alexisb	dimitern, also agree	16:21
natefinch	cherylj: an error is a lot better than silently half-working... but yeah, should be fixed to mirror add-machine	16:21
dimitern	alexisb: and babbageclunk's fix should get us there	16:21
natefinch	huzzah :)	16:21
dimitern	but not all the way	16:22
alexisb	dimitern, though a note to the juju-core team might be good so that we highlight the change and educate the team	16:22
alexisb	babbageclunk, ^^	16:22
dimitern	agreed	16:22
dimitern		16:22
babbageclunk	alexisb, dimitern: Clarifying - am I sending the note about this change?	16:25
dimitern	babbageclunk: I'd appreciate if you do it, I can help clarifying something or other if you need though	16:25
dimitern	frobware: so the patch didn't work for aws	16:26
alexisb	babbageclunk, yeah just to the juju-core lanchpad group	16:26
frobware	dimitern: what happened?	16:26
dimitern	frobware: ERROR juju.provisioner provisioner_task.go:681 cannot start instance for machine "0/lxd/0": missing	16:26
dimitern	container network config	16:26
dimitern	frobware: it slipped through somewhere.. looking	16:27
frobware	dimitern: why do I think that's an existent bug... ?	16:27
babbageclunk	dimitern, alexisb: Ok cool - I think I understand the wider issues now. Basically just that this will still do slightly weird things on clouds that don't support the container type, but at least that the add-machine and deploy behaviour is more consistent.	16:28
alexisb	babbageclunk, yep	16:28
alexisb	and we as a team should be clear on what the current behaviour is and the gaps, so we can both explain to users and make better desicions on what the behaviour should be	16:29
alexisb	cmars, cherylj, do we have any progress on https://bugs.launchpad.net/juju-core/+bug/1581157	16:30
mup	Bug #1581157: github.com/juju/juju/cmd/jujud test timeout on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged by dave-cheney> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1581157>	16:30
dimitern	frobware: nope, it was a warning before	16:41
dimitern	frobware: I'll need to add a few more tweaks to the patch and will resend	16:41
cherylj	alexisb: I haven't heard anything from cmars about it	16:50
=== frankban is now known as frankban\|afk
mup	Bug #1591290 opened: serverSuite.TestStop unexpected error <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1591290>	17:18
dimitern	frobware: fixed patch: http://paste.ubuntu.com/17177501/	17:21
dimitern	frobware: should now work ok on AWS (testing again); all unit tests fixed	17:23
dimitern	frobware: I probably should've proposed it rather than to bug you with it :/	17:24
natefinch	dimitern: you mention placement strings with = in them in that bug... but placement strings don't use = AFAIK? placement is like --to 0\lxc\0 or --to lxd:4 maybe you mean constraints?	17:26
dimitern	natefinch: on maas you can do --to zone=foo	17:28
dimitern	natefinch: and I think most others support zone= as well	17:28
dimitern	natefinch: see, it's confusing :)	17:29
natefinch	dimitern: gah, zone should be a constraint :/	17:37
natefinch	well... maybe not	17:38
natefinch	I g uess constraints are for all units of a service	17:38
natefinch	still... weird	17:39
=== benji is now known as Guest73726
dimitern	natefinch: yeah, it can't be useful as a constraint if we're to do automatic zone distribution	17:40
natefinch	dimitern: right (sometimes you might not want them distributed,. but that's the exception). Anyway... many valid placements do not use =... like specifying containers or machines	17:41
dimitern	natefinch: there's also a container=lxd constraint btw, hardly tested	17:41
babbageclunk	natefinch, dimitern: halp! After state.AddApplication's been called, the units are just staged, is that right? When/how does juju/deploy:AddUnits get called?	17:42
babbageclunk	Is it triggered by a watcher of some sort?	17:43
natefinch	babbageclunk: there's the unitassigner that makes sure units get assigned to machnies	17:43
dimitern	babbageclunk: it goes like this: cmd/juju/application/deploy.go -> api/application/deploy -> apiserver/application/deploy -> juju/deploy -> state	17:43
natefinch	babbageclunk: it's a worker	17:43
babbageclunk	dimitern: yeah, I could follow that, but none of the code in that chain actually ends up calling AssignUnitWithPlacement.	17:45
babbageclunk	natefinch: Ah, ok - thanks.	17:45
natefinch	babbageclunk: yeah, we add a staged assignment during deploy, and then the unitassigner reads those and turns them into real assignments.	17:46
babbageclunk	natefinch: ok - that makes sense. I was trying to understand why I didn't see the error I see in my unit test when running deploy manually.	17:50
babbageclunk	natefinch: It's because the errors are raised by the unitassigner and logged somewhere, rather than coming back from the api to the command.	17:52
dimitern	frobware: FYI, proposed it: http://reviews.vapour.ws/r/5040/	17:55
babbageclunk	dimitern, natefinch: review please? http://reviews.vapour.ws/r/5041/	18:10
dimitern	babbageclunk: cheers, looking	18:10
babbageclunk	dimitern: I mean, you shouldn't now! It's late there! It's kinda late here now!	18:10
babbageclunk	dimitern: but thanks!	18:12
* babbageclunk is off home - have delightful weekends everyone!		18:12
dimitern	babbageclunk: likewise! :)	18:12
frobware	dimitern: will take a look	18:20
redir	brb reboot	18:47
=== Spads_ is now known as Spads
mup	Bug #1588924 opened: juju list-controllers --format=yaml displays controller that cannot be addressed. <juju-core:Fix Committed> <juju-deployer:Invalid> <https://launchpad.net/bugs/1588924>	19:22
=== Spads_ is now known as Spads
=== natefinch is now known as natefinch-afk
perrito666	cmars: still around?	20:41
mup	Bug #1591379 opened: bootstrap failure with MAAS doesn't tell me which node has a problem <v-pil> <juju-core:New> <https://launchpad.net/bugs/1591379>	21:19
cmars	perrito666, yep, what's up?	21:26
* perrito666 deleted what he was writing because he began in spanish		21:26
perrito666	cmars: I wanted to ask you about juju/permission	21:26
perrito666	We are sort of moving in another direction http://reviews.vapour.ws/r/4973/#comment27181	21:26
cmars	leo un poquito	21:26
cmars	looking	21:27
perrito666	dont do that (the spanish) you just short circuited my brain badly :p	21:27
perrito666	it is fun to see your own language and not understand it	21:27
cmars	:)	21:28
cmars	perrito666, is there a doc or tl;dr for the permissions changes?	21:28
mup	Bug #1591387 opened: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>	21:55
mup	Bug #1591387 changed: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>	21:58
mup	Bug #1591387 opened: juju controller stuck in infinite loop during teardown <juju-core:New> <https://launchpad.net/bugs/1591387>	22:10

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!