/srv/irclogs.ubuntu.com/2011/08/02/#ubuntu-ensemble.txt

pullies	hi, i'm trying out ensemble but when running `ensemble status` i get an Invalid SSH key message	00:41
pullies	i have specified authorized-keys-path in my environment (that's what let me run bootstrap in the first place)	00:42
pullies	this is when using the ppa, btw.	00:43
pullies	any advice?	00:43
niemeyer	pullies: Hey there	00:51
niemeyer	pullies: Hmmm	00:51
niemeyer	pullies: What's the content of the file you're pointing at with the authorized-keys-path?	00:53
niemeyer	pullies: It usually gets that automatically from your ~/.ssh/id_dsa.pub or ~/.ssh/id_rsa.pub files	00:55
niemeyer	pullies: Did you have these when you were seeing the error earlier?	00:55
RoAkSoAx_	niemeyer: who creates id_{dsa\|.pub	00:58
RoAkSoAx_	niemeyer: who creates id_{dsa\|rsa}.pub	00:58
RoAkSoAx_	in the zookeeper?	00:58
niemeyer	RoAkSoAx_: deploy does	00:58
niemeyer	RoAkSoAx_: It serializes with the environment	00:58
RoAkSoAx_	niemeyer: ok cool, cause I was encountering situations on which the zookeeper complaint about the keys but there were no keys	01:02
RoAkSoAx_	niemeyer: err was looking for keys, but there were no keys	01:03
niemeyer	RoAkSoAx_: Ok.. pullies was just reporting a similar issue above	01:03
niemeyer	RoAkSoAx_: Maybe that's the problem	01:03
niemeyer	pullies: Have you deployed something? That could be the issue	01:03
niemeyer	We should handle that bootstrapping phase better in that sense	01:03
RoAkSoAx_	niemeyer: but in my cause, after bootstrapping the zookeeper dones't have any .pub keys created	01:05
=== daker_ is now known as daker
hazmat	g'morning	11:22
fwereade	heya hazmat :)	11:46
pullies	sorry, i disappeared for the night, as you could probably tell. ;-) i have deployed nothing.	12:22
pullies	-----BEGIN RSA PRIVATE KEY-----	12:22
pullies	(a base64 encoded string which i won't paste here)	12:22
pullies	-----END RSA PRIVATE KEY-----	12:22
* hazmat has to give an ensemble presentation tonight, but his laptop does a kernel panic everytime it sleeps		12:26
hazmat	sadness	12:26
fwereade	did something about how we handle $PYTHONPATH change?	12:46
hazmat	fwereade, no.. not in a very long time	12:47
fwereade	hazmat: ok, I'm doing something dumb then :)	12:47
hazmat	fwereade, there is a new bug open about pythonpath being set for hooks which it probably shouldn't be	12:47
hazmat	fwereade, what's the sympton?	12:47
fwereade	I set PYTHONPATH, run bin/ensemble, and it still picks up the system version	12:48
hazmat	fwereade, a system version (ppa install) of ensemble?	12:48
fwereade	hazmat: yep	12:48
hazmat	fwereade, is it ppa or a manual installation via sudo setup.py install?	12:49
fwereade	ppa	12:49
hazmat	fwereade, i'd suggest first removing the package	12:49
hazmat	PYTHONPATH=$PWD python -c "import ensemble; print ensemble"	12:50
hazmat	is a quick verification of the import path for ensemble	12:50
hazmat	er.. PYTHONPATH=$PWD:$PYTHONPATH is probably better	12:51
hazmat	for real usage	12:51
fwereade	hazmat: that appears to work	12:51
fwereade	hazmat: anyway, don't worry, all I really needed was verification that it we me being stupid ;)	12:51
hazmat	fwereade, path problems hit all of us one time or another..	12:52
* fwereade is reassured ;)		12:54
hazmat	fwereade, btw, when your the second reviewer on a branch, you should adjust the merge proposal status (at the top) if both the reviews are approve, then approved, else work in progress.	12:58
* hazmat dog walks, bbiab		12:59
fwereade	hazmat: cool, thanks	13:01
fwereade	hazmat: I think I may have thought it happened by magic :p	13:02
pullies	sorry, a little context would probably help. last night i reported an error that `ensemble status` gives an Invalid SSH key message when i specify my authorized-keys-path in environment	13:02
hazmat	pullies, so you have the path to your ssh key in 'authorized-keys-path' in the provider section of the environment?	13:03
hazmat	pullies, and yes the context helps ;-)	13:04
pullies	hazmat, i have a path to my ssh key in environments.sample.authorized-keys-path	13:05
hazmat	pullies, looking at the code, it looks like the path is a misnomer :-( it wants the name of a file in the .ssh directory	13:05
pullies	AHA	13:05
hazmat	pullies, if your up for it please file a bug that this is misleading/confusing regarding the name and usage of this setting	13:05
pullies	hazmat, where's the bug tracker?	13:06
pullies	launchpad?	13:06
hazmat	http://launchpad.net/ensemble	13:06
hazmat	pullies, your going to need to shutdown and bootstrap again, we need the ssh key active for doing any connection from the cli to the ensemble setup	13:07
hazmat	pullies, so your saying you where able to bootstrap with an invalid key?	13:07
hazmat	s/where/were	13:07
hazmat	that's also a problem/bug	13:07
hazmat	if so	13:07
pullies	hazmat, running the instance was successful. i believe connecting to it was not	13:07
pullies	this launchpad dashboard is confusingly worded about openid	13:08
pullies	:-)	13:08
hazmat	pullies, suggestions welcome on #launchpad ;-)	13:08
hazmat	pullies, thanks for filing a bug, i'm heading out for a few minutes, let us know how it goes	13:09
pullies	still the same error message	13:10
pullies	i'm assuming "the ssh directory" is ~/.ssh	13:11
pullies	?	13:11
_mup_	Bug #819803 was filed: authorized-keys-path is actually a filename, not a path. <Ensemble:New> < https://launchpad.net/bugs/819803 >	13:14
pullies	hazmat, it's possible that ssh access hasn't been enabled for the security group, poking around at ec2 docs. can you confirm that this is a necessary precondition that ensemble doesn't take care of?	13:17
hazmat	pullies, ensemble does indeed take care of ec2 security groups	13:31
hazmat	pullies, it takes a minute or two for the instance to be up and responding	13:31
hazmat	pullies, actually it does look like it will try either a full path or a name	13:31
pullies	i've skipped the ensemble part.	13:32
hazmat	pullies, it looks like you would get a LookupError	13:32
pullies	i'm trying to use the keypair itself	13:32
hazmat	not an invalid ssh key error, if it couldn't find the key	13:32
pullies	to make sure i can login	13:32
pullies	and i can't.	13:32
pullies	ssh -i ~/.ssh/mykey.pem ubuntu@ec2-ip.compute-1.amazonaws.com	13:33
pullies	that should succeed, yes?	13:33
pullies	ssh -v is telling me that it reads the rsa private key	13:33
pullies	"authentications that can continue: public key" is issued twice	13:34
pullies	i'm a bit confused why i can't ssh directly in	13:34
hazmat	pullies, the ssh key is specified for ensemble is the public key, not the private key	13:35
hazmat	pullies, do you have an id_[dsa\|rsa].pub in ~/.ssh ?	13:35
pullies	amazon only gave me a .pem file to download	13:36
hazmat	pullies, ensemble doesn't use that	13:36
hazmat	pullies, try ssh-keygen -t rsa	13:36
pullies	will generate a local key and try that	13:36
pullies	:-) duh.	13:36
pullies	thanks	13:37
hazmat	pullies, you can remove the authorized-keys-path as well, ensemble picks up default keys automatically if none are specified	13:37
fwereade	difference between orchestra and ec2: we can't easily tell whether an orchestra machine is running	13:37
fwereade	so get_zookeeper_machines is problematic on orchestra	13:38
fwereade	because it can't verify the sanity of the state it gets from FileStorage	13:38
fwereade	in orchestra, should we be (say) trashing state in shutdown, or should we figure out some way to query the machines and thereby match ec2 better?	13:39
hazmat	fwereade, orchestra doesn't know if machines it setup are running?	13:41
hazmat	i thought it was doing a tftp/dhcp setup, maybe that's not exposed via the api?	13:42
fwereade	hazmat: nope, you can theoretically query power status	13:42
fwereade	but that's acted weird for me	13:43
fwereade	and that still doesn't tell us whether they're actually running, or if something went wrong part way through install (for example)	13:43
hazmat	fwereade, it looks like the examples specify remote power management but not status	13:43
fwereade	the api includes a "status" command, which AFAICT acts like an "off" command	13:44
hazmat	fwereade, that seems like an upstream bug if that's the case	13:45
hazmat	fwereade, not having any orchestra/cobbler experience, i'm not sure what the options are. but if the zk pointer file is invalid, the whole thing basically breaks down.	13:46
niemeyer	Hey guys!	13:46
fwereade	hazmat: I'm assuming for now that it's something I'm doing wrong, I tend to defer to RoAkSoAx_ for the final word on these things	13:46
hazmat	niemeyer, top of the morning	13:46
fwereade	heya niemeyer	13:47
fwereade	hazmat: it's fine if provider-state is borked, on ec2, because we can check machine status	13:47
fwereade	hazmat: we bootstrap if there's no state, or if the state is nonsensical => probably already shut down	13:47
hazmat	fwereade, well on ec2 we always intersect the two provider state against machine status, becuase we need the ip resolution	13:47
niemeyer	hazmat: balance to you my friend	13:48
fwereade	hazmat: maybe that's the intent, I'm just telling you what I could infer from the code :)	13:48
hazmat	fwereade, yup, indeed that's the case.. orchestra is a different beast a bit	13:48
pullies	hazmat, now this is progress	13:49
pullies	2011-08-02 09:48:45,857 ERROR SSH forwarding error: Agent admitted failure to sign using the key. Permission denied (publickey).	13:49
niemeyer	hazmat, fwereade Got the conversation mid-way through, but it sounds sensible to trash state on shutdown	13:49
niemeyer	I'd rather rename shutdown to destroy-environment, but that's another topic	13:49
hazmat	pullies, if you modify/change the key, you'll need to ensemble shutdown && ensemble bootstrap	13:49
pullies	this is after doing that	13:49
pullies	and i've removed the path from environments.yaml	13:50
niemeyer	pullies: Have you run deploy already?	13:51
hazmat	pullies, so if you do ec2-describe-instances do you see the instance running (the security group should match the environment name prefixed with 'ensemble-')	13:51
fwereade	niemeyer: cool, that feels like it would make life easier on my side and do no harm to ec2	13:51
niemeyer	Well, I guess it doesn't really matter actually	13:51
niemeyer	fwereade: Agreed	13:51
hazmat	niemeyer, yeah.. failure to connect precludes deploy	13:51
niemeyer	hazmat: I was thinking that the key is serialized with the env, which happens at deploy	13:52
niemeyer	hazmat: But that's something else.. we need a key there to connect to zk in the first place	13:52
hazmat	pullies, you should be able to ssh into the machine using directly using ssh ubuntu@ec2-host-name	13:52
hazmat	niemeyer, the public key is sent at launch time via cloud-init	13:52
hazmat	niemeyer, its not stored in zk	13:52
pullies	hazmat, the dashboard shows the machine.	13:53
niemeyer	hazmat: It is stored in zk during deploy	13:53
hazmat	niemeyer, the environment is yes	13:53
niemeyer	hazmat: Otherwise how would it be in cloud-init for the other machines	13:53
niemeyer	hazmat: The keys	13:53
hazmat	niemeyer, totally	13:53
hazmat	but not for the bootstrap	13:53
niemeyer	hazmat: Yes, I guess that's what I said above?	13:53
hazmat	yup	13:53
hazmat	:-)	13:53
hazmat	pullies, so what'd i'd like to verify is from the cli you can log into that machine via ssh, if you didn't rename the ssh key, it should just pick up the private side of the same default	13:55
hazmat	ssh will try a few from what it finds in ~/.ssh	13:56
pullies	i generated the key twice. it's possible something is cached in either my client or theirs. will have to log out and try again	13:58
pullies	will attempt it tonight	13:58
pullies	thanks for the help. will definitely focus on the ssh portion, i don't think it's ensemble at this point	13:58
niemeyer	statik: Morning	14:11
statik	morning niemeyer	14:12
RoAkSoAx_	fwereade: howdy!!	14:16
fwereade	RoAkSoAx_: heyhey!	14:17
fwereade	ow's it going?	14:17
RoAkSoAx_	fwereade: pretty good, you?	14:17
fwereade	RoAkSoAx_: pretty good thanks :)	14:17
fwereade	RoAkSoAx_: and I got netboot 9% working on my cobbler, too	14:17
fwereade	shadow-trunk is up to date, and might even work for you now ;)	14:18
RoAkSoAx_	fwereade: cool, where are you stuck?	14:18
fwereade	er, that should have been a 99% up there :)	14:18
fwereade	the ubuntu-orchestra-client install	14:18
RoAkSoAx_	fwereade: cool, I'm actually pulling your branch to test now	14:18
RoAkSoAx_	fwereade: you mean the variable?	14:18
RoAkSoAx_	on the preseed?	14:19
fwereade	(1) it asks for an rsyslog server, and then cannot fails with "cannot stat /var/something/puppet"	14:19
RoAkSoAx_	fwereade: show me the line in the preseed	14:19
fwereade	RoAkSoAx_: yeah, I remember you telling me to "just comment it out for now" a while ago, so that's what I did	14:19
fwereade	can't copy from VM, but it's the pkgsel one as copied from your mail	14:20
RoAkSoAx_	fwereade: is the creation of the cloud-init data fixed?	14:20
fwereade	RoAkSoAx_: I think so	14:20
fwereade	RoAkSoAx_: I now generate something that actually looks like a working EC2 one	14:21
RoAkSoAx_	fwereade: ok, gonna test now then ;)	14:21
fwereade	RoAkSoAx_: the precise details of how I screwed it up the first time are far to embarrassing to relate :p	14:21
fwereade	RoAkSoAx_: sweet, tyvm	14:21
RoAkSoAx_	fwereade: hehe its all good	14:22
fwereade	RoAkSoAx_: hm, I seem to be getting "204 No Content"s from webdav, which I wasn't before, but it all works (apart from the error, heh)	14:31
RoAkSoAx_	fwereade: on the orchestra server, what's in /var/lib/webdav	14:32
fwereade	RoAkSoAx_: the right stuff	14:32
RoAkSoAx_	fwereade: formulas dir and provider-state?	14:33
fwereade	RoAkSoAx_: no but yes (I haven't got a formulas dir at the moment, but the right content was written to provider-state)	14:33
RoAkSoAx_	fwereade: mkdir -p /var/lib/webdav/formulas && chown -R www-data:www-data /var/lib/webdav/formulas	14:35
RoAkSoAx_	fwereade: ok, so bootstrapping works, ensemble status doesn't	14:36
fwereade	awesome! I haven't even thought about what status does, so that's the progress I wanted :)	14:37
RoAkSoAx_	fwereade: ok, in orchestra means that was related to having @property def _machines:	14:38
fwereade	RoAkSoAx_: indeed, and my understanding was that that was something you wanted to defer until the sprint	14:38
RoAkSoAx_	fwereade: but anywa,s what's the last thing merged there and what was left to "separate" from the old bootstrap-orchestra branch?	14:38
RoAkSoAx_	fwereade: i wanna have it working though	14:38
fwereade	RoAkSoAx_: sounds good to me	14:39
fwereade	RoAkSoAx_: were we going with "stick it into ks_meta" for now?	14:39
fwereade	last thing merged into shadow-trunk is cobbler-launch-machine	14:39
RoAkSoAx_	fwereade: so what;'s missing, the shutdown stuff?	14:39
fwereade	cobbler-kill-machine is WIP	14:39
fwereade	bootstrap-verify-storage is an unrelated bug I picked up lest I spin my wheels on monday, and that should be good soon	14:40
RoAkSoAx_	fwereade: alright, so I;ll re-read your branch and try to identify what'[s missing from the things I wanted to do	14:40
fwereade	RoAkSoAx_: I plan to make one more change -- to treat 204 as success (as I think is correct: processed successfully, doesn't feel it needs to return any content)	14:41
RoAkSoAx_	fwereade: I haven't seen that, where did you see that?	14:42
RoAkSoAx_	ort in what situation	14:42
fwereade	I seem to be getting that every time I PUT provider-state to webdav	14:42
RoAkSoAx_	fwereade: i haven't seen anything	14:42
RoAkSoAx_	fwereade: make sure the formulas dir is there and restart apache2 and see if it keeps throwing that error	14:43
RoAkSoAx_	fwereade: is the default storage-url also in?	14:43
fwereade	well, it's not an error, it seems like a perfectly legitimate response: "yep, cool, I've handled your request and I have nothing more to tell you, but here's a fresh etag maybe"	14:43
fwereade	but twisted getPage seems to consider "not 200" == "something happened, raise an exception"	14:44
fwereade	I'll bounce apache anyway, but I think what'll fix it is deleting provider-state, I'll let you know in 5	14:45
RoAkSoAx_	fwereade: nah nothing will delete provider-state	14:45
RoAkSoAx_	fwereade: you'd have to do it manually in orther to be able to bootstrap again	14:45
fwereade	RoAkSoAx_: what I'm doing is setting it to {} on shutdown	14:46
fwereade	RoAkSoAx_: and, yes, if I overwrite I get 204, if I trash it manually I get 200	14:47
fwereade	RoAkSoAx_: overwrite is perfectly reaonable behaviour, I'll just make sure ensemble understands that	14:47
RoAkSoAx_	fwereade: i don't think we would need to delete provider-state on shutdown	14:47
RoAkSoAx_	fwereade: remember that we are dealing with physical hw	14:48
RoAkSoAx_	and it is expensive	14:48
RoAkSoAx_	to be installing every time we want a zookeeper	14:48
RoAkSoAx_	when we already have one	14:48
=== RoAkSoAx_ is now known as RoAkSoAx
fwereade	hm, I thought that ensemble shutdown was intended to wipe out the whole environment -- inverse of bootstrap	14:48
RoAkSoAx	fwereade: that's one of the things I'm also planning to address.	14:49
fwereade	that's what it seems to do on EC2 anyway :)	14:49
RoAkSoAx	fwereade: yes, on ec2 is non-expensive because you can fire up instances or destroy them on demand	14:49
fwereade	RoAkSoAx: heh, ok	14:49
RoAkSoAx	fwereade: but in real hardware is not the same approach	14:49
fwereade	RoAkSoAx: I've been working under the assumption that I should mirror ec2 behaviour as closely as possible	14:50
fwereade	RoAkSoAx: at least for now ;)	14:51
RoAkSoAx	fwereade: yes, but I think things like that	14:51
RoAkSoAx	can be avoided for now	14:51
RoAkSoAx	fwereade: I mean, wiping out provider-state is a super minor change	14:51
RoAkSoAx	and I don't think it is necessary	14:51
fwereade	RoAkSoAx: well, keeping a zookeeper around is quite a major difference, it seems to me :)	14:52
fwereade	RoAkSoAx: well... it's very convenient for me :)	14:52
RoAkSoAx	fwereade: indeed, but again, we are dealing with real hardware in this case	14:52
RoAkSoAx	fwereade: sysadmins wont install zookeepers every week to deploy environments but rather, they would keep once zookeeper up and running at all times	14:53
RoAkSoAx	fwereade: it is expensive in many ways, 1: downtime 2. network bandwidth wasted 3. hardware is useless 4. reinstallations at all times are expensive	14:54
RoAkSoAx	fwereade: why this works on ec2? simply becuase i can fire up/destroy instances on demand and costs 2 cents?	14:54
RoAkSoAx	fwereade: were there's a prebuilt image	14:54
fwereade	RoAkSoAx: heh, got you, it's the system install cost not the zookeeper install cost (right?)	14:55
RoAkSoAx	fwereade: yes	14:55
fwereade	RoAkSoAx: ...but we still pay the system install cost for every other machine, right?	14:55
RoAkSoAx	fwereade: right, but the idea is now to figure out a way of re-using the machines instead	14:55
fwereade	RoAkSoAx: and if we have a local mirror it's not going to be that big a difference is it?	14:55
fwereade	RoAkSoAx: ha -- I see :)	14:56
fwereade	RoAkSoAx: that goal has escaped me	14:56
fwereade	RoAkSoAx: sorry :)	14:56
RoAkSoAx	fwereade: hehe but yeah having a mirror is still big difference when deploying a services	14:56
RoAkSoAx	cause it still uses bandwdith	14:57
RoAkSoAx	and multiplyed by lots of servers	14:57
RoAkSoAx	it is huge	14:57
RoAkSoAx	fwereade: but yes, that's another thing that I was gonna bring up during the sprint	14:57
fwereade	RoAkSoAx: you make a lot of sense	14:57
* RoAkSoAx better starts writing down all this stuff otherwise he'll forget :)		14:58
fwereade	RoAkSoAx: it just doesn't precisely fit with what I'd understood our goals to be -- I thought we were aiming for parity with ec2 for now, and figuring out the tricky stuff at the sprint	14:58
* fwereade would appreciate that :)		14:58
RoAkSoAx	fwereade: yeah we can do that if you want too	14:59
RoAkSoAx	fwereade: dealing with VM's is as inexpensive as ec2	14:59
niemeyer	I seem to remember the wiki sent us to the right page after authenticating	14:59
niemeyer	It doesn't look like that's the case anymore	14:59
fwereade	RoAkSoAx: well, that's my justification for what I've been doing	15:00
RoAkSoAx	fwereade: but right now, what I was persnally looking for is having it bootstrapping, deploying, etc, working (not really exactly the same as ec2, but close), so that during the sprint we could address these issues and differences with ec2	15:00
fwereade	RoAkSoAx: I feel it's currently useful, towards that goal, even if things change as the plans firm up	15:00
RoAkSoAx	fwereade: you don't need to justify as we didn't set any boundaries about stuff liuke these when we started	15:01
fwereade	RoAkSoAx: that's my idea too, with the added condition of "on my local VM network"	15:01
RoAkSoAx	fwereade: but my concern is that you might end up writing code that might be later dismissed :)	15:01
fwereade	RoAkSoAx: deleting code is one of the great joys in life ;)	15:01
RoAkSoAx	fwereade: hehehe alright	15:02
RoAkSoAx	fwereade: again I don't mind you doing that, seriously, just giving you a broad view of what I have in my mind at the moment :)	15:02
fwereade	RoAkSoAx: cool, I was worried I was going off into the weeds :)	15:02
fwereade	RoAkSoAx: good to resync ;)	15:02
RoAkSoAx	fwereade: nah.. either way, this things are gonna have to be discussed next week so my thoights my change given input from others	15:03
RoAkSoAx	these*	15:04
fwereade	RoAkSoAx: cool -- anyway, I'll handle 204s on .put() and propose launch-machine and bootstrap-verify-storage	15:04
RoAkSoAx	fwereade: cool	15:04
fwereade	RoAkSoAx: and that'll probably be my day, but I might be able to check in later when cath's gone to bed	15:04
fwereade	RoAkSoAx: I'll make sure shadow-trunk is up to date with whatever I've proposed	15:05
RoAkSoAx	i'll work on reviwing what would be missing from shadow-trunk in comparison to bootstrap's branch	15:05
fwereade	fantastic	15:05
=== daker is now known as daker_
niemeyer	<RoAkSoAx_> fwereade: i don't think we would need to delete provider-state on shutdown	15:54
niemeyer	<RoAkSoAx_> fwereade: remember that we are dealing with physical hw	15:54
niemeyer	<RoAkSoAx_> and it is expensive	15:54
niemeyer	RoAkSoAx: destroy-environment should really destroy it..	15:54
niemeyer	RoAkSoAx: I agree with you that physical hardware may make the admin act differently	15:55
niemeyer	RoAkSoAx: E.g. not destroying the environment	15:55
niemeyer	RoAkSoAx: It should be possible to terminate services and take them off the machines so that we can reuse not only the bootstrap machine but all of them	15:55
fwereade	everyone: I need to be away sharpish, I'm afraid	15:55
niemeyer	RoAkSoAx: But that's about _using_ the env	15:55
niemeyer	RoAkSoAx: Without destroying it	15:55
niemeyer	RoAkSoAx: Having ensemble destroy-environment not destroying it for reuse would be awkward	15:56
fwereade	but I have a couple of new mps, and I would appreciate reviews from one and all, eithet on those or on their various prerequisites :)	15:56
fwereade	enjoy your afternoons :)	15:56
niemeyer	I'm stepping out as well, but for lunch.. biab	15:57
RoAkSoAx	niemeyer: right, but from my point of view, destroy an environment should destroy everything, but leave the information from the zookeeper, so next time someone will like to bootstrap, it can detect "hey there's already a zookeeper, if it is sleeping, let's wake it up, if it is awake, let's use it"	16:04
RoAkSoAx	niemeyer: and that way we save ourselves from reinstalling a machine again	16:05
niemeyer	RoAkSoAx: zk is part of the environment	16:47
niemeyer	RoAkSoAx: It's actually a key part of it	16:47
niemeyer	RoAkSoAx: If one wants to save the time to redeploy zk, just don't destroy the environment	16:48
niemeyer	RoAkSoAx: It's a "doctor, it hurts!" case :)	16:48
_mup_	ensemble/states-with-principals r303 committed by kapil.thangavelu@canonical.com	16:48
_mup_	statebase retry topology change respects change functions which yield control.	16:48
niemeyer	bcsaller: How's it going with the local dev stuff?	17:55
bcsaller	niemeyer: I'm working on trying to add flexability to how machine assignment is done in deploy/add_unit. Those both use state.service.assign_to_unassigned_machine which clearly isn't always what we want.	17:57
bcsaller	niemeyer: but specifying machines in deploy/add-unit is a little at odd with the co-location spec. Its a different axis to plot unit placement on	17:58
niemeyer	bcsaller: Don't worry about co-location for the moment..	17:58
niemeyer	bcsaller: This is really a different angle of the problem	17:58
bcsaller	just keeping it in mind	17:58
niemeyer	bcsaller: Cool, that's nice	17:58
niemeyer	bcsaller: Hmm.. but we do have specific assignment, rigth?	17:59
niemeyer	bcsaller: assign_to_unassigned is just one method we have	17:59
bcsaller	its the only one used	17:59
bcsaller	in the cli	17:59
bcsaller	so really it becomes about providing access to other means for placement (as a starting point)	18:00
bcsaller	I know there is a desire down the road to say things like `ensemble add-unit -n <num> service`	18:01
bcsaller	but if deploy and add-unit grow syntax to support machine assignment I want it to be future friendly	18:01
niemeyer	bcsaller: Have you seen assign_to_machine?	18:02
bcsaller	yes	18:02
bcsaller	niemeyer: I think the issue comes in at the cli level to be clear	18:03
niemeyer	bcsaller: That's why I don't get the problem you're describing.. sure, we have assign_to_unassigned_mchine, which is the hard one..	18:03
niemeyer	bcsaller: We also have an explicit one	18:03
niemeyer	bcsaller: Which is easy to use	18:03
bcsaller	niemeyer: its literally an issue of cli syntax I'm talking about, not a coding hurdle	18:03
niemeyer	bcsaller: Ahh, ok	18:03
bcsaller	I don't want to blindly add new syntax that isn't friendly to the other efforts we have in mind	18:04
niemeyer	bcsaller: 100% with you	18:04
niemeyer	bcsaller: Hmmm	18:05
niemeyer	bcsaller: Here is an idea..	18:07
SpamapS	How is this at all relevant to local dev?	18:07
SpamapS	There's only one machine in local dev.	18:07
niemeyer	bcsaller: Let's introduce a command named "set-devel-flag"	18:07
niemeyer	SpamapS: Let's cover this in a moment..	18:07
SpamapS	Which would be "available" because it can add containers.	18:07
niemeyer	bcsaller: Or even better, "set-devel"	18:08
bcsaller	SpamapS: thats an important part of the change, but now the cli tools only look for unassigned machines so its a little more pervasive	18:08
niemeyer	bcsaller: Takes a json blob	18:08
niemeyer	bcsaller: and stores it in zookeeper, within the topology in a "devel" key	18:09
SpamapS	So to me, the current way, "find me an available machine" should just find you machine 0 .. your local machine. For EC2, since they can't do containers, they are unavailable as soon as they have 1 thing on them.	18:09
niemeyer	bcsaller: So we can experiment with different settings	18:09
niemeyer	SpamapS: Don't worry about it.. we're just splitting development in logical steps	18:10
niemeyer	SpamapS: We'll eventually give you the feature you want.	18:10
niemeyer	bcsaller: Or maybe it should really be "set-flag"	18:10
niemeyer	bcsaller: So that we can use that later	18:10
niemeyer	bcsaller: (rather than being specific to "development")	18:11
niemeyer	bcsaller: This way you can create an alternative path within the logic by consulting specific flags	18:11
niemeyer	bcsaller: Without altering the standard operation	18:11
niemeyer	bcsaller: Thoughts?	18:11
bcsaller	niemeyer: we could easily add arguments to deploy/add-unit that were conceptually --placement <strategy_or_plan> where it could be a machine id or the name of an available planner which could choose local, reuse, etc	18:11
bcsaller	as a counter proposal	18:11
niemeyer	bcsaller: Yes, we could, .. we'd also have to worry about getting it right.. you already spent a day thinking about it and didn't get to a good plan, so my suggestion is to get unblocked and	18:12
bcsaller	std ops through the code paths would all have to check those flags, which is fine, we want something like that anyway	18:12
niemeyer	bcsaller: have the actual goal in mind for the moment.. we can worry about neat placement strategies down the road	18:12
niemeyer	bcsaller: The problem we have at hand right now doesn't depend ont his	18:12
bcsaller	I don't need to build those now, that wasn't the point	18:12
niemeyer	bcsaller: That's my point! ;-0	18:13
bcsaller	+1	18:13
bcsaller	I find that syntax better than talking about setting development flags in a json bucket, but under the hook it will play out much the same from the internals of those tools	18:14
niemeyer	bcsaller: So every time you do deploy wordpress/mysql/etc, you'll have --placement ?	18:14
cole	roadmap question: I get that ~/.ensemble/environments.yaml can be very easily modified to scale an app. is there a framework for allowing this to be done based on some performance threshold? like memory consumed or cpu utilization / overall cluster throughput etc… ??	18:14
niemeyer	bcsaller: In local development there can't be anything besides --placement=local	18:15
niemeyer	bcsaller: So where do you store the fact placement _has_ to be local?	18:15
bcsaller	niemeyer: it would just default to doing with it does, "unassigned" which points to the existing method	18:15
bcsaller	`local` is a method that says return machine<0>	18:16
niemeyer	bcsaller: Ok.. that sounds good as well.. can you please describe the semantics end-to-end?	18:16
niemeyer	cole: We'll be with you in a sec	18:16
bcsaller	`ensemble deploy --placement local mysql`	18:17
bcsaller	`ensemble deploy --placement local wordpress`	18:17
bcsaller	would place two units and assigned them to the machine returned by the policy, in this case machine 0 which is the local box	18:17
bcsaller	internally this would replace the code in deploy and add unit that maps/find machines and does unit assignment with a callout to policy by name. If that option is an int, the machine is is resolved and used with a different policy function doing specific assignment	18:19
bcsaller	add-unit -n <num> --placement xxx could still be strange, with a policy it could work, with a machine id... ?	18:19
bcsaller	but that doesn't seem to be a blocker to me	18:19
hazmat	niemeyer, bcsaller unrelated to current discussion, i was looking over the co-location stuff on the ML, and was wondering if this isn't easier with the relation qualification co-located or a new relation type container, the distinguishing characteristic is that the physical placement, its odd indeed for a local co-located service to talk to an opposite end remote service. its more of a local either p2p relations between those units deployed in the s	18:21
hazmat	ame container, or a bus/ring container relation containing only the local units.	18:21
hazmat	bcsaller, that sounds good if default placement policy can derive from provider	18:21
niemeyer	bcsaller: We don't have to address specific assignment for the moment	18:21
hazmat	thus obviating the need for specifying it in hte common case	18:21
niemeyer	bcsaller: I want to avoid the "I want this in machine X" feature for now	18:21
niemeyer	bcsaller: Because it blocks other characteristics we're intrested on	18:21
bcsaller	niemeyer: I prefer that as well	18:22
niemeyer	hazmat: Sorry, I'll be with you soon.. let me unroll the stack of questions	18:22
niemeyer	bcsaller: Ok	18:22
bcsaller	niemeyer: a couple of named policies that map cli stuff to the service assignment code then is pretty simple and seems future aware	18:22
niemeyer	bcsaller: So --placement local sounds fine to bootstrap.. the local provider can somehow determine the default policy down the road	18:23
bcsaller	hazmat: it makes total sense that providers can carry code for specific policies	18:23
hazmat	cole, its definitely something we're thinking about, but its probably a ways out, we're currently working out how to get things like default service monitoring onto systems. in future with monitoring, and a remote api for ensemble, a user could provide scaling logic, its probably a while till ensemble provides it as a generic feature.	18:23
hazmat	bcsaller, not that they should per se have code, ideally it could be generic, just that they specify a default named policy	18:24
niemeyer	bcsaller: The point was more that we need to tweak default policy according to backend	18:24
bcsaller	ok	18:24
niemeyer	bcsaller: We don't want --placement switches on every single call on a local dev	18:24
cole	hazmat: thanks! I figured as much. I think we might be able to help in that area. project looks like it's coming along nicely!	18:25
bcsaller	right	18:25
bcsaller	got it	18:25
niemeyer	bcsaller: But I see your overall plan, it's a good idea, +1	18:25
bcsaller	cool, I can work on a branch for that today, sounds pretty simple	18:25
hazmat	cole, fwiw, as is though ensemble cli already enables the ability to scale a service and automatically reconfigure clusters for the additional capacity, just not as the automated scaling bit in response to service conditions.	18:26
niemeyer	SpamapS: So..	18:27
cole	hzmat: yep, got it!	18:27
niemeyer	SpamapS: The way the work is being structured is this:	18:27
niemeyer	<niemeyer> 1) Make multiple units work on a single machine across the board (no LXC)	18:28
niemeyer	<niemeyer> 2) Make local deployments work with one or multiple units (no LXC)	18:28
niemeyer	<niemeyer> 3) Make LXC work to deploy units locally (doesn't matter if EC2 can't do it yet)	18:28
niemeyer	SpamapS: bcsaller is working on step (1) still (he started yesterday :-)	18:28
SpamapS	Cool, I had a branch that did 1 with --machine $machine_id .. tho it was failing tests last I checked.	18:29
niemeyer	SpamapS: That's exactly the context of the conversation.. I don't want to nail the problem of specific assignment for the moment.. there are other approaches we can take for that (resource interest, service proximity, etc), and it's really unrelated to the core problem we're solving for local development	18:31
niemeyer	SpamapS: So I had one suggestion, and bcsaller has a better suggestion which we'll go down with.. --placement local..	18:31
niemeyer	SpamapS: This is a trivial bootstrap process that keeps the complex problems for latter	18:32
niemeyer	later	18:32
bcsaller	SpamapS: where using a local provider would change the default placement policy for you	18:32
SpamapS	ACK	18:36
hazmat	niemeyer, although placement considering the service to be deployed (resource interest, service proximity) will need to receive it as part of the placement api	18:37
niemeyer	hazmat: Ok, re. co-location.. I agree the flag on the relation is probably all we need	18:37
niemeyer	hazmat: I don't see it as being special, though	18:37
niemeyer	hazmat: These relations still need well defined interfaces	18:38
hazmat	niemeyer, yeah.. well its not clear that a local co-located service needs to have any access to the remote units	18:38
niemeyer	hazmat: They don't _have_ to	18:38
hazmat	er. its opposite end	18:38
niemeyer	hazmat: But they should be _able_ to	18:38
niemeyer	hazmat: re. the placement point above, yes, I'm not trying to define how that's going to work right no	18:39
niemeyer	w	18:39
niemeyer	hazmat: Was rather just mentioning there are additional things we'll want to talk about and understand when sorting this actual issue	18:39
niemeyer	hazmat: The problem we have at hand right now is much simpler, though	18:39
hazmat	indeed	18:39
hazmat	okay.. i did some reviews and security work today, switching tracks i'm going to do a presentation tonight at a local python user group, going to prep for that	18:42
niemeyer	hazmat: Cool.. I'll switch to reviews.. is there something blocking you on that front?	18:43
niemeyer	I'd like to sort all of William's branches today, hopefully	18:43
hazmat	niemeyer, nope.. i've just been going through william's branches.. on the security front, the integration work is coming along, i've reworked the interfaces a few times, most recently to enable us to turn off security by default for tests (default for now is enabled), still a little bit of refactoring to do on the policy.. i'm trying to finish the end to end so i can fix up policy-rules branch based on better knowledge of its application.	18:45
niemeyer	hazmat: Cool	18:45
niemeyer	Huge wind storm here today	19:08
jcastro	how do you move between VTs in the tmux thing when you're in debug mode?	19:11
hazmat	jcastro, ctrl-a	19:17
hazmat	is the escape sequence, tmux config in debug-mode is setup to emulate screen	19:18
jcastro	ah, been spoiled by byobu I guess, heh	19:19
* jcastro finishes up his ensemble screencast		19:19
niemeyer	jcastro: We hope to use byobu again at some point	19:27
jcastro	easy to forget how spoiled I was	19:27
niemeyer	jcastro: kirkland is working on a set of configs for tmux, and hopefully we can also bring screen back in the future	19:27
hazmat	niemeyer, any progress on the repo work?	20:18
hazmat	just using the principia-tools to setup a demo.. and thinking ick	20:18
niemeyer	hazmat: None..	20:20
niemeyer	hazmat: Stuck on reviews, interviews, conversations, etc	20:21
niemeyer	hazmat: Hoping to get to it this week still	20:21
SpamapS	Ok I just uploaded txzookeeper 0.8.0 to oneiric.. and will upload trunk shortly as well.	20:59
SpamapS	hazmat: If there's anything minor I can do to make principia less "ick" .. let me know. I've tried to make it a little better of late. :-P	21:00
SpamapS	hazmat: don't want to spend much time on it though.. :)	21:00
hazmat	SpamapS, i appreciate the work on it, just wishing for a repository to obviate the need for additional tools to deploy	21:01
SpamapS	hazmat: exactly	21:01
SpamapS	hazmat: I'd like a better repo too.. principia is.. well a nice experiment. :()	21:02
SpamapS	hazmat: note that there's a 'princpia update' command now.. which pulls a new list of formulas	21:02
SpamapS	hazmat: and some of the commands have --help	21:02
jcastro	SpamapS: what!	21:03
jcastro	where?	21:03
hazmat	SpamapS, if i had to capture in one line the three things to make the dev story better.. it would be "local dev, no formula revs, pre-allocate machines"	21:03
SpamapS	jcastro: in the ppa	21:03
jcastro	oh man	21:03
SpamapS	jcastro: sudo apt-get install principia-tools	21:03
jcastro	I totally missed that	21:03
hazmat	SpamapS, getall by itself seems to do the trick of updating (mr seems to do it)	21:03
jcastro	also, check it out: http://www.youtube.com/watch?v=4Rl7wTlUqkY	21:03
SpamapS	hazmat: getall calls update :)	21:03
hazmat	well of grabbing new formulas	21:03
hazmat	nice :-)	21:03
hazmat	SpamapS, was that a good summation of things? or are there others that get top billing?	21:04
SpamapS	hazmat: yeah definitely.. though I have to say, the formula dev story is already pretty damn good.. our standards just keep going up. :)	21:04
hazmat	SpamapS, indeed, but precious seconds get lost, and turned into minutes.. we keep getting busier ;-)	21:05
hazmat	i think i figured out a quick way to pre-allocate machines, but the allocation doesn't take place till the first formula is deployed	21:06
hazmat	which is kinda of a bummer, its more like a delayed pre-allocation	21:06
SpamapS	yeah I think it actually makes sense to enable it as its own command	21:06
hazmat	SpamapS, like add-machines 5 ?	21:06
SpamapS	ensemble bootstrap && ensemble allocate-machines --ec2.instance-type=m1.small 10	21:07
SpamapS	It would be cool to have every aspect of the environment available as --env.x=foo	21:07
SpamapS	Would solve a lot of the "need a way to specify X at runtime"	21:08
SpamapS	jcastro: cool video	21:08
hazmat	hmmm.. that sounds good re allocate-machines.. the env.x syntax is likely problematic.. its kinda of redundant in that the cli is already targeting a env for any op, so the qualification is odd	21:10
SpamapS	hazmat: its to prevent namespace collision	21:11
SpamapS	hazmat: doesn't have to be ec2. .. could be --envset instance-type=m1.small	21:11
SpamapS	hazmat: or just bury it in the positional args	21:12
SpamapS	hazmat: just seems like a good idea to be able to override settings at runtime thats all	21:12
hazmat	SpamapS, ic.. i was thinking just allocate-machines --provider-size=m1.small 4	21:17
=== robbiew is now known as robbiew-afk
niemeyer	Ugh.. almost 8	22:43
niemeyer	Time flew by today	22:43
niemeyer	ALRIGHT!	23:02
niemeyer	We have an almost empty review queue!	23:02
niemeyer	It's been a while..	23:02
niemeyer	But!	23:02
niemeyer	We still need a hand on this one:	23:02
niemeyer	https://code.launchpad.net/~fwereade/ensemble/webdav-storage	23:02
niemeyer	It lacks a second review	23:02
niemeyer	Any takers?	23:02
_mup_	Bug #820107 was filed: Ensemble should enable flexible unit placement <Ensemble:In Progress by bcsaller> < https://launchpad.net/bugs/820107 >	23:35

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!