pullieshi, i'm trying out ensemble but when running `ensemble status` i get an Invalid SSH key message00:41
pulliesi have specified authorized-keys-path in my environment (that's what let me run bootstrap in the first place)00:42
pulliesthis is when using the ppa, btw.00:43
pulliesany advice?00:43
niemeyerpullies: Hey there00:51
niemeyerpullies: Hmmm00:51
niemeyerpullies: What's the content of the file you're pointing at with the authorized-keys-path?00:53
niemeyerpullies: It usually gets that automatically from your ~/.ssh/id_dsa.pub or ~/.ssh/id_rsa.pub files00:55
niemeyerpullies: Did you have these when you were seeing the error earlier?00:55
RoAkSoAx_niemeyer: who creates id_{dsa|.pub00:58
RoAkSoAx_niemeyer: who creates id_{dsa|rsa}.pub00:58
RoAkSoAx_in the zookeeper?00:58
niemeyerRoAkSoAx_: deploy does00:58
niemeyerRoAkSoAx_: It serializes with the environment00:58
RoAkSoAx_niemeyer: ok cool, cause I was encountering situations on which the zookeeper complaint about the keys but there were no keys01:02
RoAkSoAx_niemeyer: err was looking for keys, but there were no keys01:03
niemeyerRoAkSoAx_: Ok.. pullies was just reporting a similar issue above01:03
niemeyerRoAkSoAx_: Maybe that's the problem01:03
niemeyerpullies: Have you deployed something?  That could be the issue01:03
niemeyerWe should handle that bootstrapping phase better in that sense01:03
RoAkSoAx_niemeyer: but in my cause, after bootstrapping the zookeeper dones't have any .pub keys created 01:05
=== daker_ is now known as daker
fwereadeheya hazmat :)11:46
pulliessorry, i disappeared for the night, as you could probably tell.  ;-)  i have deployed nothing.12:22
pullies-----BEGIN RSA PRIVATE KEY-----12:22
pullies(a base64 encoded string which i won't paste here)12:22
pullies-----END RSA PRIVATE KEY-----12:22
* hazmat has to give an ensemble presentation tonight, but his laptop does a kernel panic everytime it sleeps12:26
fwereadedid something about how we handle $PYTHONPATH change?12:46
hazmatfwereade, no.. not in a very long time12:47
fwereadehazmat: ok, I'm doing something dumb then :)12:47
hazmatfwereade, there is a new bug open about pythonpath being set for hooks which it probably shouldn't be12:47
hazmatfwereade, what's the sympton?12:47
fwereadeI set PYTHONPATH, run bin/ensemble, and it still picks up the system version12:48
hazmatfwereade, a system version (ppa install) of ensemble?12:48
fwereadehazmat: yep12:48
hazmatfwereade, is it ppa or a manual installation via sudo setup.py install?12:49
hazmatfwereade, i'd suggest first removing the package12:49
hazmatPYTHONPATH=$PWD python -c "import ensemble; print ensemble"12:50
hazmatis a quick verification of the import path for ensemble12:50
hazmater.. PYTHONPATH=$PWD:$PYTHONPATH  is probably better12:51
hazmatfor real usage12:51
fwereadehazmat: that appears to work12:51
fwereadehazmat: anyway, don't worry, all I really needed was verification that it we me being stupid ;)12:51
hazmatfwereade, path problems hit all of us one time or another..12:52
* fwereade is reassured ;)12:54
hazmatfwereade, btw, when your the second reviewer on a branch, you should adjust the merge proposal status (at the top) if both the reviews are approve, then approved, else work in progress. 12:58
* hazmat dog walks, bbiab12:59
fwereadehazmat: cool, thanks13:01
fwereadehazmat: I think I may have thought it happened by magic :p13:02
pulliessorry, a little context would probably help.  last night i reported an error that `ensemble status` gives an Invalid SSH key message when i specify my authorized-keys-path in environment13:02
hazmatpullies, so you have the path to your ssh key in 'authorized-keys-path' in the provider section of the environment? 13:03
hazmatpullies, and yes the context helps ;-)13:04
pullieshazmat, i have a path to my ssh key in environments.sample.authorized-keys-path13:05
hazmatpullies, looking at the code, it looks like the path is a misnomer :-( it wants the name of a file in the .ssh directory13:05
hazmatpullies, if your up for it please file a bug that this is misleading/confusing regarding the name and usage of this setting13:05
pullieshazmat, where's the bug tracker?13:06
hazmatpullies, your going to need to shutdown and bootstrap again, we need the ssh key active for doing any connection from the cli to the ensemble setup13:07
hazmatpullies, so your saying you where able to bootstrap with an invalid key?13:07
hazmatthat's also a problem/bug13:07
hazmatif so13:07
pullieshazmat, running the instance was successful.  i believe connecting to it was not13:07
pulliesthis launchpad dashboard is confusingly worded about openid13:08
hazmatpullies, suggestions welcome on #launchpad ;-)13:08
hazmatpullies, thanks for filing a bug, i'm heading out for a few minutes, let us know how it goes13:09
pulliesstill the same error message13:10
pulliesi'm assuming "the ssh directory" is ~/.ssh13:11
_mup_Bug #819803 was filed: authorized-keys-path is actually a filename, not a path. <Ensemble:New> < https://launchpad.net/bugs/819803 >13:14
pullieshazmat, it's possible that ssh access hasn't been enabled for the security group, poking around at ec2 docs.  can you confirm that this is a necessary precondition that ensemble doesn't take care of?13:17
hazmatpullies, ensemble does indeed take care of ec2 security groups13:31
hazmatpullies, it takes a minute or two for the instance to be up and responding13:31
hazmatpullies, actually it does look like it will try either a full path or a name13:31
pulliesi've skipped the ensemble part.13:32
hazmatpullies, it looks like you would get a LookupError13:32
pulliesi'm trying to use the keypair itself13:32
hazmatnot an invalid ssh key error, if it couldn't find the key13:32
pulliesto make sure i can login13:32
pulliesand i can't.13:32
pulliesssh -i ~/.ssh/mykey.pem ubuntu@ec2-ip.compute-1.amazonaws.com13:33
pulliesthat should succeed, yes?13:33
pulliesssh -v is telling me that it reads the rsa private key13:33
pullies"authentications that can continue: public key" is issued twice13:34
pulliesi'm a bit confused why i can't ssh directly in13:34
hazmatpullies, the ssh key is specified for ensemble is the public key, not the private key13:35
hazmatpullies, do you have an id_[dsa|rsa].pub in ~/.ssh ?13:35
pulliesamazon only gave me a .pem file to download13:36
hazmatpullies, ensemble  doesn't use that13:36
hazmatpullies, try ssh-keygen -t rsa13:36
pullieswill generate a local key and try that13:36
pullies:-)  duh.13:36
hazmatpullies, you can remove the authorized-keys-path as well, ensemble picks up default keys automatically if none are specified13:37
fwereadedifference between orchestra and ec2: we can't easily tell whether an orchestra machine is running13:37
fwereadeso get_zookeeper_machines is problematic on orchestra13:38
fwereadebecause it can't verify the sanity of the state it gets from FileStorage13:38
fwereadein orchestra, should we be (say) trashing state in shutdown, or should we figure out some way to query the machines and thereby match ec2 better?13:39
hazmatfwereade, orchestra doesn't know if machines it setup are running? 13:41
hazmati thought it was doing a tftp/dhcp setup, maybe that's not exposed via the api?13:42
fwereadehazmat: nope, you can theoretically query power status13:42
fwereadebut that's acted weird for me13:43
fwereadeand that still doesn't tell us whether they're actually running, or if something went wrong part way through install (for example)13:43
hazmatfwereade, it looks like the examples specify remote power management but not status13:43
fwereadethe api includes a "status" command, which AFAICT acts like an "off" command13:44
hazmatfwereade, that seems like an upstream bug if that's the case13:45
hazmatfwereade, not having any orchestra/cobbler experience, i'm not sure what the options are. but if the zk pointer file is invalid, the whole thing basically breaks down.13:46
niemeyerHey guys!13:46
fwereadehazmat: I'm assuming for now that it's something I'm doing wrong, I tend to defer to RoAkSoAx_ for the final word on these things13:46
hazmatniemeyer, top of the morning13:46
fwereadeheya niemeyer13:47
fwereadehazmat: it's fine if provider-state is borked, on ec2, because we can check machine status13:47
fwereadehazmat: we bootstrap if there's no state, or if the state is nonsensical => probably already shut down13:47
hazmatfwereade, well on ec2 we always intersect the two provider state against machine status, becuase we need the ip resolution13:47
niemeyerhazmat: balance to you my friend13:48
fwereadehazmat: maybe that's the intent, I'm just telling you what I could infer from the code :)13:48
hazmatfwereade, yup, indeed that's the case.. orchestra is a different beast a bit13:48
pullieshazmat, now this is progress13:49
pullies2011-08-02 09:48:45,857 ERROR SSH forwarding error: Agent admitted failure to sign using the key. Permission denied (publickey).  13:49
niemeyerhazmat, fwereade Got the conversation mid-way through, but it sounds sensible to trash state on shutdown13:49
niemeyerI'd rather rename shutdown to destroy-environment, but that's another topic13:49
hazmatpullies, if you modify/change the key, you'll need to ensemble shutdown && ensemble bootstrap13:49
pulliesthis is after doing that13:49
pulliesand i've removed the path from environments.yaml13:50
niemeyerpullies: Have you run deploy already?13:51
hazmatpullies, so if you do ec2-describe-instances do you see the instance running (the security group should match the environment name prefixed with 'ensemble-')13:51
fwereadeniemeyer: cool, that feels like it would make life easier on my side and do no harm to ec213:51
niemeyerWell, I guess it doesn't really matter actually13:51
niemeyerfwereade: Agreed13:51
hazmatniemeyer, yeah.. failure to connect precludes deploy13:51
niemeyerhazmat: I was thinking that the key is serialized with the env, which happens at deploy13:52
niemeyerhazmat: But that's something else.. we need a key there to connect to zk in the first place13:52
hazmatpullies, you should be able to ssh into the machine using  directly using ssh ubuntu@ec2-host-name 13:52
hazmatniemeyer, the public key is sent at launch time via cloud-init13:52
hazmatniemeyer, its not stored in zk13:52
pullieshazmat, the dashboard shows the machine.13:53
niemeyerhazmat: It is stored in zk during deploy13:53
hazmatniemeyer, the environment is yes13:53
niemeyerhazmat: Otherwise how would it be in cloud-init for the other machines13:53
niemeyerhazmat: The keys13:53
hazmatniemeyer, totally 13:53
hazmatbut not for the bootstrap13:53
niemeyerhazmat: Yes, I guess that's what I said above?13:53
hazmatpullies, so what'd i'd like to verify is from the cli you can log into that machine via ssh, if you didn't rename the ssh key, it should just pick up the private side of the same default13:55
hazmatssh will try a few from what it finds in ~/.ssh13:56
pulliesi generated the key twice.  it's possible something is cached in either my client or theirs.  will have to log out and try again13:58
pullieswill attempt it tonight13:58
pulliesthanks for the help.  will definitely focus on the ssh portion, i don't think it's ensemble at this point13:58
niemeyerstatik: Morning14:11
statikmorning niemeyer14:12
RoAkSoAx_fwereade: howdy!!14:16
fwereadeRoAkSoAx_: heyhey!14:17
fwereadeow's it going?14:17
RoAkSoAx_fwereade: pretty good, you?14:17
fwereadeRoAkSoAx_: pretty good thanks :)14:17
fwereadeRoAkSoAx_: and I got netboot 9% working on my cobbler, too14:17
fwereadeshadow-trunk is up to date, and might even work for you now ;)14:18
RoAkSoAx_fwereade: cool, where are you stuck?14:18
fwereadeer, that should have been a 99% up there :)14:18
fwereadethe ubuntu-orchestra-client install14:18
RoAkSoAx_fwereade: cool, I'm actually pulling your branch to test now14:18
RoAkSoAx_fwereade: you mean the variable?14:18
RoAkSoAx_on the preseed?14:19
fwereade(1) it asks for an rsyslog server, and then cannot fails with "cannot stat /var/something/puppet"14:19
RoAkSoAx_fwereade: show me the line in the preseed14:19
fwereadeRoAkSoAx_: yeah, I remember you telling me to "just comment it out for now" a while ago, so that's what I did14:19
fwereadecan't copy from VM, but it's the pkgsel one as copied from your mail14:20
RoAkSoAx_fwereade: is the creation of the cloud-init data fixed?14:20
fwereadeRoAkSoAx_: I *think* so14:20
fwereadeRoAkSoAx_: I now generate something that actually looks like a working EC2 one14:21
RoAkSoAx_fwereade: ok, gonna test now then ;)14:21
fwereadeRoAkSoAx_: the precise details of how I screwed it up the first time are far to embarrassing to relate :p14:21
fwereadeRoAkSoAx_: sweet, tyvm14:21
RoAkSoAx_fwereade: hehe its all good14:22
fwereadeRoAkSoAx_: hm, I seem to be getting "204 No Content"s from webdav, which I wasn't before, but it all works (apart from the error, heh)14:31
RoAkSoAx_fwereade: on the orchestra server, what's in /var/lib/webdav14:32
fwereadeRoAkSoAx_: the right stuff14:32
RoAkSoAx_fwereade: formulas dir and provider-state?14:33
fwereadeRoAkSoAx_: no but yes (I haven't got a formulas dir at the moment, but the right content was written to provider-state)14:33
RoAkSoAx_fwereade: mkdir -p /var/lib/webdav/formulas && chown -R www-data:www-data /var/lib/webdav/formulas14:35
RoAkSoAx_fwereade: ok, so bootstrapping works, ensemble status doesn't14:36
fwereadeawesome! I haven't even thought about what status does, so that's the progress I wanted :)14:37
RoAkSoAx_fwereade: ok, in orchestra means that was related to having @property def _machines:14:38
fwereadeRoAkSoAx_: indeed, and my understanding was that that was something you wanted to defer until the sprint14:38
RoAkSoAx_fwereade: but anywa,s what's the last thing merged there and what was left to "separate" from the old bootstrap-orchestra branch?14:38
RoAkSoAx_fwereade: i wanna have it working though14:38
fwereadeRoAkSoAx_: sounds good to me14:39
fwereadeRoAkSoAx_: were we going with "stick it into ks_meta" for now?14:39
fwereadelast thing merged into shadow-trunk is cobbler-launch-machine14:39
RoAkSoAx_fwereade: so what;'s missing, the shutdown stuff?14:39
fwereadecobbler-kill-machine is WIP14:39
fwereadebootstrap-verify-storage is an unrelated bug I picked up lest I spin my wheels on monday, and that should be good soon14:40
RoAkSoAx_fwereade: alright, so I;ll re-read your branch and try to identify what'[s missing from the things I wanted to do14:40
fwereadeRoAkSoAx_: I plan to make one more change -- to treat 204 as success (as I think is correct: processed successfully, doesn't feel it needs to return any content)14:41
RoAkSoAx_fwereade: I haven't seen that, where did you see that?14:42
RoAkSoAx_ort in what situation14:42
fwereadeI seem to be getting that every time I PUT provider-state to webdav14:42
RoAkSoAx_fwereade: i haven't seen anything14:42
RoAkSoAx_fwereade: make sure the formulas dir is there and restart apache2 and see if it keeps throwing that error14:43
RoAkSoAx_fwereade: is the default storage-url also in?14:43
fwereadewell, it's not an error, it seems like a perfectly legitimate response: "yep, cool, I've handled your request and I have nothing more to tell you, but here's a fresh etag maybe"14:43
fwereadebut twisted getPage seems to consider "not 200" == "something happened, raise an exception"14:44
fwereadeI'll bounce apache anyway, but I think what'll fix it is deleting provider-state, I'll let you know in 514:45
RoAkSoAx_fwereade: nah nothing will delete provider-state14:45
RoAkSoAx_fwereade: you'd have to do it manually in orther to be able to bootstrap again14:45
fwereadeRoAkSoAx_: what I'm doing is setting it to {} on shutdown14:46
fwereadeRoAkSoAx_: and, yes, if I overwrite I get 204, if I trash it manually I get 20014:47
fwereadeRoAkSoAx_: overwrite is perfectly reaonable behaviour, I'll just make sure ensemble understands that14:47
RoAkSoAx_fwereade: i don't think we would need to delete provider-state on shutdown14:47
RoAkSoAx_fwereade: remember that we are dealing with physical hw14:48
RoAkSoAx_and it is expensive14:48
RoAkSoAx_to be installing every time we want a zookeeper14:48
RoAkSoAx_when we already have one14:48
=== RoAkSoAx_ is now known as RoAkSoAx
fwereadehm, I thought that ensemble shutdown was intended to wipe out the whole environment -- inverse of bootstrap14:48
RoAkSoAxfwereade: that's one of the things I'm also planning to address.14:49
fwereadethat's what it seems to do on EC2 anyway :)14:49
RoAkSoAxfwereade: yes, on ec2 is non-expensive because you can fire up instances or destroy them on demand14:49
fwereadeRoAkSoAx: heh, ok14:49
RoAkSoAxfwereade: but in real hardware is not the same approach14:49
fwereadeRoAkSoAx: I've been working under the assumption that I should mirror ec2 behaviour as closely as possible14:50
fwereadeRoAkSoAx: at least for now ;)14:51
RoAkSoAxfwereade: yes, but I think things like that14:51
RoAkSoAxcan be avoided for now14:51
RoAkSoAxfwereade: I mean, wiping out provider-state is a super minor change14:51
RoAkSoAxand I don't think it is necessary14:51
fwereadeRoAkSoAx: well, keeping a zookeeper around is quite a major difference, it seems to me :)14:52
fwereadeRoAkSoAx: well... it's very convenient for me :)14:52
RoAkSoAxfwereade: indeed, but again, we are dealing with real hardware in this case14:52
RoAkSoAxfwereade: sysadmins *wont* install zookeepers every week to deploy environments but rather, they would keep once zookeeper up and running at all times14:53
RoAkSoAxfwereade: it is expensive in many ways, 1: downtime 2. network bandwidth wasted 3. hardware is useless 4. reinstallations at all times are expensive14:54
RoAkSoAxfwereade: why this works on ec2? simply becuase i can fire up/destroy instances on demand and costs 2 cents?14:54
RoAkSoAxfwereade: were there's a prebuilt image14:54
fwereadeRoAkSoAx: heh, got you, it's the system install cost not the zookeeper install cost (right?)14:55
RoAkSoAxfwereade: yes14:55
fwereadeRoAkSoAx: ...but we still pay the system install cost for every other machine, right?14:55
RoAkSoAxfwereade: right, *but* the idea is now to figure out a way of *re-using* the machines instead14:55
fwereadeRoAkSoAx: and if we have a local mirror it's not going to be *that* big a difference is it?14:55
fwereadeRoAkSoAx: ha -- I see :)14:56
fwereadeRoAkSoAx: that goal has escaped me14:56
fwereadeRoAkSoAx: sorry :)14:56
RoAkSoAxfwereade: hehe but yeah having a mirror is still big difference when deploying a services14:56
RoAkSoAxcause it still uses bandwdith14:57
RoAkSoAxand multiplyed by lots of servers14:57
RoAkSoAxit is huge14:57
RoAkSoAxfwereade: but yes, that's another thing that I was gonna bring up during the sprint14:57
fwereadeRoAkSoAx: you make a lot of sense14:57
* RoAkSoAx better starts writing down all this stuff otherwise he'll forget :)14:58
fwereadeRoAkSoAx: it just doesn't precisely fit with what I'd understood our goals to be -- I thought we were aiming for parity with ec2 for now, and figuring out the tricky stuff at the sprint14:58
* fwereade would appreciate that :)14:58
RoAkSoAxfwereade: yeah we can do that if you want too14:59
RoAkSoAxfwereade: dealing with VM's is as inexpensive as ec214:59
niemeyerI seem to remember the wiki sent us to the right page after authenticating14:59
niemeyerIt doesn't look like that's the case anymore14:59
fwereadeRoAkSoAx: well, that's my justification for what I've been doing15:00
RoAkSoAxfwereade: but right now, what I was persnally looking for is having it bootstrapping, deploying, etc, working (not really exactly the same as ec2, but close), so that during the sprint we could address these issues and differences with ec215:00
fwereadeRoAkSoAx: I feel it's currently useful, towards that goal, even if things change as the plans firm up15:00
RoAkSoAxfwereade: you don't need to justify as we didn't set any boundaries about stuff liuke these when we started15:01
fwereadeRoAkSoAx: that's my idea too, with the added condition of "on my local VM network"15:01
RoAkSoAxfwereade: but my concern is that you might end up writing code that might be later dismissed :)15:01
fwereadeRoAkSoAx: deleting code is one of the great joys in life ;)15:01
RoAkSoAxfwereade: hehehe alright15:02
RoAkSoAxfwereade: again I don't mind you doing that, seriously, just giving you a broad view of what I have in my mind at the moment :)15:02
fwereadeRoAkSoAx: cool, I was worried I was going off into the weeds :)15:02
fwereadeRoAkSoAx: good to resync ;)15:02
RoAkSoAxfwereade: nah.. either way, this things are gonna have to be discussed next week so my thoights my change given input from others15:03
fwereadeRoAkSoAx: cool -- anyway, I'll handle 204s on .put() and propose launch-machine and bootstrap-verify-storage15:04
RoAkSoAxfwereade: cool15:04
fwereadeRoAkSoAx: and that'll probably be my day, but I might be able to check in later when cath's gone to bed15:04
fwereadeRoAkSoAx: I'll make sure shadow-trunk is up to date with whatever I've proposed15:05
RoAkSoAxi'll work on reviwing what would be missing from shadow-trunk in comparison to bootstrap's branch15:05
=== daker is now known as daker_
niemeyer<RoAkSoAx_> fwereade: i don't think we would need to delete provider-state on shutdown15:54
niemeyer<RoAkSoAx_> fwereade: remember that we are dealing with physical hw15:54
niemeyer<RoAkSoAx_> and it is expensive15:54
niemeyerRoAkSoAx: destroy-environment should really destroy it..15:54
niemeyerRoAkSoAx: I agree with you that physical hardware may make the admin act differently15:55
niemeyerRoAkSoAx: E.g. not destroying the environment15:55
niemeyerRoAkSoAx: It should be possible to terminate services and take them off the machines so that we can reuse not only the bootstrap machine but all of them15:55
fwereadeeveryone: I need to be away sharpish, I'm afraid15:55
niemeyerRoAkSoAx: But that's about _using_ the env15:55
niemeyerRoAkSoAx: Without destroying it15:55
niemeyerRoAkSoAx: Having ensemble destroy-environment not destroying it for reuse would be awkward15:56
fwereadebut I have a couple of new mps, and I would appreciate reviews from one and all, eithet on those or on their various prerequisites :)15:56
fwereadeenjoy your afternoons :)15:56
niemeyerI'm stepping out as well, but for lunch.. biab15:57
RoAkSoAxniemeyer: right, but from my point of view, destroy an environment should destroy everything, but leave the information from the zookeeper, so next time someone will like to bootstrap, it can detect "hey there's already a zookeeper, if it is sleeping, let's wake it up, if it is awake, let's use it"16:04
RoAkSoAxniemeyer: and that way we save ourselves from reinstalling a machine again16:05
niemeyerRoAkSoAx: zk is part of the environment16:47
niemeyerRoAkSoAx: It's actually a key part of it16:47
niemeyerRoAkSoAx: If one wants to save the time to redeploy zk, just don't destroy the environment16:48
niemeyerRoAkSoAx: It's a "doctor, it hurts!" case :)16:48
_mup_ensemble/states-with-principals r303 committed by kapil.thangavelu@canonical.com16:48
_mup_statebase retry topology change respects change functions which yield control.16:48
niemeyerbcsaller: How's it going with the local dev stuff?17:55
bcsallerniemeyer: I'm working on trying to add flexability to how machine assignment is done in deploy/add_unit. Those both use state.service.assign_to_unassigned_machine which clearly isn't always what we want. 17:57
bcsallerniemeyer: but specifying machines in deploy/add-unit is a little at odd with the co-location spec. Its a different axis to plot unit placement on 17:58
niemeyerbcsaller: Don't worry about co-location for the moment..17:58
niemeyerbcsaller: This is really a different angle of the problem17:58
bcsallerjust keeping it in mind17:58
niemeyerbcsaller: Cool, that's nice17:58
niemeyerbcsaller: Hmm.. but we do have specific assignment, rigth?17:59
niemeyerbcsaller: assign_to_unassigned is just one method we have17:59
bcsallerits the only one used17:59
bcsallerin the cli17:59
bcsallerso really it becomes about providing access to other means for placement (as a starting point)18:00
bcsallerI know there is a desire down the road to say things like `ensemble add-unit -n <num> service`18:01
bcsallerbut if deploy and add-unit grow syntax to support machine assignment I want it to be future friendly 18:01
niemeyerbcsaller: Have you seen assign_to_machine?18:02
bcsallerniemeyer: I think the issue comes in at the cli level to be clear18:03
niemeyerbcsaller: That's why I don't get the problem you're describing.. sure, we have assign_to_unassigned_mchine, which is the hard one..18:03
niemeyerbcsaller: We also have an explicit one18:03
niemeyerbcsaller: Which is easy to use18:03
bcsallerniemeyer: its literally an issue of cli syntax I'm talking about, not a coding hurdle 18:03
niemeyerbcsaller: Ahh, ok18:03
bcsallerI don't want to blindly add new syntax that isn't friendly to the other efforts we have in mind 18:04
niemeyerbcsaller: 100% with you18:04
niemeyerbcsaller: Hmmm18:05
niemeyerbcsaller: Here is an idea..18:07
SpamapSHow is this at all relevant to local dev?18:07
SpamapSThere's only one machine in local dev.18:07
niemeyerbcsaller: Let's introduce a command named "set-devel-flag"18:07
niemeyerSpamapS: Let's cover this in a moment..18:07
SpamapSWhich would be "available" because it can add containers.18:07
niemeyerbcsaller: Or even better, "set-devel"18:08
bcsallerSpamapS: thats an important part of the change, but now the cli tools only look for unassigned machines so its a little more pervasive 18:08
niemeyerbcsaller: Takes a json blob18:08
niemeyerbcsaller: and stores it in zookeeper, within the topology in a "devel" key18:09
SpamapSSo to me, the current way, "find me an available machine" should just find you machine 0 .. your local machine. For EC2, since they can't do containers, they are unavailable as soon as they have 1 thing on them.18:09
niemeyerbcsaller: So we can experiment with different settings18:09
niemeyerSpamapS: Don't worry about it.. we're just splitting development in logical steps18:10
niemeyerSpamapS: We'll eventually give you the feature you want.18:10
niemeyerbcsaller: Or maybe it should really be "set-flag"18:10
niemeyerbcsaller: So that we can use that later18:10
niemeyerbcsaller: (rather than being specific to "development")18:11
niemeyerbcsaller: This way you can create an alternative path within the logic by consulting specific flags18:11
niemeyerbcsaller: Without altering the standard operation18:11
niemeyerbcsaller: Thoughts?18:11
bcsallerniemeyer: we could easily add arguments to deploy/add-unit that were conceptually --placement <strategy_or_plan> where it could be a machine id or the name of an available planner which could choose local, reuse, etc18:11
bcsalleras a counter proposal 18:11
niemeyerbcsaller: Yes, we could, .. we'd also have to worry about getting it right.. you already spent a day thinking about it and didn't get to a good plan, so my suggestion is to get unblocked and18:12
bcsallerstd ops through the code paths would all have to check those flags, which is fine, we want something like that anyway18:12
niemeyerbcsaller: have the actual goal in mind for the moment.. we can worry about neat placement strategies down the road18:12
niemeyerbcsaller: The problem we have at hand right now doesn't depend ont his18:12
bcsallerI don't need to build those now, that wasn't the point18:12
niemeyerbcsaller: That's my point! ;-018:13
bcsallerI find that syntax better than talking about setting development flags in a json bucket, but under the hook it will play out much the same from the internals of those tools18:14
niemeyerbcsaller: So every time you do deploy wordpress/mysql/etc, you'll have --placement ?18:14
coleroadmap question:  I get that ~/.ensemble/environments.yaml can be very easily modified to scale an app.  is there a framework for allowing this to be done based on some performance threshold? like memory consumed or cpu utilization / overall cluster throughput etc… ??18:14
niemeyerbcsaller: In local development there can't be anything besides --placement=local18:15
niemeyerbcsaller: So where do you store the fact placement _has_ to be local?18:15
bcsallerniemeyer: it would just default to doing with it does, "unassigned" which points to the existing method 18:15
bcsaller`local` is a method that says return machine<0>18:16
niemeyerbcsaller: Ok.. that sounds good as well.. can you please describe the semantics end-to-end?18:16
niemeyercole: We'll be with you in a sec18:16
bcsaller`ensemble deploy --placement local mysql`18:17
bcsaller`ensemble deploy --placement local wordpress`18:17
bcsallerwould place two units and assigned them to the machine returned by the policy, in this case machine 0 which is the local box18:17
bcsallerinternally this would replace the code in deploy and add unit that maps/find machines and does unit assignment with a callout to policy by name. If that option is an int, the machine is is resolved and used with a different policy function doing specific assignment 18:19
bcsalleradd-unit -n <num> --placement xxx could still be strange, with a policy it could work, with a machine id... ?18:19
bcsallerbut that doesn't seem to be a blocker to me 18:19
hazmatniemeyer, bcsaller unrelated to current discussion, i was looking over the co-location stuff on the ML, and was wondering if this isn't easier with the relation qualification co-located or a new relation type container, the distinguishing characteristic is that the physical placement, its odd indeed for a local co-located service to talk to an opposite end remote service. its more of a local either p2p relations between those units deployed in the s18:21
hazmatame container, or a bus/ring container relation containing only the local units.18:21
hazmatbcsaller, that sounds good if default placement policy can derive from provider18:21
niemeyerbcsaller: We don't have to address specific assignment for the moment18:21
hazmatthus obviating the need for specifying it in hte common case18:21
niemeyerbcsaller: I want to avoid the "I want this in machine X" feature for now18:21
niemeyerbcsaller: Because it blocks other characteristics we're intrested on18:21
bcsallerniemeyer: I prefer that as well 18:22
niemeyerhazmat: Sorry, I'll be with you soon.. let me unroll the stack of questions18:22
niemeyerbcsaller: Ok18:22
bcsallerniemeyer: a couple of named policies that map cli stuff to the service assignment code then is pretty simple and seems future aware18:22
niemeyerbcsaller: So --placement local sounds fine to bootstrap.. the local provider can somehow determine the default policy down the road18:23
bcsallerhazmat: it makes total sense that providers can carry code for specific policies 18:23
hazmatcole, its definitely something we're thinking about, but its probably a ways out, we're currently working out how to get things like default service monitoring onto systems. in future with monitoring, and a remote api for ensemble, a user could provide scaling logic, its probably a while till ensemble provides it as a generic feature.18:23
hazmatbcsaller, not that they should per se have code, ideally it could be generic, just that they specify a default named policy18:24
niemeyerbcsaller: The point was more that we need to tweak default policy according to backend18:24
niemeyerbcsaller: We don't want --placement switches on every single call on a local dev18:24
colehazmat: thanks!  I figured as much.  I think we might be able to help in that area.  project looks like it's coming along nicely!18:25
bcsallergot it18:25
niemeyerbcsaller: But I see your overall plan, it's a good idea, +118:25
bcsallercool, I can work on a branch for that today, sounds pretty simple18:25
hazmatcole, fwiw, as is though ensemble cli already enables the ability to scale a service and automatically reconfigure clusters for the additional capacity, just not as the automated scaling bit in response to service conditions.18:26
niemeyerSpamapS: So..18:27
colehzmat: yep, got it!18:27
niemeyerSpamapS: The way the work is being structured is this:18:27
niemeyer<niemeyer> 1) Make multiple units work on a single machine across the board (no LXC)18:28
niemeyer<niemeyer> 2) Make local deployments work with one or multiple units (no LXC)18:28
niemeyer<niemeyer> 3) Make LXC work to deploy units locally (doesn't matter if EC2 can't do it yet)18:28
niemeyerSpamapS: bcsaller is working on step (1) still (he started yesterday :-)18:28
SpamapSCool, I had a branch that did 1 with --machine $machine_id .. tho it was failing tests last I checked.18:29
niemeyerSpamapS: That's exactly the context of the conversation.. I don't want to nail the problem of specific assignment for the moment.. there are other approaches we can take for that (resource interest, service proximity, etc), and it's really unrelated to the core problem we're solving for local development18:31
niemeyerSpamapS: So I had one suggestion, and bcsaller has a better suggestion which we'll go down with.. --placement local..18:31
niemeyerSpamapS: This is a trivial bootstrap process that keeps the complex problems for latter18:32
bcsallerSpamapS: where using a local provider would change the default placement policy for you18:32
hazmatniemeyer, although placement considering the service to be deployed (resource interest, service proximity)  will need to receive it as part of the placement api18:37
niemeyerhazmat: Ok, re. co-location.. I agree the flag on the relation is probably all we need18:37
niemeyerhazmat: I don't see it as being special, though18:37
niemeyerhazmat: These relations still need well defined interfaces18:38
hazmatniemeyer, yeah.. well its not clear that a local co-located service needs to have any access to the remote units18:38
niemeyerhazmat: They don't _have_ to18:38
hazmater. its opposite end18:38
niemeyerhazmat: But they should be _able_ to18:38
niemeyerhazmat: re. the placement point above, yes, I'm not trying to define how that's going to work right no18:39
niemeyerhazmat: Was rather just mentioning there are additional things we'll want to talk about and understand when sorting this actual issue18:39
niemeyerhazmat: The problem we have at hand right now is much simpler, though18:39
hazmatokay.. i did some reviews and security work today, switching tracks i'm going to do a presentation tonight at a local python user group, going to prep for that18:42
niemeyerhazmat: Cool.. I'll switch to reviews.. is there something blocking you on that front?18:43
niemeyerI'd like to sort all of William's branches today, hopefully18:43
hazmatniemeyer, nope.. i've just been going through william's branches.. on the security front, the integration work is coming along, i've reworked the interfaces a few times, most recently to enable us to turn off security by default for tests (default for now is enabled), still a little bit of refactoring to do on the policy.. i'm trying to finish the end to end so i can fix up policy-rules branch based on better knowledge of its application.18:45
niemeyerhazmat: Cool18:45
niemeyerHuge wind storm here today19:08
jcastrohow do you move between VTs in the tmux thing when you're in debug mode?19:11
hazmatjcastro, ctrl-a19:17
hazmatis the escape sequence, tmux config in debug-mode is setup to emulate screen19:18
jcastroah, been spoiled by byobu I guess, heh19:19
* jcastro finishes up his ensemble screencast19:19
niemeyerjcastro: We hope to use byobu again at some point19:27
jcastroeasy to forget how spoiled I was19:27
niemeyerjcastro: kirkland is working on a set of configs for tmux, and hopefully we can also bring screen back in the future19:27
hazmatniemeyer, any progress on the repo work?20:18
hazmatjust using the principia-tools to setup a demo.. and thinking ick20:18
niemeyerhazmat: None..20:20
niemeyerhazmat: Stuck on reviews, interviews, conversations, etc20:21
niemeyerhazmat: Hoping to get to it this week still20:21
SpamapSOk I just uploaded txzookeeper 0.8.0 to oneiric.. and will upload trunk shortly as well.20:59
SpamapShazmat: If there's anything minor I can do to make principia less "ick" .. let me know. I've tried to make it a little better of late. :-P21:00
SpamapShazmat: don't want to spend much time on it though.. :)21:00
hazmatSpamapS, i appreciate the work on it, just wishing for a repository to obviate the need for additional tools to deploy21:01
SpamapShazmat: exactly21:01
SpamapShazmat: I'd like a better repo too.. principia is.. well a nice experiment. :()21:02
SpamapShazmat: note that there's a 'princpia update' command now.. which pulls a new list of formulas21:02
SpamapShazmat: and some of the commands have --help21:02
jcastroSpamapS: what!21:03
hazmatSpamapS, if i had to capture in one line the three things to make the dev story better.. it would be "local dev, no formula revs, pre-allocate machines"21:03
SpamapSjcastro: in the ppa21:03
jcastrooh man21:03
SpamapSjcastro: sudo apt-get install principia-tools21:03
jcastroI totally missed that21:03
hazmatSpamapS, getall by itself seems to do the trick of updating (mr seems to do it)21:03
jcastroalso, check it out: http://www.youtube.com/watch?v=4Rl7wTlUqkY21:03
SpamapShazmat: getall calls update :)21:03
hazmatwell of grabbing new formulas21:03
hazmatnice :-)21:03
hazmatSpamapS, was that a good summation of things? or are there others that get top billing?21:04
SpamapShazmat: yeah definitely.. though I have to say, the formula dev story is already pretty damn good.. our standards just keep going up. :)21:04
hazmatSpamapS, indeed, but precious seconds get lost, and turned into minutes.. we keep getting busier ;-)21:05
hazmati think i figured out a quick way to pre-allocate machines, but the allocation doesn't take place till the first formula is deployed21:06
hazmatwhich is kinda of a bummer, its more like a delayed pre-allocation21:06
SpamapSyeah I think it actually makes sense to enable it as its own command21:06
hazmatSpamapS, like add-machines 5 ?21:06
SpamapSensemble bootstrap && ensemble allocate-machines --ec2.instance-type=m1.small 1021:07
SpamapSIt would be cool to have every aspect of the environment available as --env.x=foo21:07
SpamapSWould solve a lot of the "need a way to specify X at runtime"21:08
SpamapSjcastro: cool video21:08
hazmathmmm.. that sounds good re allocate-machines.. the env.x syntax is likely problematic.. its kinda of redundant in that the cli is already targeting a env for any op, so the qualification is odd21:10
SpamapShazmat: its to prevent namespace collision21:11
SpamapShazmat: doesn't have to be ec2. .. could be --envset instance-type=m1.small21:11
SpamapShazmat: or just bury it in the positional args21:12
SpamapShazmat: just seems like a good idea to be able to override settings at runtime thats all21:12
hazmatSpamapS, ic.. i was thinking just allocate-machines --provider-size=m1.small  421:17
=== robbiew is now known as robbiew-afk
niemeyerUgh.. almost 822:43
niemeyerTime flew by today22:43
niemeyerWe have an almost empty review queue!23:02
niemeyerIt's been a while..23:02
niemeyerWe still need  a hand on this one:23:02
niemeyerIt lacks a second review23:02
niemeyerAny takers?23:02
_mup_Bug #820107 was filed: Ensemble should enable flexible unit placement <Ensemble:In Progress by bcsaller> < https://launchpad.net/bugs/820107 >23:35

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!