/srv/irclogs.ubuntu.com/2011/12/06/#juju.txt

SpamapS	Ryan_Lane: w00t!	00:00
SpamapS	Ryan_Lane: m_3 is on holiday all week.. best email him to be sure he gets the message. :)	00:00
Ryan_Lane	ah. ok. cool ;)	00:02
hazmat	jimbaker, figured that one out	00:22
hazmat	re connection error, its was the chain aware deferred thingy on errors	00:22
hazmat	jimbaker, i'm wondering about resurrecting your last rewrite	00:22
SpamapS	config-get needs the same --format=shell that relation-get has	00:23
SpamapS	config-changed runs ridiculously slow with 40 execs .. :-P	00:24
akgraner	ok Silly question but do you all have weekly IRC meetings and if so where can I find the wiki link to the agenda and meeting logs etc - I wanted to include juju in with the development teams weekly meetings section in UWN	00:34
SpamapS	No there's no weekly IRC meeting.	00:35
akgraner	well that explains why I couldn't find anything - thanks :-)	00:35
jimbaker	SpamapS, sounds good	00:35
SpamapS	akgraner: we do a google hangout weekly, and it definitely needs an agenda and minutes..	00:36
akgraner	ahh ok when you all formalize that can you drop the link into the news channel so we can include it please :-)	00:36
jimbaker	hazmat, i hope you have better luck in that rewrite	00:36
akgraner	SpamapS, looking forward to the juju webinar...:-)	00:37
jimbaker	but it makes perfect sense that it was the chain aware stuff - that was the obvious suspect	00:37
SpamapS	akgraner: yeah should be good. Hopefully we can finish our slides tomorrow and have time to practice. :)	00:39
_mup_	Bug #900560 was filed: after upgrade-charm, new config settings do not all have default values <juju:New> < https://launchpad.net/bugs/900560 >	01:20
SpamapS	jimbaker: I just used juju scp btw. :)	01:40
SpamapS	jimbaker: ty!	01:41
negronjl	SpamapS: I just uploaded hadoop-mapreduce charm ... It's a charm that allows you to decouple your mapreduce job from your hadoop cluster.	01:54
SpamapS	negronjl: w00t!	01:54
SpamapS	negronjl: would love to see a web interface for that.. isn't that kind of what HIVE does?	01:54
negronjl	SpamapS: hive is more of a layer on top of hadoop, this charm just takes care of all the setup and cleanup that is supposed to happen before and after a job is run	01:55
negronjl	SpamapS: I'll work on hive, pig, etc. eventually :)	01:55
negronjl	SpamapS: hive lets you use more SQL like commands to query hadoop	01:55
SpamapS	that would be nice for demos.. dump a giant chunk of data in and start doing ad-hoc queries	01:56
negronjl	SpamapS: I'll work on that soon ...	01:56
negronjl	SpamapS: this new hadoop-mapreduce charm shows users how they can use their own custom mapreduce jobs, make a charm out of it and run it against a hadoop cluster that's already deployed via juju	01:57
negronjl	SpamapS: so, it allows you to juju deploy it all :)	01:57
negronjl	SpamapS: I'll work on hive next ( as time permits )	01:57
Skaag	can I make my own ubuntu servers juju compatible so that my apps do not use amazon's ec2 cloud?	01:59
SpamapS	sounds great tho	02:00
Skaag	so it's not possible?	02:00
SpamapS	Skaag: you can, I was responding to negronjl ;)	02:02
Skaag	ah, very cool	02:02
SpamapS	Skaag: right now you only can do that using the 'orchestra' provider	02:02
Skaag	which costs money?	02:02
SpamapS	nope	02:02
SpamapS	just time.. ;)	02:02
SpamapS	its kind of raw	02:02
SpamapS	and you only get one service per machine	02:02
Skaag	yes that sucks	02:02
Skaag	I assume it's going to be fixed...	02:03
SpamapS	Yeah there are two features which will help with that.. subordinate services and placement constraints.	02:03
SpamapS	subordinate services will let you marry two things to basically create like a super charm.	02:03
SpamapS	placement constraints will let you define where you want services to be placed based on things like cpu, ram, or just arbitrary tags that you assign to machines	02:04
SpamapS	Skaag: so depending on what you want, your needs should be served	02:04
SpamapS	cripes, its late.. I need to go	02:04
* SpamapS disappears		02:04
Skaag	sounds awesome!	02:05
Skaag	thanks SpamapS	02:05
shang	hi all, anyone experienced issue when deploying ganglia to EC2? (using "bzr checkout lp:charm/ganglia ganglia" command?)	02:53
=== Skaag_ is now known as Skaag
* hazmat yawns		05:00
jimbaker	SpamapS, glad juju scp proved to be useful	05:20
niemeyer	Good morning!	11:10
fwereade	morning niemeyer!	11:11
niemeyer	rog: I've mixed things up in your review, sorry about that	11:15
niemeyer	That's what I take for review late at night	11:17
TheMue	moo	11:18
niemeyer	rog: You have a saner review	11:38
niemeyer	TheMue: Yo	11:38
rog	niemeyer: thanks, i'm on it	12:06
rog	niemeyer, fwereade, TheMue: mornin' everyone BTW	12:07
fwereade	heya rog	12:07
TheMue	yo rog and fwereade	12:08
niemeyer	rog: Btw, gocheck has a FitsTypeOf checker that does what you wanted there	12:17
niemeyer	Assert(x, FitsTypeOf, (*environ)(nil))	12:18
rog	niemeyer: thanks. all done. https://codereview.appspot.com/5449065/	12:23
Daviey	why isn't this review handled on launchpad?	12:29
niemeyer	Daviey: It is..	12:29
niemeyer	Daviey: https://code.launchpad.net/~rogpeppe/juju/go-juju-ec2-region/+merge/84256	12:30
niemeyer	Daviey: But codereview does inline code reviews	12:30
niemeyer	Daviey: So we use both	12:30
Daviey	seems a quirky workflow, but ok. :/	12:32
niemeyer	Daviey: http://blog.labix.org/2011/11/17/launchpad-rietveld-happycodereviews	12:33
niemeyer	Daviey: The quirky workflow is significantly simpler and faster to deal with	12:34
TheMue	Do we have a complete paper for the used tools, conventions, workflows, standards and conventions?	12:35
niemeyer	TheMue: Nope	12:35
niemeyer	TheMue: You can check out the Landscape one for some background	12:36
TheMue	something like golangs "Getting Started" would be fine. nothing big.	12:36
niemeyer	TheMue: This blog post is a Getting Started pretty much	12:37
TheMue	landscape one?	12:37
rog	niemeyer: next: https://codereview.appspot.com/5449103/	12:37
niemeyer	TheMue: Nah, nevermind.. reading it again I notice it's of little use for us	12:39
TheMue	niemeyer: ok	12:40
rog	niemeyer: one mo, i need to merge trunk	12:40
fwereade	niemeyer: consider a unit agent coming up to discover that the unit workflow is in state "stop_error"	12:40
TheMue	niemeyer: maybe I'll write down my "starter experiences" ;)	12:41
niemeyer	TheMue: Sounds like a plan	12:41
niemeyer	fwereade, rog: ok	12:41
fwereade	niemeyer: it seems to me that that could only have happened as a result of a failed attempt to shut down cleanly at some stage	12:41
niemeyer	fwereade: Hmm.. yeah	12:42
fwereade	niemeyer, and that in that case it would be reasonable to retry into "stopped" without hooks, and then to transition back to "started" as we would have done had the original stop worked properly	12:42
fwereade	niemeyer, please poke holes in my theory	12:43
niemeyer	fwereade: What do you mean by `to retry into "stopped"`	12:43
niemeyer	?	12:43
niemeyer	fwereade: The user is retrying? THe system? etc	12:43
fwereade	niemeyer, I was referring to the "retry" transition alias	12:44
fwereade	niemeyer, may as well just explicitly "retry_stop"	12:44
niemeyer	fwereade: I'm missing something still	12:45
niemeyer	fwereade: Probably because I don't understand it well enough	12:45
fwereade	niemeyer, I think I've misthought anyway	12:45
niemeyer	fwereade: Does stop transition back onto "started" when it works?	12:45
fwereade	niemeyer, a clean shutdown of the unit agent puts the workflow into "stopped", and that will go back into "started" when it comes up again	12:46
niemeyer	fwereade: Aha, ok	12:46
niemeyer	fwereade: That makes sense	12:46
niemeyer	fwereade: So the unit shuts down, and goes onto stop_error because stop failed..	12:46
fwereade	niemeyer: so I think the general idea, that coming up in "stop_error" should somehow be turned into a "started" state, remains sensible	12:47
niemeyer	fwereade: You mean automatically?	12:47
rog	niemeyer: all clear now.	12:47
fwereade	niemeyer, I do, but I may well be wrong	12:47
niemeyer	fwereade: I'm not sure	12:48
niemeyer	fwereade: I mean, I'm not sure that doing this automatically is a good idea	12:48
niemeyer	fwereade: I'd vote for explicitness until we understand better how people are using this	12:48
fwereade	niemeyer, that's perfectly reasonable	12:48
niemeyer	fwereade: The reason being that a stop_error can become an exploded start and/or upgrade soon enough	12:48
niemeyer	fwereade: I'd personally appreciate knowing that stop failed, so I can investigate what happened on time, rather than blowing up in cascade in later phases, which will be harder to debug	12:49
fwereade	niemeyer, that definitely makes sense as far as it goes	12:49
* fwereade thinks a moment		12:50
niemeyer	fwereade: resolved [--retry] enables the workflow of ignoring it	12:50
fwereade	niemeyer, but it leaves the unit in a "stopped" state without an obvious way to return to a "started" state	12:51
fwereade	niemeyer, I'll need to check whether bouncing the unit agent will then do the correct swicth to "started"	12:51
fwereade	niemeyer, which I guess would be sort of OK, but it's not necessarily the obvious action to take to get back into "started" once you've resolved the stop error	12:52
niemeyer	fwereade: What about "resolved"?	12:53
niemeyer	fwereade: "resolved" should transition it to stopped rather than started	12:53
niemeyer	fwereade: It always moves it to the next intended state, rather than the previous one	12:54
fwereade	niemeyer, yes, but "stopped" is not a useful state for a running unit to be in	12:54
niemeyer	fwereade: Indeed.. but I don't know what you mean by that	12:55
niemeyer	fwereade: The stop hook has run to transition it onto stopped.. if you resolved a stop_error, it should be stopped, not running	12:56
fwereade	niemeyer: it should, yes, but the purpose of that stopped state is to signal that it's in a nice clean state for restart (if the restart ever happens ofc)	12:56
fwereade	niemeyer: (it could just have gone down for ever, but I don't think that's relevant here)	12:57
fwereade	niemeyer: we end up past the point at which we should detect "ok, we were shut down cleanly so we can be brought up cleanly, let's bring ourselves back up"	12:57
niemeyer	fwereade: You're implying that the stop hook is only called for starting, which is awkward to me	12:59
fwereade	"without painful considerations like 'we're in charm_upgrade_error state, reestablish all watches and relation lifecycles but keep the hook executor stopped watches'"	12:59
fwereade	(er, lose the trailing "watches" above)	12:59
fwereade	niemeyer: I'm implying, I think, that the stop hook is called for stopping, and that that can either happen because the whole unit is going away forever or because we've been killed cleanly for, say, machine reboot	13:01
niemeyer	fwereade: It can also happen for any other reason, like a migration, or cold backup, etc	13:02
fwereade	niemeyer: and that when the agent comes up again, "stopped" is considered to be a sensible state from which to transition safely to "started", just as None or "installed" would be	13:02
fwereade	niemeyer: I'm not sure I see the consequences there, expand please?	13:03
niemeyer	fwereade: That's fine, but the point is that some of these actions are not safe to execute if stopped has actually blown up, I think	13:03
fwereade	niemeyer: I'm comfortable with the decision not to automatically start after stop_error	13:04
fwereade	niemeyer: I'm not confident that we have a sensible plan for transitioning back to started once the user has fixed and resolveded	13:05
niemeyer	fwereade: It feels like there are two situations:	13:06
niemeyer	fwereade: 1) The unit was stopped temporarily, without the machine rebooting	13:06
niemeyer	fwereade: 2) The unit was stopped because the machine is rebooting or is being killed	13:07
niemeyer	fwereade: Are you handling only 2) right now?	13:07
niemeyer	fwereade: Or is there some situation where you're handling 1)?	13:07
niemeyer	rog: Review delivered	13:08
rog	niemeyer: on it	13:08
fwereade	niemeyer, I'm handling both, I think -- how do we tell which is the case?	13:08
niemeyer	fwereade: Is there a scenario where the unit stops without a reboot? How?	13:09
fwereade	niemeyer: not a planned situation	13:10
niemeyer	fwereade: What about unplanned situations.. how could it happen?	13:10
fwereade	niemeyer: but what about, say, an OOM kill?	13:10
niemeyer	fwereade: OOM kill of what?	13:11
fwereade	niemeyer, the unit agent... could that never happen?	13:11
fwereade	niemeyer, even, a poorly written charm that accidentally kills the wrong PID	13:11
niemeyer	fwereade: Wow, hold on. The unit agent dying and the unit agent workflow are disconnected, aren't them?	13:12
niemeyer	Sorry	13:12
niemeyer	fwereade: Wow, hold on. The unit agent dying and the unit workflow are disconnected, aren't them?	13:12
fwereade	niemeyer: the workflow transitions to "stopped" when the agent is stopped	13:12
niemeyer	fwereade: WHAAA	13:12
niemeyer	fwereade: How can it possibly transition when it dies?	13:13
fwereade	niemeyer: ok, as I understand it, when we're killed normally, stopService will be called	13:14
fwereade	niemeyer: it's only a kill -9 that will take us down without our knowledge	13:14
niemeyer	fwereade: and an OOM, and a poorly writen charm, and ...	13:14
fwereade	niemeyer: indeed, I'm considering that there are two scenarios, respectively equivalent to kill and kill -9	13:15
rog	niemeyer: response made.	13:15
fwereade	niemeyer: in neither case do we know for what reason we're shutting down	13:15
fwereade	niemeyer: (and in only the first case do we know it's even happened)	13:16
niemeyer	fwereade: So why are we taking a kill as a stop? Didn't you implement logic that enables the unit agent to catch up gracefully after a period of being down?	13:17
fwereade	niemeyer: we were always taking a kill as a stop	13:17
fwereade	http://paste.ubuntu.com/761597/	13:17
fwereade	(given that a "friendly" kill will stopService)	13:18
niemeyer	fwereade: ok, the question remains	13:18
fwereade	niemeyer: ...sorry, I thought you were following up your last message	13:22
fwereade	niemeyer: I have implemented that logic, or thought I had, but have discovered a subtlety in response to hazmat's latest review	13:23
fwereade	niemeyer: well, in hindsight, an obviousty :/	13:23
hazmat	g'morning	13:23
fwereade	heya hazmat	13:23
hazmat	i ended up rewriting the ssh client stuff last night	13:24
hazmat	fwereade, i'm heading back to your review right now	13:24
fwereade	hazmat: might be better to join the conversation here	13:24
* hazmat catches up with the channel log		13:24
fwereade	hazmat: I'm pretty sure the initial version of the unit state handling is flat-out wrong :(	13:24
fwereade	niemeyer: ok, stepping back a mo	13:26
fwereade	niemeyer: when the unit agent comes up, the workflow could be in any state, and we need to make sure the lifecycles end up in the correct state	13:27
niemeyer	fwereade: Right	13:27
niemeyer	fwereade: What I'm questioning is the implicit stop	13:27
fwereade	niemeyer: cool, we're on the same page then	13:28
niemeyer	fwereade: We should generally not kill the unit unless we've been explicitly asked to	13:28
niemeyer	fwereade: In the face of uncertainty, attempt to preserve the service running	13:28
fwereade	niemeyer: I am very sympathetic to arguments that we should not explicitly go into the "stopped" when we're shut down	13:28
fwereade	niemeyer: cool	13:28
fwereade	niemeyer: am I right in thinking there's some bug where the stop hook doesn't get run anyway?	13:29
niemeyer	fwereade: I agree that we should transition stop_error => started when we're coming from a full reboot, though	13:29
niemeyer	fwereade: The stop story was mostly uncovered before you started working on it	13:30
fwereade	niemeyer: heh, I think I'm more confused now :(	13:32
fwereade	niemeyer: the discussion of start hooks on reboot seems to me to be unconnected with the actual state we're in	13:32
fwereade	niemeyer: by it's nature, that uarantee is an end-run around whatever normal workflow we've set up, isn't it?	13:33
* fwereade resolves once again to lrn2grammar, lrn2spell		13:33
hazmat	fwereade, why would it be an end run around to what the state of the system is?	13:34
hazmat	errors should be explicitly resolved	13:34
hazmat	i see the point re stop being a effectively a final transition for resolved	13:35
niemeyer	fwereade: Hmm	13:35
niemeyer	fwereade: I agree with hazmat in the general case	13:35
niemeyer	fwereade: What I was point at, though, is that there's one specific case where that's not true: reboots	13:35
niemeyer	fwereade: Because we're sure the unit was stopped, whether it liked it or not	13:36
fwereade	hazmat, niemeyer: surely there are only a few "expected" states in which to run the start hook?	13:36
niemeyer	fwereade: That's a special case where any transition should be transitioned onto "stopped", and then the normal startup workflow should run	13:36
fwereade	niemeyer: so we should set up state transitions to"stopped" for every state?	13:37
niemeyer	fwereade: That's a bit of a truism (sure, we don't run start arbitrarily)	13:38
hazmat	i'm wondering if the reboot scenario is better handled explicitly	13:38
fwereade	niemeyer: ...including, say, "install_error"?	13:38
niemeyer	hazmat: Right, that's my thinking too	13:38
hazmat	via marking the zk unit state, rather than implicit detection	13:38
niemeyer	hazmat: Nope	13:38
niemeyer	hazmat: Machines do crash, and that's also a reboot	13:38
hazmat	true	13:39
niemeyer	fwereade: No.. why would we?	13:39
niemeyer	fwereade: You seem to be in a different track somehow	13:39
niemeyer	fwereade: Maybe I'm missing what you're trying to understand/point out	13:39
fwereade	niemeyer: I think the discussion has broadened to cover several subjects, each with several possible tracks ;)	13:40
* fwereade marshals his thoughts again		13:41
niemeyer	fwereade: There are two cases:	13:41
niemeyer	1) Reboot	13:41
niemeyer	In that case, when the unit agent starts in this scenario, it should reset the state to stopped, and then handle the starting cleanly.	13:42
niemeyer	2) Explicit stop + start (through a command to be introduced, or whatever)	13:43
niemeyer	In this scenario, a stop_error should remain as such until the user explicitly resolves it	13:43
niemeyer	resolving should transition to stopped, and not attempt to start it again	13:43
niemeyer	Then,	13:43
niemeyer	There's a third scenario	13:43
niemeyer	3) Unknown unit agent crashes	13:43
niemeyer	Whether kill or kill -9 or any other signal, the unit agent should not attempt to transition onto stopped, because the user didn't ask for the service to stop.	13:44
niemeyer	Instead, the unit agent should be hooked up onto upstart	13:44
niemeyer	So that it is immediately kicked back on	13:45
niemeyer	Even if the user says "stop unit-agent".. we should not stop the service	13:46
hazmat	3) sounds good, the stop dance needs explicit coordination with non container agents, getting explicit transitions removed from implicit process actions is a win.	13:47
fwereade	niemeyer: ok, so I guess at the moment we don't have anything depending on the existing stop-on-kill behaviour	13:47
hazmat	2) isn't really a use case we have atm, but sounds good, 1) the handing nears some exploration, what's an install_error mean in this context	13:48
niemeyer	hazmat: Good point, agreed	13:48
hazmat	are we resetting to a null state, or are we creating guide paths to start from every point in the graph	13:48
niemeyer	hazmat: Even on reboots, it should take into account which states are fine to ignore	13:49
niemeyer	stop_error is one of them	13:49
niemeyer	start_error is another	13:50
fwereade	niemeyer: so indeed I'm very happy with (3) and only a little confused about (2): because I'm now not sure when we should be in a "stopped" state	13:50
hazmat	niemeyer, right.. by ignore you mean ignoring resolved action, and just reset the workflow to a point where we can call start, and possibly install again	13:50
niemeyer	fwereade: Imagine we introduce three fictitious commands (we have to think, let's not do them now): juju start, juju stop, and juju restart.	13:51
hazmat	juju transition <state>	13:51
niemeyer	fwereade: Do you get the picture?	13:51
niemeyer	hazmat: Ugh, nope.. mixing up implementation and interface	13:51
hazmat	;-)	13:51
fwereade	niemeyer: yes, but I'm worried about here-and-now; I see https://bugs.launchpad.net/juju/+bug/802995 and https://bugs.launchpad.net/juju/+bug/872264	13:52
_mup_	Bug #802995: Destroy service should invoke unit's stop hook, verify/investigate this is true <juju:New> < https://launchpad.net/bugs/802995 >	13:52
_mup_	Bug #872264: stop hook does not fire when units removed from service <juju:Confirmed> < https://launchpad.net/bugs/872264 >	13:52
hazmat	juju coffee, bbiam	13:52
niemeyer	fwereade: Both of these look like variations of the 1) scenario.. why would we care about any erroring states if the unit is effectively being doomed?	13:53
fwereade	niemeyer: it seems to me that if we don't stop on stopService, we will never go into the "stopped" state	13:56
fwereade	niemeyer: I'm keen on making that change	13:56
niemeyer	fwereade: Why? Stop can (and should, IMO) be an explicitly requested action.	13:57
niemeyer	fwereade: Stop is "Hey, your hundred-thousand dollars database server is going for a rest"	13:57
fwereade	niemeyer: back up again: you're saying that we shouldn't go into "stopped" just because the unit agent is shutting down	13:58
niemeyer	fwereade: It doesn't feel like the kind of thing to be done without an explicit "Hey, stop it!" from the admin	13:58
fwereade	niemeyer: so it's a state that we won't ever enter until some hypothetical future juju start/stop command is implemented?	13:58
niemeyer	fwereade: Define "shutting down"	13:58
niemeyer	fwereade Shutting down can mean lots of things.. if it means "kill", yes, that's what I mean	13:59
niemeyer	fwereade: If it means, the unit was notified that it should stop and shut down, no, it should execute the stop hook	13:59
fwereade	niemeyer: ok, that makes perfect sense, it just seems that we don't actually do that now	14:00
niemeyer	fwereade: No, it's not a hypothetical future.. a unit destroy request can execute stop	14:00
niemeyer	fwereade: Because it's an explicit shut down request from the admin	14:00
niemeyer	fwereade: But that's not the same as twisted's stopService	14:01
fwereade	niemeyer: ok, but in that case we'll never be in a position where we're coming back up in "stopped" or "stop_error", which renders the original question moot	14:01
niemeyer	fwereade: That's right	14:01
niemeyer	fwereade: it's a problem we have, and that will have to be solved at some point, but it doesn't feel like you have it now	14:02
niemeyer	Alright, and I really have to finish packing, get lunch, and leave, or will miss the bus	14:02
fwereade	niemeyer: sorry to hold you up, didn't realise :(	14:02
fwereade	niemeyer: take care & have fun :)	14:03
niemeyer	fwereade: No worries, it was a good conversation	14:03
niemeyer	Cheers all!	14:03
hazmat	niemeyer, have a good trip	14:06
hazmat	fwereade, so do you feel like you've got a clear path forward?	14:06
fwereade	hazmat: still marshalling my thoughts a bit	14:06
fwereade	hazmat: the way I see it right now, I have to fix the stop behaviour and see what falls out of the rafters, and then move forward from there again	14:07
* hazmat checks out the mongo conference lineup		14:07
hazmat	fwereade, fixing the stop behavior is a larger scope of work	14:08
hazmat	fwereade, i'd remove/decouple stop state from process shutdown and move forward with the restart work	14:09
fwereade	hazmat: ...then my path is not clear, because I have to deal with coming up in states that are ...well, wrong	14:09
hazmat	fwereade, a writeup on stop stuff.. http://pastebin.ubuntu.com/761640/	14:11
hazmat	i feel mixing up different things though	14:12
hazmat	fwereade, got a moment for g+?	14:12
fwereade	hazmat, yeah, sounds good	14:12
niemeyer	hazmat: Thanks!	14:47
niemeyer	Heading off..	14:47
niemeyer	Will be online later from the airport..	14:47
rog	quick query: anyone know of a way to get bzr, for a given directory, to add all files not previously known and remove all files not in the directory that were previously known? a kind of "sync", i guess.	15:05
rog	i can do "bzr rm $dir" then "bzr add $dir" but that's a bit destructive	15:06
mpl	rog: I'm still a bit lost between the different gozk repos. what's the difference (in terms of import) between launchpad.net/gozk and launchpad.net/gozk/zookeeper? is one of them a subpart of the other or are they really just different versions of the same thing overall?	15:21
rog	mpl: the latter is the later version	15:21
rog	mpl: the former is the currently public version	15:21
rog	mpl: i just found a pending merge that needs approval from niemeyer, BTW	15:21
mpl	rog: ok, thx. and with which one of them do you think I should work?	15:23
rog	mpl: launchpad.net/gozk/zookeeper	15:23
mpl	rog: good, because gotest pass with that one for me, while they don't with the public one. also, how come there is no Init() in launchpad.net/gozk/zookeeper? has it been moved/renamed to something else?	15:24
rog	mpl: let me have a look	15:25
rog	mpl: Init is now called Dial	15:27
mpl	yeah looks like it, thanks for the confirmation.	15:28
lynxman	hazmat: ping	15:31
hazmat	lynxman, pong	15:31
rog	mpl: seems like my merge has actually gone in recently	15:33
lynxman	hazmat: quick question for you, we're trying to see how to do a remote deployment with juju	15:33
* hazmat nods		15:34
lynxman	hazmat: so we were thinking about a "headless" juju deployment in which it connects to zookeeper once the formula has been deployed	15:34
lynxman	hazmat: part of the formula being setting up a tunnel or route to the zookeeper node	15:34
lynxman	hazmat: what are your thoughts about that? :)	15:34
hazmat	lynxman, what do you mean by remote in this context?	15:35
lynxman	hazmat: let's say I have a physical platform and I want to extend it by deploying nodes on another platform which is not on the same network	15:35
lynxman	hazmat: but it's just an extension of the first platform, in N nodes as necessary	15:35
hazmat	lynxman, like different racks different or like different data centers different	15:36
lynxman	hazmat: different data centers different :)	15:36
hazmat	ie. what's the normal net latency	15:36
lynxman	hazmat: latency would be a bit higher than usual, let's say around 200ms	15:37
hazmat	and increased risk of splits	15:37
lynxman	hazmat: exactly :)	15:37
hazmat	lynxman, so in that case, i'd say its probably best to model two juju different environments, and have proxies for any inter-data center relations	15:38
lynxman	hazmat: hmm could you point me towards proper documentation for proxying? is Juju so far okay with that?	15:38
hazmat	lynxman, hmm.. maybe its worth taking a step back, what sort of activities do you want that coordinate across the data centers	15:39
lynxman	hazmat: we want to extend a current deployment, let's say a wiki app, set up the appropriate tunnels for all the necessary db backend comms	15:39
lynxman	hazmat: just to support periods of higher traffic	15:40
lynxman	hazmat: so I'd extend my serverfarm on certain hours of the day, for example :)	15:42
hazmat	lynxman, well not exactly that which is well known.. extending it across the world during certain hours of the day on architecture that wants connected clients is a different order	15:43
hazmat	lynxman, so proxies are probably something that would be best provided by juju itself, its possible to do so in a charm, but in cases like this you'd effectively be forking the charm your proxying	15:43
lynxman	hazmat: pretty much yeah	15:44
lynxman	hazmat: we want to use the colocation facility of juju to add a proxy charm under the regular one and such	15:44
hazmat	lynxman, atm we're working on making its so that juju agents can disconnect for extended periods and come back in a sane state	15:44
lynxman	hazmat: hmm interesting, any ETA for that or is it in the far future?	15:45
hazmat	lynxman, its for 12.04	15:45
lynxman	hazmat: neat :)	15:45
hazmat	lynxman, its not clear how the remote dc is exposed via the provider in this case which i assume is orchestra	15:46
lynxman	hazmat: the idea is to add either a stunnel or a route to a vpn concentrator, which will be deployed by a small charm or orchestra itself as necessary	15:46
hazmat	lynxman, right, but it wouldn't be a single orchestra endpoint controlling machines at each data center, they would be separate	15:47
lynxman	hazmat: exactly	15:47
hazmat	lynxman, so i'm still thinking its better to try and model this as separate environments	15:48
lynxman	hazmat: it'd be configured by cloud-init	15:48
lynxman	hazmat: hmm I see	15:48
lynxman	hazmat: so any docs you can point me at on how to connect two zookeeper instances?	15:48
hazmat	its not a single endpoint we're talking too, and even just for redundancy, we'd want each data center to be functional in the case of a split	15:49
* hazmat ponders proxies		15:50
hazmat	so in this case you'd want to have the tunnel/vpn as a subordinate charm, and a proxy db, that you can communicate with.	15:51
hazmat	hmm.. lynxman i think the nutshell is cross dc isn't on the roadmap for 12.04, we will support different racks, different availability zones, etc. but i don't think we have the bandwidth to do cross-dc	15:52
hazmat	well	15:52
lynxman	hazmat: well we're trying to investigate the options on that basically	15:52
lynxman	hazmat: our first idea was a headless juju that could deploy a charm and as part of the charm connect itself back to the zookeeper	15:53
lynxman	hazmat: just to keep it as atomic as possible	15:53
hazmat	lynxman, fair enough, lacking support in the core, the options are if you have a single provider endpoint, you can try it anyways and it might work. or you'll be doing custom charm work to try and cross the chasm.	15:53
hazmat	headless juju is not a meaningful term to me	15:54
lynxman	hazmat: headless as the head being zookeeper :)	15:54
hazmat	lynxman, still not meaningful ;-)	15:54
hazmat	its like saying a web browser without html	15:54
lynxman	hazmat: hmm the idea is to deploy a charm through the juju client and once the charm is setup let it connect through a tunnel to zookeeper to report back	15:55
lynxman	hazmat: does that make more sense?	15:55
hazmat	less	15:55
hazmat	it would make more sense to register a machine for use by juju	15:55
=== amithkk is now known as sansui12
hazmat	since a single provider is lacking here	15:56
hazmat	and then it would be available to the juju env, and you could deploy particular units/services across it with the appropriate constraint specification	15:56
lynxman	hazmat: so I'd need to do the tunneling part as a pre-deployment before juju using another tool, be it cloud-init or such, right?	15:56
hazmat	but the act of registration startups a zk connected machine agent	15:56
=== sansui12 is now known as amithkk
lynxman	hazmat: then just tell juju to deploy into that machine in special using a certain charm	15:57
hazmat	lynxman, what's controlling machines in the other dc?	15:57
hazmat	lynxman, are they two dcs with two orchestras ?	15:57
lynxman	hazmat: best case scenario cloud-init	15:57
lynxman	hazmat: not necessarily	15:57
lynxman	hazmat: but cloud-init is also integrated into orchestra so...	15:57
hazmat	lynxman, yup	15:58
lynxman	hazmat: it's a good single point	15:58
hazmat	lynxman, the notion of connecting back on charm deploy isn't really the right one.. juju owns the machine since it created it, and the machine is connected to zk independent of any services deployed to it	16:00
lynxman	hazmat: that's why I wanted to pass the idea through you to know what you thought :)	16:00
hazmat	hence the notion of registering the machine to make it available to the environment, but thats something out of core as it violates any notion of interacting with the machine via the provider	16:00
lynxman	hazmat: exactly, it does violate the model somehow	16:02
hazmat	as far as approaching this in a way thats supportable in an on-going fashion, i think its valuable to try and model the different dcs as different juju environments that are communicating	16:02
hazmat	then you could deploy your vpn charm as a subordinate charm through out one environment to provide the connectivity to the other env	16:03
hazmat	the lack of cross env relations and no support for the core is problematic, but it sound more like a solvable case of a one-off deployment	16:04
hazmat	via custom charms	16:04
hazmat	actually maybe even generically if its done right..	16:05
lynxman	hazmat: but that's the idea, the other dc can be used on and off, different machines, different allocations	16:05
lynxman	hazmat: that's why I was opting for an atomic solution	16:05
hazmat	a proxy charm would be fairly generic	16:05
mpl	rog: which merge were you talking about?	16:53
rog	mpl: update-server-interface (revision 24 in gozk/zookeeper trunk)	16:54
mpl	rog: bleh, I find launchpad interface to view changes and browse files really awkward :/	17:24
rog	mpl: i use bzr qdiff when codereview isn't available	17:25
rog	bzr: (apt-get install qbzr)	17:26
mpl	rog: anyway, what is this merge about. are you pointing it out because it is relevant to the Init -> Dial change we talked about?	17:26
rog	mpl: it had lots of changes in it	17:30
rog	mpl: and quite possibly that change included, i can't remember	17:31
mpl	ok	17:32
hazmat	fwereade, its not clear that coming up with an installed state would result in a transition.	18:32
hazmat	to started	18:32
hazmat	re ml	18:32
_mup_	Bug #900873 was filed: Automatically terminate machines that do not register with ZK <juju:New> < https://launchpad.net/bugs/900873 >	18:38
=== lamal666 is now known as lamalex
hazmat	jimbaker, incidentally this is my cleanup of sshclient.. http://pastebin.ubuntu.com/761938/	19:07
jimbaker	hazmat, you have a yield in _internal_connect, but no inlineCallbacks	19:09
hazmat	jimbaker, sure	19:09
jimbaker	hazmat, so i like the intent (the inline form is much better for managing errors imho), but is that supposed to work as-is?	19:11
hazmat	jimbaker, do you see a problem with it?	19:12
hazmat	jimbaker, there's a minor accompanying change to sshforward	19:12
jimbaker	hazmat, i just wonder why it's not decorated with @inlineCallbacks, that's all	19:12
marcoceppi	So, can a charm run config-set and set config options?	19:20
SpamapS	no	19:21
SpamapS	well	19:21
SpamapS	maybe?	19:22
SpamapS	marcoceppi: its worth trying 'juju set' from inside a charm.. but my inclinationw ould be that it couldn't because it wouldn't have AWS creds to find the ZK server.	19:22
marcoceppi	Ah, okay.	19:22
SpamapS	marcoceppi: I do think its an interesting idea to be able to adjust the whole service's settings from hooks.	19:22
marcoceppi	For things like blowfish encryption key, I'd like to randomly generate it and have it set in the config so juju get <service> will show it	19:23
* marcoceppi writes a bug		19:23
SpamapS	marcoceppi: not sure ZK is a super safe place for private keys	19:23
marcoceppi	Neither is plaintext files, but that's what I'm working with	19:24
niemeyer	Yo	19:24
SpamapS	marcoceppi: what you want is the ability to feed data back to the user basically, right?	19:25
marcoceppi	more or less	19:25
marcoceppi	yes	19:25
SpamapS	marcoceppi: yeah there's a need for that, Not sure if a "config-set" would be the right place	19:25
marcoceppi	mm, it's just the first thing that came to mind for me	19:26
SpamapS	marcoceppi: bug #862418 might encompass what you want	19:26
_mup_	Bug #862418: Add a way to show warning/error messages back to the user <juju:Confirmed> < https://launchpad.net/bugs/862418 >	19:26
marcoceppi	thanks	19:27
marcoceppi	Ugh, this is probably the wrong channel, but I can't get this sed statement to work. Surely someone has had to escape a variable that was a path to work with sed before	19:30
marcoceppi	I started with s/\//\\\//g because that seems like it would logically work, but it doesn't.	19:30
SpamapS	marcoceppi: you can use other chars than /	19:30
marcoceppi	For a file path?	19:30
SpamapS	s,/etc/hosts,/etc/myhosts,g	19:31
marcoceppi	ohhh	19:31
marcoceppi	that actually helps a lot.	19:31
marcoceppi	I completely forgot about that	19:31
SpamapS	Yeah, I get all backslash crazy sometimes too then remember	19:31
nijaba	hello. Quick question: if I want to pass parameters to my charm (retrieved by config-get), do I need to explicitly mention the yaml file on the deploy command? I could not find doc about this. Did I miss it?	19:42
SpamapS	nijaba: you can pass a full configuration via yaml in deploy, or use 'juju set' after deploy.	19:55
nijaba	SpamapS: but do I have to specify the yaml or will chamname.yaml be automatically used?	19:55
SpamapS	nijaba: if you want deploy to use a yaml file you have to mention it	19:56
nijaba	SpamapS: ok, thanks a lot :)	19:56
nijaba	SpamapS: another question. in my config.yaml, how do I set my default to be an empty string. I get: "expected string, got None" error	20:08
marcoceppi	nijaba: default: ""	20:12
nijaba	SpamapS: nm, found the doc finaly :)	20:12
nijaba	marcoceppi: thanks :)	20:16
SpamapS	IMO the default for type: string should be ""	20:18
SpamapS	None comes out as "None" from config-get	20:18
SpamapS	or empty, I forget actually	20:19
SpamapS	hrm	20:19
nijaba	"" works as expected	20:22
marcoceppi	SpamapS: I've got a couple of improvements for charm-helper-sh (mainly fixes to wget) should I just push those straight to the branch or would a review still be a good thing?	20:30
SpamapS	marcoceppi: lets do reviews for everything.. we'll pretend people are actually using these and we don't want to break it. :)	20:30
marcoceppi	<3 sounds good	20:31
nijaba	you'd better, cause I do now :)	20:31
marcoceppi	\o/	20:31
marcoceppi	It's actually a bug fix that would result in an install error :\	20:31
hazmat	SpamapS, do you have ideas on how to reproduce bug 861928	20:35
_mup_	Bug #861928: provisioning agent gets confused when machines are terminated <juju:New for jimbaker> < https://launchpad.net/bugs/861928 >	20:35
SpamapS	hazmat: yes, just terminate a machine using ec2-terminate-machines	20:35
hazmat	SpamapS, ah	20:35
hazmat	SpamapS, thanks	20:35
SpamapS	hazmat: I don't know if there's a race and you have to do it at a certain time.	20:36
hazmat	SpamapS, i've been playing around with juju terminate-machine .. haven't been able to reproduce, i'll triggering it externally with ec2-terminate-instances	20:37
SpamapS	hazmat: right, because juju terminate-machine cleans up ZK first. :)	20:38
hazmat	jimbaker, have you tried reproducing this one?	20:45
jimbaker	hazmat, not yet, i've been working on known_hosts actually	20:45
nijaba	another stupid question: can I do a relation-get in a config-changed hook? how do I specify which relation I am taling about?	20:47
nijaba	*talking	20:48
niemeyer	Flight time..	20:49
niemeyer	Laters	20:49
hazmat	nijaba, not at the moment	20:51
nijaba	hazmat: harg. So if I have a config file that takes stuff from the relation, the logic to implement config-changed is going to be quite complicated...	20:52
hazmat	nijaba, you can store it on disk in your relation hooks, and use it in config-changed	20:52
hazmat	nijaba, or just configure the service in the relation hook	20:52
nijaba	hazmat: ah, coool. thanks a lot	20:53
hazmat	nijaba, don't get me wrong, that is a bug	20:53
nijaba	hazmat: I don't, I just like the workaround	20:53
SpamapS	nijaba: one way I've done what you're dealing with is to just have the hooks feed into one place, and then at the end, try to build the configuration if possible.. otherwise exit 0	21:00
_mup_	txzookeeper/trunk r45 committed by kapil.foss@gmail.com	21:21
_mup_	[trivial] ensure we always attempt to close the zk handle on client.close, stops background connect activity associated to the handle.	21:21
SpamapS	nijaba: tsk tsk.. idempotency issues in limesurvey charm. ;)	21:36
nijaba	SpamapS: can you be a bit clearer?	21:37
SpamapS	nijaba: I've got a diff now.. :)	21:37
SpamapS	nijaba: if you remove the db relation and add it back in.. fail abounds ;)	21:37
SpamapS	nijaba: or more correctly, if you try to relate it to a different db server	21:37
nijaba	SpamapS: ah. let me finish the current test and will fix	21:37
SpamapS	nijaba: no I have a fix already	21:38
SpamapS	nijaba: I'll push the diff up	21:38
nijaba	SpamapS: ok thanks, will look	21:38
SpamapS	mv admin/install admin/install.done	21:38
SpamapS	chmod a-rwx admin/install.done	21:38
SpamapS	This bit is rather perplexing tho	21:39
nijaba	SpamapS: for "security" reason, install procs are to be moved away or admin interface will compllain	21:39
nijaba	SpamapS: they recommend to completely remove, but that'smy way of doing it	21:40
nijaba	SpamapS: just so that it could be reused	21:40
robbiew	negronjl: you ever see this type of java error with the hadoop charms? -> https://pastebin.canonical.com/56816/	21:41
negronjl	robbiew: checking	21:41
SpamapS	nijaba: ok, makes sense	21:41
negronjl	robbiew: bad jar file ...	21:42
negronjl	robbiew: Where is the jar file itself in the system ?	21:42
robbiew	yeah...but the same example worked when I deployed using default 32bit oneiric system...this is a 64bit	21:42
SpamapS	nijaba: seems rather silly to remove perfectly good tools.. ;)	21:42
robbiew	negronjl: /usr/lib/hadoop	21:43
robbiew	i can unzip it	21:43
negronjl	robbiew: did you do this all with juju ?	21:43
negronjl	robbiew: if so, was it in AWS ? Do you have the AMI so I can re-create ? I normally use 64-bit oneiric ones and I haven't seen that error ..	21:44
negronjl	robbiew: using the following .... default-image-id: ami-7b39f212	21:44
negronjl	default-instance-type: m1.large	21:44
negronjl	default-series: oneiric	21:44
negronjl	robbiew: in my environment.yaml file	21:45
* robbiew double checks his		21:45
robbiew	hmm	21:45
robbiew	negronjl: default-image-id: ami-c162a9a8	21:46
negronjl	robbiew: I normally use the latest m1.large, oneiric AMI that I can find in http://uec-images.ubuntu.com/oneiric/current/	21:46
negronjl	robbiew: Today's latest one ( for oneiric, m1.large ) is: ami-c95e95a0	21:48
negronjl	robbiew: I am currently testing that one just to be sure	21:48
robbiew	negronjl: do we do them for m1.xlarge?	21:48
* robbiew was playing around with types		21:49
negronjl	robbiew: I haven't tried them but, now is as good a time as any so, trying now :)	21:49
negronjl	robbiew: give me a sec	21:49
robbiew	lol	21:49
SpamapS	nijaba: http://bazaar.launchpad.net/~charmers/charm/oneiric/limesurvey/trunk/revision/10	21:51
* SpamapS should have pushed that to a review branch.. bad bad SpamapS		21:51
negronjl	robbiew: no xlarge images	21:51
* SpamapS runs off to dentist appt that starts in 8 minutes		21:51
negronjl	robbiew: I see cc1.4xlarge though	21:51
negronjl	robbiew: test that one ?	21:51
robbiew	negronjl: yeah...so I'm clearly of in the weeds :)	21:52
robbiew	eh...nah	21:52
robbiew	let me use the right settings first ;)	21:52
robbiew	I had it working with the defaults	21:52
negronjl	robbiew: k ... let me know if I can break anything for you :)	21:52
robbiew	so I figured it was user error	21:52
robbiew	lol...sure	21:52
nijaba	SpamapS: thanks :) but this means that the install proc may run multiple times even for the same db. does this work?	21:54
nijaba	SpamapS: just proposed a merge for you	22:07
fwereade	hazmat, you're right; I don't think a unit would automatically transition from installed to started, in the current trunk	22:16
fwereade	hazmat, but the only valid starting state for the original code was None	22:16
fwereade	hazmat, and given the clearly-expressed intent of the install transition's success_transition of "start", it seemed like a clear and obvious thing to do :)	22:18
fwereade	gn all	22:18
marcoceppi	SpamapS: I'm still writing tests for the hook, so I'll just push that up as a different merge request later	22:42
marcoceppi	test for the helper*	22:46
robbiew	hazmat: any chance we could get some workitems listed on https://blueprints.launchpad.net/ubuntu/+spec/servercloud-p-juju-roadmap? at least around the features we want to deliver? I need to confirm, but i think we can attach any bugs we want to fix.	22:47
* robbiew is sure SpamapS would love to help you guys with this (wink wink)		22:48
* hazmat takes a look		22:50
robbiew	hazmat: doesn't necessarily have to be you....if there's a list of who's doing what, I can put the info in myself...is it the kanban?	22:52
hazmat	robbiew, its in other blueprints	22:52
robbiew	oh	22:53
robbiew	hazmat: juju project blueprints?	22:53
* robbiew can look there		22:53
hazmat	robbiew, https://blueprints.launchpad.net/juju	22:54
Daviey	upstream vs distro blueprints.	22:54
robbiew	hazmat: cool, thx	22:54
robbiew	negronjl: so I'm still getting the error...maybe I'm missing something	22:57
robbiew	do I need to use the hadoop-mapreduce charm now?	22:58
robbiew	I was simply deploying hadoop-master, hadoop-slave, and ganglia	22:58
robbiew	then relating the master and slave services...and ganglia and slave services	22:59
negronjl	robbiew: you shouldn't ....	22:59
negronjl	robbiew: the hadoop-mapreduce is there for convenience ... what are the steps of what you are doing ?	22:59
robbiew	relate master slave....relate ganglia slave	23:00
robbiew	pretty much m_3's blog post	23:00
negronjl	robbiew: let me deploy a cluster ... give me a sec	23:01
robbiew	it worked with default type and oneiric today....only difference is the 64bit types.	23:01
robbiew	ok	23:01
_mup_	juju/upgrade-config-defaults r428 committed by kapil.thangavelu@canonical.com	23:03
_mup_	ensure new config defaults are applied on upgrade	23:03
negronjl	robbiew: deploying now .. let me wait for it to complete and I'll see what it does	23:04
robbiew	cool	23:05
robbiew	negronjl: I wonder if it's a weird java bug	23:06
robbiew	negronjl: was following the steps here: http://cloud.ubuntu.com/2011/11/monitoring-hadoop-benchmarks-teragenterasort-with-ganglia-2/	23:10
robbiew	fyi	23:10
robbiew	gotta run and pick up kids...and do the dad thing...will be back on later tonight. If you find anything just update here or shoot me a mail	23:10
negronjl	robbiew: ok ... I'll email it to ya	23:11
robbiew	cool...good luck! :P	23:11
SpamapS	nijaba: yes install just exits if it is run a second time on the same DB	23:38
nijaba	SpamapS: k, good :)	23:39
nijaba	SpamapS: do you code your charms local or on ec2?	23:40
SpamapS	nijaba: ec2	23:45
SpamapS	the ssh problem with the local provider makes it basically unusable for me.	23:45
SpamapS	nijaba: I was using canonistack for a while but it also became unreliable. :-P	23:45
SpamapS	EC2 is still the most reliable way to get things done with juju. :)	23:45
nijaba	SpamapS: which one? canonistack or your's	23:46
SpamapS	us-east-1 ;)	23:46
nijaba	was talking about openstack	23:46
SpamapS	nijaba: pinged you back on the merge proposal.. I think the right thing to do is just re-delete the "already configured" check	23:46
nijaba	sounds good to me	23:47
SpamapS	nijaba: I only ever tried against canonistack.	23:47
nijaba	k	23:47
SpamapS	nijaba: I really like limesurvey. Its a lot more fun than wordpress for testing things. :)	23:47
SpamapS	just have to convince it to stop using myisam. :)	23:48
nijaba	SpamapS: hehe, it's been my learning project for both packaging and juju now :)	23:48
nijaba	SpamapS: I still can't get the package in though	23:48
nijaba	SpamapS: been ready for 2 years, lack sponsor for both debian and ubuntu	23:49
nijaba	SpamapS: my merge should help you use InnoDB, it's now in the config	23:49
SpamapS	nijaba: packaging PHP apps is so pre-JUJU	23:50
nijaba	SpamapS: true, true	23:51

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!