/srv/irclogs.ubuntu.com/2011/12/06/#juju.txt

SpamapSRyan_Lane: w00t!00:00
SpamapSRyan_Lane: m_3 is on holiday all week.. best email him to be sure he gets the message. :)00:00
Ryan_Laneah. ok. cool ;)00:02
hazmatjimbaker, figured that one out00:22
hazmatre connection error, its was the chain aware deferred thingy on errors00:22
hazmatjimbaker, i'm wondering about resurrecting your last rewrite00:22
SpamapSconfig-get needs the same --format=shell that relation-get has00:23
SpamapSconfig-changed runs ridiculously slow with 40 execs .. :-P00:24
akgranerok Silly question but do you all have weekly IRC meetings and if so where can I find the wiki link to the agenda and meeting logs etc - I wanted to include juju in with the development teams weekly meetings section in UWN00:34
SpamapSNo there's no weekly IRC meeting.00:35
akgranerwell that explains why I couldn't find anything - thanks :-)00:35
jimbakerSpamapS, sounds good00:35
SpamapSakgraner: we do a google hangout weekly, and it definitely *needs* an agenda and minutes..00:36
akgranerahh ok when you all formalize that can you drop the link into the news channel so we can include it please :-)00:36
jimbakerhazmat, i hope you have better luck in that rewrite00:36
akgranerSpamapS, looking forward to the juju webinar...:-)00:37
jimbakerbut it makes perfect sense that it was the chain aware stuff - that was the obvious suspect00:37
SpamapSakgraner: yeah should be good. Hopefully we can finish our slides tomorrow and have time to practice. :)00:39
_mup_Bug #900560 was filed: after upgrade-charm, new config settings do not all have default values <juju:New> < https://launchpad.net/bugs/900560 >01:20
SpamapSjimbaker: I just used juju scp btw. :)01:40
SpamapSjimbaker: ty!01:41
negronjlSpamapS: I just uploaded hadoop-mapreduce charm ... It's a charm that allows you to decouple your mapreduce job from your hadoop cluster.01:54
SpamapSnegronjl: w00t!01:54
SpamapSnegronjl: would love to see a web interface for that.. isn't that kind of what HIVE does?01:54
negronjlSpamapS: hive is more of a layer on top of hadoop, this charm just takes care of all the setup and cleanup that is supposed to happen before and after a job is run01:55
negronjlSpamapS: I'll work on hive, pig, etc. eventually :)01:55
negronjlSpamapS: hive lets you use more SQL like commands to query hadoop01:55
SpamapSthat would be nice for demos.. dump a giant chunk of data in and start doing ad-hoc queries01:56
negronjlSpamapS: I'll work on that soon ...01:56
negronjlSpamapS: this new hadoop-mapreduce charm shows users how they can use their own custom mapreduce jobs, make a charm out of it and run it against a hadoop cluster that's already deployed via juju01:57
negronjlSpamapS: so, it allows you to juju deploy it all :)01:57
negronjlSpamapS: I'll work on hive next ( as time permits )01:57
Skaagcan I make my own ubuntu servers juju compatible so that my apps do not use amazon's ec2 cloud?01:59
SpamapSsounds great tho02:00
Skaagso it's not possible?02:00
SpamapSSkaag: you can, I was responding to negronjl ;)02:02
Skaagah, very cool02:02
SpamapSSkaag: right now you only can do that using the 'orchestra' provider02:02
Skaagwhich costs money?02:02
SpamapSnope02:02
SpamapSjust time.. ;)02:02
SpamapSits kind of raw02:02
SpamapSand you only get one service per machine02:02
Skaagyes that sucks02:02
SkaagI assume it's going to be fixed...02:03
SpamapSYeah there are two features which will help with that.. subordinate services and placement constraints.02:03
SpamapSsubordinate services will let you marry two things to basically create like a super charm.02:03
SpamapSplacement constraints will let you define where you want services to be placed based on things like cpu, ram, or just arbitrary tags that you assign to machines02:04
SpamapSSkaag: so depending on what you want, your needs should be served02:04
SpamapScripes, its late.. I need to go02:04
* SpamapS disappears02:04
Skaagsounds awesome!02:05
Skaagthanks SpamapS02:05
shanghi all, anyone experienced issue when deploying ganglia to EC2? (using  "bzr checkout lp:charm/ganglia ganglia" command?)02:53
=== Skaag_ is now known as Skaag
* hazmat yawns05:00
jimbakerSpamapS, glad juju scp proved to be useful05:20
niemeyerGood morning!11:10
fwereademorning niemeyer!11:11
niemeyerrog: I've mixed things up in your review, sorry about that11:15
niemeyerThat's what I take for review late at night11:17
TheMuemoo11:18
niemeyerrog: You have a saner review11:38
niemeyerTheMue: Yo11:38
rogniemeyer: thanks, i'm on it12:06
rogniemeyer, fwereade, TheMue: mornin' everyone BTW12:07
fwereadeheya rog12:07
TheMueyo rog and fwereade12:08
niemeyerrog: Btw, gocheck has a FitsTypeOf checker that does what you wanted there12:17
niemeyerAssert(x, FitsTypeOf, (*environ)(nil))12:18
rogniemeyer: thanks. all done. https://codereview.appspot.com/5449065/12:23
Davieywhy isn't this review handled on launchpad?12:29
niemeyerDaviey: It is..12:29
niemeyerDaviey: https://code.launchpad.net/~rogpeppe/juju/go-juju-ec2-region/+merge/8425612:30
niemeyerDaviey: But codereview does inline code reviews12:30
niemeyerDaviey: So we use both12:30
Davieyseems a quirky workflow, but ok. :/12:32
niemeyerDaviey: http://blog.labix.org/2011/11/17/launchpad-rietveld-happycodereviews12:33
niemeyerDaviey: The quirky workflow is significantly simpler and faster to deal with12:34
TheMueDo we have a complete paper for the used tools, conventions, workflows, standards and conventions?12:35
niemeyerTheMue: Nope12:35
niemeyerTheMue: You can check out the Landscape one for some background12:36
TheMuesomething like golangs "Getting Started" would be fine. nothing big.12:36
niemeyerTheMue: This blog post is a Getting Started pretty much12:37
TheMuelandscape one?12:37
rogniemeyer: next: https://codereview.appspot.com/5449103/12:37
niemeyerTheMue: Nah, nevermind.. reading it again I notice it's of little use for us12:39
TheMueniemeyer: ok12:40
rogniemeyer: one mo, i need to merge trunk12:40
fwereadeniemeyer: consider a unit agent coming up to discover that the unit workflow is in state "stop_error"12:40
TheMueniemeyer: maybe I'll write down my "starter experiences" ;)12:41
niemeyerTheMue: Sounds like a plan12:41
niemeyerfwereade, rog: ok12:41
fwereadeniemeyer: it seems to me that that could only have happened as a result of a failed attempt to shut down cleanly at some stage12:41
niemeyerfwereade: Hmm.. yeah12:42
fwereadeniemeyer, and that in that case it would be reasonable to retry into "stopped" without hooks, and then to transition back to "started" as we would have done had the original stop worked properly12:42
fwereadeniemeyer, please poke holes in my theory12:43
niemeyerfwereade: What do you mean by `to retry into "stopped"`12:43
niemeyer?12:43
niemeyerfwereade: The user is retrying? THe system? etc12:43
fwereadeniemeyer, I was referring to the "retry" transition alias12:44
fwereadeniemeyer, may as well just explicitly "retry_stop"12:44
niemeyerfwereade: I'm missing something still12:45
niemeyerfwereade: Probably because I don't understand it well enough12:45
fwereadeniemeyer, I think I've misthought anyway12:45
niemeyerfwereade: Does stop transition back onto "started" when it works?12:45
fwereadeniemeyer, a clean shutdown of the unit agent puts the workflow into "stopped", and that will go back into "started" when it comes up again12:46
niemeyerfwereade: Aha, ok12:46
niemeyerfwereade: That makes sense12:46
niemeyerfwereade: So the unit shuts down, and goes onto stop_error because stop failed..12:46
fwereadeniemeyer: so I think the general idea, that coming up in "stop_error" should somehow be turned into a "started" state, remains sensible12:47
niemeyerfwereade: You mean automatically?12:47
rogniemeyer: all clear now.12:47
fwereadeniemeyer, I do, but I may well be wrong12:47
niemeyerfwereade: I'm not sure12:48
niemeyerfwereade: I mean, I'm not sure that doing this automatically is a good idea12:48
niemeyerfwereade: I'd vote for explicitness until we understand better how people are using this12:48
fwereadeniemeyer, that's perfectly reasonable12:48
niemeyerfwereade: The reason being that a stop_error can become an exploded start and/or upgrade soon enough12:48
niemeyerfwereade: I'd personally appreciate knowing that stop failed, so I can investigate what happened on time, rather than blowing up in cascade in later phases, which will be harder to debug12:49
fwereadeniemeyer, that definitely makes sense as far as it goes12:49
* fwereade thinks a moment12:50
niemeyerfwereade: resolved [--retry] enables the workflow of ignoring it12:50
fwereadeniemeyer, but it leaves the unit in a "stopped" state without an obvious way to return to a "started" state12:51
fwereadeniemeyer, I'll need to check whether bouncing the unit agent will then do the correct swicth to "started"12:51
fwereadeniemeyer, which I guess would be sort of OK, but it's not necessarily the obvious action to take to get back into "started" once you've resolved the stop error12:52
niemeyerfwereade: What about "resolved"?12:53
niemeyerfwereade: "resolved" should transition it to stopped rather than started12:53
niemeyerfwereade: It always moves it to the next intended state, rather than the previous one12:54
fwereadeniemeyer, yes, but "stopped" is not a useful state for a running unit to be in12:54
niemeyerfwereade: Indeed.. but I don't know what you mean by that12:55
niemeyerfwereade: The stop hook has run to transition it onto stopped.. if you resolved a stop_error, it should be stopped, not running12:56
fwereadeniemeyer: it should, yes, but the purpose of that stopped state is to signal that it's in a nice clean state for restart (if the restart ever happens ofc)12:56
fwereadeniemeyer: (it could just have gone down for ever, but I don't think that's relevant here)12:57
fwereadeniemeyer: we end up past the point at which we should detect "ok, we were shut down cleanly so we can be brought up cleanly, let's bring ourselves back up"12:57
niemeyerfwereade: You're implying that the stop hook is only called for starting, which is awkward to me12:59
fwereade"without painful considerations like 'we're in charm_upgrade_error state, reestablish all watches and relation lifecycles but keep the hook executor stopped watches'"12:59
fwereade(er, lose the trailing "watches" above)12:59
fwereadeniemeyer: I'm implying, I think, that the stop hook is called for stopping, and that that can either happen because the whole unit is going away forever *or* because we've been killed cleanly for, say, machine reboot13:01
niemeyerfwereade: It can also happen for any other reason, like a migration, or cold backup, etc13:02
fwereadeniemeyer: and that when the agent comes up again, "stopped" is considered to be a sensible state from which to transition safely to "started", just as None or "installed" would be13:02
fwereadeniemeyer: I'm not sure I see the consequences there, expand please?13:03
niemeyerfwereade: That's fine, but the point is that some of these actions are not safe to execute if stopped has actually blown up, I think13:03
fwereadeniemeyer: I'm comfortable with the decision not to automatically start after stop_error13:04
fwereadeniemeyer: I'm not confident that we have a sensible plan for transitioning back to started once the user has fixed and resolveded13:05
niemeyerfwereade: It feels like there are two situations:13:06
niemeyerfwereade: 1) The unit was stopped temporarily, without the machine rebooting13:06
niemeyerfwereade: 2) The unit was stopped because the machine is rebooting or is being killed13:07
niemeyerfwereade: Are you handling only 2) right now?13:07
niemeyerfwereade: Or is there some situation where you're handling 1)?13:07
niemeyerrog: Review delivered13:08
rogniemeyer: on it13:08
fwereadeniemeyer, I'm handling both, I think -- how do we tell which is the case?13:08
niemeyerfwereade: Is there a scenario where the unit stops without a reboot? How?13:09
fwereadeniemeyer: not a *planned* situation13:10
niemeyerfwereade: What about unplanned situations.. how could it happen?13:10
fwereadeniemeyer: but what about, say, an OOM kill?13:10
niemeyerfwereade: OOM kill of what?13:11
fwereadeniemeyer, the unit agent... could that never happen?13:11
fwereadeniemeyer, even, a poorly written charm that accidentally kills the wrong PID13:11
niemeyerfwereade: Wow, hold on. The unit agent dying and the unit agent workflow are disconnected, aren't them?13:12
niemeyerSorry13:12
niemeyerfwereade: Wow, hold on. The unit agent dying and the unit workflow are disconnected, aren't them?13:12
fwereadeniemeyer: the workflow transitions to "stopped" when the agent is stopped13:12
niemeyerfwereade: WHAAA13:12
niemeyerfwereade: How can it possibly *transition* when it *dies*?13:13
fwereadeniemeyer: ok, as I understand it, when we're killed normally, stopService will be called13:14
fwereadeniemeyer: it's only a kill -9 that will take us down without our knowledge13:14
niemeyerfwereade: and an OOM, and a poorly writen charm, and ...13:14
fwereadeniemeyer: indeed, I'm considering that there are two scenarios, respectively equivalent to kill and kill -913:15
rogniemeyer: response made.13:15
fwereadeniemeyer: in neither case do we know for what reason we're shutting down13:15
fwereadeniemeyer: (and in only the first case do we know it's even happened)13:16
niemeyerfwereade: So why are we taking a kill as a stop? Didn't you implement logic that enables the unit agent to catch up gracefully after a period of being down?13:17
fwereadeniemeyer: we were always taking a kill as a stop13:17
fwereadehttp://paste.ubuntu.com/761597/13:17
fwereade(given that a "friendly" kill will stopService)13:18
niemeyerfwereade: ok, the question remains13:18
fwereadeniemeyer: ...sorry, I thought you were following up your last message13:22
fwereadeniemeyer: I have implemented that logic, or thought I had, but have discovered a subtlety in response to hazmat's latest review13:23
fwereadeniemeyer: well, in hindsight, an obviousty :/13:23
hazmatg'morning13:23
fwereadeheya hazmat13:23
hazmati ended up rewriting the ssh client stuff last night13:24
hazmatfwereade, i'm heading back to your review right now13:24
fwereadehazmat: might be better to join the conversation here13:24
* hazmat catches up with the channel log13:24
fwereadehazmat: I'm pretty sure the initial version of the unit state handling is flat-out wrong :(13:24
fwereadeniemeyer: ok, stepping back a mo13:26
fwereadeniemeyer: when the unit agent comes up, the workflow could be in *any* state, and we need to make sure the lifecycles end up in the correct state13:27
niemeyerfwereade: Right13:27
niemeyerfwereade: What I'm questioning is the implicit stop13:27
fwereadeniemeyer: cool, we're on the same page then13:28
niemeyerfwereade: We should generally not kill the unit unless we've been explicitly asked to13:28
niemeyerfwereade: In the face of uncertainty, attempt to preserve the service running13:28
fwereadeniemeyer: I am very sympathetic to arguments that we should *not* explicitly go into the "stopped" when we're shut down13:28
fwereadeniemeyer: cool13:28
fwereadeniemeyer: am I right in thinking there's some bug where the stop hook doesn't get run anyway?13:29
niemeyerfwereade: I agree that we should transition stop_error => started when we're coming from a full reboot, though13:29
niemeyerfwereade: The stop story was mostly uncovered before you started working on it13:30
fwereadeniemeyer: heh, I think I'm more confused now :(13:32
fwereadeniemeyer: the discussion of start hooks on reboot seems to me to be unconnected with the actual state we're in13:32
fwereadeniemeyer: by it's nature, that uarantee is an end-run around whatever normal workflow we've set up, isn't it?13:33
* fwereade resolves once again to lrn2grammar, lrn2spell13:33
hazmatfwereade, why would it be an end run around to what the state of the system is?13:34
hazmaterrors should be explicitly resolved13:34
hazmati see the point re stop being a effectively a final transition for resolved13:35
niemeyerfwereade: Hmm13:35
niemeyerfwereade: I agree with hazmat in the general case13:35
niemeyerfwereade: What I was point at, though, is that there's one specific case where that's not true: reboots13:35
niemeyerfwereade: Because we're *sure* the unit was stopped, whether it liked it or not13:36
fwereadehazmat, niemeyer: surely there are only a few "expected" states in which to run the start hook?13:36
niemeyerfwereade: That's a special case where *any* transition should be transitioned onto "stopped", and then the normal startup workflow should run13:36
fwereadeniemeyer: so we should set up state transitions to"stopped" for every state?13:37
niemeyerfwereade: That's a bit of a truism (sure, we don't run start arbitrarily)13:38
hazmati'm wondering if the reboot scenario is better handled explicitly13:38
fwereadeniemeyer: ...including, say, "install_error"?13:38
niemeyerhazmat: Right, that's my thinking too13:38
hazmatvia marking the zk unit state, rather than implicit detection13:38
niemeyerhazmat: Nope13:38
niemeyerhazmat: Machines do crash, and that's also a reboot13:38
hazmattrue13:39
niemeyerfwereade: No.. why would we?13:39
niemeyerfwereade: You seem to be in a different track somehow13:39
niemeyerfwereade: Maybe I'm missing what you're trying to understand/point out13:39
fwereadeniemeyer: I think the discussion has broadened to cover several subjects, each with several possible tracks ;)13:40
* fwereade marshals his thoughts again13:41
niemeyerfwereade: There are two cases:13:41
niemeyer1) Reboot13:41
niemeyerIn that case, when the unit agent starts in this scenario, it should reset the state to stopped, and then handle the starting cleanly.13:42
niemeyer2) Explicit stop + start (through a command to be introduced, or whatever)13:43
niemeyerIn this scenario, a stop_error should remain as such until the user explicitly resolves it13:43
niemeyerresolving should transition to *stopped*, and not attempt to start it again13:43
niemeyerThen,13:43
niemeyerThere's a third scenario13:43
niemeyer3) Unknown unit agent crashes13:43
niemeyerWhether kill or kill -9 or any other signal, the unit agent should *not* attempt to transition onto stopped, because the user didn't ask for the service to stop.13:44
niemeyerInstead, the unit agent should be hooked up onto upstart13:44
niemeyerSo that it is immediately kicked back on13:45
niemeyerEven if the user says "stop unit-agent".. we should not stop the service13:46
hazmat3) sounds good, the stop dance needs explicit coordination with non container agents, getting explicit transitions removed from implicit process actions is a win.13:47
fwereadeniemeyer: ok, so I guess at the moment we don't have anything depending on the existing stop-on-kill behaviour13:47
hazmat2) isn't really a use case we have atm, but sounds good, 1) the handing nears some exploration, what's an install_error mean in this context13:48
niemeyerhazmat: Good point, agreed13:48
hazmatare we resetting to a null state, or are we creating guide paths to start from every point in the graph13:48
niemeyerhazmat: Even on reboots, it should take into account which states are fine to ignore13:49
niemeyerstop_error is one of them13:49
niemeyerstart_error is another13:50
fwereadeniemeyer: so indeed I'm very happy with (3) and only a little confused about (2): because I'm now not sure when we *should* be in a "stopped" state13:50
hazmatniemeyer, right.. by ignore you mean ignoring resolved action, and just reset the workflow to a point where we can call start, and possibly install again13:50
niemeyerfwereade: Imagine we introduce three fictitious commands (we have to think, let's not do them now): juju start, juju stop, and juju restart.13:51
hazmatjuju transition <state>13:51
niemeyerfwereade: Do you get the picture?13:51
niemeyerhazmat: Ugh, nope.. mixing up implementation and interface13:51
hazmat;-)13:51
fwereadeniemeyer: yes, but I'm worried about here-and-now; I see https://bugs.launchpad.net/juju/+bug/802995 and https://bugs.launchpad.net/juju/+bug/87226413:52
_mup_Bug #802995: Destroy service should invoke unit's stop hook, verify/investigate this is true <juju:New> < https://launchpad.net/bugs/802995 >13:52
_mup_Bug #872264: stop hook does not fire when units removed from service <juju:Confirmed> < https://launchpad.net/bugs/872264 >13:52
hazmatjuju coffee, bbiam13:52
niemeyerfwereade: Both of these look like variations of the 1) scenario.. why would we care about any erroring states if the unit is effectively being doomed?13:53
fwereadeniemeyer: it seems to me that if we don't stop on stopService, we will *never* go into the "stopped" state13:56
fwereadeniemeyer: I'm keen on making that change13:56
niemeyerfwereade: Why? Stop can (and should, IMO) be an explicitly requested action.13:57
niemeyerfwereade: Stop is "Hey, your hundred-thousand dollars database server is going for a rest"13:57
fwereadeniemeyer: back up again: you're saying that we shouldn't go into "stopped" just because the unit agent is shutting down13:58
niemeyerfwereade: It doesn't feel like the kind of thing to be done without an explicit "Hey, stop it!" from the admin13:58
fwereadeniemeyer: so it's a state that we won't ever enter until some hypothetical future juju start/stop command is implemented?13:58
niemeyerfwereade: Define "shutting down"13:58
niemeyer fwereade Shutting down can mean lots of things.. if it means "kill", yes, that's what I mean13:59
niemeyerfwereade: If it means, the unit was notified that it should stop and shut down, no, it should execute the stop hook13:59
fwereadeniemeyer: ok, that makes perfect sense, it just seems that we don't actually do that now14:00
niemeyerfwereade: No, it's not a hypothetical future.. a unit destroy request can execute stop14:00
niemeyerfwereade: Because it's an explicit shut down request from the admin14:00
niemeyerfwereade: But that's not the same as twisted's stopService14:01
fwereadeniemeyer: ok, but in that case we'll never be in a position where we're coming back up in "stopped" or "stop_error", which renders the original question moot14:01
niemeyerfwereade: That's right14:01
niemeyerfwereade: it's a problem we have, and that will have to be solved at some point, but it doesn't feel like you have it now14:02
niemeyerAlright, and I really have to finish packing, get lunch, and leave, or will miss the bus14:02
fwereadeniemeyer: sorry to hold you up, didn't realise :(14:02
fwereadeniemeyer: take care & have fun :)14:03
niemeyerfwereade: No worries, it was a good conversation14:03
niemeyerCheers all!14:03
hazmatniemeyer, have a good trip14:06
hazmatfwereade, so do you feel like you've got a clear path forward?14:06
fwereadehazmat: still marshalling my thoughts a bit14:06
fwereadehazmat: the way I see it right now, I have to fix the stop behaviour and see what falls out of the rafters, and then move forward from there again14:07
* hazmat checks out the mongo conference lineup14:07
hazmatfwereade, fixing the stop behavior is a larger scope of work14:08
hazmatfwereade, i'd remove/decouple  stop state from process shutdown and move forward with the restart work14:09
fwereadehazmat: ...then my path is not clear, because I have to deal with coming up in states that are ...well, wrong14:09
hazmatfwereade, a writeup on stop stuff.. http://pastebin.ubuntu.com/761640/14:11
hazmati feel mixing up different things though14:12
hazmatfwereade, got a moment for g+?14:12
fwereadehazmat, yeah, sounds good14:12
niemeyerhazmat: Thanks!14:47
niemeyerHeading off..14:47
niemeyerWill be online later from the airport..14:47
rogquick query: anyone know of a way to get bzr, for a given directory, to add all files not previously known and remove all files not in the directory that *were* previously known? a kind of "sync", i guess.15:05
rogi can do "bzr rm $dir" then "bzr add $dir" but that's a bit destructive15:06
mplrog: I'm still a bit lost between the different gozk repos. what's the difference (in terms of import) between launchpad.net/gozk and launchpad.net/gozk/zookeeper? is one of them a subpart of the other or are they really just different versions of the same thing overall?15:21
rogmpl: the latter is the later version15:21
rogmpl: the former is the currently public version15:21
rogmpl: i just found a pending merge that needs approval from niemeyer, BTW15:21
mplrog: ok, thx. and with which one of them do you think I should work?15:23
rogmpl: launchpad.net/gozk/zookeeper15:23
mplrog: good, because gotest pass with that one for me, while they don't with the public one. also, how come there is no Init() in launchpad.net/gozk/zookeeper? has it been moved/renamed to something else?15:24
rogmpl: let me have a look15:25
rogmpl: Init is now called Dial15:27
mplyeah looks like it, thanks for the confirmation.15:28
lynxmanhazmat: ping15:31
hazmatlynxman, pong15:31
rogmpl: seems like my merge has actually gone in recently15:33
lynxmanhazmat: quick question for you, we're trying to see how to do a remote deployment with juju15:33
* hazmat nods15:34
lynxmanhazmat: so we were thinking about a "headless" juju deployment in which it connects to zookeeper once the formula has been deployed15:34
lynxmanhazmat: part of the formula being setting up a tunnel or route to the zookeeper node15:34
lynxmanhazmat: what are your thoughts about that? :)15:34
hazmatlynxman, what do you mean by remote in this context?15:35
lynxmanhazmat: let's say I have a physical platform and I want to extend it by deploying nodes on another platform which is not on the same network15:35
lynxmanhazmat: but it's just an extension of the first platform, in N nodes as necessary15:35
hazmatlynxman, like different racks different or like different data centers different15:36
lynxmanhazmat: different data centers different :)15:36
hazmatie. what's the normal net latency15:36
lynxmanhazmat: latency would be a bit higher than usual, let's say around 200ms15:37
hazmatand increased risk of splits15:37
lynxmanhazmat: exactly :)15:37
hazmatlynxman, so in that case, i'd say its probably best to model two juju different environments, and have proxies for any inter-data center relations15:38
lynxmanhazmat: hmm could you point me towards proper documentation for proxying? is Juju so far okay with that?15:38
hazmatlynxman, hmm.. maybe its worth taking a step back, what sort of activities do you want that coordinate across the data centers15:39
lynxmanhazmat: we want to extend a current deployment, let's say a wiki app, set up the appropriate tunnels for all the necessary db backend comms15:39
lynxmanhazmat: just to support periods of higher traffic15:40
lynxmanhazmat: so I'd extend my serverfarm on certain hours of the day, for example :)15:42
hazmatlynxman, well not exactly that which is well known.. extending it across the world during certain hours of the day on architecture that wants connected clients is a different order15:43
hazmatlynxman, so proxies are probably something that would be best provided by juju itself, its possible to do so in a charm, but in cases like this you'd effectively be forking the charm your proxying15:43
lynxmanhazmat: pretty much yeah15:44
lynxmanhazmat: we want to use the colocation facility of juju to add a proxy charm under the regular one and such15:44
hazmatlynxman, atm we're working on making its so that juju agents can disconnect for extended periods and come back in a sane state15:44
lynxmanhazmat: hmm interesting, any ETA for that or is it in the far future?15:45
hazmatlynxman, its for 12.0415:45
lynxmanhazmat: neat :)15:45
hazmatlynxman, its not clear how the remote dc is exposed via the provider in this case which i assume is orchestra15:46
lynxmanhazmat: the idea is to add either a stunnel or a route to a vpn concentrator, which will be deployed by a small charm or orchestra itself as necessary15:46
hazmatlynxman, right, but it wouldn't be a single orchestra endpoint controlling machines at each data center, they would be separate15:47
lynxmanhazmat: exactly15:47
hazmatlynxman, so i'm still thinking its better to try and model this as separate environments15:48
lynxmanhazmat: it'd be configured by cloud-init15:48
lynxmanhazmat: hmm I see15:48
lynxmanhazmat: so any docs you can point me at on how to connect two zookeeper instances?15:48
hazmatits not a single endpoint we're talking too, and even just for redundancy, we'd want each data center to be functional in the case of a split15:49
* hazmat ponders proxies15:50
hazmatso in this case you'd want to have the tunnel/vpn as a subordinate charm, and a proxy db, that you can communicate with.15:51
hazmathmm.. lynxman i think the nutshell is cross dc isn't on the roadmap for 12.04, we will support different racks, different availability zones, etc. but i don't think we have the bandwidth to do cross-dc15:52
hazmatwell15:52
lynxmanhazmat: well we're trying to investigate the options on that basically15:52
lynxmanhazmat: our first idea was a headless juju that could deploy a charm and as part of the charm connect itself back to the zookeeper15:53
lynxmanhazmat: just to keep it as atomic as possible15:53
hazmatlynxman, fair enough, lacking support in the core, the options are if you have a single provider endpoint, you can try it anyways and it might work. or you'll be doing custom charm work to try and cross the chasm.15:53
hazmatheadless juju is not a meaningful term to me15:54
lynxmanhazmat: headless as the head being zookeeper :)15:54
hazmatlynxman, still not meaningful ;-)15:54
hazmatits like saying a web browser without html15:54
lynxmanhazmat: hmm the idea is to deploy a charm through the juju client and once the charm is setup let it connect through a tunnel to zookeeper to report back15:55
lynxmanhazmat: does that make more sense?15:55
hazmatless15:55
hazmatit would make more sense to register a machine for use by juju15:55
=== amithkk is now known as sansui12
hazmatsince a single provider is lacking here15:56
hazmatand then it would be available to the juju env, and you could deploy particular units/services across it with the appropriate constraint specification15:56
lynxmanhazmat: so I'd need to do the tunneling part as a pre-deployment before juju using another tool, be it cloud-init or such, right?15:56
hazmatbut the act of registration startups a zk connected machine agent15:56
=== sansui12 is now known as amithkk
lynxmanhazmat: then just tell juju to deploy into that machine in special using a certain charm15:57
hazmatlynxman, what's controlling machines in the other dc?15:57
hazmatlynxman, are they two dcs with two orchestras ?15:57
lynxmanhazmat: best case scenario cloud-init15:57
lynxmanhazmat: not necessarily15:57
lynxmanhazmat: but cloud-init is also integrated into orchestra so...15:57
hazmatlynxman, yup15:58
lynxmanhazmat: it's a good single point15:58
hazmatlynxman, the notion of connecting back on charm deploy isn't really the right one.. juju owns the machine since it created it, and the machine is connected to zk independent of any services deployed to it16:00
lynxmanhazmat: that's why I wanted to pass the idea through you to know what you thought :)16:00
hazmathence the notion of registering the machine to make it available to the environment, but thats something out of core as it violates any notion of interacting with the machine via the provider16:00
lynxmanhazmat: exactly, it does violate the model somehow16:02
hazmatas far as approaching this in a way thats supportable in an on-going fashion, i think its valuable to try and model the different dcs as different juju environments that are communicating16:02
hazmatthen you could deploy your vpn charm as a subordinate charm through out one environment to provide the connectivity to the other env16:03
hazmatthe lack of cross env relations and no support for the core is problematic, but it sound more like a solvable case of a one-off deployment16:04
hazmatvia custom charms16:04
hazmatactually maybe even generically if its done right..16:05
lynxmanhazmat: but that's the idea, the other dc can be used on and off, different machines, different allocations16:05
lynxmanhazmat: that's why I was opting for an atomic solution16:05
hazmata proxy charm would be fairly generic16:05
mplrog: which merge were you talking about?16:53
rogmpl: update-server-interface (revision 24 in gozk/zookeeper trunk)16:54
mplrog: bleh, I find launchpad interface to view changes and browse files really awkward :/17:24
rogmpl: i use bzr qdiff when codereview isn't available17:25
rogbzr: (apt-get install qbzr)17:26
mplrog: anyway, what is this merge about. are you pointing it out because it is relevant to the Init -> Dial change we talked about?17:26
rogmpl: it had lots of changes in it17:30
rogmpl: and quite possibly that change included, i can't remember17:31
mplok17:32
hazmatfwereade, its not clear that coming up with an installed state would result in a transition.18:32
hazmatto started18:32
hazmatre ml18:32
_mup_Bug #900873 was filed: Automatically terminate machines that do not register with ZK <juju:New> < https://launchpad.net/bugs/900873 >18:38
=== lamal666 is now known as lamalex
hazmatjimbaker, incidentally this is my cleanup of sshclient.. http://pastebin.ubuntu.com/761938/19:07
jimbakerhazmat, you have a yield in _internal_connect, but no inlineCallbacks19:09
hazmatjimbaker, sure19:09
jimbakerhazmat, so i like the intent (the inline form is much better for managing errors imho), but is that supposed to work as-is?19:11
hazmatjimbaker, do you see a problem with it?19:12
hazmatjimbaker, there's a minor accompanying change to sshforward19:12
jimbakerhazmat, i just wonder why it's not decorated with @inlineCallbacks, that's all19:12
marcoceppiSo, can a charm run config-set and set config options?19:20
SpamapSno19:21
SpamapSwell19:21
SpamapSmaybe?19:22
SpamapSmarcoceppi: its worth trying 'juju set' from inside a charm.. but my inclinationw ould be that it couldn't because it wouldn't have AWS creds to find the ZK server.19:22
marcoceppiAh, okay.19:22
SpamapSmarcoceppi: I do think its an interesting idea to be able to adjust the whole service's settings from hooks.19:22
marcoceppiFor things like blowfish encryption key, I'd like to randomly generate it and have it set in the config so juju get <service> will show it19:23
* marcoceppi writes a bug19:23
SpamapSmarcoceppi: not sure ZK is a super safe place for private keys19:23
marcoceppiNeither is plaintext files, but that's what I'm working with19:24
niemeyerYo19:24
SpamapSmarcoceppi: what you want is the ability to feed data back to the user basically, right?19:25
marcoceppimore or less19:25
marcoceppiyes19:25
SpamapSmarcoceppi: yeah there's a need for that, Not sure if a "config-set" would be the right place19:25
marcoceppimm, it's just the first thing that came to mind for me19:26
SpamapSmarcoceppi: bug #862418 might encompass what you want19:26
_mup_Bug #862418: Add a way to show warning/error messages back to the user <juju:Confirmed> < https://launchpad.net/bugs/862418 >19:26
marcoceppithanks19:27
marcoceppiUgh, this is probably the wrong channel, but I can't get this sed statement to work. Surely someone has had to escape a variable that was a path to work with sed before19:30
marcoceppiI started with s/\//\\\//g because that seems like it would logically work, but it doesn't.19:30
SpamapSmarcoceppi: you can use other chars than /19:30
marcoceppiFor a file path?19:30
SpamapSs,/etc/hosts,/etc/myhosts,g19:31
marcoceppiohhh19:31
marcoceppithat actually helps a lot.19:31
marcoceppiI completely forgot about that19:31
SpamapSYeah, I get all backslash crazy sometimes too then remember19:31
nijabahello.  Quick question: if I want to pass parameters to my charm (retrieved by config-get), do I need to explicitly mention the yaml file on the deploy command?  I could not find doc about this.  Did I miss it?19:42
SpamapSnijaba: you can pass a full configuration via yaml in deploy, or use 'juju set' after deploy.19:55
nijabaSpamapS: but do I have to specify the yaml or will chamname.yaml be automatically used?19:55
SpamapSnijaba: if you want deploy to use a yaml file you have to mention it19:56
nijabaSpamapS: ok, thanks a lot :)19:56
nijabaSpamapS: another question.  in my config.yaml, how do I set my default to be an empty string.  I get: "expected string, got None" error20:08
marcoceppinijaba: default: ""20:12
nijabaSpamapS: nm, found the doc finaly :)20:12
nijabamarcoceppi: thanks :)20:16
SpamapSIMO the default for type: string *should* be ""20:18
SpamapSNone comes out as "None" from config-get20:18
SpamapSor empty, I forget actually20:19
SpamapShrm20:19
nijaba"" works as expected20:22
marcoceppiSpamapS: I've got a couple of improvements for charm-helper-sh (mainly fixes to wget) should I just push those straight to the branch or would a review still be a good thing?20:30
SpamapSmarcoceppi: lets do reviews for everything.. we'll pretend people are actually using these and we don't want to break it. :)20:30
marcoceppi<3 sounds good20:31
nijabayou'd better, cause I do now :)20:31
marcoceppi\o/20:31
marcoceppiIt's actually a bug fix that would result in an install error :\20:31
hazmatSpamapS, do you have ideas on how to reproduce  bug 86192820:35
_mup_Bug #861928: provisioning agent gets confused when machines are terminated <juju:New for jimbaker> < https://launchpad.net/bugs/861928 >20:35
SpamapShazmat: yes, just terminate a machine using ec2-terminate-machines20:35
hazmatSpamapS, ah20:35
hazmatSpamapS, thanks20:35
SpamapShazmat: I don't know if there's a race and you have to do it at a certain time.20:36
hazmatSpamapS, i've been playing around with juju terminate-machine .. haven't been able to reproduce, i'll triggering it externally with ec2-terminate-instances20:37
SpamapShazmat: right, because juju terminate-machine cleans up ZK first. :)20:38
hazmatjimbaker, have you tried reproducing this one?20:45
jimbakerhazmat, not yet, i've been working on known_hosts actually20:45
nijabaanother stupid question: can I do a relation-get in a config-changed hook?  how do I specify which relation I am taling about?20:47
nijaba*talking20:48
niemeyerFlight time..20:49
niemeyerLaters20:49
hazmatnijaba, not at the moment20:51
nijabahazmat: harg.  So if I have a config file that takes stuff from the relation, the logic to implement config-changed is going to be quite complicated...20:52
hazmatnijaba, you can store it on disk in your relation hooks, and use it in config-changed20:52
hazmatnijaba, or just configure the service in the relation hook20:52
nijabahazmat: ah, coool.  thanks a lot20:53
hazmatnijaba, don't get me wrong, that is a bug20:53
nijabahazmat: I don't, I just like the workaround20:53
SpamapSnijaba: one way I've done what you're dealing with is to just have the hooks feed into one place, and then at the end, try to build the configuration if possible.. otherwise exit 021:00
_mup_txzookeeper/trunk r45 committed by kapil.foss@gmail.com21:21
_mup_[trivial] ensure we always attempt to close the zk handle on client.close, stops background connect activity associated to the handle.21:21
SpamapSnijaba: tsk tsk.. idempotency issues in limesurvey charm. ;)21:36
nijabaSpamapS: can you be a bit clearer?21:37
SpamapSnijaba: I've got a diff now.. :)21:37
SpamapSnijaba: if you remove the db relation and add it back in.. fail abounds ;)21:37
SpamapSnijaba: or more correctly, if you try to relate it to a different db server21:37
nijabaSpamapS: ah.  let me finish the current test and will fix21:37
SpamapSnijaba: no I have a fix already21:38
SpamapSnijaba: I'll push the diff up21:38
nijabaSpamapS: ok thanks, will look21:38
SpamapSmv admin/install admin/install.done21:38
SpamapSchmod a-rwx admin/install.done21:38
SpamapSThis bit is rather perplexing tho21:39
nijabaSpamapS: for "security" reason, install procs are to be moved away or admin interface will compllain21:39
nijabaSpamapS: they recommend to completely remove, but that'smy way of doing it21:40
nijabaSpamapS: just so that it could be reused21:40
robbiewnegronjl: you ever see this type of java error with the hadoop charms? -> https://pastebin.canonical.com/56816/21:41
negronjlrobbiew: checking21:41
SpamapSnijaba: ok, makes sense21:41
negronjlrobbiew: bad jar file ...21:42
negronjlrobbiew: Where is the jar file itself in the system ?21:42
robbiewyeah...but the same example worked when I deployed using default 32bit oneiric system...this is a 64bit21:42
SpamapSnijaba: seems rather silly to remove perfectly good tools.. ;)21:42
robbiewnegronjl:  /usr/lib/hadoop21:43
robbiewi can unzip it21:43
negronjlrobbiew: did you do this all with juju ?21:43
negronjlrobbiew: if so, was it in AWS ?  Do you have the AMI so I can re-create ?  I normally use 64-bit oneiric ones and I haven't seen that error ..21:44
negronjlrobbiew: using the following ....      default-image-id: ami-7b39f21221:44
negronjl    default-instance-type: m1.large21:44
negronjl    default-series: oneiric21:44
negronjlrobbiew: in my environment.yaml file21:45
* robbiew double checks his21:45
robbiewhmm21:45
robbiewnegronjl:  default-image-id: ami-c162a9a821:46
negronjlrobbiew: I normally use the latest m1.large, oneiric AMI that I can find in http://uec-images.ubuntu.com/oneiric/current/21:46
negronjlrobbiew: Today's latest one ( for oneiric, m1.large ) is: ami-c95e95a021:48
negronjlrobbiew: I am currently testing that one just to be sure21:48
robbiewnegronjl: do we do them for m1.xlarge?21:48
* robbiew was playing around with types 21:49
negronjlrobbiew: I haven't tried them but, now is as good a time as any so, trying now :)21:49
negronjlrobbiew: give me a sec21:49
robbiewlol21:49
SpamapSnijaba: http://bazaar.launchpad.net/~charmers/charm/oneiric/limesurvey/trunk/revision/1021:51
* SpamapS should have pushed that to a review branch.. bad bad SpamapS21:51
negronjlrobbiew: no xlarge images21:51
* SpamapS runs off to dentist appt that starts in 8 minutes21:51
negronjlrobbiew: I see cc1.4xlarge though21:51
negronjlrobbiew: test that one ?21:51
robbiewnegronjl: yeah...so I'm clearly of in the weeds :)21:52
robbieweh...nah21:52
robbiewlet me use the right settings first ;)21:52
robbiewI had it working with the defaults21:52
negronjlrobbiew: k ... let me know if I can break anything for you :)21:52
robbiewso I figured it was user error21:52
robbiewlol...sure21:52
nijabaSpamapS: thanks :) but this means that the install proc may run multiple times even for the same db.  does this work?21:54
nijabaSpamapS: just proposed a merge for you22:07
fwereadehazmat, you're right; I don't think a unit would automatically transition from installed to started, in the current trunk22:16
fwereadehazmat, but the only valid starting state for the original code was None22:16
fwereadehazmat, and given the clearly-expressed intent of the install transition's success_transition of "start", it seemed like a clear and obvious thing to do :)22:18
fwereadegn all22:18
marcoceppiSpamapS: I'm still writing tests for the hook, so I'll just push that up as a different merge request later22:42
marcoceppitest for the helper*22:46
robbiewhazmat: any chance we could get some workitems listed on https://blueprints.launchpad.net/ubuntu/+spec/servercloud-p-juju-roadmap?  at least around the features we want to deliver?  I need to confirm, but i think we can attach any bugs we want to fix.22:47
* robbiew is sure SpamapS would love to help you guys with this (wink wink)22:48
* hazmat takes a look22:50
robbiewhazmat: doesn't necessarily have to be you....if there's a list of who's doing what, I can put the info in myself...is it the kanban?22:52
hazmatrobbiew, its in other blueprints22:52
robbiewoh22:53
robbiewhazmat: juju project blueprints?22:53
* robbiew can look there22:53
hazmatrobbiew, https://blueprints.launchpad.net/juju22:54
Davieyupstream vs distro blueprints.22:54
robbiewhazmat: cool, thx22:54
robbiewnegronjl: so I'm still getting the error...maybe I'm missing something22:57
robbiewdo I need to use the hadoop-mapreduce charm now?22:58
robbiewI was simply deploying hadoop-master, hadoop-slave, and ganglia22:58
robbiewthen relating the master and slave services...and ganglia and slave services22:59
negronjlrobbiew: you shouldn't ....22:59
negronjlrobbiew: the hadoop-mapreduce is there for convenience ... what are the steps of what you are doing ?22:59
robbiewrelate master slave....relate ganglia slave23:00
robbiewpretty much m_3's blog post23:00
negronjlrobbiew: let me deploy a cluster ... give me a sec23:01
robbiewit worked with default type and oneiric today....only difference is the 64bit types.23:01
robbiewok23:01
_mup_juju/upgrade-config-defaults r428 committed by kapil.thangavelu@canonical.com23:03
_mup_ensure new config defaults are applied on upgrade23:03
negronjlrobbiew: deploying now .. let me wait for it to complete and I'll see  what it does23:04
robbiewcool23:05
robbiewnegronjl: I wonder if it's a weird java bug23:06
robbiewnegronjl:  was following the steps here: http://cloud.ubuntu.com/2011/11/monitoring-hadoop-benchmarks-teragenterasort-with-ganglia-2/23:10
robbiewfyi23:10
robbiewgotta run and pick up kids...and do the dad thing...will be back on later tonight.  If you find anything just update here or shoot me a mail23:10
negronjlrobbiew: ok ... I'll email it to ya23:11
robbiewcool...good luck! :P23:11
SpamapSnijaba: yes install just exits if it is run a second time on the same DB23:38
nijabaSpamapS: k, good :)23:39
nijabaSpamapS: do you code your charms local or on ec2?23:40
SpamapSnijaba: ec223:45
SpamapSthe ssh problem with the local provider makes it basically unusable for me.23:45
SpamapSnijaba: I was using canonistack for a while but it also became unreliable. :-P23:45
SpamapSEC2 is still the most reliable way to get things done with juju. :)23:45
nijabaSpamapS: which one?  canonistack or your's23:46
SpamapSus-east-1 ;)23:46
nijabawas talking about openstack23:46
SpamapSnijaba: pinged you back on the merge proposal.. I think the right thing to do is just re-delete the "already configured" check23:46
nijabasounds good to me23:47
SpamapSnijaba: I only ever tried against canonistack.23:47
nijabak23:47
SpamapSnijaba: I really like limesurvey. Its a lot more fun than wordpress for testing things. :)23:47
SpamapSjust have to convince it to stop using myisam. :)23:48
nijabaSpamapS: hehe, it's been my learning project for both packaging and juju now :)23:48
nijabaSpamapS: I still can't get the package in though23:48
nijabaSpamapS: been ready for 2 years, lack sponsor for both debian and ubuntu23:49
nijabaSpamapS: my merge should help you use InnoDB, it's now in the config23:49
SpamapSnijaba: packaging PHP apps is so pre-JUJU23:50
nijabaSpamapS: true, true23:51

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!