[00:14] nice fixed debug-hooks to work for early in the agent lifecycle [00:14] er. charm [00:17] hazmat: Ohhh.. that's sweet [00:18] hazmat: unix modes in zip is in upstream, btw.. we'll just need tip for the moment [00:18] niemeyer, cool, i've got a tip build i can utilize [00:18] niemeyer, its going to be a little while till the next release? [00:18] hazmat: I've upgraded the PPA, but failed to put the series in the version number [00:18] since they just released [00:18] hazmat: There's a bug in the build procedure with the colon in the path that I'll have to check out when I get a moment [00:19] hazmat: Yeah, but it shouldn't be a big deal for us for the server side [00:19] hazmat: We can deploy with tip [00:19] hazmat: Well.. and that'll be in the weekly in a couple of days [00:19] niemeyer, cool [01:30] <_mup_> juju/status-with-unit-address r403 committed by kapil.thangavelu@canonical.com [01:30] <_mup_> debug hooks works with unit address, also address deficiency around debugging early unit lifecycle, by allowing debug-hooks command to wait for the unit to be running [01:31] niemeyer, i think i'm going to go ahead and try to address the placement stuff after the lxc merge [01:31] hazmat: Sound ok.. but I also think it's going to be surprisingly trivial to handle it as suggested [01:31] Sounds [01:32] hazmat: It's indeed fine either way, though [01:32] niemeyer, i know.. i'm just fading, and want to get merges in.. this stuff needs to go upstream... maybe i should hold off till i get full night's rest [01:32] anyways.. last of the branches is ready for review ( cli with unit address) [01:33] hazmat: Yeah, get some rest [01:33] hazmat: I'll probably do the same to get up early tomorrow [01:33] niemeyer, there are places where i think placement on the cli is useful, and placement is a global provider option.. [01:34] some of the discussion from earlier w/ bcsaller.. [01:34] the cross-az stuff in particular is of interest to me [01:34] hazmat: I think it's overthinking the problem a bit [01:34] hazmat: This is well beyond what we need for the feature at hand [01:35] hazmat: I'd rather keep it simple and clean until practice shows the need for the cli [01:35] i'm concerned that placement is going to get out of hand on responsibilities on the one hand, and on the other i see it as being very convienent for implementing features like deploy this unit in the a differrent az [01:36] hazmat: I feel uncertain about that [01:36] ic cross az as something required for production on ec2.. i'm not sure where else we can put this sort of decision [01:36] hazmat: We're implementing one feature, and imagining something else without carefully taking into account the side effects [01:37] fair enough [01:37] hazmat: It's not required really.. [01:37] hazmat: cross az can be done with a single cluster [01:37] niemeyer, sure it can.. but how do we place it such that is [01:37] er.. place such that it is [01:37] hazmat: Yeah, good question.. I don';t think the placement branch answers it [01:38] hazmat: So I'd rather keep it in a way we're comfortable rather than creeping up without properly figuring what we're targeting at [01:38] it doesn't but cli placement is easy facility for it.. i agree there are ramifications there given provider validation that bear more thought, but it works pretty simply afaics [01:39] hazmat: I'm not sure, and given that it really won't work either way right now, I'd rather not do it for now. [01:40] hazmat: If nothing else, we're offering a visible interface to something that makes no sense to the user, with some intermangling in the implementation that we're not totally comfortable with. [01:40] hazmat: Feels like a perfect situation to raise KISS and YAGNI [01:42] * hazmat ponders [01:43] i'll sleep on it... i still think cross-az stuff is very important.. and that this is probably the simplest way to offer it to users. [01:44] but perhaps its a red herring... much else to do for internal failure scenario recovery [01:44] reconnects, restarts, etc [01:45] hazmat: That's not even the point.. no matter if it's the implementation we want or not, it doesn't work today, and won't work for quite a while. [01:45] niemeyer, i could implement this cross-az with via cli placement in a day i think. [01:45] hazmat: I'd rather not have this stuff creeping up in the code base until we figure it out. [01:45] tomorrow even [01:45] hazmat: Heh [01:46] ;-) [01:46] hazmat: I suggest we KISS and you suggest doing even more.. get some sleep. :) [01:46] indeed [01:52] <_mup_> Bug #859308 was filed: Juju commands (ssh/status/debug-hooks) should work with unit addresses. < https://launchpad.net/bugs/859308 > [10:48] Hello! [10:49] niemeyer: hiya! [10:50] rog: Hey! [10:56] niemeyer: what's the best way for me to update to your merged version? [10:56] (of gozk) [10:57] is it now in a new repository? [10:57] rog: It's a new branch.. just branch from lp:gozk/zk [10:57] rog: Which is an alias for lp:~juju/gozk/zk [10:58] rog: In the future it'll go back to being lp:~juju/gozk/trunk, once we kill launchpad.net/gozk [10:58] I mean, kill as in not support this import path [10:58] ok [11:16] <__lucio__> hi! is there a way to compose to formulas so i can say, for example, deploy a database server + a monitoring agent to this node? [11:22] * rog finds lots of documentation bugs. oops. [11:27] __lucio__: Absolutely [11:27] <__lucio__> niemeyer, how? (hello!) [11:27] __lucio__: Hey! :) [11:28] __lucio__: Charms (previously known as formulas) interconnect via relations that follow a loose protocol [11:29] __lucio__: We give a name to the interface between them so that we can distinguish the protocols [11:29] __lucio__: So, you can define in one of the formulas that it requires (consumes) a given relation interface, and in the other side that it provides (serves) the given relation interface [11:30] __lucio__: This way both sides can be interconnected at runtime [11:30] __lucio__: Using the "juju add-relation" command [11:30] __lucio__: The charms will be notified when such a relation is established via the hooks [11:31] rog: Hm? [11:31] __lucio__: Does that make sense? :) [11:32] <__lucio__> niemeyer, not exactly what i mean. imagine i get the mysql charm and want to deploy it. get machine 1 with mysql. then i want to deploy some agent to monitor the system stats there. i want to create a new charm and say "deploy this charm to this machine that already exists" [11:32] <__lucio__> is that the "placement policy"? [11:32] __lucio__: Ah [11:32] __lucio__: I see [11:32] <__lucio__> the key part in here would be that those charms should know nothing of each other [11:33] __lucio__: This will be supported in the coming future through what we're calling co-located charms [11:33] __lucio__: In practice it'll be just a flag in the relation [11:33] __lucio__: and juju will put the charms together based on that [11:33] __lucio__: It's not implemented yet, though [11:33] __lucio__: and it's not the placement policy [11:34] hazmat: See? :) [11:34] __lucio__: Yeah, exactly [11:34] __lucio__: Re. knowing nothing about each other [11:34] __lucio__: They will use exactly the same interface for communication that normal charms use [11:34] <__lucio__> niemeyer, ack. nice to see you guys thought about it :) [11:35] __lucio__: Despite them being in the same machine [11:35] __lucio__: Yeah, there's a lot of very nice stuff to come.. just a matter of time [12:33] niemeyer: ping [12:34] fwereade: Hey! [12:34] niemeyer: thanks for the review :) [12:34] niemeyer: how's it going? [12:34] fwereade: np [12:34] fwereade: Going in a roll! [12:35] niemeyer: sweet :D [12:35] fwereade: I was wondering about charm id/url/collection/etc terminology [12:35] fwereade: Ok [12:35] niemeyer: and wanted to know what your theoughts were re: the hash at the end [12:36] niemeyer: I see it as not really part of the *id* so much as just a useful bit of verification [12:36] niemeyer: but... well, it's an important bit of verification :) [12:36] fwereade: Which hash? [12:37] niemeyer: lp:foo/bar-1:ry4xn987ytx984qty498tx984ww [12:37] when they're stored [12:37] Howdy folks .. did the LXC work land already? [12:37] seeing lots of cool comments [12:37] fwereade: It must be there [12:37] commits I mean [12:37] fwereade: For storage, specifically [12:37] (yes that was a keyboard-mash, not a hash, but close enough ;)) [12:38] fwereade: The issue isn't verification, but uniqueness [12:38] kim0: Heya! [12:38] kim0: It's on the way [12:38] niemeyer: ...ha, good point, hadn't internalised the issues with revision uniqueness [12:38] niemeyer: except, wait, doesn't the collection-revision pair guarantee uniqueness? [12:39] niemeyer: I know rvisions and names wouldn't be enough [12:39] fwereade: Define "guarantee" [12:39] fwereade: ;_) [12:40] cool, can't wait to tell the world about this .. It's such a nice feature [12:40] fwereade: A hash is a reasonable "guarantee", even if it's not 100% certain. Trusting the user to provide a unique pair isn't very trustworthy. [12:40] * kim0 compiles regular juju progress report .. shares with planet earth [12:40] kim0: It is indeed! And we're almost there [12:41] niemeyer: ok, it feels like the bad assumption is that a collection + a name will uniquely identify a (monotonically increasing) sequence of revisions [12:41] niemeyer: confirm? [12:42] fwereade: I'd say more generally that the tuple (collection, name, id) can't be proven unique [12:43] fwereade: If we were the only ones in control of releasing them, we could make it so, but we're not [12:43] niemeyer: hm, indeed :) [12:43] niemeyer: ok, makes sense [12:43] niemeyer: in that case, I don't see where we'd ever want the revision without the hash [12:44] fwereade: That seems a bit extreme [12:44] mm .. the juju list is not on https://lists.ubuntu.com/ [12:45] fwereade: The revision number is informative [12:46] fwereade: and in the store it will uniquely identify the content [12:46] fwereade: FWIW, the same thing is true for packages [12:48] lunch [12:48] niemeyer: ok... but if we ever have reason to be concerned about uniqueness of coll+name+rev, in what circumstances *can* we assume that that alone is good enough to identify a charm? [12:49] niemeyer: (ok: we can if it came from the store, probably (assuming *we* don't screw anything up) but it doesn't seem sensible to special case that [12:50] niemeyer: ) [12:50] fwereade: Pretty much in all cases we can assume it's unique within the code [12:51] niemeyer: if we want the bundles to be stored with keys including the hash, why would we eschew that requirement for the ZK node names? [12:51] niemeyer: um, "pretty much in all cases" == "not in all cases" :p [12:52] fwereade: Sure, you've just found one case where we're concerned about clashes [12:52] fwereade: Maybe we can change that logic, actually.. hmm [12:53] fwereade: The real problem there is that it's very easy for the user to tweak a formula and ask to deploy it, and then deploy something else [12:54] fwereade: The question is how to avoid that situation [12:54] niemeyer: sorry, are we talking about upgrades, or just normal deploys? [12:55] fwereade: I'm happy for us to remove the hash from the name if we can find a way to avoid surprising results in these scenarios [12:55] fwereade: Both [12:55] niemeyer: heh, I was more in favour of making the hash a required part of the ID throughout (at least internally) [12:56] niemeyer: my issue was that it *wasn't* included in the ZK node name at the moment [12:56] niemeyer: that seemed like a problem :) [12:56] fwereade: That will mean we'll get two things deployed with the same name-id [12:56] fwereade: Not a situation I want to be around for debugging ;-) [12:57] fwereade: HmM! [12:57] fwereade: What about revisioning local formulas automatically based on the latest stored version in the env? [12:58] fwereade: Effectively bumping it [12:58] fwereade: The spec actually already suggests that, despite that problem [12:58] fwereade: This way we can remove the hash.. but we must never overwrite a previously existing charm [12:59] niemeyer: I'm confused [12:59] fwereade: Ok, let me unconfuse you then [12:59] fwereade: What's the worst point in the above explanation? :) [13:00] niemeyer: can we agree that (1) we can't guarantee that a (coll, name, rev) doesn't necessarily uniquely identify a charm [13:01] (2) therefore, we need something else to guarantee uniqueness [13:01] fwereade: My suggestion is to guarantee uniqueness "at the door" [13:01] fwereade: We never replace a previous (coll, name, rev) [13:02] fwereade: If we detect the user is trying to do that, we error out [13:02] fwereade: To facilitate development, though, we must give people a way to quickly iterate over versions of a charm [13:03] fwereade: Which means we need to bump charm revisions in the local case based on what was previously deployed [13:03] fwereade: Makes sense? [13:04] niemeyer: I think so [13:04] * fwereade thinks... [13:04] fwereade: This way we can remove the hash [13:04] fwereade: But you'll have to review logic around that a bit so that we're sure we're not naively replacing a previous version [13:05] fwereade: It shouldn't be hard, IIRC [13:05] niemeyer: I don't remember it being exceptionally complex [13:05] fwereade: Because we consciously store the charm in zk after uploading [13:05] fwereade: So if the charm is in zk, it must be in the storage [13:05] niemeyer: I have a vague feeling it'lll already complain if we try to overwrite a charm in zk [13:05] fwereade: and thus we shouldn't replace [13:06] fwereade: I think upgrade is a bit more naive [13:06] fwereade: But I'm not sure [13:06] fwereade: Perhaps my memory is failing me [13:06] niemeyer: I know less about the code than you might think, I was working most of last week with about 3 mental registers operating properly :/ [13:08] niemeyer: CharmStateManager calls client.create with a hashless ID, so that should explode reliably already [13:09] fwereade: Not sure really.. but please review it.. it'll be time well spent [13:09] fwereade: Then, we'll need to implement the revision bumping that is in the spec [13:09] fwereade: For the local case, that is [13:10] niemeyer: there was atlk a little while ago about allowing people to just ignore revisions locally [13:10] niemeyer: which seems to me to be quite nice for charm authors [13:11] fwereade: Exactly.. that's a way to do exactly that [13:12] fwereade: The user will be able to ignore it, because we'll be sorting out automatically [13:12] fwereade: Please see details in the spec [13:12] Will get a bite.. biab [13:12] niemeyer: by overwriting the revision file in the local repo? (the spec seems to me to be talking about how the formula store should work, not local repos) === med_out is now known as medberry [13:28] fwereade: CTRL-F for "local formula" within "Formula revisions" [13:28] fwereade: Sorry.. [13:28] fwereade: CTRL-F for "local deployment" within "Formula revisions" [13:29] niemeyer: hm, I see it now, sorry [13:29] fwereade: np [13:29] niemeyer: for some reason I'm not very happy with us writing into a local repo though [13:30] fwereade: That's why the revision is being taken out of the metadata [13:30] niemeyer: ...and it seems to say we should bump on every deploy, which feels rather aggressive [13:31] niemeyer: just a suggestion: if the revision and the hash don't match, we blow up as expected [13:31] fwereade: You have the context for why this is being done now.. I'm happy to take suggestions :) [13:31] fwereade: The hash of what? [13:31] fwereade: Directories have no hashe [13:31] niemeyer: don't they? [13:31] niemeyer: ok, it's the hash of the bundle [13:32] but they do have the appropriate method [13:32] fwereade: Yeah. it's a hack really [13:32] fwereade: Plus, not updating means we'll force users to bump it manually [13:32] fwereade: Effectively doing the "rather aggressive" part manually, which sucks [13:34] niemeyer: what if we treat a revision file that exists as important -- so if you change a revisioned formula but don't change the rev, you blow up -- but allow people to just delete the revision file locally, in which case we identofy purely by hash and treat the has of the current local version as "newer" than any other hashes that might be around [13:34] fwereade: I don't get what's the problem you're solving with that behavior [13:35] niemeyer: the failure-to-upgrade-without-manually-tweaking-revision [13:35] fwereade: The solution in the spec solves that without using hashes [13:36] fwereade: Why is your suggestion better? [13:36] niemeyer: but at the cost of repeatedly uploading the same formula every time it's deployed whether or not it's required [13:36] fwereade: Hmm [13:37] niemeyer: I'm also a bit suspicious of requiring write access to local repos, just to deploy from them [13:37] niemeyer: feels icky ;) [13:38] fwereade: That's trivial to solve.. but let's see, your earlier point is a good one [13:43] fwereade: Hmm.. I think we can specialize the behavior to upgrade [13:44] fwereade: and make deploy consistent for local/remote [13:44] fwereade: In deploy, if there's a charm in the env, use it no matter what [13:44] fwereade: Well, assuming no revision was provided, which is always true nowadays [13:45] fwereade: In upgrade, if it is local, bump the revision to the the revision currently deployed (in the *env*) + 1 [13:46] niemeyer: so we *might* still needlessly upload, but less frequently... not entirely unreasonable, I guess :p [13:47] fwereade: Sure, which gets us back to the original issue.. we need a method that: [13:47] 1) Does not needlessly bump the revision [13:47] 2) Does not require people to bump the revision manually [13:47] That's one solution [13:48] fwereade: I don't want to get into the business of comparing the hash of an open directory with a file in the env [13:48] fwereade: At least not right now.. to solve the problem we'd need to create a unique way to hash the content that doesn't vary with different bundlings [13:49] niemeyer: hm, I wasn't aware we had different bundlings to deal with..? [13:49] fwereade: Well.. [13:49] fwereade: There's a file in the env.. there's an open directory in the disk [13:49] fwereade: How do we compare the two? [13:51] niemeyer: well, at the moment, we zip up the env and hash the zipfile; I understand you think that's a hack, but I don't understand how it makes the situation any worse [13:51] niemeyer: hm, I wasn't aware we had different bundlings to deal with..? [13:51] fwereade: So you do understand we have different bundlings to deal with [13:51] niemeyer: we have different representations of charms, but the hashing is the same [13:51] fwereade: Why is it the same? [13:52] fwereade: Where's the zipping algorithm described that guarantees that zipping the same directory twice necessarily produces the same hash? [13:52] niemeyer: because we convert dirs to bundles to hash them? [13:52] niemeyer: and we *also* convert dirs to bundles to deploy them [13:52] fwereade: Where's the zipping algorithm described that guarantees that zipping the same directory twice necessarily produces the same hash? [13:52] niemeyer: ah-ha [13:53] zip files hold modification times... [13:54] niemeyer, rog: hmm. [13:54] niemeyer: i did something like this before [13:54] rog: modification times can be preserved.. but there are other aspects like ordering that are entirely unspecified [13:54] yup [13:55] niemeyer: my fs file traversal thing (which later became alphabet) solved this by always archiving in canonical order [13:55] So, there are two choices: either we define a directory/content hashing algorithm, or we don't take the content into account [13:56] and i added a filter for canonicalising metadata we don't care about (e.g. mtime, atime) [13:56] oh yes, permissions were a problem too. [13:56] it worked very well in the end though [13:56] rog: Sure, I'm not saying it's not possible.. I'm just saying that it requires diving into the problem more than "hash the zip files" [13:56] sure [13:57] zip files aren't canonical [13:58] niemeyer: as a side note: what's the trivial solution to my discomfort with requiring write access to local repos? [13:58] fwereade: bundle the revision dynamically [13:59] niemeyer: so we'd have local repos with different revs to the deployed versions? that feels like a pain to debug, too [13:59] fwereade: That may be the case either way, and there's absolutely nothing we can do to prevent it [14:00] fwereade: The prove being that the local version is user modifiable [14:01] fwereade: Either way, the normal case is writing the revision.. so let's not worry about the read-only case for now [14:02] niemeyer: ok then :) [14:02] fwereade: local: is really targeting development.. [14:03] niemeyer: true [14:03] fwereade: Again, please note that the local revision bumping must take the revision from the env + 1, rather than taking the local revision number in consideration [14:03] fwereade: On upgrade, specifically.. [14:04] fwereade: I believe we can handle the deploy case exactly the same for local/remote [14:04] niemeyer: understood, I just feel that "local newer than env" is easily comprehensible, while "env newer than local (from which it was deployed" is a touch confusing [14:04] niemeyer: agree on deploy: just use the one already deployed if it exists [14:05] niemeyer: (I know I'm still talking about the magic non-writing case, I'll try to forget about that) [14:06] fwereade: I don't understand the first comment in this series [14:06] niemeyer: sorry, I was still wittering on about the non-writing case, it's not relevant ATM [14:07] fwereade: The local namespace is flat.. [14:07] fwereade: Ponder for a second what happens if both of us start deploying the same "local" formula on the env [14:08] fwereade: and what the revision numbers mean in that case [14:08] niemeyer: I've been having quiet nightmares about that, actually ;) [14:09] fwereade: There's no nightmare that, if you acknowledge that local: is targeting development most importantly [14:09] s/that,/there,/ [14:09] niemeyer: I think the only sensible thing we can say is Don't Do That [14:09] fwereade: It's fine actually.. the last deployment will win [14:09] fwereade: Which is a perfect valid scenario when development is being done [14:09] perfectly [14:09] Can't write today [14:10] fwereade: "local:" is _not_ about handling all non-store cases.. [14:10] fwereade: We'll eventually have a "custom store" people will be able to deploy in-house [14:11] niemeyer: ok, a separate piece fell into place, part of my brain was conflating services and charms [14:12] niemeyer: I'm happy about that now [14:12] fwereade: Ah, phew, ok :-) [14:14] niemeyer: so... we trash hashes, then, and double-check that we'll explode if we try to overwrite a (coll, name, rev) in ZK [14:14] fwereade: Yeah, "explode" as in "error out nicely".. :-) [14:15] niemeyer: quote so ;) [14:15] gaah, I can't write either :/ [14:16] niemeyer: tyvm, very illuminating discussion [14:16] fwereade: It's been my pleasure.. have been learning as well [14:16] niemeyer: cheers :) [14:17] fwereade: Btw, the critical piece to review is whether we might overwrite the storage content or not [14:17] fwereade: We have some protection from zk that create(...) won't work if it already exists [14:17] fwereade: But we have none from the storage [14:17] fwereade: So if the logic is not as we think it is, it'll blindly overwrite and we'll figure later [14:18] fwereade: The hash protected us from that, even if not in an ideal way as you pointed out [14:18] niemeyer: yes indeed, I'll need to be careful but it's not insoluble [14:19] fwereade: I _think_ the original logic had "store + put in zk" for exactly that reason [14:19] niemeyer: btw, really quick lazy question: what would cause a zk charm node to be deleted? [14:19] fwereade: The ordering means that if an upload breaks mid-way, we still retry and overwrite [14:19] fwereade: Nothing, IIRC [14:20] fwereade: We debated a bit about garbage collecting it [14:20] niemeyer: ok, I thought I saw some logic to deal with that case, and was a bit surprised [14:20] fwereade: and we can do it at some point [14:20] fwereade: but I don't recall supporting it ATM [14:22] niemeyer: cool, I won't fret too much about that [14:30] Man.. empty review queue.. I'll run and do some addition server-side work on the store [14:30] additional.. [14:47] * hazmat catches up on the backlog [14:48] <_mup_> juju/go-store r14 committed by gustavo@niemeyer.net [14:48] <_mup_> Bootstrapping store package. [14:49] fwereade, niemeyer interesting about col/name/rev uniqueness.. one of the bugs/useability things for charm authors, is being able to do away with constant rev increments for iteration and just relying on hash [14:49] hazmat: morning! [14:49] its something that bites pretty much every charm author [14:49] hazmat: indeed, but niemeyer has convinced me that auto-incrementing on upgrade from local repos should solve that [14:50] hazmat: Yeah.. there are other ways to handle this without relying on hash, though.. read through :) [14:50] * hazmat continues the backlog [14:50] long conversation indeed [14:57] m_3: howdy .. please ping me hwen you're up [14:57] <_mup_> juju/go-store r15 committed by gustavo@niemeyer.net [14:57] <_mup_> Imported the mgo test suite setup/teardown from personal project. [15:06] niemeyer, so the conclusion is, for local repositories, always increment the version on deploy regardless of any change to the formula? [15:06] hazmat: Not quit [15:06] e [15:07] fwereade: Hmm.. I think we can specialize the behavior to upgrade [15:07] fwereade: and make deploy consistent for local/remote [15:07] fwereade: In deploy, if there's a charm in the env, use it no matter what [15:07] fwereade: Well, assuming no revision was provided, which is always true nowadays [15:07] fwereade: In upgrade, if it is local, bump the revision to the the revision currently deployed (in the *env*) + 1 [15:07] hazmat: ^ [15:09] hmm. also we should log at info level the formula we're using on deploy (already in env, vs uploaded) [15:09] hazmat: True [15:09] that's part of what bites people, lack of discovery into the problem till they go inspecting things [15:09] hmm. also we should log at info level the formula we're using on deploy (already in env, vs uploaded) [15:09] fwereade: ^ [15:10] LOL [15:10] hazmat: True [15:10] hmm. also we should log at info level the formula we're using on deploy (already in env, vs uploaded) [15:10] that's part of what bites people, lack of discovery into the problem till they go inspecting things [15:10] fwereade: ^^^ [15:10] niemeyer, hazmat: sounds sensible [15:11] auto increment on upgrade sounds good [15:12] the upgrade implementation is pretty strict on newer versions, which is why i punted on a hash based approach, it was hard to maintain that notion [15:14] hazmat: Agreed. The hash stuff sounds interesting to detect coincidences for sure, but the detail is that it won't really solve the problems we have.. we need to consider larger versions anyway, and need to be able to update the previous deployment [15:14] ... without manual interaction [15:14] So for now it feels like the auto-increment upgrade is enough [15:16] fwereade: When do you think the new CharmURL & CharmCollection abstractions will be available? [15:16] fwereade: Just want to sync up because I'd like to have a look at them before mimicking in Go [15:16] fwereade: So we match logic [15:16] niemeyer: hopefully EOmyD, but I'm not quite sure when that will be [15:17] fwereade: Cool, thks [15:17] niemeyer: certainly before strot of your day tomorrow though [15:17] fwereade: Ok [15:17] gaah *start* of your day [15:17] fwereade: Are you planning on doing any modifications to the suggested API? [15:18] I think I'm happy with everything you proposed [15:18] fwereade: Awesome, I'll get started on it then [15:19] niemeyer: I'll let you know ASAP if I come up with any problems [15:19] fwereade: Superb, cheers [15:19] fwereade: Will do the same on my end [15:37] kim0: hey man... what's up? [15:42] <_mup_> juju/config-juju-origin r358 committed by jim.baker@canonical.com [15:42] <_mup_> Merged trunk [15:47] fwereade, do you know if the orchestra machines generally have useable fqdns? [15:48] hazmat: better check with roaksoax, but I don't think you can guarantee it [15:48] hazmat: context? [15:48] fwereade, niemeyer, re the delta with local/lxc vs orchestra on address retrieval.. with local the fqdn isn't resolvable, but the ip address is routable and there is a known interface. with orchestra the number of nics on a machine isn't knowable, but i was hoping we could say fqdns are resolvable [15:49] hazmat: IIRC the dns_name should work from other machines, but I don't think we have any guarantees about how it works from outside that network [15:49] this also per SpamapS comments on the original implementation that we should favor fqdn over ip address, and neatly sidesteps ipv4 vs ipv6 behind dns [15:49] hazmat: We can't guarantee it ATM [15:49] hazmat: Most of the tests I recall were done with IP addresses [15:50] niemeyer, on the address branch its all just a popen... local with ip, ec2 and orchestra with fqdn hostnames [15:50] hazmat: The fully qualified domain will also not resolve the problem.. it may have multiple nics despite the existence of a fqdn [15:51] niemeyer, multiple nics is fine if the fqdn is resolvable [15:51] hazmat: I believe it's not.. it'll resolve to an arbitrary ip address [15:51] hazmat: Which may not be the right one if a machine has multiple ips [15:51] hazmat: ec2 is a different case.. [15:51] hazmat: We know what we're doing there [15:52] niemeyer ? hostname -f returns the fqdn of the host regardless of multiple nics [15:52] For multiple NIC's, the FQDN should resolve to the NIC that you wish the host to be externally reachable on... [15:52] which is what we do for orchestra [15:52] hazmat: hostname -f returns *a* name, that may be resolvable or not, and that may map to the right ip or not [15:53] I *can* see a situation where you have a management NIC, and a service NIC .. each needing different handling. [15:53] SpamapS, we've got separation of public/private addresses for units, but getting those addresses on on orchestra deployments is the question [15:54] doesn't seem like we can do that apriori [15:54] Indeed. DNS is the only reliable way, IMO, to handle something so loosely coupled. [15:55] hazmat: I suggest checking with smoser and RoAkSoAx then [15:55] hazmat: If they're happy, I'm happy :) [15:57] hi all [15:58] koolhead11: Hey! [15:58] hello niemeyer [15:58] niemeyer: one merge proposal sent your way: https://code.launchpad.net/~rogpeppe/gozk/update-server-interface/+merge/77009 [15:58] rog: Woohay, cheers! [15:58] SpamapS: i got some idea how not to use dbconfig-common :) [15:58] niemeyer: (ignore the first one, i did the merge the wrong way around) [15:59] rog: The first one? [15:59] I think IP's grokked from the network provider are usable... EC2 knows which one is externally available vs. internal, and the provider has full network control, so you can take that IP and use it confidently. Orchestra has no such guarantees, so the hostname that we gave to the DHCP server and that we built from its DNS settings is the only meaningful thing we can make use of. [15:59] koolhead11: progress is good. :) [16:00] SpamapS: yeah. :D [16:01] * koolhead11 bows to Daviey [16:01] For servers with multi-NIC, the only real thing we can do is use a cobbler pre-seed template that selects the most appropriate one. Making use of multiples for mgmt/service seems like something we'll have to do as a new feature. [16:01] niemeyer: hold on, i think i mucked up. too many versions flying around. [16:01] rog: No worries [16:02] gozk/zk vs gozk vs gozk/zookeeper [16:02] niemeyer: no, it's all good i think [16:03] rog: Coolio [16:03] niemeyer: i just did a dud directory rename, but i don't think it affects what you'll see [16:04] RoAkSoAx: We were just talking about ips vs hostnames in the context of orchestra units [16:04] RoAkSoAx: hazmat has more details [16:04] hello robbiew RoAkSoAx [16:04] I'm going to step out for lunch and leave you guys with trouble! [16:04] niemeyer: ok [16:04] niemeyer: im on a sprint atm [16:04] hazmat: ^^ [16:04] RoAkSoAx: It's quick [16:04] RoAkSoAx: But important [16:04] * niemeyer biab [16:06] RoAkSoAx, just trying to determine if on an orchestra launched machine we can assume either a routable hostname (fqdn) or nic for recording an address to the machine [16:06] ie. if something like hostname -f is useable to reach the machine from another machine in the orchestra environment [16:07] i assume the orchestra server is just tracking mac addresses on the machine [16:07] hazmat: hazmat yes the orchestra server is tracking the MAC address [16:07] hazmat: we always have to track it [16:08] hazmat: though, we were making sure hostnames was fqdn as an standard and that it was set correctly [16:08] hazmat: via could-init [16:08] smoser: ^^ [16:09] hazmat: the idea is to use a DNS reacheable name for each machine that's fqdn [16:09] RoAkSoAx, if thats the case that's perfect.. fqdn == hostname that is [16:11] hazmat: yes that's what we are trying to standarize last couple weeks. Give me a few minutes till I get a hold on a few people here [16:11] hazmat: and discuss the approach [16:11] hazmat: its fair to say that we should take a look at other strategies for addressing services and machines as we get deeper in to the hardware deployment story... [16:12] hazmat: for this primary pass, making it work "a lot like the cloud" is the simplest approach. [16:13] for what its worth, you really shoul dnot expect that 'hostname --fqdn' gives an addressable hostname [16:13] smoser: we have no other reliable source of data about what this machine's name is. [16:13] i believe we've fixed it so that will be the case under orchestra, and in EC2 (and we're fixing that for single nic guests in nova). [16:14] The fact that it wasn't happening was a bug. [16:14] no. [16:14] in those limited cases, that is the case. [16:14] but 'hostname --fqdn' is just not reliable. [16:14] read the man page if you disagree. [16:14] it basically says not to use it [16:15] so i would really suggest against telling charms that the right way to do something is something that documents itself as the wrong way [16:15] :) [16:15] i dont have a solution [16:15] smoser: Indeed, this is the first time I've actually read this.. I wonder how recently this changed. :-/ [16:16] I don't know if I agree with the man page's reasoning or with the mechanics of --all-fqdns [16:16] "Don't use this because it can be changed" vs. "Rely on reverse DNS instead" ... [16:16] if you're depending on cloud-init (which you are for better or worse), we can put something in it , or an external command that would basically query the metadata provided by the cloud provider to give you this. [16:17] i would i guess suggest making a ensemble command "get-hostname" or something [16:17] smoser: Its something we can control (since we control the initial boot of the machine) which ripples through and affects everything else on the machine. [16:17] I believe the plan is to have some sort of "unit info" command for charms to use. [16:17] you do not control the initial boot of the machine. [16:17] you do not control the dns. [16:18] so how could you possibly control resolution of a name to an IP? [16:18] smoser: We do control what we've told the provisioner to do .. which is to name that box "X" [16:18] no you do not [16:18] not on ec2 [16:18] cobbler does [16:18] right. [16:19] but stay out of that [16:19] that would mean that ensemble is acting as the cloud provider in some sense when it talks to cobbler [16:19] which is just yucky. [16:19] we don't put the hostname in the metadata for the nocloud seed? [16:19] not any more [16:19] cobbler does [16:19] ensembel does not [16:19] which is much cleaner [16:20] s/ensemble/juju/ [16:20] Can we ask cobbler what it put there? [16:20] or s/cleaner/more ec2-or-nova-like/ [16:20] you *can*, but you should not. [16:20] oh [16:20] Ok.. where then should we get the address for the machine? [16:20] wait [16:20] yes [16:20] you can ask cobbler what it put there [16:20] sorry [16:20] can and should I think [16:20] yes [16:20] :) [16:20] sorry [16:21] i thought you were saying "Can we tell cobbler what to put there" [16:21] I'm not enthralled with hostname --fqdn. It is, however, the only common tool we have between all environments at the moment. [16:21] well its easy enough to add a tool [16:21] that lives on the nodes [16:22] I think it might actually be quite trivial to write a charm tool ... 'machine-info --hostname' which gives us the hostname the provider wants us to be contacted with. [16:22] the other thing, i think might be reasonable to consider, if you're only interested in single-cloud systems, would be to have juju run a dns server. [16:22] SpamapS, right. that is what i'm suggesting is fairly easy. [16:22] Too tightly coupled to juju at that point [16:23] right [16:23] If an environment can't provide reliable DNS then it should just give us network addresses when we ask for the hostname. [16:23] i agree with this. [16:23] I believe thats the direction the local provider has gone [16:24] why do you care about a hostname ? [16:24] just curious [16:24] would it not be superior to always be IP ? [16:24] definitely not [16:24] (assuming that the IP would not change) [16:24] why? [16:24] IP can vary from your perspective [16:25] a hostname provides the appropriate level of indirection [16:25] somewhat. [16:25] but in all cases you are aaware of so far, the IP address of the system is what you want. [16:26] ie, in all of cobbler, nova, ec2, 'ifconfig eth0' returns an internally addressable IPv4 address. [16:26] IPv4 or IPv6? internal or external? [16:26] you are interested in IPv4 internal [16:26] usually [16:26] you're 100% only interested in internal if you're using hostname --fqdn [16:26] so that leaves you only ipv4 and ipv6 [16:26] I'm not saying we can't use IP's, I'm saying we need to talk about *hosts* [16:27] ec2 has no ipv4 [16:27] so now you're down to nova (which i know you've not tested ipv6 of) and cobber, which i highly doubt you have [16:27] machine-info --hostname [16:27] You're getting all pragmatic on me. [16:27] just return ipv4 internal ip address. [16:28] no [16:28] so this is below the level of charm [16:28] Like what, you want to ship something *now* ? [16:28] i dont understand the question [16:28] juju is going to prepopulate and store the address, we just need to know how to get it on an orchestra machine [16:28] no [16:28] i was hoping hostname -f would do.. seems like it won't [16:28] do not do that hazmat [16:28] that is broken [16:28] juju should *NOT* prepopulate the address. [16:29] juju is not orchestra [16:29] it can query, it does not set or own. [16:29] smoser, sorry wrong context.. juju was going to store the address from the provider for the charm [16:29] smoser: I'm being a bit sarcastic. Yes, all currently known use cases are satisfied with IP's. But all of them also *should* have hostnames, and we shouldn't ignore the need for hostnames just because we can. [16:29] smoser, the question is how to get the address [16:29] i'm fine with wanting to have hostnames [16:29] you can hide that cleanly behind a command [16:29] in which right now, you're assuming that command is 'hostname --fqdn' [16:30] which is documented as broken [16:30] so i'm suggesting adding another command [16:30] which does the sambe general thing, but works around a bug or two [16:30] and may, in some cases, return an ipv4 address. [16:30] smoser, that command is? [16:30] 'machine-info --hostname' [16:31] hazmat: we've talked about a "machine info" or "unit info" script before. [16:31] I think you want unit info, not machine info. [16:31] which you add as a level of abstraction into the node [16:31] fine [16:31] SpamapS, that doesn't answer the question of how that command gets the info [16:31] ie. how do we implement machine-info's retrieval of the address [16:31] hazmat, right now, it does this: echo $(hostname --fqdn) [16:32] that makes it 100% bug-for-bug compatible with what you have right now [16:32] but is fixable in one location [16:32] hazmat: it queries the provider (or, more likely, queries the info we cached in the zk topology when the machine/unit started) [16:32] SpamapS, is right. [16:32] so for local and ec2 providers, we have known solutions, its the orchestra case that its not clear what we should od [16:32] in the orchestra provider 'hostname --fqdn' works [16:33] and i thought we had (or i think we should) assume that the machine's "hostname" in cobbler is fqdn internal address. [16:33] s/assume/insist/ [16:33] so ensembel can just query that from cobbler [16:33] afaik, the only place broken right now is in nova [16:34] due to bug 854614 [16:34] smoser, does cobbler have any notion of external/public addresses? or just hostnames for a given mac addr [16:34] which will be fixed [16:34] <_mup_> Bug #854614: metadata service local-hostname is not fqdn < https://launchpad.net/bugs/854614 > [16:34] RoAkSoAx would know more, but whatever it is, you assert that in some portion of the machines's metadata, a fqdn exists for internal address. [16:34] and you use it [16:35] i dont have cobbler in front of me to dump machine data. but i think it is a reasonable assertion. [16:35] wow, --all-fqdns /win 24 [16:35] doh [16:35] so --all-fqdns is pretty new [16:36] Appeared just before 9.10 I think [16:36] its really all messed up. [16:36] and it doesn't help you [16:36] as it doesn't sort them in any order (how could it?) [16:36] yeah its not useful [16:36] so how can you rely on its output [16:37] providers need to tell us how a machine they're responsible for is addressable [16:37] right. [16:37] and we just assert at the moment that cobbler stores that in (i think) 'hostname' [16:38] And then the external and internal IP's are both the result of querying DNS for that hostname. [16:38] i dont folow that. [16:38] i didn't know external ip was something that was being discussed. [16:39] Just thinking of analogs for ec2's metadata [16:39] Its needed [16:39] for expose [16:39] i agree it would be needed... [16:39] For orchestra, all the firewall stuff is noop'd though [16:39] i really have to look at nova to find a good place for this. [16:39] but basically i think we just need to store it there and assert that it is configured sanely. [16:40] I believe there's a desire to mvoe that FW management to the agents managing ufw/iptables .. but for now, providers have to do it, and orchestra can't. [16:40] yes, hostname in cobbler is the canonical source of the machine's hostnanme [16:40] and Mavis Beacon is the canonical source of my bad typing [16:40] i think for our near term purposes cobbler no op is fine for firewall [16:40] agreed [16:41] SpamapS, so hostname -? is fine for cobbler for the private address.. and hopefully the public address? [16:41] almost certainly not the public address. [16:41] smoser, its not clear what a public address means in orchestra.. its outside the purview of the provider [16:42] hazmat, well, sort of [16:42] clearly orchestra could have that data [16:42] and could provide it to you [16:42] but i dont think we have a place where we assert it is stored now. [16:43] orchestra does not imply whether it has public/private networks. [16:44] Its really not all that interesting, just return hostname for anything wanting to address the machine. [16:45] good enough for me. [16:45] so i do suggest the layer of indirection over 'hostname --fqdn' [16:45] And I'll open up a bug for the desired charm tool [16:45] smoser: agreed, will open up that bug now [16:45] SpamapS, the common use for that is going away [16:46] SpamapS, the relations be prepopulated with the info [16:46] although we still need a way to query it agreed [16:46] at the unit level [16:46] Right, is there a bug for that then? [16:47] or will it be a reserved variable in relation-get ? [16:47] SpamapS, not yet.. but the units-with-addresses branch does the work of storing it directly on the units (pub/private) address in provider specific manner [16:47] SpamapS, just a prepopulated one [16:47] I like that [16:48] i just needed to verify that hostname --fqdn does something sane w/ orchestra [16:48] and it seems like thats what we should use use for now [16:48] which is nice, since that's whats implemented for orchestra [16:48] Wow.. long thread [16:50] hazmat: since all the charms currently rely on it, its been made to work that way. But as we've discussed here, its not really robust as a long term solution. [16:50] smoser, RoAkSoAx does that mean that bug 846208 is fixed? [16:50] <_mup_> Bug #846208: Provisioned nodes do not get a FQDN < https://launchpad.net/bugs/846208 > [16:50] wrt to orchestra [16:51] SpamapS, agreed, but getting it out of the charms, goes a long way to giving us the flexibility to fix it [16:51] niemeyer: yeah, when you get Me, the tire kicker, and smoser, Mr. Meh, talking about something.. the threads tend to go back and forth with a lot of "NO, no, no NO No, no, ahh, yes." [16:51] SpamapS: That's a nice way to get something proper in place.. [16:52] adam_g, probably knows aboug 846208 but i would have thought yes. [16:52] speaking of long term and short term... I'm hoping to file the FFE tomorrow.. where are we at? [16:52] SpamapS, this is probably the closest bug 788992 [16:52] <_mup_> Bug #788992: example formulas refer to providing the hostname in ensemble itself < https://launchpad.net/bugs/788992 > [16:53] at very least, i'm fairly sure that 'hostname -f' should do the right thing there now. [16:53] smoser, cool [16:55] yeah that bug was fixed already ##846208 will verify now that im here [16:55] <_mup_> Bug #846208: Provisioned nodes do not get a FQDN < https://launchpad.net/bugs/846208 > [16:56] SpamapS, we're very close on local dev. [16:56] bcsaller, how's it going? [16:56] hazmat: I was just reading back the channel actually [16:56] Awesome [16:57] hazmat: have you tried the branch yet? [16:57] bcsaller, not yet.. i'll do so now [17:02] bcsaller, what's the url for the stats on apt-cacher-ng? [17:03] http://localhost:3142/acng-report.html [17:04] hazmat: btw did those tests get fixed? [17:04] SpamapS, which tests? [17:04] hazmat: lxc tests IIRC [17:05] the ones that were blatantly broken last week in trunk [17:05] SpamapS, oh yeah.. the breakage, indeed their fixed.. trunk is green [17:06] cool [17:06] I've been doing regular uploads to my PPA with the distro packaging, which runs the test suite... that was blocking those from working. :p [17:06] bcsaller, i'm seeing some oddities around namespace passing which is breaking lxc-ls, but the units are up and running [17:06] hazmat: I'll need details ;) [17:07] bcsaller, i'll dig into it [17:07] bcsaller, but it appears to be working [17:07] hazmat: in an older version its wasn't setting the ns to qualified name and created images with out a prefix, but that was fixed [17:08] bcsaller, ah.. that looks like the problem [17:08] sounds like [17:08] hazmat: you didn't pull? [17:08] bcsaller, i probably need to remerge your branch [17:08] sounds like [17:08] bcsaller, i've been pulling your branch and looking over the diff, but i don't think i've remerging into the rest of the pipeline [17:09] then I'm surprised it worked. I expect the services in the container didn't actually start for you [17:09] hazmat: that 'conf' change was missing too I expect [17:11] bcsaller, does the template machine get the namespace qualifier? [17:11] s/machine/container [17:11] no, there are some advantages and disadvantages there [17:12] I expect there will be debate around that point in the review [17:13] I guess it *should* though, I can think of many ways it can go wrong for people [17:13] vs being a cost savings for the well behaved. It should also have things like series name in it I expect [17:17] bcsaller, the question is can we get this stuff landed today for push to the repos tomorrow, is there anything i can help with? [17:18] i think all my branches are approved at this point, i've got one last minor to update the provider name, and prepopulate the relations with the unit address [17:21] bcsaller, latest revno is 404 on omega? [17:21] Idk, can't find it [17:21] ;) [17:21] yeah, thats it [17:30] bcsaller, getting pty allocation errors, just had a kernel upgrade going to try a reboot [17:30] unit agents aren't running [17:30] conf file looks fine === koolhead11 is now known as koolhead11|bot [18:00] <_mup_> juju/config-juju-origin r359 committed by jim.baker@canonical.com [18:00] <_mup_> Add support for get_default_origin [18:01] * rog is done for the day. see y'all. [18:01] rog: Cheers! [18:12] bcsaller, the container unit agents never start, and i get pty allocation errors trying to login manually [18:12] hazmat: sounds like what you were having at the sprint [18:12] hazmat: what was the resolution to that? [18:12] bcsaller, upgrading to oneiric [18:12] i don't think that works twice ;-) [18:13] darn [18:13] currently on lxc == 0.7.5-0ubuntu8 [18:13] same [18:19] hazmat: the lxc-library tests do or don't trigger this issue for you? [18:31] bcsaller, are you specing the origin somehow? [18:31] bcsaller, the lxc lib tests fail in omega for me [18:41] SpamapS: What is the set of valid charm names we're going to support? [18:41] SpamapS: foo(-bar)*? [18:41] Or, more properly "^[a-z]+([a-z0-9-]+[a-z])*$" [18:42] fwereade, bcsaller, hazmat, anyone: ^^^? [18:42] niemeyer: yes that looks exactly right [18:42] basically the hostname spec. ;) [18:43] but no capitals [18:43] +1 [18:43] niemeyer: looks fine to me, might need [-_] [18:43] sounds good [18:43] bcsaller: It contains - already [18:43] no _'s [18:43] one visual separator is fine [18:43] ahh 0-9-, ic [18:44] bcsaller, do you have some delta in your omega branch that's not pushed? [18:44] hazmat: no [18:44] bcsaller, i get test failures.. it looks like around juju package install [18:44] origin should be ppa at this point, I think thats what it says in the code, I'll check again [18:45] fwereade: In case you are around, these will be useful: [18:45] var validUser = regexp.MustCompile("^[a-z0-9][a-zA-Z0-9+.-]+$") [18:45] var validSeries = regexp.MustCompile("^[a-z]+([a-z-]+[a-z])?$") [18:45] var validName = regexp.MustCompile("^[a-z]+([a-z0-9-]+[a-z])?$") [18:45] bcsaller, http://paste.ubuntu.com/697431/ [18:48] hazmat: so either the origin isn't ppa, the networking isn't working or... [18:48] bcsaller, the networking is working at least packages are being installed [18:49] hazmat: and you said you can't ssh into the container? I'd try to run the juju-create script, it will be some /tmp/xxxxx-juju-create script in the container and follow the output [18:50] bcsaller, also when the tests fail they leave an orphan container [19:17] jimbaker: any chance of getting env-origin landed today? [19:18] niemeyer, i'm working on the mocks for this. once done, it will be ready for review [19:18] jimbaker: Ugh.. [19:18] niemeyer, so pretty close i would say [19:18] jimbaker: "working on the mocks" gives me bad feelings nowadays, for some reason [19:19] niemeyer, well as i understand i need to mock out apt-cache policy for the various cases [19:20] jimbaker: Not really.. that's a pretty side-effects free problem to solve [19:21] niemeyer, how we would test in the case of being on a distro vs one where it was installed from the ppa? or in the case of being installed from a branch? [19:23] jimbaker: origin, source = parse_juju_policy(data) [19:24] niemeyer, but we still need to run apt-cache policy in order to collect the necessary data. isn't this the role for the mock, to intercept this call with some variations of what it could return? [19:25] jimbaker: There's a single test needed for actually calling apt-cache, and that's also trivial to automate without mocking by putting an executable in the path. [19:25] jimbaker: I won't fight if you decide to mock this one [19:25] jimbaker: But mocking every single iteration of parse_juju_policy is making our lives more painful without a reason [19:26] jimbaker: It's a side-effects free process [19:26] jimbaker: and it's idempotent [19:26] jimbaker: If you need mocker for that I'll take the project page down! :-) [19:27] niemeyer, i will rewrite it according to what you have described, it's not a problem [19:50] bcsaller, are you sure you dont have something in /var/cache/lxc that makes it work for you? [19:50] bcsaller, i just blew away my cache and its still failing on the tests [19:51] hazmat: I'll try to clean that out and check again [19:51] take a few minutes [20:02] bcsaller, did it work? [20:02] bootstrap is still going, w/o cache. [20:03] so for me it hit the test timeout [20:04] but I'm let it build the cache outside the test now [20:05] bcsaller, you on dsl? [20:05] it didn't hit the test timeout for me.. but still failed [20:05] cable [20:05] the unpacking phase took too long oddly [20:10] hazmat: I am seeing errors now, I'll look into it more [20:10] bcsaller, cool, thanks [20:10] bcsaller, as far as i can see ppa is selected across the board [20:10] looked that way to me as well [20:11] oh wait its wrong archive [20:11] haha [20:11] i thought that got fixed in this branch, but you had it cached [20:11] bcsaller, niemeyer pointed it out to me in a review [20:12] bcsaller nevermind that looks sane for the ppa [20:13] * hazmat grabs some lunch [20:13] er.. snack [20:13] yeah, I didn't know what you were talking about there :) [20:33] hazmat: pushed changes to both lxc-lib and omega, it was a missing dep that was cached for me :( [20:33] * bcsaller looks for a brown paper bag [20:34] bcsaller, cool, just glad its fixed [20:34] * hazmat retries [20:49] Hmm.. getting sporadic failures of one test.. [20:49] https://launchpadlibrarian.net/81106645/buildlog_ubuntu-oneiric-i386.ensemble_0.5%2Bbzr361-0ubuntu1~ppa1_FAILEDTOBUILD.txt.gz [20:49] juju.agents.tests.test_unit.UnitAgentTest.test_agent_executes_config_changed_hook [21:07] Hi, I have a problem tyring to get juju to connect to EC2. I described it here http://ubuntuforums.org/showthread.php?t=1849913 but also with the new version today it is tsill the same. I can bootstrap, a new instance is created in EC2, but in juju status the connection is refused [21:08] Cannot connect to machine i-48751428 (perhaps still initializing): could not connect before timeout after 2 retries 2011-09-26 14:03:34,431 ERROR Cannot connect to machine i-48751428 (perhaps still initializing): could not connect before timeout after 2 retries [21:10] jrings: hey, the key that juju uses by default is $HOME/.ssh/id_(rsa|dsa) [21:11] How can I tell juju to use the .pem from EC2? [21:12] jrings: you don't need to [21:12] jrings: it installs your key in the instances [21:12] Well my key is in $HOME/.ssh [21:12] and the juju bootstrap works [21:13] why can't juju status connect then? [21:13] bootstrap complets w/o ssh [21:13] its possible your key didn't make it into the instance for some reason [21:14] jrings: can you pastebin ec2-get-console-output ? [21:16] if i couldn't find a key during bootstrap it will raise an exception [21:17] Is that the same as the log for the instance in the EC2 webconsole? [21:19] If so, here: http://pastebin.com/4c78GVC9 [21:25] jrings: heh, i takes a few minutes to get the full log .. so you might have to wait a bit longer. [21:25] Or maybe there's a limit to the size.. I've never checked [21:25] (that would suck if the limit was applied to the top.. and it wasn't updated like a ring buffer [21:28] hmm.. this line 2011-09-25 10:24:11,882 ERROR SSH forwarding error: bind: Cannot assign requested address [21:28] is interesting [21:28] that's what I get from the juju status [21:29] we pick a random open port on localhost to setup a port forward over ssh [21:30] conflicting with desktop-couch ? [21:30] it looks like that fails, although for it to fail persistently suggests something else is going on [21:30] which does the same thing [21:30] Yeah true [21:30] hazmat: does it definitely do 127.0.0.1 ? [21:30] Yes I can see it trying different ports. [21:30] jrings: can ou paste the output of 'ifconfig -a' ? [21:31] Wait, I set up a single node hadoop locally and had to change something to localhost [21:31] eth1 Link encap:Ethernet HWaddr f0:4d:a2:5f:5c:09 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Interrupt:41 Base address:0xa000 lo Link encap:Local Loopback inet addr:127.0.0.1 Ma [21:31] ugh [21:31] wait [21:31] Here: http://pastebin.com/Vpp3hJPt [21:35] hrm [21:35] jrings: ufw running? [21:35] can't imagine that would break it tho [21:36] Just did a ufw disable and tried again, same result [21:39] Oh shit I got it [21:39] I had Rstudio installed [21:39] it had a server on 127.0.01:8787 [21:39] just uninstalled it, juju status works [21:40] no wait [21:40] actually it doesn't [21:40] argh [21:40] that doesn't make sense. :-/ [21:40] weird [21:40] I got [21:40] 2011-09-26 14:39:06,972 DEBUG Spawning SSH process with remote_user="ubuntu" remote_host="ec2-174-129-58-110.compute-1.amazonaws.com" remote_port="2181" local_port="58376". 2011-09-26 14:39:08,981:6112(0x7f2eadf27720):ZOO_INFO@log_env@658: Client environment:zookeeper.version=zookeeper C client 3.3.3 2011-09-26 14:39:08,981:6112(0x7f2eadf27720):ZOO_INFO@log_env@662: Client environment:host.name=vavatch 2011-09-26 14:39:08,981:6112(0x7f2 [21:40] jrings: can you do 'strace -e trace=listen,bind,connect -f juju status' and paste that? (note that the command 'pastebinit' is really nice for this) [21:40] one time [21:41] and then the next juju status failed again [21:41] SpamapS, it picks the open port from all interfaces but binds to it on localhost [21:41] although i recently added an SO_REUSEADDR flag .. it should still be random each run [21:41] hazmat: literally looks up 'localhost' or uses 127.0.0.1 ? [21:41] it does a bind socket.bind("", 0) [21:42] wait, isn't it an ssh forward? [21:42] SpamapS, ah.. yeah.. for the port forward it explicitly uses localhost [21:42] 'localhost' [21:43] jrings: pastebin 'ping -c 1 localhost' [21:45] Here is the strace http://pastebin.com/Q0CPnDBr [21:46] And the ping works http://pastebin.com/cwsep2NK [21:47] <_mup_> juju/go-charm-url r14 committed by gustavo@niemeyer.net [21:47] <_mup_> Implemented full-blown charm URL parsing and stringification. [21:50] <_mup_> Bug #860082 was filed: Support for charm URLs is needed in Go < https://launchpad.net/bugs/860082 > [21:58] connection on port 49486 worked [21:59] is there a way to fix the port? [21:59] jrings: it should work on pretty much any port thats not already used [22:00] bcsaller, hazmat: How to build the base to review lxc-omega? [22:01] Ugh [22:01] txaws.ec2.exception.EC2Error: Error Message: Not authorized for images: [ami-852fedec] [22:01] Have seen this before... [22:01] stale image.. doh [22:15] Does this try to use IP6? [22:16] bcsaller, hazmat: I'm pushing it back onto Work in Progress.. there are multiple bases and no mention of what they are in the summary [22:16] bcsaller: I've added an item about the file lock implementation there already [22:24] niemeyer, its lxc-library-clone->file-lock and local-provider-config [22:25] <_mup_> juju/config-juju-origin r360 committed by jim.baker@canonical.com [22:25] <_mup_> Unmocked tests in place [22:26] <_mup_> juju/config-juju-origin r361 committed by jim.baker@canonical.com [22:26] <_mup_> Added files to bzr [22:32] hazmat: file-lock is not even in the kanban [22:33] lxc-omega also changed since I last pulled it [22:33] I'm going to hold off a bit since this is getting a bit wild [22:34] niemeyer, https://code.launchpad.net/~bcsaller/juju/filelock/+merge/75806 [22:34] the change was a one liner to address a missing package dep [22:34] that i found while trying it out [22:37] That's fine, but things are indeed a bit wild.. missing branches in the kanban.. branch changing after being pushed, multiple pre-reqs that are not mentioned [22:39] The file-lock branch should probably be dropped, unless I misunderstand what is going on there [22:39] It's not really a mutex.. it'll explode if there are two processes attempting to get into the mutex region [22:40] There's an implementation in Twisted already [22:40] niemeyer, its mean to error if another process tries to use it, but yeah the impl in twisted is probably a better option [22:41] hazmat: It feels pretty bad.. telling the user "Can't open file" with a traceback wouldn't be nice [22:46] <_mup_> juju/config-juju-origin r362 committed by jim.baker@canonical.com [22:46] <_mup_> PEP8, docstrings [23:13] bcsaller, the lxc.lib tests pass but there's still some errors getting units to run [23:14] hazmat: what are you seeing? [23:14] bcsaller, one quickie the #! header on the juju-create is wrong, missing "/" before bin/bash [23:15] bcsaller, it looks like add-apt-repo still isn't installed on the container.. perhaps i had a leftover machine-0-template .. [23:15] cause juju isn't installed which i assume causes the problem [23:16] I suspect thats the case [23:17] because its missing a prefix its not getting killed i assume has to be done by hand, its going to cause problems as well if someone wants to use the series option [23:19] hmm we should passing origin down from the provider to the machine agent [23:27] hmm.. the clone interface makes it rather hard to put in console and container logs [23:27] i guess just stuff the attrs back on [23:31] <_mup_> juju/lxc-omega-merge r398 committed by kapil.thangavelu@canonical.com [23:31] <_mup_> enable container logs, and trivial juju script header fix [23:41] <_mup_> juju/config-juju-origin r363 committed by jim.baker@canonical.com [23:41] <_mup_> Setup origin policy for affected EC2, Orchestra provider tests [23:56] <_mup_> juju/env-origin r360 committed by jim.baker@canonical.com [23:56] <_mup_> Reversed to r357