[00:02] niemeyer, that fixes one test, thanks for the tip [00:02] still seeing a problem on ensemble.control.tests.test_upgrade_formula.ControlFormulaUpgradeTest.test_upgrade_formula_service_using_latest [00:03] (was that fixed by the trivial earlier?) [00:04] kirkland: I'm testing deploying all of principia w/o ifconfig [00:05] looks like line 18 should have been fixed in that trivial earlier, http://paste.ubuntu.com/627679/ [00:06] hazmat, ^^^ [00:15] SpamapS: cool [00:15] SpamapS: point me to the diff? [00:18] kirkland: yeah when its ready. Have to fix some stuff in memcached [00:19] kirkland: and this is bringing to light the fact that the munin formula needs to not just copy/paste everything. ;) [00:19] well.. the munin formula needs to be a machine formula.. but thats another matter entirely. :-P [00:25] jimbaker, yeah.. i think that [00:25] is a fallout from a trivial fix i did earlier today [00:25] jimbaker, it should be a one liner to fix it with the merge if your game [00:25] hazmat, yeah, just didn't have this one additional change of adding the % id [00:26] hazmat, absolutely i can make that change, i will work on it in next 20 min or so [00:31] <_mup_> ensemble/expose-provision-service-hierarchy r297 committed by jim.baker@canonical.com [00:31] <_mup_> Removed debugging [01:17] kim0: hey I made a post tagged ubuntu-cloud (and cloud) and its not showing up on cloud.ubuntu.com [01:18] bcsaller, hazmat, niemeyer - i need a trivial on lp:~jimbaker/ensemble/trivial-test-upgrade-formula; the diff is here: http://paste.ubuntu.com/627698/ [01:18] this fixes the broken test in trunk [01:18] jimbaker`: hahaha [01:19] jimbaker`: That assertion is great :-) [01:19] jimbaker`: +1! [01:19] niemeyer, yes, it's kind of funny seeing that %r in the test, it explains a number of rather trivial things ;) [01:19] looks good [01:24] <_mup_> ensemble/trunk r257 committed by jim.baker@canonical.com [01:24] <_mup_> [trivial] Fixes broken test introduced by trivial in r254 [r=bcsaller,niemeyer] [01:25] <_mup_> ensemble/standardize-log-testing r261 committed by jim.baker@canonical.com [01:25] <_mup_> Merged trunk [01:29] <_mup_> ensemble/trunk r258 committed by jim.baker@canonical.com [01:29] <_mup_> merge standardize-log-testing [r=niemeyer,hazmat][f=795233] [01:29] <_mup_> Removed usage and definition of save_logging, reset_logging, and [01:29] <_mup_> assertInDefaultLog from codebase in favor of standard log testing. [01:30] kirkland: re working w/o IP.. all of the formulas had at least one commit, some two.. the latest formulas from principia have everything except munin removed from using ifconfig. Works *perfectly* [01:30] bbl [02:10] <_mup_> ensemble/expose-provision-service-hierarchy r298 committed by jim.baker@canonical.com [02:10] <_mup_> Removed unnecessary import from test_provision [02:27] <_mup_> ensemble/set-transitions r241 committed by bcsaller@gmail.com [02:27] <_mup_> merge trunk === almaisan-away is now known as al-maisan === al-maisan is now known as almaisan-away === almaisan-away is now known as al-maisan [11:03] <_mup_> Bug #798115 was filed: Ensemble is too slow to startup < https://launchpad.net/bugs/798115 > === al-maisan is now known as almaisan-away === almaisan-away is now known as al-maisan [14:32] Mornings! [14:35] morning :) [14:36] kim0: Hey, how're things going there? [14:37] hey .. going fine .. how about yourself [14:38] Waking up from a late night [14:39] hope it late fun .. :) [14:39] it was* [14:40] I just discovered why status takes 5 seconds .. it isn't ssh'ing around the globe .. ensemble itself needs time [14:40] morning gang [14:40] I filed a bug [14:40] m_3: morning :) [14:41] kim0: Time for what? [14:41] start up [14:41] check Bug #798115 [14:41] <_mup_> Bug #798115: Ensemble is too slow to startup < https://launchpad.net/bugs/798115 > [14:43] kim0: This is not startup.. this is status working!? [14:43] kim0: Ensemble is certainly doing more than establishing an ssh connection.. :-) [14:43] hmm [14:44] would reading the remote data require that much time [14:44] kim0: Reaching it several times to communicate over zookeeper with a high-latency connection, yes [14:44] a ha [14:45] perhaps the protocol is chatty :) [14:46] if there's any optimizations you guys can do .. that's be great [14:46] that'd* [14:46] kim0: Agreed :) [14:47] I've been feeling that one too... we've been on mobile broadband the past few days. Cable guy due at the new apartment this morning! [14:47] es has even been timing out on me [14:49] status is particularly sensitive, as it obtains information about everything [14:49] my RTT is 100ms .. it must need 50 round trips :) [14:49] and only the bootstrap node was up [14:51] kim0: Assuming non-existing of the world, yes [14:51] non-existence [14:51] how so ? [14:51] kim0: It takes 100ms to send an empty packet and doing nothing with it [14:52] heh yeah, all I'm saying is it's perhaps worth a good look into where that time is spent, and if there's some quick wins [14:54] kim0: Making status fast isn't a priority right now.. if we don't make Ensemble more useful, no one will be interested in knowing the status. [14:54] kim0: So let's say this is a problem we're keen on having, for the moment ;-) [14:55] hehe :) I suppose status is not the only thing that's slow .. but yeah, it sure is low priority [14:56] kim0: It's likely the slowest [14:56] kim0: By a significant margin [14:56] user perception thing... you expect bootstrap and deploy to be slow... and status to be fast [14:56] yeah, my uneducated guess was about the opposite :) [14:56] but it's the opposite [14:56] kim0: After removing the constant factors, of course (ssh, etc) [14:56] m_3: exactly [14:56] I agree, it's not high priority right now, but good to mention [14:57] m_3: Well.. it's _really_ perception, because they _are_ the slowest [14:57] m_3: The difference is that the command line isn't hooked up [14:57] yup [14:58] Which means there isn't a guy bashing the enter key and screaming "FASTER! FASTER!" on the other side. [14:58] niemeyer: would having an interactive ensemble tool help [14:59] kim0: A bit.. not much [14:59] kim0: Reducing the number of roundtrips is the real win [14:59] yeah [15:03] it certainly seems like it'd be cool to have an 'ensh' interactive shell [15:04] but I don't know the real utility of that other than blocking commands to give the user feedback and maybe sense of control [15:05] maybe it could have a replica of the zk environment ? would that help a lot overcome high latency networks ? [15:05] not important [15:05] * kim0 never used zk, and is getting ready for random things to be thrown at him :) [15:06] zk docs even call them zk "ensembles" [15:06] hehe [15:10] kim0: Brilliant stuff in the tutorial [15:11] yeah more exposure to docs basically [15:30] jimbaker`: ping [15:31] niemeyer, hi [15:31] jimbaker`: Hey man [15:31] jimbaker`: What's up there? [15:32] niemeyer, looks like it's another nice day here in colorado [15:33] jimbaker`: Nice :-) [15:33] jimbaker`: I'm review the expose branches, and would just like to bring up an idea about function naming [15:34] niemeyer, sounds good [15:35] definitely would like to hear any ideas on such things, the provision code had to come up with a number, and they probably can be improved [15:35] jimbaker`: If I tell you something like.. hmmmm.. "Can you please check the schedule?", what kind of assumption can you make? [15:36] we are probably dealing with issues around time, or more generally events [15:36] jimbaker`: Right, but it's pretty hard to figure what I'm trying to figure, right? [15:37] jimbaker`: Same thing as.. "Hey, can you please go down and check the street?" [15:37] jimbaker`: This an "empty" request, if you see what I mean [15:37] It's an [15:37] niemeyer, correct, the work "check" is one of those words that feels like a crutch word [15:38] extremely vague [15:38] jimbaker`: Yeah.. [15:38] jimbaker`: It'd be much easier to say, "Is there traffic in the street?" [15:38] so ideally it would not be used. so the question is, what replaces it to be more precise [15:39] jimbaker`: Or, "Ensure our slot is booked in the schedule" [15:39] jimbaker`: This is detailing the intended _outcome_, rather than how one will do it [15:39] jimbaker`: check_firewall_settings, watch_service_changes, ... they have that feeling [15:40] jimbaker`: open_close_ports() is a much better name than check_firewall_settings, as an example [15:41] niemeyer, sounds like a great suggestion for that function name [15:41] watch_service_changes of course parallels the existing watch_machine_changes [15:42] jimbaker`: They are both bad as well [15:42] jimbaker`: For the same reasons [15:42] so there was some convention already existing, but as you mention, bad [15:42] jimbaker`: and worse, they look like a request to watch.. [15:42] jimbaker`: Which isn't the case [15:43] jimbaker`: Agreed.. consistency is better than nothing [15:43] niemeyer, agreed, going outside of the narrow scope of this particular file, the larger convention is watch means to actually *watch* [15:43] jimbaker`: But we can as well change both, consistently :) [15:43] jimbaker`: Right, exactly [15:43] jimbaker`: and for the same reasons above, this is vague [15:44] jimbaker`: Watch for what? Will have to read the doc/code to know [15:44] niemeyer, sure, i can definitely make that change [15:44] in both names [15:46] jimbaker`: E.g. watch_expose_flag is a good name for watch_service_changes [15:46] the first is to at least use our cb_ prefix to indicate callbacks [15:47] jimbaker`: I'm ambivalent about it.. they are generally good hints when we can't figure something better that actuall describes the intention [15:47] jimbaker`: Otherwise, twisted is all about callbacks.. we'll go crazy :) [15:48] jimbaker`: Note that we generally use cb_, if I remember correctly [15:48] niemeyer, certainly twisted is always about the callback, fortunately inlineCallbacks can make it more linear [15:48] but that's another line of thought [15:48] jimbaker`: Which means we have the same problem as above.. the function name has no hints about its intention [15:49] jimbaker`: Perhaps a good way to put the distinction is that one is "this is why someone is calling me" and the other is "this is what I'm going to do" [15:49] jimbaker`: The latter is generally more useful when reading the code [15:50] niemeyer, indeed [15:51] my 10 year old daughter just got into this in her robotics camp last week. i looked at the function names she was writing, and they were very explicit on what the function was going to do [15:51] (this was in C, she told me on the 2nd day she wanted to be a programmer. i told her that in the future, all professionals will be programming in some way ;) ) [15:52] jimbaker`: I'm not entirely sure.. I had that feeling in the early 90s, but nowadays it feels like it's getting harder to get people interested in the details of the problem [15:53] Just too easy to be a user, I guess [15:54] i consider building a spreadsheet model to be programming, or similar tasks. it can be done visually or with code, or both [15:55] but again the aspect of functions that describe concrete functionality, and that we can effectively reason about them because we have good names, that was very clear in my daughter's code and ideally in any code we all write [15:55] jimbaker`: COol [15:56] jimbaker`: Re. [3], can you extend a bit on why you think it's not doable? [15:56] niemeyer, this is the collapsing together of both dictionaries [15:56] jimbaker`: Yeah [15:57] so certainly doable, the question was whether it would make things harder to read [15:57] jimbaker`: Agreed.. I had the impression it'd make them easier [15:57] jimbaker`: So I'm interested in your perspective [15:59] i need to track two distinct things here. whether or not a watched service is exposed or not. if it is exposed, we then start a watch on its service units [15:59] and then what watches have been established for each service unit [16:00] self._watched_services tracks the first; self._watched_service_units the second [16:00] jimbaker`: Yes, as I understand it, you need to track: a) Whether a service is exposed or not; b) What units in this service are being watched [16:00] jimbaker`: Is that correct? [16:01] niemeyer, not quite [16:01] jimbaker`: Ok.. that may be the detail I'm missing then. [16:01] niemeyer, because i need to track two watches for a service, because that's the api i'm working with [16:01] 1) watching the service's exposed flag; 2) watching its service units [16:02] jimbaker`: Isn't that a and b above? [16:03] niemeyer, sorry, it was a little ambiguous [16:03] niemeyer, for each service unit, we need to watch its ports [16:03] so that's the watch per service unit [16:05] jimbaker`: Ok, so.. it feels like we're talking about the same thing [16:05] jimbaker`: So, objectively.. [16:05] jimbaker`: self._watched_services tracks which services are watched.. and it's really a set rather than a dictionary [16:06] jimbaker`: It's values aren't used for anything [16:06] so _watched_services manages watches at the granularity of one watch per service (watch_exposed_flag, watch_service_units); _watched_service_units tracks at the granularity of a service unit [16:06] jimbaker`: self._watched_service_units is a defaultdict, which means the information of whether a key exists or not is not being used in any good way [16:07] niemeyer, actually _watched_services value is used to indicate whether the watch_service_units watch has been started [16:07] jimbaker`: Make the latter a real dict, and use presence information in a good way [16:07] jimbaker`: This way you don't need two dictionaries, nor the clean up you have in other functions [16:08] False - only watch_exposed_flag; True - also started watch_service_units [16:09] jimbaker`: I understand.. can you please point out the place in the code that invalidates the suggested design? [16:10] niemeyer, i think what you are suggesting is something like the following: [16:12] the presence of keys in _watched_service_units indicates that it is exposed; the corresponding value may be None (or whatever value is useful) if there is no watch on its service units; when that watch is established, change to a set, which collects all the watches on the corresponding service units' ports [16:12] niemeyer, i can certainly implement such a design. it just seemed more complicated [16:13] niemeyer, if you have another design in mind, i don't see it [16:14] jimbaker`: Maybe it's more complicated, but I'm trying to understand why.. [16:14] jimbaker`: It feels like you have double bookkeeping, e.g. [16:14] niemeyer, alternatively i can use a multi-level dict or equivalently an object to represent this mapping [16:14] + self._watched_services[service_name] = False [16:14] + self._watched_service_units.pop(service_name, None) [16:15] niemeyer, agreed, that is an implication of the design [16:15] jimbaker`: [16:15] + self._watched_services.pop(service_name, None) [16:15] + self._watched_service_units.pop(service_name, None) [16:15] on the other hand, there is no need to do any conditionals on what the value of _watched_service_units is [16:16] jimbaker`: Yeah, and I'm trying to understand why the complication is necessary. [16:16] jimbaker`: Sure, because you're always doing it in a different dictionary that you maintain sideways :- [16:16] ) [16:17] so you make that one tradeoff in a couple of places where the parent key (so to speak) is deleted and one's maintaining referential integrity (so to speak). but arguably resulting in simpler code [16:18] jimbaker`: I don't see the tradeoff.. you're effectively discarding presence information in one and using only presence information in the other [16:18] jimbaker`: Either way, never mind [16:18] jimbaker`: I should have just tried out.. it'll be easier [16:19] jimbaker`: You may well be right.. I'm just missing the "This is a better design, because ..." sentence. [16:19] niemeyer, no worries, just wanted you to know i thought through the implications of your question [16:20] and looked at the implementation. it seemed more complicated, at least at what i tried, vs the relative simplicity of what's being done now [16:20] but again at the cost of two dicts [16:20] jimbaker`: If it's simpler, let's keep it. [16:21] jimbaker`: I'm just missing the "This is simpler, because ..." explanation.. but it's just me failing to get it. [16:22] again, i think the lack of conditionals makes it simpler. instead we trade that for invariants (hence the use of code like self._watched_service_units.pop(service_name, None) - it may be there, it may not, it doesn't matter) [16:23] my concern was that there was enough logic in this code already, i didn't want to increase its level [16:24] jimbaker`: Yeah, no conditionals is simpler.. why would we need conditionals? (rhetoric question) [16:24] niemeyer, ;) [16:24] jimbaker`: This is rhetoric in the sense that this is the kind of thing that can make a long conversation about design short.. [16:25] jimbaker`: "I need to track the foobar of the boodoom." finishes such conversations in no time. [16:25] niemeyer, sure, that's absolutely right [16:25] jimbaker`: "Code without conditionals is simpler." gives no hints. [16:26] also having dictionaries state this is what i track, and my invariant is maintained, that's nice too [16:27] naively, it would be nice if it would nice if it were possible to avoid all such extra state, it is possible to introspect for a watch, but that's not really going to work as i understand it === al-maisan is now known as almaisan-away [16:50] Man.. [16:50] This whole inlineCallback thing is tricky.. [16:51] While it brings back the nice feeling of straight code back, it also introduces concurrency issues which are very easy to ignore. [16:58] jimbaker`: This is what I mean [16:58] jimbaker`: http://paste.ubuntu.com/628037/ [16:58] jimbaker`: Untested.. [16:58] jimbaker`: But theoretically it shouldn't break the tests, if they are not deeply dependent on the implementation [16:59] jimbaker`: There are also a couple of points inlined in the diff worth noting [16:59] jimbaker`: I'll step out for lunch.. please let me know what you think [16:59] jimbaker`: About the approach, not lunch ;-) [17:00] niemeyer, enjoy your lunch. i certainly have food opinions, but they are to express remotely [17:00] hard to express [17:00] niemeyer, i'll take a look at the diff thanks [17:34] niemeyer, still taking a look at you diff to see why it doesn't work [17:38] jimbaker`, i was looking at the zookeeper test setup, and was curious how the contextmanager and generator here works with the test setup... it looks like a nice way for us to do fixtures [17:38] jimbaker`, if you have a few minutes to discuss i'd like to do an audio chat [17:40] hazmat, i'm looking at that code again, one moment [17:42] hazmat, this is a basic example of a context manager. sure, we can talk now if you'd like. mumble? skype? [17:42] jimbaker`: Thinking over lunch, I understand why you preferred the other approach.. the first dictionary is actually a three-state one [17:42] niemeyer, correct [17:43] jimbaker`: So _watched_services[name] = False is actually not entirely correct [17:43] jimbaker`: The service _is_ being watched [17:43] sure, it's not just being watched for its service units. but it does correspond to being exposed or not, hence the boolean [17:44] jimbaker`: Yeah, booleans are lovely.. :-) [17:45] jimbaker`: I'm happy with either approach, but it needs clarification.. I'll see if I can make tests pass with this and compare [17:45] niemeyer, yes, they say exactly what i mean them to be ;) i hope [17:45] jimbaker`: _watched_services[foo] being False means the service is not watched. [17:45] niemeyer, yeah, obviously there's still one ref to watched_service_units in your diff, but there's more to it than that [17:45] jimbaker`: No other way to interpret it [17:46] jimbaker`: Yeah, leave that with me [17:46] niemeyer, so maybe it should mean - _watching_service_units or something like that and _watching_service_unit_ports, although this seems too wordy [17:46] but again a tweak on the names can make this clearer [17:47] jimbaker`: If the single design doesn't work, we can come up with something like _watched_services[foo] = Exposed/Unexposed [17:47] jimbaker`: Erm.. single dict design [17:47] niemeyer, that's also a good choice. maybe they can define nonzero too [17:48] ;) [17:48] jimbaker`: nonzero? [17:48] niemeyer, i believe it's __nonzero__ to be precise, as called by bool [17:48] new objects that act just like booleans! [17:49] jimbaker`: Heh [17:50] biab, i'm going to get some coffee [17:51] Enjoy [17:54] niemeyer: where is the ensemble documentation? I used to have it but, can't find it now. [17:55] negronjl: Should be in https://ensemble.ubuntu.com/docs [17:55] niemeyer: perfect! thanks! [17:55] negronjl: np! [17:58] niemeyer: does ensemble take care of the security groups in AWS? ie: if I have a service that requires port 8112 open, would it take care of opening that port and closing whatever is left ? [17:59] negronjl: We're working on that _right now_, literally :-) [17:59] negronjl: Both me and jimbaker` are hacking on it as we speak [17:59] niemeyer: perfect. thx [17:59] negronjl: The idea will work like this: [17:59] negronjl: The formula declares what ports it needs open [17:59] negronjl: Via a call to, say, open-port 80/tcp [18:00] negronjl: That doesn't actually _open_ the port directly, though [18:00] negronjl: The admin is in control of which services are exposed [18:00] negronjl: So you can do something like [18:00] negronjl: ensemble expose myblog [18:00] negronjl: This command will make Ensemble check which ports the formula declared as open, and will open the firewall for them, specifically [18:01] niemeyer: that's great [18:01] negronjl: So the ports that are punched through the firewall are those that both: a) Have been declared as open-port by the formula; and b) Have been exposed [18:02] negronjl: What we have in trunk right now is "everything is open party" [18:02] niemeyer: I noticed. hence the question :P [18:02] negronjl: One jimbaker` finished the last few bits, we'll have the full feature [18:02] niemeyer: perfect. [18:02] negronjl: I'm just assisting jimbaker` [18:03] negronjl: Once jimbaker` finishes the last few bits, we'll have the full feature [18:03] (that was the real sentence :-) [18:22] I wonder if "open port" means open to the general Internet .. is it possible to open to some other service only ? like mysql being accessible from mediawiki only [18:24] kim0, that's certainly a fair question - for now open-port doesn't mean that, it's the exposed setting that means open to the internet (and interprets opened ports accordingly) [18:24] kim0, hope that makes sense [18:24] * kim0 scratches head [18:25] so it's not possible today to open to a certaing sg right ? [18:25] niemeyer just went over how these two pieces fit together with negronjl, fwiw [18:25] kim0, there is no current open design work to do what you ask [18:25] yeah got it [18:26] kim0, but i can imagine that we could leverage open-port as you describe to get at different security zones along the lines of what SGs in general can do [18:26] kim0: In a way, open-port actually means exactly that.. it tags which port _should_ be open to whoever is consuming that service [18:27] kim0: As jimbaker` points out, we're not working on inter-service-unit handling of that now, though [18:27] yeah got it .. thanks [18:27] Yeah, what jimbaker` says [18:28] kim0, we should certainly capture this in a bug [18:35] <_mup_> txzookeeper/session-event-handling r44 committed by kapil.foss@gmail.com [18:35] <_mup_> managed zk cluster api === daker is now known as daker_ [18:49] <_mup_> txzookeeper/session-event-handling r44 committed by kapil.foss@gmail.com [18:49] <_mup_> managed zk cluster api [18:56] ok guys. so I have hadoop-master set up in ensemble but, I have a question before I move forward with the slave nodes. Any way of changing the instance type to m1.large ( or anything else for that matter )? [18:58] negronjl, not at the moment [18:58] hazmat: thx. no worries. I'll deal with it. [18:59] hazmat: Hmm [18:59] hazmat: What about default-instance-type? [18:59] we've talked before about using a separate ebs volume per unit instead of an ebs instance, which would allow for some sort of expansion.. but that's really post lxc isolation [18:59] negronjl, as niemeyer points out you can switch the default instance type for the entire environment if you wish, but not per unit/machine atm [18:59] hazmat: where would I change that ? [18:59] or service for that matter.. [19:00] negronjl, in ~/.ensemble/environments.yaml [19:00] hazmat: perfect! thx. [19:01] negronjl, https://ensemble.ubuntu.com/docs/provider-configuration-ec2.html [19:01] hazmat: perfect! thx. [19:02] hazmat++ for actually documenting it :) [19:02] niemeyer, it was getting hard to remember ;-) [19:05] Yo compiz! Gimme my cursor back! [19:06] No game :( [19:07] I'll give unity another chance by 11.10 :) [19:07] kim0, when it comes to compiz its underlying unity and classic.. so its hard to escape from its bugs.. [19:08] oh yeah .. I do run gnome without unity .. rock solid [19:08] without compiz I mean [19:08] jimbaker`: http://paste.ubuntu.com/628096/ [19:08] jimbaker`: All tests pass [19:09] kim0, how do you set that up.. if i login under classic, i still have compiz as the window manager afaics [19:09] jimbaker`: Sorry, let me add a proper comment on the dict [19:09] i'm still averaging a reboot about every 6hrs.. or some sort of re-login after the xsession crashes [19:10] niemeyer, ok, i like that much better [19:11] hmm .. I am indeed running metacity .. no idea how to configure it though [19:11] niemeyer, it reads better than my version, so thanks [19:12] hazmat: perhaps gconf-editor → desktop -> gnome -> applications -> window_manager and set to metacity [19:14] jimbaker`: No problem [19:14] jimbaker`: Here is the version with the comment: http://paste.ubuntu.com/628102/ [19:15] kim0, awesome thanks.. i'll try that out [19:16] jimbaker`: I was glad to dive in as well.. the overall logic feels good, even though I'm still a bit concerned with concurrency issues [19:16] niemeyer, comment is also good, and i like how the dict is maintained [19:16] niemeyer, i understand your concern, but after being in that code for a while, i think the concurrency aspects are solid [19:17] jimbaker`: I'm not entirely sure.. I'm concerned in general, not just with that one piece of code [19:18] jimbaker`: We haven't been considering locks, etc, very often. Twisted makes us lazy in that regard, but every yield in a function is putting control away from the function, and the world can change at that point. [19:18] niemeyer, yeah, that's definitely what i think whenever i have a yield [19:18] jimbaker`: As an example, in that same diff, look at the original cb_check_service_units function [19:18] jimbaker`: You were making assumptions about the state of the world within it [19:19] jimbaker`: because you've _tested_ it [19:19] jimbaker`: But every time you "yield", the world can change [19:19] niemeyer, the world may change. not completely arbitrary [19:19] jimbaker`: Concrete example: [19:19] - if unit_name not in self._watched_service_units[service_name]: [19:19] jimbaker`: What guarantees that the service wasn't unexposed on the yield within the for loop? [19:20] jimbaker`: Not arbitrary, but hard to keep the world state in mind.. [19:20] any way to ssh into the instances to get a better view of what's going on ? [19:21] negronjl: ensemble ssh [19:21] negronjl: Or, [19:21] niemeyer, thanks that is in fact an invalid assumption [19:21] negronjl: ensemble ssh $UNIT_NAME/$N [19:21] ahh...perfect! thanks niemeyer [19:21] jimbaker`: It's hard.. I don't blame you [19:21] negronjl: np! [19:22] negronjl: You may also be interested in "ensemble debug-hooks" [19:22] negronjl: It's quite fun [19:22] niemeyer: I'll make sure to check it out [19:22] negronjl: A bit like "gdb for hooks" :-) [19:22] niemeyer: perfect! exactly what I'm looking for [19:23] negronjl: check out https://ensemble.ubuntu.com/docs/write-formula.html#debugging-hooks and let me know how to improve it :) [19:23] niemeyer, in your new code you ignore this transition from the set to NotExposed (or deleted), because it's safe to setup such watches, they will simply terminate immediately [19:23] niemeyer, so keeps things simple [19:23] niemeyer, and correct [19:24] niemeyer: hadoop-master and hadoop-slave is done. Would you point me to how to submit this for you guys? [19:24] niemeyer: I pressume that I would submit this to principia ?? [19:25] niemeyer: now that I have a better understanding, I plan on porting all of the other orchestra-modules to ensemble. [19:25] negronjl: check out https://ensemble.ubuntu.com/Principia [19:26] kim0: perfect! thx [19:27] negronjl: Woah, that's awesome! [19:27] thx niemeyer [19:28] negronjl: Man, can't wait to deploy my first hadoop cluster with Ensemble [19:34] hazmat: Have we missed the weekly meeting timing? [19:34] hazmat: Or did I get it wrong? === almaisan-away is now known as al-maisan [19:42] kim0: Have you been merging your branches? [19:42] nope [19:42] kim0: Ok, please let me know when you do the suggested FAQ tweaks then, and I'll comit [19:42] commit [19:43] niemeyer: that was the branch https://code.launchpad.net/~kim0/ensemble/updating-faq [19:43] niemeyer: don't know if it's merged or not [19:43] kim0: I know, I just reviewed it [19:44] kim0: https://code.launchpad.net/~kim0/ensemble/updating-faq/+merge/64679 [19:44] ah ok .. will update the branch [19:51] Is there any scientific explanation to a browser tab spinning/loading for 30mins! I mean tcp should either timeout, or retry right [19:53] kim0: Long polling is one of them [19:54] kim0: I.e. its constantly retrying to wait for updates from the server [19:54] it's [19:54] like gmail .. I dont think it keeps spinning in that case [19:55] kim0: There are different implementations possible [19:55] kim0: http://tools.ietf.org/html/rfc6202 [19:56] thanks! [19:57] although I'm actually inclined to think it's more of a bug than a website feature .. since refreshing the page, would finish loading in a couple of seconds [19:59] any way nvm [20:04] kim0: Possibly [20:54] kim0: did you get my msg about my post not syndicating onto cloud.ubuntu.com ? === al-maisan is now known as almaisan-away [21:08] yeah just replied [21:08] * kim0 mostly afk [21:43] niemeyer: Filed bugs #798421 (hadoop-master) and #798422 ( hadoop-slave). Feedback is well appreciated. [21:43] <_mup_> Bug #798421: new-formula (hadoop-master) < https://launchpad.net/bugs/798421 > [21:43] <_mup_> Bug #798422: new-formula (hadoop-slave) < https://launchpad.net/bugs/798422 > [21:44] negronjl: Awesome, thanks a lot! [21:44] niemeyer: np. I'll be working on the rest of 'em shortly. [21:58] negronjl, this is the hdfs name node and job tracker re hadoop-master? [21:59] hazmat: I don't quite get your question [21:59] negronjl, i'm just trying to understand what bits are managed by the hadoop-master formula [21:59] * hazmat digs through the source [22:00] hazmat: I have to go to a meeting now. Do you mind if we talked about this in a while ( 30 minutes or so ) [22:00] negronjl, sounds good [22:00] interesting [22:00] 3 interfaces, does that work? [22:03] negronjl: have you deployed these yet? [22:03] SpamapS: I have [22:04] SpamapS, any reason you thought they wouldn't work? [22:04] The only confusing part is the 3 interfaces [22:04] ensemble-log "Point your browser to http://${public-hostname}:50070" [22:04] Also the variable public-hostname isn't ever set [22:04] hmm.. yeah.. three interfaces under one name is a bit strange [22:04] not sure that's valid [22:04] Well there's really no reason for 3 interfaces [22:05] negronjl: can you explain what you were trying to accomplish there? [22:05] negronjl: (BTW this is *really* cool) [22:05] SpamapS: let me get through my meeting and I'll work with you guys on this [22:05] m_31: ping, we're discussing hadoop, you should be paying attention. :) [22:05] negronjl: ahh yes! when you have time then, just ping us [22:06] negronjl, only the last interface will survive, its over-writing a duplicate key else [22:06] negronjl, ttyl [22:06] ahh more abuse of the "what is my IP" paradigm. I feel quite personally responsible for that. [22:06] negronjl, and i agrew with SpamapS this is very cool [22:07] SpamapS, almost as much i ;-) [22:07] I'm sure hostnames will suffice for the places where IP is being used. [22:07] SpamapS, i thought you'd switched out principia to getting things off ifconfig? [22:07] The one place where I'm resolving them in the formula into IP's, is memcached because I don't want web requests blocking on DNS. [22:07] instead of the md server.. but i guess the dns name still needs the md server or external actor providing this info [22:08] hazmat: just yesterday I switched almost everything to getting things from DNS [22:08] SpamapS, sure... but they get cached typically so the per request overhead should still be neglible [22:08] I'm looking into whether or not to also include running a local caching name server as well. [22:09] hazmat: they don't get cached by PHP very effectively. :-P [22:09] hazmat: it does the equivilent of nscd .. cache it "forever" [22:10] hazmat: but yeah, it may not be worth it to resolve in formula even in that case. [22:14] I think I'll add a principia proof warning about querying the metadata service [22:17] negronjl: just branched them... awesome! [22:20] <_mup_> ensemble/status-dot-output r259 committed by bcsaller@gmail.com [22:20] <_mup_> fix for Bug #792448, unsafe labels in dot graph [22:21] <_mup_> ensemble/status-dot-output r260 committed by bcsaller@gmail.com [22:21] <_mup_> example of previously bad input, used by tests [22:21] ahh...meeting is going long guys. I've been reading SpamapS about the three interfaces and I think I can get it all done with just the one interface. give me a few minutes to fix [22:26] <_mup_> ensemble/relation-get-eval r228 committed by bcsaller@gmail.com [22:26] <_mup_> remove shell variable prefix, its now implicit [22:29] Ok I just added a check in principia's proof command that warns if you use the metadata service directly... [22:29] * SpamapS now starts cleaning all of those up [22:32] SpamapS: can you elaborate on using metadata service directly? [22:32] SpamapS: I'm new to this hence all the stupid questions [22:33] negronjl: I feel a lot more stupid trying to grok hadoop than you do asking about my cryptic communication style. ;) [22:33] negronjl: Basically that would mean the formula would be undeployable outside of EC2 [22:34] negronjl: in addition, that information is available in DNS. The one thing that isn't is the public hostname, but thats really something we need to make ensemble provide. [22:34] SpamapS: ahh...it all makes sense now. I remember the very "nice" conversation on one of the channels about ipaddr [22:34] negronjl: "nice" :) [22:35] negronjl: the appropriate thing to do is send around hostnames, and if you *MUST* resolve it to an IP, resolve it where you are going to use it, not where you are sending it from. [22:35] SpamapS: I can always ifconfig ...... awk .... echo .. .sed ..etc. :) [22:35] That way you can take into account whether or not you have IPv6 .. and any search domains [22:35] negronjl: nooo thats the next evil hack I'm going to add a warning for. :) [22:35] negronjl: hostname -f gives you the FQDN of your machine. Send that. [22:35] SpamapS: I knew that would get a kick out of ya :) [22:54] SpamapS: Would it be better to use hostname instead of IP? Also, where do I get the Public DNS from if not from the meta-data ? [22:55] negronjl: yes, hostname -f should always be resolvable from any node that can reach you directly.. [22:56] negronjl: for the public hostname, you can get it from 'whatismyip.com' .. [22:56] negronjl: just kidding btw.. I think we need Ensemble to provide a machine info tool that works accross any machine provider. [22:57] SpamapS: so, for now, I'll use the hostname -f for internal and meta-data for external ( until I get something better ) [22:57] negronjl: when do you *actually* need the public hostname, programattically? [22:57] negronjl: the only space I see you attempting to use it is to print, into the debug log, the hostname people should use to access the server. [22:57] negronjl: which you can get from 'ensemble status' [22:58] SpamapS: I'll do that instead. [22:58] SpamapS: taking it out [22:58] negronjl: \o/ [22:58] Now if we only had an LXC machine provider... [23:00] SpamapS: I pushed all of the changes we discussed here (one interface instead of three, hostname -f instead of meta-data, remove the public-dns bit ) [23:01] pulling [23:01] SpamapS, want to discuss bug 766317 ? [23:01] <_mup_> Bug #766317: debug-log should show relation settings changes < https://launchpad.net/bugs/766317 > [23:02] SpamapS: deploying ( and praying a bit ) [23:02] negronjl: principia proof reports this: W: all formulas should provide at least one thing [23:02] basically this addresses observability of formulas. obviously i can readily write a utility to grab this zk info and show how it changes [23:02] negronjl: for hadoop-slave .. the reason is, if it doesn't provide anything, how can other services consume what it does? [23:02] w/in the constraints of any applied security on zk, of course [23:02] Spamaps: you don't consume anything out of the hadoop-slave nodes [23:03] jimbaker: Right, I'd prefer that we not log all that data (even though we are now.. thats something I think we should change and not log these credentials. [23:03] SpamapS: ... your own wordpess formula (/usr/share/ensemble/examples/wordpress ) doesn't provide anything either :D [23:03] SpamapS, debug-log is certainly not intended for actual logging [23:03] negronjl: yes, thats a bug, it should provide 'website' [23:04] of course as it is right now, it's just going through ZookeeperHandler, so it has the potential to leak to other handlers i suppose [23:04] negronjl: interesting though.. these essentially just contact the master and "help" it.. [23:04] SpamapS: I guess I can provide some dummy interface if I have to but, you are correct...they just help the master do it's thing [23:04] jimbaker: the log file that I'm concerned about is the formula log, which seems to be related to debug-log [23:05] negronjl: well it may be that this is an exception worth making. [23:05] negronjl: can the master do anything useful without the slaves? [23:05] SpamapS: not really [23:06] negronjl: so maybe flip them.. master requires: hadoop-slave [23:06] SpamapS, correct as i recall debug-log is effectively collating what goes into agent logs like the formula log, through the standard log handler mechanism in python [23:06] and provides .. whatever it is that other services consume from it. [23:06] negronjl: the reason I wrote that "everything must provide one thing" is because conceptually (while maybe not pragmatically), its true. [23:07] SpamapS: let me see if I understand it. Let me play with it for a bit [23:07] SpamapS, i don't have enough log-fu to know if it is possible to ensure that some things are never written to a specific handler [23:07] negronjl: before you go too far down that rabbit hole.. [23:07] SpamapS: yeah? [23:08] negronjl: This seems like just the beginning. How do other services utilize the master? [23:08] SpamapS, but we could make it such that by default it is not written to such formula logs, that should be doable [23:08] SpamapS: they upload files to it (jar files and such) for hadoop to process [23:09] negronjl: also one other mistake your formulas have, that is not always obvious, is that sometimes your 'relation-get' won't return anything, because the other side won't have done its 'relation-set' just yet... you have to test that the values are actually set. [23:09] SpamapS, anyone who could overwrite this presumably can sub in arbitrary code in anyway [23:09] SpamapS, so maybe that will address your concern? [23:10] SpamapS: should I trust that ensemble will re-run the script when the relation-set part gets done? [23:10] jimbaker: I'd rather just never see values logged by ensemble. Its far simpler for formula authors to choose when they do that, if somebody needs more observability, they should use debug-hooks or just alter the formula. [23:10] negronjl: yes [23:11] SpamapS, well at some point there's going to be a utility written for this because it's rather useful in my experience [23:11] negronjl: if you look, a lot of the -changed formulas do relation-get, and if all the values aren't set, just exit 0 [23:11] SpamapS: ok. I can do that [23:11] jimbaker: useful and secure are often at odds. :) [23:11] given the ease of access to ZK, even if it's simply third party [23:12] SpamapS, agreed on how security is always getting in the way ;) [23:12] jimbaker: I don't mind that the data is in zookeeper at the time. But I'm considering instances where people may exchange something like tokens or encryption keys and then delete them, but they might be useful for decoding sniffed traffic later. [23:13] jimbaker: http://dev2ops.org/storage/WallOfConfusion.png [23:13] SpamapS, yeah, that's a good one for scenarios like this [23:13] negronjl: I *love* that the heavy lifting is all done in debconf [23:15] SpamapS: we have iamfuzz to thank for that :) [23:15] jimbaker: Maybe if the agents didn't log this *by default* (they seem to log at DEBUG level right now) I'd be more inclined to support it. [23:16] negronjl: so why aren't these in the Ubuntu archive? [23:16] SpamapS, for stuff like that it's just a matter of choosing appropriate levels and handlers [23:16] SpamapS: they currently depend on [sun|oracle]-java [23:16] SpamAps: currently working with cloudera to fully support openjdk [23:16] negronjl: multiverse would allow for that [23:17] SpamapS, so we could simply have the policy that debug-log captures debug, but the default level is INFO or higher, something like that [23:17] SpamapS: I think it's going into partner but, I'm not sure. [23:17] jimbaker: Right. the problem is that i'm fairly certain I have no way of changing the debug log because of the way unit agents are started. ;) [23:18] negronjl: partner has the added benefit of being turned on by default. :) [23:18] SpamapS: true that :) [23:18] negronjl: ok, so people "upload" stuff to these as jars. Is there a standard way to do that? [23:19] SpamapS, i need to run, ttyl [23:19] like, WebDAV, scp, ftp? [23:19] SpamapS: not really...you have to create a directory then, change permissions, then change user, then ( and only then ) run your "job" [23:19] SpamapS: normally I have done it using scp [23:19] SpamapS: and ssh [23:19] negronjl: Ok, because thats what the master should end up "providing" [23:19] looks like there's a website too [23:21] negronjl: so provides website: interface: http .. and then just set the hostname / port. [23:24] SpamapS: The master has two websites [23:24] SpamapS: one on port 50030 and another one ( for a different purpose but just as important ) on port 50070 [23:25] SpamapS: so, so far your suggestion is for the slave to provide hadoop-master and for the master to provide website: interface http ? [23:26] SpamapS: If so, can I provide both pages (50030 and 5070)? [23:30] negronjl: you don't have to call it 'website' [23:30] negronjl: you can do 'website-foo' and 'website-bar' [23:31] negronjl: what are the two ports' purposes? [23:31] negronjl: the slave should provide hadoop-slave .. the master should require hadoop-slave, and provide those two websites. [23:31] SpamapS: If the protocol is the same, both interfaces should be named the same way [23:32] niemeyer: same interface, different relation name [23:32] SpamapS: Right [23:32] SpamapS: The interface should be "website", right? [23:32] SpamapS: That's what we agreed yesterday, at least [23:33] http://paste.ubuntu.com/628196/ [23:33] the interface is just http [23:33] Or did I forget something? [23:33] SpamapS: Yesterday we agreed to use 'website' as the interface [23:33] Oh [23:34] for what exactly? [23:34] SpamapS: For an interface which had only "url" as a relation setting [23:35] Oh, I wasn't part of that discussion. Interesting. [23:36] do we want to separate out DFS-type services from mapreduce-type services? [23:36] or interpret "website" as "webservice" [23:36] Makes a lot of sense tho. I like the idea of specifying the actual protocol though. Some url handlers don't handle FTP... [23:37] SpamapS: You actually were part of it [23:37] SpamapS: We just didn't understand each other [23:38] SpamapS: I think "http" is too much detail about what the service provides [23:38] SpamapS: Most clients that support "http" will also support "https" [23:38] Yeah that I recall [23:38] I didn't remember that we had settled on URL, but I do like it. [23:38] SpamapS: So "website" feels like a good name for a user-oriented view that can be both http or https in a "url" setting [23:39] SpamapS: That reminds me, we need to put these in the wiki [23:39] I'd almost say that interface should actually be called "url" .. the website name is just a standard convention for relation name I've been using for web apps. [23:39] SpamapS: https://ensemble.ubuntu.com/Interface/ [23:39] ? [23:39] SpamapS: That may be too much [23:39] SpamapS: "url" could be "mongodb://..." [23:40] SpamapS: To make sense it needs some additional sense to make auto-resolving work [23:40] yeah website at least implies "web" [23:40] SpamapS: "website" provides the semantic meaning, without binding to the specific protocol [23:41] For the interface docs.. I've been thinking about it too. I was trying to think if there's a way we can express it with a testing framework that could actually verify if something that says it "provides" "website" does. [23:42] SpamapS: That's pretty interesting.. I think we can do something about that [23:42] niemeyer: one reason I've been doing http specifically is that haproxy and IPVS don't care about urls.. they care about host and port only. [23:43] SpamapS: Well.. they do care about whether it's http or not, IIRC [23:43] SpamapS: haproxy, at least [23:43] but I suppose I can just parse that out relatively easily. [23:43] SpamapS: Agreed [23:44] url, and 'check_url' would be a good optional thing to be able to set.. so that load balancers know the specific url to hit for health checking. [23:45] SpamapS: Definitely.. the interface page in the wiki could document optional settings as well [23:46] niemeyer: I'd like to have the interface docs in revision control.. not sure if the wiki's history is enough. [23:47] so maybe .rst for the interfaces [23:47] SpamapS: Hmm [23:48] SpamapS: I thought about the wiki to more easily allow the community to contribute/debate [23:52] Yeah.. I can see both sides. [23:53] I think as long as we point to one as "the canonical source of documentation for that interface", it will work. [23:53] Just feels like .rst would be more authoritative. [23:53] at the expense of community members needing to jump through more hoops to document their interfaces. [23:56] SpamapS: Sounds good to me [23:57] SpamapS: We should go with whatever you feel most comfortable with [23:57] SpamapS: and we can change, of course [23:57] SpamapS: But this is an area that will need your attention for sure