[00:39] <jimbaker> hazmat, this looks provocative. the last line in the formula.log for wordpress/0 - 2011-04-21 23:30:08,060: twisted@ERROR: TypeError: 'NoneType' object is not callable
[00:40] <hazmat> jimbaker, hmm
[00:40] <jimbaker> so maybe the relation workflows are running, but they are going into a bad state?
[00:40] <hazmat> jimbaker, possibly
[00:41] <hazmat> surprised we don't have anything nicer in the traceback. that's unfortunate
[00:41] <hazmat> jimbaker, the status looks right
[00:41] <jimbaker> hazmat, no, it's missing the relation status info
[00:42] <hazmat> jimbaker, one option is to  have a look at the zkshell.. ensemble ssh 0 && /usr/share/zookeeper/bin/zkCli.sh
[00:42] <hazmat> jimbaker, hmmm..
[00:42] <hazmat> jimbaker, i'll have a look in the morning
[00:43] <jimbaker> compare it against this output: http://pastebin.ubuntu.com/597192/
[00:43] <jimbaker> (from an earlier run)
[00:44] <jimbaker> hazmat, sounds good
[00:54] <jimbaker> hazmat, for later consumption - shouldn't this be more than two? zk: localhost:2181(CONNECTED) 26] ls /units --- [unit-0000000000, unit-0000000001]
[01:00] <jimbaker> hmmm... maybe not that part after all - i was looking at zk_workflow_identity
[01:00] <jimbaker> it looks like it uses the same path for both ServiceUnitState and UnitRelationState
[01:01] <jimbaker> but those are not the same paths
[01:01] <jimbaker> based on looking at those specific classes
[01:01] <jimbaker> either i'm confused or ensemble is confused ;)
[01:46] <hazmat> jimbaker, they are at the same path
[01:46] <hazmat> all the workflows for a unit-agent are managed on a single node
[02:16] <jimbaker> hazmat, thanks for the clarification
[02:16] <jimbaker> this would have caused more issues if it were not the case, i guess
[02:16] <hazmat> jimbaker np.. sorry i had to run out
[02:17] <hazmat> jimbaker, yeah.. it seems strange the workflows on the unit are initialized and showing them as running the but the units weren't up or in an error state which seems strange
[02:17] <hazmat> i can't think of any reason why that would be the case.
[02:18] <jimbaker> anyway, just curious it's happening in us-west now - one more thing to try is in eu-west (if that's the other region completely set up)
[02:18] <hazmat> i'll take a look at it tomorrow
[02:18] <hazmat> jimbaker, it is
[02:18] <jimbaker> hazmat, have a good night, ttyl
[06:24] <_mup_> ensemble/refactor-to-yamlstate r197 committed by bcsaller@gmail.com
[06:24] <_mup_> set not taking any random data, but insisting on a dict in tests (for YAMLState)
[08:52] <kim0> Morning everyone
[13:06] <kim0> anyone around o/
[15:02] <hazmat> kim0, g'morning 
[15:02] <hazmat> or hello is probably more appropriate
[15:30] <kim0> hazmat: hey o/
[15:33] <kim0> team on vacation huh :)
[15:59] <_mup_> ensemble/merged-alt-region-logging r210 committed by kapil.thangavelu@canonical.com
[15:59] <_mup_> merge ensemble-log-level
[16:00] <_mup_> ensemble/merged-alt-region-logging r211 committed by kapil.thangavelu@canonical.com
[16:00] <_mup_> merge ensemble-log-crash
[16:07] <jimbaker> kim0, hi
[16:08] <jimbaker> kim0, we have been trying out the alternative region branch. it's not worked for me with deploying our example formulas. you want to give it a try too?
[16:08] <jimbaker> it did work w/ kapil however
[16:16] <kim0> ec2 has recovered it seems .. I'm writing a small user level tutorial now, but if you need someone else to test that branch, sure
[16:17] <jimbaker> kim0, i think it would be useful for sure
[16:17] <jimbaker> it's also a good way to play w/ the environments.yaml file
[16:18] <kim0> jimbaker: cool, any instructions on using that branch ? I'll try it in a few hours though, now right now
[16:20] <kim0> let me know how do I use it, thanks
[16:27] <_mup_> Bug #769030 was filed: Enable one control bucket to be used for multiple regions <Ensemble:New> < https://launchpad.net/bugs/769030 >
[16:29] <jimbaker> you just need to configure two things in your environments.yaml file: region - us-east-1, us-west-1, eu-west-1; and ensemble-branch - https://code.launchpad.net/~hazmat/ensemble/ensemble-alternate-regions 
[16:30] <kim0> got it
[16:30] <hazmat> kim0, i also have one with the logging stuff merged in.. lp:~hazmat/ensemble/merged-alt-region-logging
[16:30] <jimbaker> hazmat, sounds good, then we don't use an earlier formula set
[16:30] <jimbaker> have to use
[16:30] <hazmat> jimbaker, it didn't work entirely
[16:31] <hazmat> jimbaker, i'm trying out trunk in east atm to verify the delta
[16:31] <jimbaker> hazmat, sounds good, less mystifying then
[16:31] <jimbaker> we prefer our failures to be consistent ;)
[16:31] <hazmat> jimbaker, what's odd is that the unit relation state in zk (in west) is good, but status isn't reporting it, and wordpress isn't running
[16:32] <hazmat> also ensemble-log seems to return 0 even when it fails/errors.
[16:33] <jimbaker> hazmat, i just got this weird output in a log
[16:33] <jimbaker> 2011-04-22 09:20:52,519 unit:mysql/0: twisted ERROR: TypeError: 'Port' object is not callable
[16:34] <hazmat> jimbaker, can you paste the full log to pastebin
[16:34] <jimbaker> hazmat, will do
[16:34] <hazmat> thanks
[16:35] <jimbaker> http://pastebin.ubuntu.com/597493/
[16:35] <hazmat> jimbaker, is this on the open-port/close-port branches?
[16:36] <jimbaker> hazmat, no, the alternative region branch, running with trunk r200 formulas
[16:36] <jimbaker> i'm going to try the new alt region branch w/ logging merge in now
[16:36] <hazmat> jimbaker, hmm.. there are no port classes in ensemble, only in twisted.
[16:36] <jimbaker> hazmat, indeed
[16:37] <jimbaker> are we sure we got good versions of python, twisted, etc built in these amis?
[16:37] <jimbaker> it's as if we got some version skew going on here
[16:38] <_mup_> Bug #769035 was filed: Need a top level decorataor on all independent callbacks to do nice error printing <Ensemble:New> < https://launchpad.net/bugs/769035 >
[16:39] <hazmat> jimbaker, just stock natty
[16:41] <_mup_> Bug #769036 was filed: Ensemble hook cli api needs to do correct exit codes  <Ensemble:New> < https://launchpad.net/bugs/769036 >
[16:44] <hazmat> jimbaker, so trunk works for me
[16:45] <jimbaker> hazmat, trunk formulas, or using trunk with us-east-1?
[16:45] <hazmat> trunk and us-east-1
[16:45] <hazmat> trying alt-region with us-east-1
[16:46] <hazmat> there's nothing in the branch remotely related to units or hooks..
[16:46] <jimbaker> hazmat, exactly, that's what is so strange here. some other unexpected dependency seemingly
[17:37] <hazmat> jimbaker, the merged-alt-region-logging branch seems to work okay
[17:37] <hazmat> in us-east-1
[17:43] <jimbaker> hazmat, trying it out now
[17:44] <hazmat> jimbaker, cool, i'm trying it out in a different region and then i do see a problem
[17:49] <jimbaker> hazmat, doesn't work for me this try. speaking of round trip overhead, now doing "watch ensemble status" ;)
[17:50] <jimbaker> i had proposed building that in for ensemble status with actual watches, but repeatedly polling like this is the poor man's approach for sure
[17:51] <jimbaker> and the interesting thing is seeing the relation service state disappear... crazy
[17:57] <bcsaller>  jimbaker: the plan with status is to have a mode where it blocks on a topo watch and then reissues the status in a loop
[17:57] <bcsaller> unlike watch it knows when things change
[17:57] <jimbaker> bcsaller, yes, intelligent watches :)
[17:57] <jimbaker> bcsaller, good to know it is in the works
[17:58] <jimbaker> bcsaller, doing "watch ensemble status" is still useful right now
[17:58] <bcsaller> good
[17:59] <jimbaker> i think once we have the relation settings added to ensemble status, that's going to be pretty awesome
[18:42] <hazmat> aha
[18:42] <hazmat> unit agents are dead
[18:43] <jimbaker> hazmat, that would make sense
[18:43] <hazmat> jimbaker, the fact there is nothing in the log is rather frightening
[18:44] <jimbaker> hazmat, indeed. i'm just about to testing us-east-1 w/ trunk at r200, which is the last good one i observed
[18:44] <jimbaker> try testing
[18:45] <hazmat> jimbaker, i was able to get trunk latest from merge-alt-region-logging working on us-east-1
[18:46] <jimbaker> hazmat, i was unable to get that - i was just getting "it works" plus empty relation service states
[18:46] <jimbaker> probably because of dead unit agents
[18:46] <hazmat> jimbaker, no.. i'm actually got populated relation states with dead units agents
[18:46] <jimbaker> hazmat, crazy
[18:46] <hazmat> jimbaker, the variations i'm using are trunk, ensemble-alternate-region, merge-alt-region-logging
[18:47] <hazmat> all with formulas from that are the equivalent of the trunk versions
[18:47] <hazmat> i've seen trunk and merge-alt-region-logging working on us-east-1
[18:47] <jimbaker> hazmat, i have tried all of those, both with us-east-1 and us-west-1
[18:47] <jimbaker> nothing is working end-to-end for me today
[18:48] <jimbaker> everything starts off fine... then it just mysteriously fails
[18:49] <jimbaker> hazmat, maybe i should rebuild my buckets, don't know if that's an issue based on bug 769030
[18:49] <_mup_> Bug #769030: Enable one control bucket to be used for multiple regions <Ensemble:New> < https://launchpad.net/bugs/769030 >
[18:49] <hazmat> jimbaker, yeah.. i switch my buckets when changing regions atm
[18:50] <hazmat> they should recover fine
[18:50] <hazmat> ie. detect dead instance stale file, and create new one
[18:50] <jimbaker> although given how the control bucket works, i wouldn't expect it to impact
[18:50] <hazmat> which is what they normally do, or we'd be cleaning it all the time
[19:00] <jimbaker> bcsaller, hazmat - standup?
[19:00] <hazmat> jimbaker, sounds good
[19:00] <bcsaller> I have little to report, but sure
[19:01] <jimbaker> then it will go fast :)
[19:09] <_mup_> Bug #769120 was filed: Ensemble status shouldn't report dead units based soley on state, but also on presence. <Ensemble:New> < https://launchpad.net/bugs/769120 >
[19:19] <hazmat> http://dtrace.org/blogs/bmc/2010/08/30/dtrace-node-js-and-the-robinson-projection/
[19:20] <hazmat> http://wiki.joyent.com/display/node/Using+Cloud+Analytics
[19:20] <hazmat> http://dtrace.org/blogs/dap/2011/03/01/welcome-to-cloud-analytics/
[19:30] <hazmat> allergies miserable.
[19:39] <jimbaker> hazmat, you really should try hot yoga. i found it really helps clear sinuses and it would seem prevent allergic symptoms too
[20:15] <hazmat> jimbaker, sadly hot yoga isn't my thing
[20:16] <hazmat> does anyone understand apport handling of core files?
[20:28] <hazmat> jimbaker, can you give a hook at this look trivial patch for trunk.. https://pastebin.canonical.com/46611/
[20:28] <hazmat> not sure if argparse version changed, but i currently have these two tests failing for me on trunk
[20:32] <hazmat> bcsaller, ^ if you have a moment and could look at the trivial.. i'm waiting on that before doing some merges.
[20:33] <bcsaller> hazmat: the change to generation happens outside the patch?
[20:33] <hazmat> bcsaller, yeah.. the error output change cause is not clear, i just matched to what the current production is
[20:34] <bcsaller> seems like it should have been caught when the change happend and the tests would have broken. 
[20:34] <bcsaller> I'm fine with the change, but want to understand how it happeded
[20:34] <bcsaller> happened
[20:37] <hazmat> bcsaller, yeah.. i'm bisecting the last 5revs now to double check
[20:40] <hazmat> bcsaller, just went back a month history, still getting the errors, i'd have to guess its an argparse change and rev increment
[20:42] <bcsaller> maybe, yeah
[20:43] <bcsaller> thanks for checking
[20:43] <hazmat> bcsaller, seems to be a change between pypi version of argparse and the builtin 2.7 version
[20:43] <bcsaller> ahh, natty on 2.7
[20:43] <bcsaller> makes sense now
[20:45] <hazmat> not sure if that's it.. also happens with python 2.6 using the distro argparse
[20:45] <hazmat> but it does work with the pypi argparse 1.2.1
[20:46] <hazmat> where as the distro version (for 2.6) is 1.1-1
[20:47] <hazmat> no.. actually it didn't work 1.2.1
[20:47] <hazmat> i had used a patched trunk to test that one
[20:48] <_mup_> ensemble/trunk r205 committed by kapil.thangavelu@canonical.com
[20:48] <_mup_> argparse error output seems to have changed, match tests to match current output [trivial][r=bcsaller]
[20:54] <_mup_> ensemble/trunk r206 committed by kapil.thangavelu@canonical.com
[20:54] <_mup_> merge ensemble-log-level [a=niemeyer][r=niemeyer][f=767364]
[20:54] <_mup_> This fixes a problem with the ensemble-log hook CLI API,
[20:54] <_mup_> not correctly taking a -lLOG_LEVEL option.
[20:57] <_mup_> ensemble/trunk r207 committed by kapil.thangavelu@canonical.com
[20:57] <_mup_> merge ensemble-log-crash [a=niemeyer][r=kapil][f=767391]
[20:57] <_mup_> This fixes a traceback when attempting to use the ensemble-log
[20:57] <_mup_> hook CLI API from hooks.
[21:36] <hazmat> jimbaker, the principia trunk seems to work with trunk, but the trunk formulas don't on natty.
[21:44] <hazmat> still seeing unit agents die though
[21:48] <hazmat> hmm
[23:18] <hazmat> jimbaker, txzookeeper unit tests are segfaulting with default natty it appears.
[23:19] <hazmat> jimbaker, bcsaller do you run with the package zk or a local build?
[23:19] <bcsaller> local
[23:20] <hazmat> yeah.. we've been getting away with not having our own packages
[23:20] <hazmat> even though there were known issues with the lucid one, it still worked for our uses.
[23:20] <hazmat> doesn't appear to be the case with natty, we're going to need package trunk (3.4) or backport perhaps
[23:31] <jimbaker> hazmat, curiously i'm reinstalling stuff now
[23:32] <jimbaker> is everyone running on python 2.7 at this point?
[23:32] <hazmat> jimbaker, i am
[23:32] <hazmat> jimbaker, i test with 2.6 occasionally as well
[23:32] <jimbaker> i'm now getting some test errors on trunk, just building a new virtualenv to try 2.7 out
[23:36] <hazmat> jimbaker, what does ./test need in a ZOOKEEPER_PATH ? just the zkServer.sh  script?
[23:36] <jimbaker> hazmat, iirc, it doesn't use zkServer.sh
[23:37] <hazmat> hmm
[23:37] <hazmat> jimbaker, looks like i just need a directory with the jar
[23:38] <jimbaker> hazmat, sounds about right
[23:38] <jimbaker> it looks for both dev and prod installs
[23:38] <jimbaker> which are laid out differently
[23:39] <hazmat> hmm.. yeah.. it doesn't work just pointing to a directory of jars.. which is what the deb does for install into /usr/share/java
[23:39] <jimbaker> makes sense
[23:40] <jimbaker> fortunately easy enough to change in ensemble.tests.common.ManagedZooKeeper
[23:40] <jimbaker> basically a variant of zkServer.sh
[23:40] <jimbaker> was that adjusted in the deb package?
[23:43] <hazmat> jimbaker, just had to fix the test/common get class path to not hardcode src release stuff
[23:44] <hazmat> jimbaker, debian uses /usr/share/java for java libs
[23:44] <jimbaker> which was the whole point of that classpath property, so that's cool