/srv/irclogs.ubuntu.com/2012/03/15/#juju.txt

jcastroSpamapS, next time you do want to do a review though, ping me, I can at least pick up the easy ones to prescreen for ya.00:03
SpamapSjcastro: I need to get the python charm helpers into charm-tools actualy..  thats the current priority00:10
jcastroI mean an opportunistic "whenever"00:12
hazmatadam_g, if you have the provisioning agent log that would be helpful to diagnose.. is that against maas or orchestra?00:19
hazmatooh.00:19
hazmatbaremetal that is00:19
hazmatthat is odd, its not even showing the unit00:20
adam_ghazmat: yeah, i watched the logs and there was nothing odd, let me go see if i can grep out that deployment00:24
adam_gits since been working00:24
adam_gthis is an orchestra provider00:25
adam_ghttp://paste.ubuntu.com/884140/00:28
hazmatadam_g, with no units like that, it would appear the units where destroy via juju remove-unit00:28
hazmatadam_g, that's a fragment of the log00:29
adam_ghazmat: on ec2, ive seen juju get trigger happy and start taking out nodes that i've manually added to the security group. is it capable of doing similar things with the orchestra provider?00:30
adam_ghazmat: how much context would you like? the log is big00:30
hazmatadam_g, yes.. it owns the security group on ec2, and will treat things it doesn't know about on ec2 as runaways and clean them up.. that behavior is also present on orchestra00:31
hazmatadam_g, but again something would have to have removed the rabbitmq unit00:31
hazmatie.. juju remove-unit00:32
hazmatand even then juju wouldn't kill the machine.. because it knows about it00:32
hazmatand if the machine where dead out of band, the unit would still show00:32
hazmatadam_g, i'll take as much context as you have00:32
adam_gright00:33
adam_gsure one sec00:33
adam_ghttp://paste.ubuntu.com/884146/00:33
adam_g^ that is from the teardown of the previous deployment through till the deployment following the failure00:34
adam_ghttp://paste.ubuntu.com/884147/ <- thats the whole thing00:34
SpamapShazmat: does zk have transactions of any kind, or could it be a transient thing caused by a timeout of some kind between client and zk?00:35
hazmatSpamapS, it has atomic operations we use, it has a limited tx in 3.4.. as for the cause of this issue, i haven't seen anything in the logs that shows me its a bug00:38
hazmatversus just acting on executed command00:38
hazmatadam_g, how are you tearing down the env?00:38
hazmathmm.. it would be nice to get a dump of00:39
hazmatzk00:39
hazmatas is i see the unit was destroyed explicitly, and the machine to which it was assigned was removed as well00:40
hazmata service with no units, looks like the original status output00:40
adam_ghazmat: i keep the bootstra node in place, and do something like: destroy all services, terminate all machines but the bootstrap, usually sleeping for some seconds between terminate-machine to allow power unit to catch up with requests00:40
adam_ghazmat: 'as i see the unit was destroy explicitly'... which unit? the rabbitmq that is missing its machine?00:41
SpamapSahh so remove-unit won't clean up an empty service00:41
hazmatadam_g, its missing any  units00:42
hazmatSpamapS, yes00:42
SpamapSadam_g: does add-unit resolve things?00:42
* hazmat tries to come up with a remote dump zk script00:42
hazmatyeah.. that would verify00:43
adam_gSpamapS: i can try next time i hit this...00:43
hazmatadam_g, do you have this teardown automated?00:44
hazmatadam_g, you should try the charmrunner tools00:44
hazmathmm00:44
hazmatactually i guess the snapshot/restore assumes a local provider00:44
hazmateasy to fix though00:45
adam_ghazmat: yea, teardown is automated. i'd definitely like to combine efforts and standardize on whatever tools you guys are using at some point00:47
adam_gFWIW, i'd never seen this issue until recently though, last 1.5 week or so00:47
hazmatadam_g, pls keep that env alive for a few minutes more if not already dead00:50
adam_ghazmat: still in place00:50
hazmatadam_g, i'm almost done with a remote dump zk script00:50
hazmatadam_g, the tools are a bit split.. i've got a few useful ones in charmrunner (charm test thingy), and there are some in jujujitsu00:54
hazmatSpamapS, btw. nice name00:54
hazmatadam_g, here's the script http://paste.ubuntu.com/884162/00:54
hazmatyou can just python dumpzk.py -f filen.zip -e env_name00:54
SpamapShazmat: name?00:55
hazmatSpamapS, the jujujitsu name00:55
SpamapSOh, hah, yeah, I love it. :)00:55
SpamapSI do hope others like the idea and want to dump more things into it.00:56
adam_ghazmat: people.canonical.com/~agandelman/zk.zip  this is from the current deployment in the same enviromment. the failed unit in that pastebin is gone by now. ill hang onto that script and dump it next time i run into the issue00:59
hazmatadam_g, cool00:59
=== Guest18667 is now known as jrgifford
hazmatadam_g, till then afaics from looking at status code, the rabbitmq unit was removed explicitly with juju remove-unit, and then the machine removed with juju terminate-machine01:04
hazmatadam_g, but that seems odd, since i assume your just using destroy-service and terminate-machine for cleanup01:06
adam_ghazmat: thats strange. nowhere in any of the automation we use is remove-unit called01:06
adam_gright01:06
hazmatanyways.. if you can run that script if it happens again that would be helpful01:06
adam_gfor sure01:06
_mup_Bug #955677 was filed: provisioning agent crashes when deploying to a maas node <juju:New> < https://launchpad.net/bugs/955677 >03:34
=== almaisan-away is now known as al-maisan
=== al-maisan is now known as almaisan-away
=== almaisan-away is now known as al-maisan
=== Leseb_ is now known as Leseb
=== Leseb_ is now known as Leseb
=== tobin is now known as Guest24410
_mup_Bug #955576 was filed: 'local:' services not started on reboot <juju:New> <juju (Ubuntu):Confirmed> < https://launchpad.net/bugs/955576 >10:14
=== hspencer is now known as hspencer[afk]
=== asavu_ is now known as asavu
=== TheMue_ is now known as TheMue
=== al-maisan is now known as almaisan-away
=== medberry is now known as med_
=== fjlacoste is now known as flacoste
=== elmo_ is now known as elmo
=== almaisan-away is now known as al-maisan
=== Guest24410 is now known as otbin
=== otbin is now known as tobin
=== tobin is now known as Guest10799
_mup_juju/local-survive-restart r477 committed by kapil.thangavelu@canonical.com16:11
_mup_upstartify local provider zk16:11
jamespage\o/16:31
=== al-maisan is now known as almaisan-away
SpamapShazmat: my hero! :)16:45
SpamapSlxc and the local provider have gotten much better of late16:45
hazmatSpamapS, its mostly unchanged outside of the upstartification of some bits16:49
hazmatSpamapS, there's still some love needed for the whole failure scenario around lxc-wait16:50
SpamapShazmat: yeah thats being looked at upstream... apparently you can only have one lxc-wait running at a time, and that is the krux of the problem16:51
SpamapScrux even .. :-P16:51
hazmatSpamapS, well.. we're not properly passing it a bit mask around multiple states, we're just waiting for it to get to started, and on error it never does. but yeah.. the ability to ask it multiple times is also nice16:51
hazmater. concurrently16:51
=== lifeless_ is now known as lifeless
SpamapShazmat: apparently it listens for a signal from lxc-start on a private socket so only one lxc-wait can be listening at one time16:53
_mup_Bug #956183 was filed: Support suspending environment <juju:New> < https://launchpad.net/bugs/956183 >16:53
SpamapShazmat: I'm pretty sure that master-customize also seems to not error on failure of any of its commands.16:53
=== medberry is now known as Guest35857
adam_ghazmat: around?17:22
hazmatadam_g, yes17:25
hazmatheadless chicken17:25
adam_gsame here heh17:28
SpamapSmaybe try a tournequette to stop the bleeding?17:29
adam_ghazmat: so there seems to be some issues ATM /w juju + essex, which i think are security group related.  i was going to see if you had a script/doc around that mimicks the boto calls juju runs in the ec2 provider. i was having trouble recreating using euca2ools. i can extract it all myself if you're bogged down, but figured id check first17:30
hazmatun momento17:33
hazmatadam_g, http://paste.ubuntu.com/885124/17:35
hazmatthose are all the calls, but re security groups, there is one for the environment, and then one per machine17:36
SpamapSadam_g: one thing.. juju uses txaws, not boto17:36
hazmatthe environment has a rule to allow for internal group access17:36
hazmatand then ones per machine are manipulate to allow for external access as the services with units on a given machine are exposed17:36
hazmatthe environment group is also used to help identify which machine in the provider juju has responsibility for, ie as a form of tagging17:38
adam_ghazmat: thanks, ill check those. id like to be able to recreate the same security groups + rules manually on ec2 and nova. i think theres something screwy going on with rules that reference other groups17:43
adam_ggreat idea>> iptables-save for ec2 security groups17:45
hazmatadam_g, yeah.. it was a bit wonky last cycle as well for self-referential security group rules, ie. the metadata looked suspect, i think it worked well because effectively the enforcement wasn't in place.17:47
=== Guest35857 is now known as med12345
=== med12345 is now known as med___
jcastroSpamapS, I've got two incoming charms that need a round 2 review18:08
jcastroand m_3 is chilling at some ruby conference18:08
jcastrobut these will be easy. :)18:08
SpamapScool18:08
jamespagejcastro, I can prob pickup some review later tomorrow if that would help?18:09
jcastrosubway IRC and Alice IRC.18:09
jcastrojamespage, actually what would help is you monitoring the incoming queue on occassion18:09
jcastrolet me get you a link18:09
jamespagejcastro, sure18:09
jamespagemaybe we should try to doing something pilot'ish like we do for Ubuntu dev?18:10
jcastroyeah18:10
jcastrofor now though:18:10
jcastrohttps://bugs.launchpad.net/charms/+bugs?field.tag=new-charm18:10
jcastroany of the new ones18:10
jamespagelike saltmaster or gearman?18:11
jcastroand Fix Committed18:11
jcastrosaltmaster is incomplete, updated18:11
jcastrofix committed is when the person was incomplete then wants another review18:11
jamespagerightoh18:13
jamespageand New is up for first review?18:13
jcastroright18:13
_mup_juju/refactor-machine-agent r461 committed by jim.baker@canonical.com18:21
_mup_Merged trunk & resolved conflict18:21
=== med___ is now known as med__
=== med__ is now known as med_
=== marcoceppi_ is now known as marcoceppi
=== dvestal is now known as dvestal|away
_mup_juju/relation-reference-spec r6 committed by jim.baker@canonical.com18:53
_mup_Initial commit18:53
_mup_juju/relation-hook-commands-spec r6 committed by jim.baker@canonical.com19:34
_mup_Initial commit19:35
_mup_juju/relation-info-command-spec r6 committed by jim.baker@canonical.com19:42
_mup_Initial commit19:42
_mup_Bug #956352 was filed: Enable relation hook commands to work with arbitrary relations. <juju:In Progress by jimbaker> < https://launchpad.net/bugs/956352 >19:42
_mup_juju/juju-status-changes-spec r6 committed by jim.baker@canonical.com19:45
_mup_Initial commit19:45
_mup_Bug #956357 was filed: Fix `juju status` bug when working with multiple relations for a service. <juju:In Progress by jimbaker> < https://launchpad.net/bugs/956357 >19:47
_mup_Bug #956372 was filed: Add `relation-info` to list relation ids associated with a service <juju:New> < https://launchpad.net/bugs/956372 >19:52
_mup_Bug #956377 was filed: Enable unambiguous reference to relations by using a relation id <juju:In Progress by jimbaker> < https://launchpad.net/bugs/956377 >19:56
jcastroSpamapS, this might be more of an m_3 question but21:15
jcastroif I want to see a big list of what charms are currently failing tests and that I should be looking to fix I go to .... ?21:16
SpamapShm, why does yaml.dump have to make such ugly yaml?21:17
SpamapSjcastro: charmtests.markmims.com is what I've been looknig at21:17
SpamapSLooks dead tho21:18
jcastrobummer21:19
SpamapSjcastro: Its a single charm, so you can also just deploy it.. ;)21:20
SpamapSahh.. defualt_flow_style=False helps21:21
SpamapSjcastro: how much would you love a juju-jitsu subcommand called 'setup-environment' that did Q&A to fill in the blanks?21:22
jcastroI would have a party21:22
SpamapSjcastro: polishing it off now21:22
jcastrohey is this in the PPA yet?21:22
SpamapSno21:23
SpamapSstill pretty raw.. so.. bzr branch and play..21:23
jcastrooh dude21:23
jcastroyou put the gource thing in here21:23
SpamapSjcastro: yes!21:24
SpamapSjcastro: just run it.. you get a gourcer on your default environment. :)21:24
m_3jcastro SpamapS charmtests back up21:51
m_3hit by the overly-strict type checking across the whole repo21:52
SpamapSm_3: did you pull the latest changes? I fixed most of them over the last week.21:52
m_3SpamapS: I did... essentially `charm list | grep lp:charms`21:53
SpamapSm_3: thats part of why I added the new --fix stuff to 'charm update'21:54
m_3I thougth that was for existing local repos.. this wipes andclean branches21:55
m_3win 1721:56
SpamapSjcastro: there's a little surprise waiting for you on your blog ;)23:53
m_3hazmat: lots of progress!  charmtests.markmims.com23:58
m_3looks like most of them are completing the graph runs without hanging23:58
hazmatm_3, nice23:59

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!