/srv/irclogs.ubuntu.com/2012/04/03/#launchpad-yellow.txt

bachi gmb11:15
gmbHi bac.11:23
bacgmb: hi there.  have you any experience getting an 8 instance ec2 up and running?  i've configured per gary's email.  do you then just do the normal thing, deploying a master and slave or do you have to deploy multiple slaves?11:24
gmbbac: I'm just deploying the one slave. I haven't actually tried to make it build anything yet, though.11:26
gmbBut the one slave, plus testr, plus multiple cores, should be all we need.11:26
bacok, cool.  i'm trying to test new steps for the slave i've added to the lpbuildbot master.cfg11:26
bacoriginally i had a slave start error...so it didn't get to the part i was interested in11:27
gary_posterhey gmb.  it would have been fine if you had started the zope.testing thing.  Better than fine, great. ;-)  Did you?  I'll arrange the cards if so11:59
gary_posterI'm assuming you completed the automation script?11:59
gary_posterrestarting post upgrade...12:00
bacwhat is the trick for resetting the apt cache?  from ec2 i'm getting hash errors.12:08
bacbenji: do you know ^^ ?12:09
benjilet me check my notes12:09
bacthanks12:10
benjibac: apt-get clean12:11
gary_posterhullo12:11
gary_posterthrough a series of misadventures I'm on the mac side12:12
gary_posterhttps://talkgadget.google.com/hangouts/_/extras/canonical.com/goldenhorde12:12
gary_posterI'm waiting here should anyone care to join me ^^^12:12
gary_posterbac benji gmb12:12
gmbgary_poster: Not yet, but I will (on both counts). I've spent my morning trying to get a working desktop environment, but I've given up and switched to OSX+SSH for the afternoon.12:12
gary_posterok gmb12:13
gary_posterbenji, I'm collecting, not filing yet, but I'll pastebin what I find here as I go12:38
gary_posterhttp://pastebin.ubuntu.com/912936/ is first12:38
gary_posterhttp://pastebin.ubuntu.com/912940/ is second, looks essentially the same (first line is mistake, should not have been copied)12:40
* benji looks.12:41
gary_posteraaaand that looks familiar... http://pastebin.ubuntu.com/912942/12:41
gary_postersame http://pastebin.ubuntu.com/912944/12:42
benjihmm, it looks like our previous fix for the locale whining of perl might have been good to keep12:42
gary_posteryeah maybe so12:42
gary_postersame http://pastebin.ubuntu.com/912948/12:43
benjimy first tack will be to treat it as a test bug, if the test doesn't care about locales then we should ignore this warning when comparing output12:43
gary_postersame: http://pastebin.ubuntu.com/912950/12:44
* gmb -> late lunch12:44
gary_posterthat would be nice, if possible12:44
gary_posterwhee, same http://pastebin.ubuntu.com/912951/12:45
gary_posterI see several of these sorts of errors but they are not connected clearly with a test (they break subunit formatting).  http://pastebin.ubuntu.com/912955/ wgrant talked about these memcache things on the list recently.  I thought the resolution was that it was caused by a newer version of some package and we were going to try and roll back12:47
gary_posterbzr locale again http://pastebin.ubuntu.com/912957/12:48
gary_posteragain http://pastebin.ubuntu.com/912959/12:49
gary_posterOK, I will announce if this is *not* the same bzr locale bug.  If I don't say anything, assume it is bzr locale12:49
gary_posterhttp://pastebin.ubuntu.com/912961/12:49
benjiI wonder why this is the first time we're seeing this error.12:50
gary_posterhttp://pastebin.ubuntu.com/912962/12:50
gary_posterme too12:50
gary_posterthis is first time on Precise beta 2. could be cause12:50
gary_posterhttp://pastebin.ubuntu.com/912965/12:51
gary_posterhttp://pastebin.ubuntu.com/912967/12:52
gary_posterhttp://pastebin.ubuntu.com/912968/12:52
gary_posterThat's it12:52
gary_posterSo I only saw three sorts of errors12:52
gary_posterbzr locale12:52
gary_posterthe Twisted reactor thing that benji fixed yesterday12:53
gary_posterand the connection errors that wgrant said are happening on the main buildbot12:53
benjicool12:53
benjigary_poster: do you want me to file the bug about the locale thing?12:54
gary_posterbenji, sure thank you13:28
benjik13:28
gary_posterbenji, if you give me the bug number I'm happy to add all the affected tests we found, or at least the pastebins.  Or something14:02
benjigary_poster: 97245614:02
gary_posterack thanks benji14:02
gary_posterbug 97245614:03
_mup_Bug #972456: Tests can fail when bzr emits an unexpected "unsupported locale setting" warning <Launchpad itself:In Progress by benji> < https://launchpad.net/bugs/972456 >14:03
gary_postergmb, fwiw, I was able to get the zope.testing tests to run by doing it in a lucid lxc (with python 2.6.5).  Maybe that's the trick.  I stashed the current failures at this pastebin: http://pastebin.ubuntu.com/913071/  .  I saw four, though I thought frankban said we were down to three.  I'm pretty tempted to dig into the subunit ones, since subunit support is so important to us, and the source of the regression.14:33
gary_posterThe other three (testrunner-edge-cases.txt, testrunner-debugging.txt, testrunner-debugging-layer-setup.test) I'm planning on continuing to ignore14:34
gmbgary_poster: Noted, thanks. I'm not yet in a position to pick up the zope.testing work, so if you want to forge ahead with it and then (if you're not done with it by your EOD, which you may well be) send me a handoff email, I'll be happy to carry the baton tomorrow AM.14:35
gary_postersounds perfect gmb, thx14:35
gmb(Or Precise might stop being a pain before then. Who knows?)14:35
gary_poster:-)14:35
bacgary_poster: juju expose for the master did not work for me.  can you give me details of how you fixed it via the AWS console?15:14
gary_posterbac, sure (and how weird!)15:14
gary_posterbac, I'll do it at same time15:14
gary_postergo to http://aws.amazon.com/15:14
bacthere15:14
gary_posterMy Account/Console -> console15:15
gary_postersign in15:15
gary_posterclick on ec215:15
gary_posterclick on [N] Security Groups on right side15:15
bac(make sure you aren't in region 'singapore'!  )15:15
gary_poster:-)15:15
gary_posterlook for group representing machine15:16
bacok, now i have a ton of groups15:16
gary_posterjuju-[name of the environment you used]-[number of the machine from juju status]15:16
gary_posterSo for instance my environment was big-ec215:16
gary_posterand my machine was 215:17
gary_posterso I clicked on juju-big-ec2-215:17
bac2 for the bb master, no?15:17
gary_posterOn bottom click on "Inbound" tab15:17
gary_posterbac, depends on the order that you started, but yes, that's the way it is for me too15:17
gary_posterjuju status is authoritative15:17
bacok15:18
gary_posterIn Port range on "Inbound" tab type 801015:18
bacnow i already have (to the right) 8010 0.0.0.0/015:18
bacbut no one is answering15:18
gary_posterok, so this is not the problem15:18
gary_posterbac, next possibility is that the master is not actually up15:19
bacjuju status shows the master as started15:19
gary_posterjuju ssh 215:19
gary_postergo to /var/lib/buildbot/masters/master and look at twistd.log15:19
gary_posterAs of now, bac, I have never encountered the situation you describe, fwiw15:20
bacuh oh15:20
baci have no /var/lib/buildbot15:20
gary_posterheh15:20
gary_posteryeah, that doesn't sound so good15:20
baci ran the hooks/install and start manually and they seemed happy15:20
gary_posteryou are sure--you are looking as root?15:20
bacyes, not /var/lib/buildbopt15:21
bacyes, not /var/lib/buildbot15:21
bacyes, no /var/lib/buildbot15:21
gary_posterum15:22
gary_posterbac, I'm afraid I have no idea whatsoever.  How did you deploy the master?15:28
bacjuju deploy --config=/home/bac/juju/oneiric/buildbot-master/examples/lpbuildbot.yaml --repository=~/juju local:buildbot-master15:28
bacthat failed due to the non-existent apt repo15:28
baci fixed /etc/apt/sources.list15:28
bacand then did an 'apt-get update' / 'apt-get upgrade'15:29
bacthen i ran hooks/install and hooks/start15:29
gary_posteroh, manually?15:29
bacyes15:29
bachow should i have done it?15:29
gary_posterbac, I think the right thing to have done would have been to do "juju resolved --retry buildbot-master/0"15:30
gary_posterthere are environmental variables that are not around when you run them by hand15:31
gary_posterI described doing this in my second "Starting..." email yesterday15:31
bacgary_poster: i'm a bit confused15:31
baci thought the "resolved" command was done before using 'juju debug-hooks', which i did15:31
bacso are you saying:15:32
bacjuju deploy15:32
bacsee it fail15:32
bacjuju ssh and then fix /etc/apt15:32
bacthen juju resolved -- and all should carry on happily?15:32
gary_poster*if* you do a retry15:33
gary_poster"resolved" alone means "I handled it; don't retry"15:33
gary_posterunless you add --retry15:33
gary_posterthat means "I resolved the problem you encountered; please retry"15:33
baci always used --retry, even if i was going to use debug-hooks15:33
bacok, i'll shoot this environment and try again after lunch15:34
* bac argh15:34
gary_posterso...when you said "i fixed /etc/apt/sources.list" that means you did it within debug-hooks?15:34
bacyes15:34
gary_posteroh15:34
bacbad, huh?15:34
gary_posterno15:35
gary_posterthat sounds fine on the face of it, just not what I did15:35
gary_posterI thought you meant that you had done it with juju ssh15:35
bacwell, something i did caused it to remain unhappy15:35
gary_posterright15:35
gary_postermm15:35
gary_posteryou could try deploying another master?15:35
baci don't see the benefit.  i'd rather clean house and try again from fresh15:36
gary_posteryou could try killing this master?  I don't remember if they said there was a "die die die" for a machine, or if the only option is to redo an existing machine15:36
gary_posterbac, the only benefit is that you've already paid the price for the slave setup15:37
gary_posterand it is unrelated to the master15:37
gary_posterbecause you hadn't gotten that far yet15:37
gary_posterso if you got a working master15:37
gary_posterthen you'd be able to connect your existing slave15:37
bacyeah, but i did the same lame-o dance with the slave, so i probably screwed it up15:37
gary_posterbut I'm just brainstorming15:37
gary_posterI don't think it was necessarily all that lame-o15:38
gary_posterdid you run install during the install step and start during the start step?15:38
gary_posterIf so, AFAIK, you did everything the way it was supposed to be done, for some story15:39
bacok, on the slave machine i do have /var/lib/buildbot/slaves/slave -- so it seems to be happier15:39
bacyes, that's what i did15:39
gary_posteryeah15:39
gary_posterIt's not clear to me that you did anything wrong :-/15:40
bacok, i destroyed the master15:40
gary_posterok15:40
baci'll now try to redeploy15:40
gary_postertry again, yeah.  master is fast15:41
bacok, he's up and 'pending'15:42
gary_posterok15:42
bacnow 'installed'15:42
gary_posterhuh15:42
bacoh, it is the same machine, so the apt problem is pre-fixed15:42
gary_posterright15:42
bacand 'started'15:42
bacwhee15:42
gary_posterso is there a buildbot?15:43
bacyes15:43
gary_posteruh, great, I'm so glad we figured out this problem! :-D15:43
baci will now add-relation15:43
gary_postercool15:44
bacand it is available on the web.  glad i didn't blow it all away!15:45
gary_postergreat15:45
bacok, so do i have to manually kick off a build?15:45
gary_posteryeah bac15:45
bacok, so it tried to run my script 'init_testr.sh' but it was not there15:48
bacbut it is there...in /var/lib/buildbot/slaves/slave/lucid-devel/build15:48
gary_posterbac, right mode?15:49
bac-r-xr--r-- 1 buildbot buildbot 395 Apr  3 15:46 init_testr.sh*15:49
gary_posterlooks reasonable15:49
bacso perhaps it thinks it should be somewhere else?15:49
gary_posterI don't think so...lemme look at working example.  Could you also give me url of web interface to master?15:50
bacgary_poster: yeah, the ./ made it work15:58
gary_postercool bac.15:58
bacso i need to let the build finish and then restart another to ensure i have all of the data in the .testrepository?15:58
gary_posteryes bac16:08
gary_posterbac, it should take no time at all16:08
gary_posterit should be done already, in fact16:08
bacroot@ip-10-82-27-185:/var/lib/buildbot/slaves/slave/lucid-devel/.testrepository# ls16:08
bac0  1  failing  format  next-stream  times.dbm16:08
baci think we have a wiener16:08
gary_posterlooks perfect bac :-)16:08
bacwith that i lunch and bike16:08
gary_postercool!  talk to you in a bot16:09
gary_posterbit16:09
benjigary_poster: is the kanban update bot running again?  and do we have to do anything to get it to update a card?16:22
gary_posterbenji, I think it is running (80% sure), though it gets confused by bugs that do not have LP tasks (and maybe other situations)16:23
benjigary_poster: cool, thanks16:23
gary_posterwe don't need to do anything to get it to update except wait and have an LP task for the bug (and maybe it must be the only one?)16:23
gary_posterwelcome16:23
benjigary_poster (or bac): any thoughts on a next task?  I think the main lanes are full so helping with one of those cards would seem best.17:35
gary_posterbenji, looking17:35
gary_posterbac's is almost done; he just needs to wrap it up and send it off17:35
gary_posterI'm doing frightening, embarrassing things.  Consider my most recent check-in message for our zope.testing fork:17:36
gary_posterbzr commit -m 'Consider this check-in suspect: I reviewed the test failures in the file and they seemed innocuous, rather than correct.'17:36
gary_posterI would have you join in with me but my internet connection is supposed to die for a bit soon17:37
gary_postermaybe we should do that anyway.  You can make progress while I'm disconnected.  You up for a call benji?17:38
benjigary_poster: I think so.17:38
gary_posterk17:39
gary_posterbenji https://talkgadget.google.com/hangouts/_/extras/canonical.com/goldenhorde17:40
gary_posterbenji left in a huff17:51
gary_posteror a cough17:52
gary_posteror a sneeze17:52
benjigary_poster: my machine crashed17:53
gary_posterbenji, I figured as much.  Come on by when things are normal again17:53
* bac wraps things up17:57
bacgary_poster, benji: could one of you gentlemens please review https://code.launchpad.net/~bac/lpbuildbot/remember-the-testr/+merge/100664 ?18:11
gary_posterbac, I'll look in a sec, sure18:12
gary_posterapproved bac.  thank you18:22
bacthanks18:23
* bac looks for a card18:35
bacgary_poster: did you have your francis call?18:36
gary_posterbac, 4PM18:36
gary_poster"lunch"18:54
bacbenji: are you about to grab a spot in the coding lane?18:56
bacif so, anything i can help with?18:57
benjibac: nope, I'm helping Gary with his card18:57
bacokey doke.  i'll throw a dart at my monitor then18:57
gary_posterbenji, I suspect you are almost done?19:41
benjigary_poster: totally done: lp:~benji/zope.testing/3.9.4-fork/19:42
benjiI don't know if we're MP-ready or not.19:42
gary_posterawesome benji.  Are you making an MP?  Shockingly, I'd be happy to review...oh ok.  What's up?19:42
benjigary_poster: I'll do an MP.  I just wasnt' sure if there was more you wanted to do.19:43
gary_posterno, cool.   We fixed the bug, and did some other good stuff on the way.19:43
gary_posterYou'll be happy to know that my MIL's low-end FIOS Internet connection is close to double my maxed-out U-verse connection19:44
gary_posterI need to get her an AppleTV or Roku or something19:44
gary_posterBecause she should be able to take advantage of this easily19:45
gary_posterMaybe I'll just run CAT5 down the interstate from NJ to NC19:46
gary_posterI'm sure that will work out perfectly19:46
bacgary_poster: hey, could i tap off it?19:46
gary_posterheh, sure bac19:47
benjigary_poster: https://code.launchpad.net/~benji/zope.testing/3.9.4-fork/+merge/10067919:47
gary_posterack benji, on it19:47
bacfwiw, i used to have a roku but it looks ugly and ham fisted next to the appletv19:47
gary_posterok19:48
gary_posteryeah I like my appletv19:48
gary_posterbenji, wrong merge target19:48
gary_posterso conflicts and other bad stuff19:48
benjidarn19:48
gary_postermake merge target of lp:~launchpad/zope.testing/3.9.4-fork19:48
gary_posterbenji, ^^19:49
benjialready done19:49
benjigary_poster: https://code.launchpad.net/~benji/zope.testing/3.9.4-fork/+merge/10068119:50
gary_postercool benji, much better thanks :-)19:50
gary_posterbenji, line 8 of diff is my fault.  I regarded it as a hack and I didn't mean to check it in.  I suggest removal, but if you like it for some reason that's fine19:51
benjilooking19:52
benjino that's evil, I'll remove it19:52
benjipushed19:53
gary_posterthanks19:53
gary_posterbenji, funny that there was already a test of the odd behavior of list tests19:54
gary_posterbenji, approved19:54
benjigary_poster: cool, I'll merge it into ~launchpad/zope.testing/trunk, then19:56
benjier, lp:~launchpad/zope.testing/3.9.4-fork, rather19:56
gary_posterbenji, you mean https://code.launchpad.net/~launchpad/zope.testing/3.9.4-fork ?19:56
gary_posteryeah ok19:56
benji:)19:56
gary_poster:-)19:56
benjigary_poster: I just posted my March EC2 expenses19:58
gary_posterhow bad were they benji?19:58
benjiit wasn't nearly as bad as yours: $162.1219:59
gary_posterStill, a new high?19:59
benjigary_poster: by a (base 2) order of magnitude20:02
benjiok, branch merged and pushed20:02
benjiI guess saying "twice as much" would be easier.20:02
bacgary_poster: is the /etc/apt/source.list problem on ec2 going to be something we have to deal with long term?  do we want a card to fix it?20:03
gary_posterbac, I don't know.  Maybe we should. :-/20:04
baci'll add it.  deleting later is cheap20:04
gary_posterbenji, btw, please make a lp branch for getting the new egg in the tree20:14
benjigary_poster: ah, will do20:15
gary_posterthx20:15
gary_posterbenji + card ;-)20:15
benjik20:15
bacgary_poster: any suggestions on a task to pick?20:27
gary_posterbac on call20:28
benjidarn, my setuplxc-created dependencies directory has HTTP checkouts20:40
gary_posterbac, the most helpful task would be to investigate the memcache ConnectionError failures that wgrant described on the list ("[Launchpad-dev] memcache errors in ec2 and buildbot -- newer python-memcached to blame?").  They are affecting the main buildbot and us as well, so it's a generic "green buildbot" thing.21:21
gary_posteralternatively...21:21
gary_poster"teach buildbot to understand subunit in test results to properly report failure numbers in waterfall" card is interesting21:22
gary_posteralternatively "Fix /etc/apt/sources.list on ec2" from you.  It's not that the debian repo isn't there, it's that it has the wrong hash.  Not sure how that would happen21:24
benjigary_poster: I give up.  I have the new p6 egg in the dependencies repo but I can't for the life of me get a LP build to work to make sure it's used and does the right thing.  I think my cold has infected my brain.  I'll try again tomorrow if someone else hasn't done it.21:26
gary_posterbenji, ok.  how weird!  so you've pushed the change?21:27
gary_posterto the download-cache I mean?21:27
gary_posterthe addition of the p6 egg21:27
benjigary_poster: yep21:31
gary_postercool benji.  feel better.  I will be out tomorrow it looks like.21:32
gary_posterI'll send a note to yellow list21:32
benjiok; enjoy yourself21:32
gary_posterwith suggested tasks21:32
gary_posterthanks21:32

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!