/srv/irclogs.ubuntu.com/2012/06/15/#launchpad-yellow.txt

bacbenji: i think the problem with our zope.testing fork tests is that now, for subunit, we are writing directly to __stdout__, so the testrunner actually running the tests cannot capture the output.  it's all going directly to the screen and the tests fail11:38
* gmb -> lunch11:48
benjibac: I'm over here now. :)11:59
gary_posterbac benji frankban gmb, as a reminder, we will have the daily & weekly call in 2 hours and 4 minutes12:06
benjigary_poster: thanks, I had forgotten about the time move12:07
gary_posterwelcome12:07
benjifrankban: I had dome problems with lpsetup yesterday afternoon. Once I have my EC2 instance up and configured correctly will you have a few minutes to help me?12:09
frankbanbenji: sure12:09
benjithanks12:09
benjifrankban: is it best to use lpsetup from a checkout or from the PPA?12:31
gary_posterI need to restart.  back soon ghopefully12:32
frankbanbenji: same revision, so no real difference. in general, PPA is better12:32
benjifrankban: the hangout is open when you're ready: https://plus.google.com/hangouts/_/e5a32b64b2b26fda880db39e37125e9f6733ae7512:55
frankbanbenji: joining12:56
gary_posterfrankban, thank you for great lpsetup analysis.  I (optimistically?) made cards for them on the board.  This might help us in the discussion; also, I only made blocks where I thought they were absolutely necessary, which I think is a bit more flexible than your three steps, so we can talk about that on the call also.  Thank you!13:04
frankbangary_poster: cool! thank you13:05
gary_posterbac benji frankban gmb, call in 18 (early warning since it is an unusual time for us)13:52
gmbk13:52
benjiForewarned is forearmed13:52
frankbanbenji: could you please paste the output of `locale` in your ec2 instance? gary_poster: could you do the same on the slave if you are running parallel tests?13:53
gary_posterfrankban, so in the host?13:53
benjifrankban: sure, one sec13:53
frankbangary_poster: yes13:53
frankbangary_poster: ah, no, in the lxc13:53
benjifrankban: http://paste.ubuntu.com/1042422/13:54
frankban(so, between two runs, no rush)13:54
benjifrankban: note, that is from the host13:54
gary_posterfrankban, oh ok.  this is the host, cause I already did it. http://pastebin.ubuntu.com/1042421/ .  I'll get it from a container as soon as I can13:54
frankbanbenji: sorry, I need it in the lxc: ssh `sudo lp-lxc-ip -i eth0 -n lptests` locale13:57
benjifrankban: ok, I'll do it after lpsetup finishes; I don't want to spook it13:58
benjifrankban: so, lpsetup has stopped, I don't see an error message but it also didn't have a parade about how everything went fine; will you look at termbeamer and see if it looks good to you?14:08
frankbanbenji: lpsetup finished, and it seems without error, but I believe that if you start the container, ssh into it and run make schema in devel we will see an error14:08
gary_posterbac benji frankban gmb https://plus.google.com/hangouts/_/9fc6f87349a45167b19d7b96789e769a23e20c1c?authuser=1&hl=en-USin 214:08
gary_posterurg14:08
gary_posterhttps://plus.google.com/hangouts/_/9fc6f87349a45167b19d7b96789e769a23e20c1c?authuser=1&hl=en-US14:08
gary_posterand...how odd...my camera is lit as if it is being used, but I am not visible...14:09
* benji tries14:09
frankbanbenji: because i've see that launchpad-database-dependencies have found postgres 8.4... that's the weird behavior I encountered locally, and it seems to be related to LC_ALL settings.14:10
benji:(14:10
benjifrankban: here is the locale output from inside the container http://paste.ubuntu.com/1042454/14:12
bacthe "madness" quip has been superceded by "have you tried turning it off and on again"14:19
frankbanbenji: so make schema fails: could you please try to run:16:08
frankban$ LC_ALL=C sudo pg_createcluster 9.1 main --start --encoding UNICODE16:08
frankbanthen: utilities/launchpad-database-setup ubuntu16:09
frankbanand finally: `make schema` again16:09
benjifrankban: I'll try after lunch.  I'll email you the results if you're not around.16:15
frankbancool, thanks beji16:17
frankbanhem, thanks benji16:17
benjifrankban: (if you're still around) that appears to have worked17:12
frankbanthanks benji, on Monday I will try to fix lpsetup to use LC_ALL . have a good weekend!17:15
gary_posterbye17:15
gary_posterhey benji, I'm watching a slave while tests are running17:16
gary_posterit is very interesting17:16
benjifrankban: cool; enjoy your weekend17:16
benjihow so?17:16
gary_posterso, first CPU is 77% idle (this is on a 24-container instance on a 32 thread/16 core machine)17:16
gary_posterthere is 0 wait time17:16
gary_posterfor io17:16
gary_posteraccording to vmstat17:17
gary_postermemory has 33638M free all the time17:17
gary_posterI mean, around there17:17
gary_posterentropy varies widely but never seems to get beneath the 1000s17:17
benjigary_poster: what is the % utilization for io?  (iostat -x 10) would be a good way to see it; ignoring the first output which is supposed to be "recent history" but I don't have much faith in17:18
gary_posterthere are 8.49 writes/sec and a wrqm/s of 29.64 which is the only thing that looks suspicious so far17:19
gary_postertrying that17:19
gary_posterI was doing a watch rather than a -x 10...17:20
gary_posterbut benji it never goes over 3.84% and gets as low as 0.4%17:21
benjiwow; that's quite good17:21
gary_posternow cpu is up to 33% idle17:21
gary_posterwell it ought to be!  remember, as far as we know, we are writing and reading to memoru17:22
gary_postermemory17:22
benjiyep17:22
gary_posterI'm not entirely sure what we are writing tbh, unless it is just the testr recordings17:22
benjiis there ever a non-trivial %steal?17:22
gary_posterit's always 017:22
gary_poster0.0017:22
gary_posterNow back up to 50.82% idle17:23
gary_posterI don't see what the hang up is, unless it's something like reading and writing memory or something crazy like that17:24
benjioh, wait... that will be 0 on the host; thinko on my part17:24
benjiso did one of the containers take a long time to start in this scenario?17:24
gary_posterbenji, no more than 3 minutes, but will look one sec17:26
benjiit would be interesting to run an "iostat -x 10" while they start to see if there is much resource centention, then run a pidstat on any stragglers to see why they are being slow17:28
gary_posterbenji, first one was ready @ 16:57:42, last one reported for duty @ approx 17:00:0517:28
gary_posterThat can only explain up to 3 minutes though17:29
gary_posterof 10-ish17:29
benjiright, but the non-loadbalancing could explain the rest17:29
benjiI would be interested in seeing if, say, a 14 container run on the 16 core machine did or did not have any stragglers17:30
gary_posterThe last worker to run was worker-7, which started work at 17:00:55 (so I was wrong about the last one reporting)...17:31
gary_posterI mean, that was the last one to stop; and it may have been the last one to start17:32
gary_posterworker-10 was penultimate to finish, and started @ 17:00:56; worker-17 started @ 17:00:53 and was antepenultimate to finish...17:36
gary_posterok, doing this systematically.17:37
gary_posterworker-0: 17:00:57 - 17:25:2817:37
gary_posterworker-1: 16:57:48 - 17:25:3917:38
gary_posterworker-2: 17:00:56 - 17:24:3517:39
gary_posterworker-3: 16:57:51 - 17:21:3217:39
gary_posterworker-4: 17:00:56 - 17:26:3017:40
gary_posterworker-5: 17:00:57 - 17:25:4517:42
gary_poster(Nte that this was a particularly fast run, at 32 mins, 45 secs; and this was a round-robin-assigned version17:43
gary_poster)17:43
gary_posterwait, something is wrong...17:43
gary_posterthat one lost a worker17:44
gary_posterSo 24 is too much17:45
gary_posterDuring building, hard drive is getting up to 57.44 %util17:46
gary_posterok now starting tests...17:47
gary_posterwell, listing tests...17:47
gary_posterlow %util17:47
gary_poster23% idle cpu17:48
gary_posterplenty of free memory17:48
gary_posterentropy still never below 100017:48
gary_posterbenji ^^ what else could it be?17:49
* benji reads the backlog.17:49
gary_posterbenji you only need to back 10 or so17:50
gary_posterno starting tests17:50
gary_poster"now startin tests..."17:50
benjigary_poster: what is the load?17:51
gary_posterbenji, 18.99, 11.93, 7.7417:53
gary_posternot sure if we should regard 16 or 32 as the expected top load17:53
benjihow many containers did you run?  24 again?17:53
gary_posterbenji, yes17:53
gary_posterand 23.2% idle17:54
gary_posterat highest17:54
gary_posterusually a lot more17:54
gary_posterwell, often a lot more17:54
benjiinteresting, so over the last five minutes there have been on average 19 (rounding up) processes that were runable; it seems significant to me that the number is so much less than 2417:55
gary_posterwell, there was a 14.35 % idle, but still17:55
gary_posterbenji, fwiw, that was relatively near the beginning of a test17:56
gary_posterright now our 1 minute time is 21.something17:56
benjigood, that's much closer to what I would expect17:57
gary_poster22.53 now even17:57
gary_poster23.53! 23.81! whee!17:57
benji:)17:57
benjigiven that each test is non-parallel (even if something is running in another process, like a DB query, the other process is waiting on the result), I would expect that perfect utilization would mean that load == # containers17:58
gary_poster%idle still in the 30s17:58
gary_posterI guess that makes sense-ish17:58
gary_postergiven 24 test runs on 32 "cores"17:58
gary_posterso that is 75% usage17:59
gary_posterbut where's the slow-down?17:59
benji"the slow-down" as in the variation in start up times?17:59
gary_posteryeah18:00
benjigary_poster: want to hang out?18:00
gary_posterwe are well past that now of course, i, this test run18:00
gary_postersure18:00
gary_posterbenji https://plus.google.com/hangouts/_/4ff44d4a05bb2e19cabdfe6963ad1235f6d40fc6?authuser=1&hl=en-US18:01
gary_posterI am still blue mushroom head18:01
benjigary_poster: my browser seems to be mid-crash, one secon18:02
benjid18:02
gary_posterok18:02
benjihmm, maybit it is my OS18:03
gary_posteruh oh18:03
benjirebooting18:03
gary_posterbenji http://pastebin.ubuntu.com/1042973/20:48
bacbenji, those ebs instructions look great.  i'm confused, though, as to what makes the ebs volume.  is there something magic about /dev/xvdf?21:00
benjibac: I made it by hand.  If we end up productizing it we will use the AWS API to make them and the snapshot and associate the volumes with the isntances, etc.21:01
bacbenji, so that part is not shown in your instructions?21:01
benjibac: I think I mentioned it but didn't give step-by-step instructions21:02
bacyeah, you said create them and make sure they are in the same zone21:02
bacok, i was just going to be really confused if there wasn't more to it21:03

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!