[12:04] benji, frankban gmb, my older son lost more than a cup of blood today from a mongo nose bleed. (On Sunday a friend of his pushed him off a trampoline and he landed on his nose, which may be related.) I'm not sure what my schedule is yet--the bleed has stopped, but we're trying to get a dr appt for him asap. In any case, whether it's with me or not, the people who are available for a call in a bit over an hour [12:04] should have a call [12:06] gary_poster: will do; good luck with the nose bleed [12:08] thank you [12:58] frankban, I'm going to file a bug for the testrepository unicode issue and then you can offer your branch as a possible solution. Have you or gmb started a juju instance or shall I? [12:59] gary_poster, I haven't; I'm having juju bootstrap problems after upgrading. [12:59] (again) [12:59] gmb, the one I mentioned yesterday, with the booleans? [13:00] I ended up removing all charms except the buildbot ones from the local repo [13:00] gary_poster, No, bootstrap just hangs, at least for ec2. [13:00] ugh [13:00] I need to actually poke it with a stick to find out what's going on. [13:00] s/stick/strace/ [13:00] :-) [13:00] gmb, I'll see if I dupe [13:03] gmb, so, you mean, you say "juju bootstrap" and the command never exits? If so, that's not happening for me [13:04] gary_poster, Yeah, that's what I mean. Hmm. I'll break out strace after a call. [13:04] s/a/the [13:04] ok [13:09] gmb and frankban: the horde awaits: https://talkgadget.google.com/hangouts/extras/canonical.com/goldenhorde [13:10] gary_poster: you probably aren't here, but you can join too if you are [13:10] thank you all. I'll be able to attend too. ignore the sounds of baby crying in the background [13:10] dr appt is 10:45 [13:14] gary_poster, Firefox went bye-bye. [13:16] In fact it looks like a lot have things have gone a bit sideways... [13:16] ack gmb :-/ [13:47] Yay, power socket. [13:54] * gmb -> lunch [14:11] gary_poster: have you seen this in any of your test runs? [14:11] ********************************************************************** [14:11] Could not communicate with subprocess [14:11] ********************************************************************** [14:12] benji I don't think so but will search through tee'd results from yesterday, one sec [14:12] gary_poster: oops, it looks like the OOM killer got me [14:12] benji, oh :-/ [14:13] benji, actually, yes [14:13] benji, twice in my run, not in yours [14:14] hmm [14:15] benji, http://pastebin.ubuntu.com/883280/ [14:15] benji, I'm making a card for that [14:15] benji, could be xvfb issue [14:17] benji, it would be very interesting to have you run list 1 locally *not* in an lxc [14:17] to see if that shows up [14:18] gary_poster: yep [14:18] gary_poster: I may have an opportunity to do that soonish (I might have just fixed the read-only problem (with a one-line change to a test tearDown)) [14:18] go you, benji :-) [14:19] :) [14:19] though the read-only thing is a run list 2 issue [14:20] gary_poster: right, I meant that once I fix this (list 1) issue, I could do a run of list 2 [14:20] There was something else like that I saw yesterday that I did not put a card in for...looking through results again [14:21] er, swap 1 and 2 above [14:22] cool :-) [14:22] gary_poster: should I handle the subtle differences in leases path between lucid and oneiric/precise in setuplxc? Or can I just assume the lxc is lucid? [14:24] frankban...it would be nice to handle precise because that will be what we use within a few months. If it is not easy now, put a card in slack time for the precise version [14:24] Maybe that's the right thing to do either way; I'll leave it up to you [14:25] gary_poster: ok thanks [14:25] np [14:26] gmb, I hain't fergotten ya. [14:26] (that's in "random silly accent Gary made up" to be clear) [14:26] Almost ready, hopefully [14:28] gary_poster, Okay. I've not finished lunch yet though :) [14:28] gmb, lol, ok [14:34] benji, this happend on my testlist 2 run (yours) locally, but did *not* show up in the buildbot failure output from two days ago. Did you see anything like this? http://pastebin.ubuntu.com/883307/ [14:34] * benji looks [14:34] s/ my testlist 2 run (yours)/ my testlist 2 run (the one you ran first)/ [14:35] gary_poster: I haven't seen that failure. [14:35] ok, benji, I'll say it was unique to something on my system then. Thanks [14:51] barry's absence will make it difficult to pick his brain [14:58] gmb, I'm doing a bzr branch lp:~gmb/charm-tools/add-charm-helpers on the assumption that this is the right thing to do [14:59] gary_poster, It is; you'll also need to get lp:~gmb/ubuntu/precise/charm-tools/new-packaging for the debian stuff that I'm currently using [14:59] (Since charm-tools has separate source and packaging branches) [14:59] ah ok [15:00] https://code.launchpad.net/~gmb/+recipe/charm-tools-daily has some of the details. [15:00] ack [15:01] * gary_poster decides he's still sick enough to be allowed to sit down while staring at build logs [15:05] The review season begins. [15:05] * gary_poster doesn't understand why a "successful build" would contain something that failed to build [15:06] gary_poster, Because the recipe built successfully - creating the source package - but the binary didn't :) [15:06] gmb, bah :-) [15:07] :) [15:22] benji frankban gmb, I added miscellaneous, high priority, dated cards to represent the work each of us needs to do for the reviews. We should look at this on every call to make sure we are making sufficient and timely progress on it. [15:22] k [15:22] We now, relatively suddenly, have a *lot* of miscellaneous cards. :-) To get rid of one... [15:23] gmb, what's up with lp2kanban? Did you get that running, or is Brad still doing it, or...? [15:23] gary_poster, I got it running; let me check it. [15:23] thanks [15:23] the "..." cards from Francesco ought to have had their titles filled afaik [15:26] frankban, I'm trying to decide whether I feel qualified reviewing your branch. ;-) Meanwhile though, I did see one thing that I learned this week. In this line... [15:26] while [ $delay -gt 0 -a ! -s {leases1} -a ! -s {leases2} ] [15:26] I suggest quoting "$delay" [15:26] If you do not, in some edge cases the "[" command will get upset and confused, in my experience and per some advice I've read recently. [15:27] (in particular, the case in which $delay is empty...which will probably never happen here since I assume 0 is not empty, but it is supposedly just good practice to quote substitutions because of this kind of fragility) [15:27] gary_poster: humm, maybe when $delay is something starting with "-"? [15:27] yeah maybe that too [15:28] gary_poster: ok thanks for the hint [15:29] sure [15:29] gary_poster, I've run it manually; looks like it's not running on cron for some reason; I'll poke around. [15:29] thanks gmb [15:29] Anyway; /me -> afk for a short while [15:29] (Bus home) [15:31] What the heck, I'll claim this review frankban :-) [15:31] :-D [15:31] frankban, what does this mean: "truncate -c -s0 {leases1}" [15:32] gary_poster: if the file exists, truncate it at size 0, otherwise, do nothing [15:32] gotcha [15:32] back in a few [16:25] * gary_poster is lunching now [16:57] gary_poster, Okay, so I'll catch you when you get back, but interesting aside: if I build the package manually using Brad's steps, I get something that actually installs the python files. Not to anywhere useful, but it does _actually_ install them instead of ignoring them. [16:57] s/you/I/ [16:58] gmb, I'm here, but reviewing and emailing and stuff [16:58] and, interesting [16:58] gary_poster, Ah, okay. I'm going to be heading out shortly anyway. Haven't spoken to Barry yet; I'll ping him now and see if he'll be free for a chat later. [16:59] gmb, cool. barry doesn't seem to be around today [16:59] gary_poster, Ah, yes, I just noticed. Darnit. [16:59] gmb, so what should our plan be? [17:00] gary_poster, My first choice is to get the python stuff out of charm tools and package that. This other way is proving unrewarding. [17:00] and in particular, what are our goals for the handover, and what are our goals for when I'm working. [17:00] gmb, that's fine with me, but Clint is the maintainer of the charm helpers project [17:00] so I'd like to convince him [17:01] I'm happy to give him an ultimatum, of sorts: [17:01] either focus on helping us get his preferred approach to work, or let us do what you propose [17:02] we've spent (via you primarily and brad secondarily) a *lot* of time on this [17:02] and we should wrap it up one way or the other [17:02] I'm happy to speak managerially about this to Clint ;-) [17:03] also, what is your schedule for the next few hours work-wise, again? [17:03] gary_poster, Okay. I agree with you. My thought was that having a separate package that works would be a nice convincer :). [17:03] heh [17:03] gary_poster, I'll be afk until about 20:30 UTC, and will then finish my day ~1h. [17:03] gmb, afk starting when? now-ish? [17:04] But TBH I'll likely be around quite late; Sarah's got reports to write, so she pretty much needs me to shut up and get on with other things. [17:04] gary_poster, Yes. [17:04] :-) [17:04] ok gmb, I'll see if I can corral SpamapS while you are gone [17:04] Thanks :). [17:04] :-) welcome. ttyl [17:04] * gmb -> exeunt to divers alarums [17:04] heh [17:11] frankban, bash-as-programming-language frightens me more and more as I learn more and more about it ;-) but what you've done looks very good. There are some constructs that I understand from context but will simply trust you on (case in point: why the initial dollar sign in "delay=$(( $delay - 1 ))"? Why can't it be "delay=( $delay - 1 )"?) [17:12] So, will approve :-) [17:13] Done [17:13] gary_poster: thanks. that block was taken as is from start-ephemeral... so, no idea either, and I am not curious... [17:13] lol [17:13] cool [17:14] I'm somewhat amazed that people still program in this stuff :-) [17:14] yes, only perl is worse than this... [17:15] ;-) [17:15] heh [17:23] gary_poster: landing, and, I am not a buildout expert, so maybe you could take a quick look at https://code.launchpad.net/~frankban/lpsetup/add-buildout/+merge/97466 [17:27] frankban, will do. [17:27] ty gary_poster [17:54] frankban, approved buildbot branch conditionally [17:54] ./bin/test doesn't work [17:55] other ideas in my reply [17:56] gary_poster: we still don't have unit tests in lpsetup: it is actually the next step. Having test, bin/test should work [17:56] frankban, :-) ok cool [17:59] gary_poster: buildout was suggested by benji (basically to have a test runner for free I think). I think pip is already supported (setup.py), once the project will be registered to PyPI [18:00] frankban, ok cool [18:00] ty gary_poster, EOD, have a nice evening [18:00] you too frankban, bye [18:20] ok, my list is down to 36 failures and 126 errors; submitting MP for read-only fix now [19:16] awesome benji! [19:16] very large number [19:16] but not as large as 3000 :-) [19:16] gary_poster: I'm persuing making bugs for all of the test isolation failurs, but I wonder if that's the right thing to do. Thoughts? [19:17] benji, that's fine as long as it doesn't take too long. I've identified four "high" priority cards [19:17] you and frankban have addressed two [19:18] the other two on board are, as you'd expect, the next things I'd like to see [19:18] because they prevent us from getting a true full run of test [19:18] s [19:19] I am in a state of kanban conflict: the board is now not over the limit (with me moving my branch into landing) but we can't add anything else without going back over [19:19] benji, if there is a high card, there is clear visible reason for why we would do so [19:20] I don't normally encourage this [19:20] but the high cards, and the reasoning/concern behind them, warrant it in my opinion [19:20] ok, that sounds reasonable to me; I'll take one of the two remaining high cards [19:20] cool [19:31] benji, does test list 2 still hang at the end even with your isolation fix? It would have been nice if... :-) [19:31] gary_poster: if test list 2 is tmpXctd5i, then no [19:32] so, you're suggesting that bug 954384 may already be fixed? [19:32] <_mup_> Bug #954384: test teardown can hang < https://launchpad.net/bugs/954384 > [19:32] benji, it doesn't hang? yes! that is what I am suggesting! With glee in my heart! [19:33] * benji watches as gary_poster skips through a meadow and whistles with songbirds. [19:33] :-) [19:33] yeah, I've been kind of a ball of tension this week for one reason and another. [19:33] this relieves a big reason [19:34] I can say definitively that it didn't hang, but I can't say for sure that it hung before my fix.... hmm, or can I; let me look at something. [19:34] right, it could still be lxc [19:35] unless it hung for you before [19:36] nope, I don't have any evidence that it hung before [19:37] I can do another test run, maybe late tonight (because it takes forever, 199 minutes last time) to try to get it to hang (running the pre-fix code). [19:37] all this casts doubt on whether or not I can or should work on 954384 (which I planned on doing) [19:38] I could try to reproduce the hang wiht a smaller test subset. That seems like a smart thing to do. [19:38] I also really need to get a precise lxc or vm up. [19:41] benji, why don't you do the precise vm/lxc [19:42] If we don't have the hang anymore then trying to repro doesn't make a lot of sense to me [19:42] oh [19:42] right [19:42] you don't have any evidence that it hung before [19:42] which means it could still be an lxc thing [19:43] right, since I haven't provoked it, we can't be sure that me not seeing it is evidence that it is gone [19:44] but the not having a vm thing is really hurting me too, for example, I can't land the fix for the read-only bug until I have one. I have to depend on the kindness of strangers, hint hint. [19:45] speaking of, is lpsetup supposed to work? I get TypeError: unsupported operand type(s) for +: 'NoneType' and 'str' when I run it [19:49] benji, lpsetup it is not supposed to work [19:50] benji, kindness of strangers, lol, ok [19:50] ok, good, I guess :) [19:50] thanks! [19:50] benji, I am setting up my juju buildbot to run your branch, then will land [19:51] I mean, I will land it once I have started the tests, not after they have run [19:51] well [19:51] oh heck, yes [19:52] lxc-wait is really a drag [19:54] heh [20:03] ok... it is apparently both required that we devleop on precise (because we can't land otherwise) and impossible to set LP up on precise (email thread started by [20:04] Deryck [20:04] bang! bang! [20:04] * benji shoots the ferral tabs that got in here. [20:05] * benji is orthographically challenged, but can still spell "orthographically challenged" [20:38] benji, lxc works :-) [20:39] gary_poster: I don't follow. [20:39] benji, which is what I do (precise host, lucid lxc instances) [20:39] and that appears to work [20:39] benji, on another note [20:39] lxc-start-ephemeral is adding to the number of days of my life that I have lost this week from stress :-P [20:39] heh [20:40] wait, you develop on lucid? I thought we were supposed to use precise? [20:40] benji, I edit my code on precise. I run my tests and my dev tools on lucid [20:41] which is AOK [20:41] benji, back on my stress-inducing topic...take a glance at http://ec2-50-16-1-238.compute-1.amazonaws.com:8010/builders/lucid_lp/builds/2/steps/shell_8/logs/stdio [20:41] oh! I needed that bit of info! I'll try to set up a lucid lxc container then. [20:42] notice the __init__ complaint, #1. Where the heck did that come from? [20:42] hmm "exit code 3" [20:42] #2, I try to run the given command on the slave [20:42] This fails, complaining that xvfb fails [20:43] #3, I try to run the "start a ephemeral container and then I'll log into it". This works. Then I log in, become buildbot, and run the other part of the command. This succeeds [20:43] ooh, I didn't see that. That is odd. [20:43] So...I'm about to try duping the ssh command [20:45] And all I *really* want to do is verify that frankban's fix from yesterday fixes our problem, and verify that your fix from today fixes the majority of our tests, and might or might not fix the hang problem [20:45] but instead I keep getting "one step forward two steps back" issues [20:45] ok, venting over. thanks benji. ;-) now I'll go back to trying to do the ssh [20:46] good luck :) [20:46] ;-) thanks [20:50] benji argh: [20:50] ssh -n -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /var/lib/buildbot/.ssh/launchpad_lxc_id_rsa buildbot@10.0.3.104 -- 'xvfb-run --error-file=/var/tmp/xvfb-errors.log --server-args=-screen 0 1024x768x24 -a /var/lib/buildbot/slaves/slave/lucid-devel/build/bin/test --subunit' [20:50] notice the lack of quotes that we so carefully added [20:50] pfft [20:50] benji, what did we do to fix that again? I forgot [20:51] I'm trying to remember. [20:53] gary_poster: I think this was the very long line and regex to remove extra spaces in setuplxc. [20:54] benji, this is what we are producing now, from the setuplxc script: [20:54] lxc-start-ephemeral -u buildbot -S /var/lib/buildbot/.ssh/launchpad_lxc_id_rsa -o lptests -- xvfb-run --error-file=/var/tmp/xvfb-errors.log --server-args='-screen 0 1024x768x24' -a /var/lib/buildbot/slaves/slave/lucid-devel/build/bin/test --subunit [20:54] IOW, it is good [20:54] afaict [20:54] I'm going to try random quoting in lxc-start-ephemeral... [20:57] gary_poster: I don't think that's good. Remember that ssh handles the arguments incorrectly, we need to put everythin after -- in double-quotes. [20:58] benji, well... [20:59] benji, maybe that's what we did. I don't remember, but we could dig it up. BUT... [20:59] If I call this [20:59] lxc-start-ephemeral -u buildbot -S /var/lib/buildbot/.ssh/launchpad_lxc_id_rsa -o lptests -- xvfb-run --error-file=/var/tmp/xvfb-errors.log --server-args='-screen 0 1024x768x24' -a /var/lib/buildbot/slaves/slave/lucid-devel/build/bin/test --subunit [20:59] then ISTM that lxc-start-ephemeral ought to be able to quote everything itself [21:00] however [21:00] if you run that command [21:00] $@ (in lxc-start-ephemeral) is this [21:01] xvfb-run --error-file=/var/tmp/xvfb-errors.log --server-args=-screen 0 1024x768x24 -a /var/lib/buildbot/slaves/slave/lucid-devel/build/bin/test --subunit [21:01] that is, the quotes have been lost already [21:01] before we even get to ssh [21:02] well, not exactly, the "--server-args=-screen 0 1024x768x24" bit is in a single arugment, but $@ formats the arguments as a string, loosing the grouping information which is then re-interpreted by ssh (incorrectly) [21:03] setuplxc on devel looks like it should do the right thing to me. I suggest looking at the setuplxc on your box to be sure it has the latest code [21:06] benji, what I believe is supposed to be the fix is there: [21:06] http://pastebin.ubuntu.com/883900/ [21:07] gary_poster: yeah, that looks right to me... hmm, at one point we also put double-quotes around the $@ passed to ssh inside lxc-start-ephemeral; I can't remember if that was part of the fina fix or not, but that'd be the next thing I try [21:09] this is irritating, reasoning about this stuff shouldn't be this hard [21:09] benji, on late team lead call [21:21] benji, I'm still on call [21:21] but [21:21] oh [21:21] I bet you have stopped :-) [21:22] gary_poster: I'm trying to stop. ;) [21:22] benji, :-) ok. So, I think that might have been a red herring. I'll explain last :-) [21:23] later I mean [21:42] gary_poster, So, did you hear anything back from Spamaps? [21:43] gmb, ugh, no, I've been deep into my private hell of buildbot no longer working, sorry. Let me see if he is still around [21:43] gary_poster, No worries. Seems like this week is a week of private hells. [21:43] yeah :-/ [21:43] If we all had a hangout, at least we'd be toasty together. [21:43] heh, yeah :-) [22:00] gmb, you on #juju? [22:01] gary_poster, yes; I was looking at the wrong #juju :) [22:01] :-) [22:01] gary_poster, Awesomesauce. [22:02] Also, never saying "awesomesauce" again. [22:02] lol [22:03] gmb, do you have any champagne? [22:05] * gary_poster steps away [22:11] gary_poster, No, but I'll sleep better tonight :) [22:11] gmb, cool :-)