[11:41] :-) [11:50] frankban, approved your MP, with a small typo change. [11:50] frankban, oh, after you make the change let me know and I'll approve again, or you can approve. [11:51] (and after that, change the MP to Approved) [11:55] gary_poster: thanks [11:55] welcome, looked great [12:03] gary_poster: typo fixed. [12:04] reapproved frankban [12:04] cool, approving the MP [12:04] gary_poster: doing another approve vote doesn't tickle tarmac. only changing the top status of the MP does [12:04] nm [12:04] :) [12:04] :-) [12:04] gary_poster, What's the secret sauce for forcing the buildbot-slave to have a specific number of workers? [12:05] gmb, look at master.cfg on master, at --concurrency option in testr call [12:05] currently forced at 20 [12:05] gary_poster: do note the second vote was not necessary [12:05] gary_poster, Awesome, ta. [12:06] bac, I thought it was in order to have tarmac not complain that there was a revision that had not been approved? [12:06] i just don't want people thinking they have to revisit the MP [12:06] gary_poster: no, the approved revision gets noted when the top-level status goes to 'Approved' [12:06] your reviews are just votes [12:06] bac, oh! ok cool [12:07] I voted twice. Watch out for the chads. [12:07] tarmac uses the rule 'approves >=1, disapproves == 0' so one bad vote will keep it from landing [12:07] that's why we need picture IDs for voting [12:09] bac benji frankban (gmb) https://plus.google.com/hangouts/_/6bdd991b10f6aa1b4f28efb196620126c5f443ec?authuser=1&hl=en-US [12:22] gmb, still allhands problems? [12:24] gary_poster, I signed in this morning but can't get to my goals... I'm going to talk to Sarah about it this afternoon; ISTR that hte last time we had this problem she was happy with an email to you and her with my update goals - it's the record that matters. [12:31] bac, allhands is ready for your final handshake [12:31] gary_poster: will do now [12:32] gmb, ok. the deadline is Friday, so please take some time aside today and get it sorted out for good one way or another. [12:32] Right. [12:32] gmb, also, replying to your python 2.7/parallel email now [12:32] * bac still hates google authenticator. i wonder if it got hacked along with the RSA fob [12:32] Thanks. [12:32] :-) [13:20] gary_poster, We have a working parallel test instance on Py2.7 / Precise: http://ec2-23-22-155-23.compute-1.amazonaws.com:8010/builders/lucid_lp/builds/0. Limited to 8 workers. [13:20] gary_poster, Tests are running, which is new :) [13:20] But we're seeing "unknown worker" tests. [13:20] Which is sad-panda-making. [13:24] gmb, will review in a sec. I got this error when starting up the charm: [13:24] 2012-06-27 11:53:30,606: hook.output@ERROR: Command '['su', 'buildbot', '-c', "bzr branch 'http://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel' /var/lib/buildbot/slaves/slave/devel"]' returned non-zero exit status 3 [13:24] seems transient [13:24] Was just going to retry [13:24] Is that what you had before? [13:24] gary_poster, No, I had problems with adding the buildbot group. [13:24] weird [13:24] destroy-environment and bootstrappign again worked. [13:24] yeah I think I'll just retry also [13:27] gmb, what size machine is the slave? [13:27] it looks too small for 8 [13:28] you are losing three of eight workers still [13:29] I'd also guess that the first of the two errors you have in the summary so far are because the machine is underpowered [13:29] gary_poster, Yeah, it's an m1.small. My mistake; I forgot to make it bigger. Yesterdays was a 32-core machine. [13:29] ow [13:30] Yeah. [13:30] But, weirdly, this works (ish). [13:30] Yesterday's didn't. [13:30] gary_poster, So at this point, I'm going to see what else breaks on our woefully underpowered machine [13:30] gmb, you could try shrinking concurrency down further, but that's smaller than I've ever used [13:30] Right. [13:31] gmb, as far as unknown worker goes, I have no idea. That's something going wrong within testtools, probably, within the test aggregation code orchestrated by testr --parallel [13:31] Hmm. [13:31] gary_poster, So, here's what I'm going to do: [13:32] the stdout doesn't give me any immediate clues beyond that [13:32] Let this finish (because we've come this far) [13:32] Kill it [13:32] Start a new big slave [13:32] RUn it again [13:32] Ahaha. [13:32] Now it's OOMing [13:32] Time to die. [13:32] yeah. I was going to say :-) [13:32] gmb, is this a known py 2.7 failure? [13:32] raise TypeError("Level not an integer or a valid string: %r" % level) [13:32] TypeError: Level not an integer or a valid string: None [13:32] gary_poster, Yes. mgz has just submitted a branch to fix that [13:32] cool [13:33] It's dead. Now, let's try this again :) [14:37] all: tarmac is now running on canonistack. let me know when you have a MP ready to be approved. [14:49] great, bac. [14:52] I'm going for early lunch, because it is so lovely out and will be hot again in next few days. biab [15:11] gary_poster, For when you get back: jam pointed out something interesting. The unknown worker output looks different from the the worker-N output; it looks like the output from bin/test without --subunit. [15:11] gary_poster, http://ec2-23-22-155-23.compute-1.amazonaws.com:8010/builders/lucid_lp/builds/1/steps/shell_9/logs/unknown%20worker%20%28bug%20in%20our%20subunit%20output%3F%29 [15:22] gary_poster, benji, bac: So, here's an interesting thing: http://ec2-23-22-155-23.compute-1.amazonaws.com:8010/builders/lucid_lp/builds/1/steps/shell_9. We're running with --concurrency=8 and yet we have 9 workers. The "unknown worker" appears to contain the normal output from bin/test... Are we maybe seeing some crap on stdout from somewhere again? [15:49] benji: with grep, when do you need to dereference the count modifiers? i.e. if i wanted to make the trailing '=' optional would i end it with =* or =\* in [15:49] grep -c "^ssh-[dr]sa [a-zA-Z0-9: .\/=+-]\+= " "$1" [15:49] not dereferencing works. but why is the + escaped? [15:49] s/derefernece/escape [16:02] bac: if you use -P (perl-like regular expressions) you don't have to escape the * [16:03] I don't know about the other types, I always use -P [16:28] gmb, hey. That looks like a subset of the stdout bits. [16:28] In particular, :setUp and :tearDown are patterns the subunit stuf uses for the fake subunit layers, I believe [16:29] gary_poster, Hum. okay. I've replied to the thread about it all - we're about to take off for dinner. [16:29] More poking around to be done tomorrow then. [16:29] We'll try running against 2.7 under lucid. [16:29] (if that's even possible) [16:30] Anyway. [16:30] * gmb -> exeunt, in search of victuals [16:31] bye [18:00] benji, for your lpsetup changes, have you just been running the command every now and then to make sure things work as you expect, given the minimal tests of the subcommands? [18:00] * gary_poster is wondering what to do himself [18:01] I'm tempted to run the command on an ec2 instance [18:01] gary_poster: yeah; it is quite sub-optimal, especially given how long some of the steps take [18:01] ok, thanks benji [18:02] After I finish this card I hope to work on the testing cards a bit; I suspect testing will be quite hard given the subject matter and that there is so much code already. [18:04] benji, you mean the slack time card? [18:05] no... I thought we had a "real" card for testing; I guess not. I won't be doing that, in that case. :) [18:22] gary_poster: nm [18:22] bac, ? [18:22] gary_poster: i accidentally pinged you in #launchpad [18:23] bac, lucky I'm not over there then :-) [18:23] benji, if you want to explore for a bit, you are welcome. I thought I did what we had agreed to do last Friday [18:24] gary_poster: you probably did; that's why cards are good, they help keep our (my) memory straight [18:24] :-) cool [18:25] you are still welcome to explore for a bit :-) . We don't really love the story we have right now, so I'm hoping someone will explore. We may not have a real reason to go into slack time until gmb returns though [18:53] gary_poster: i've added lpqabot to ~yellow so she can access private branches [18:53] she, like a ship, bac? [18:53] sounds good :-) [18:53] gary_poster: http://mrl.nyu.edu/~perlin/experiments/rosie/rosie.jpg [18:54] lol [18:56] bac, I guess you need to document for us what you did to set up the canonistack instance, including the credentials? [18:56] gary_poster: yes [18:56] cool [18:56] gary_poster: it *would* have been quite straightforward but for a couple of problems with software in lucid [18:57] bac, why/where do we have lucid? [18:57] in lucid ssh-import-lp-id failed for 4/5 of the yellows, rejecting our ssh keys as invalid [18:57] bac, feel free to postpone that answer for whatever time is more appropriate, like writing up your docs [18:58] uh [18:58] huh [18:58] gary_poster: the recipe jamesw gave me uses a lucid instance on canonistack. i tried using a precise ami but the version of puppet there was not compatible with his scripts [18:58] who was lucky 1/5 [18:58] oic [18:58] benji [18:58] :-) [18:58] you had a trailing line feed [18:58] one of my keys had lots of line feeds [18:58] and gmb and frankban had the misfortune of not having any base64 padding. [18:59] * benji polishes his SSH key collection. [18:59] none of those things make a key invalid,, but the script wrongly thought it did [19:02] heh [19:20] hmm, the frameworky bits in lpsetup seem to be getting in my way; I /guess/ I should add more framework [19:21] This is how Zope started. ;P [19:26] benji please record reminders of pain points for Friday [19:27] gary_poster: ooh, good point [19:28] (this isn't /entirely/ a "pain" point but more like something that needs to be handled that isn't handled by the structure there now) [19:28] gary_poster: did you see G+ has events now? i just created one for our hangout tomorrow as an experiment [19:28] looks like they are only one-offs right now [19:28] bac, what does that mean? Where should I look? [19:29] G+ [19:29] (which in my ever-so-slightly humble opinion is why frameworks (things that call your code) are often inferior to libraries (things that you call)) [19:29] gary_poster: 'events' on left vertical bar [19:29] i guess you can't have a left horizontal bar [19:30] what's with the dripping pancakes? :) [19:30] bac, cool thanks [19:30] i see benji found it [19:31] I love those HD-quality, looped animated GIFs. [19:31] benji: you don't like pancakes? [19:31] I love pancakes. Especially non sequitur pancakes. [19:32] gary_poster: see if you can ssh tarmac@10.55.60.129 [19:34] bac worked [19:34] (on call) [19:34] cool [19:34] better than not working [19:34] :-) [19:41] gary_poster: i've added tarmac landing for zope.testing/3.9.4-fork too. it's easy peasy now: http://pastebin.ubuntu.com/1063107/ [19:47] bac, you rock [19:47] gary_poster: other requests? [19:48] of course, none of these are tested as no branches have landed [19:48] bac, whirled peas. [19:48] IOW, no, not right now. Docs on what to do will be fabulous. [19:48] i do wonder how to communicate the IP address of the instance when it spins up. does canonistack have any dns abilities? [19:49] gary_poster: most of it is in a branch, so it is self documenting. i'll write up what i've got [19:49] canonistack dns abilities: Not last I tried, unfortunately. You could use a frozen ip, whatever those are called [20:13] bac ^^ [20:14] bac, we are allocated three of those IPs per person in theory [20:14] in reality you have to beg for them, IME [20:15] oh, really? i saw they have public addresses but we don't need that...just a fixed one in our internal numberspace would be fine [20:15] it's just a PITA when it hops around [20:35] gary_poster: https://dev.launchpad.net/yellow/TarmacOnCanonistack [20:35] bac, yeah meant public address. Why not? [20:38] bac, great! "You'll also need to install fabric" -> locally? or on canonistack? I'd guess the latter but the text reads as if it is the former [20:38] gary_poster: public is not necessary as long as our team can access it via chinstrap [20:38] fabric is installed on your local machine [20:38] i'll make that clear [20:38] bac, not necessary, but it gives us the "non hopping" characteristic...if it's not that big of a cost, I think it is worth it [20:39] bzc, oh, fabric drives juju locally? [20:39] bac, I mean [20:39] so it is kind of like your runparallel script? [20:39] gary_poster: no juju [20:40] ih [20:40] oh [20:40] fabric launches a canonistack instance and deploys tarmac using puppet [20:40] got it [20:40] mildly embarrassing to company that this is not juju, IMO [20:41] in terms of being a public resource [20:41] gary_poster: from the wiki page: Public IP addresses are a limited resource. Please be considerate when reserving and using public IP addresses for you instances and release them when you're not needing them [20:41] yeah saw that before [20:42] so, from that i figured we don't really need them. ssh tunneling via chinstrap is easy and not noticeable. since none of the services are public facing it doesn't make sense to me [20:42] if we were running a django site then i'd grab one [20:42] so what's the annoyance of the IP jumping then? [20:42] there is no annoyance [20:42] tarmac used to have an idea of a web site [20:42] you just have to set up your .ssh/config properly and then it is transparent [20:43] so you have to do that every time, and that is the annoyance [20:43] "there is no annoyance" bah humbug! :-) [20:44] gary_poster: james said he was working on a juju charm but it isn't ready. i chose to go with what he had as the quickest way to deploy. [20:44] ah excellent [20:44] i spent about half the time on the dumb ssh-import problem [20:44] +1 [20:44] +1 on dumb problems? :) [20:52] :-P +1 on the quickest way to deploy [21:17] bac, can you move your card to Done now? Would be nice for frakban tomorrow [21:17] frankban [21:23] bac or benji, if you want a quick review, https://code.launchpad.net/~gary/lpsetup/paralleltweaks/+merge/112438 is for you [21:23] talk to you all tomorrow