[00:33] Wow, congrats lifeless. [00:33] yes, indeed [00:33] hi wgrant [00:34] Morning poolie. [00:34] i think robert's new role will have benefits for you in particular [00:35] Oh? [00:35] being another step towards openness and ease of contribution [00:35] "think or hope" i suppose [00:40] leonardr: did you read jml's lep about api access to authentication tokens? [00:41] thumper: thanks [00:42] wgrant: thanks; poolie: thanks [00:42] i miss you lifelesS! [00:45] already? I'm not gone till Monday :) [00:45] well, i did briefly [00:45] and even then, I think its appropriate to think of it as expanding - like my waistline - the bzr team is definitely able to call on me to discuss stuff :) [00:45] :) [00:46] oh, or did you mean as an office-mate [00:46] yes, that's what i meant [00:46] righto ! - 5 hours sleep last night, but boy, better sleep than I had been getting [00:46] have seen allergy doctor this morning [00:47] so; finally; no medical stuff to do for a month; no frenzy; a feeling of peace surrounds me :) [00:48] lifeless: I'm sure we can invoke an architecture frenzy for you :P [00:59] hmm, are we in testfix mode I wonder [00:59] restricted_families = archive_arch_set.getRestrictedfamilies(self) [00:59] ForbiddenAttribute: ('getRestrictedfamilies', ) [01:00] seems unrelated to oops infrastructure changes [01:01] ah yes. [01:51] lifeless: a feeling of peace except for the frantic travelling coming? [01:53] thumper: compared to the last three months, its quiet [02:01] lifeless: great, so I can ask you all my LP questions now? :) [02:02] yes [02:02] Not that that would seem to be any different [02:03] heh [02:38] Can I coerce someone into ec2landing https://code.edge.launchpad.net/~wgrant/launchpad/faster-and-more-general-getBuildQueueSizes/+merge/28476 ? [02:40] wgrant: does that fix henninge's patch ? [02:41] rev 11093 seems to break everything, or perhaps its just my branch it breaks. [02:41] if not, then no, I can't be coerced. [02:41] You mean jelmer's/ [02:41] ah yeah [02:41] the one reviewed by [02:41] That is a bit confusing. [02:42] I wonder why PQM doesn't use --author. [02:42] lifeless: Which is the test that breaks? [02:43] wgrant: pqm doesn't use --author because its not the author. [02:43] log -n0 shows the people totally accurately. [02:43] alias it on. [02:43] lifeless: Doesn't help for commit notifications. [02:43] wgrant: we should fix that, then. [02:44] Is there a testfix in progress? [02:44] bzr has the real data; this is a presentation issue IMO: fudging it by feeding PQM an approximation you want to see on the outside just leads to harder to work with data. [02:45] wgrant: I'm not sure how to look that up yet. [02:45] wgrant: and today is a bzr day :) [02:45] lifeless: Well, if you tell me which test fails, I will fix it. It's just an s/f/F/. [02:46] sure [02:47] yu have mail [02:48] Thanks. [02:48] de nada [02:54] Huh. [02:54] Did that get *CPed* without being EC2'd? [02:54] Crazy. [02:59] da wtf [02:59] lifeless: i don't know the state of the committed/released distinction [03:00] there are various contradictory bugs [03:00] i don't think it's useful [03:00] in that it makes noise while simplifying reality too much to be useful [03:00] but maybe that's just me [03:00] I really don't like the noise when a release happens [03:00] The 'it is available to the user to use now' aspect is good though. [03:00] perhaps doing daily releases would address it ? [03:01] (e.g. edge :)) [03:01] that's one element of my dissatisfaction [03:01] 'fix released' does not aiui reliably mean you will see the fix, even on edge [03:02] doing daily releases would be good [03:02] it wouldn't reduce the amount of noise [03:03] whats one of the bug numbers you saw [03:03] istm that a lean view of the process is: it's waiting, it's in progress, it's done [03:03] I'm wondering if I'm missing a subscription somewhere [03:03] bug 592792 [03:03] yeah, I need to tweak my subscriptions [03:03] \o/ more mail. [03:04] poolie: You're saying that's not fixed on edge? [03:04] poolie: perhaps there are two clients here - the feature requestor and the dev asking 'do I have more to do' ? [03:04] wgrant: that was a bug that sent me mail, i'm not saying it is or isn't fixed on edge [03:05] lifeless: there are [03:05] the feature requester wants to know "is it worth me testing whether it's really fixed" [03:05] or "bothering to try that again" etc [03:05] if we try we can probably make both deeply unhappy [03:05] heh [03:05] oops, thats the wrong way around :) [03:06] istm that a state change is not enough [03:06] without saying "this is live on edge now" or "this will be in bzr2.2b4" [03:07] I need a volunteer at the epic, with a 2 minute 'lightning talk' screen [03:09] lifeless: in a certain light, isn't the point of --author to say who the author of the code is? [03:09] lifeless: a merge isn't really code [03:09] so pqm could be the committer, and set the author [03:10] which is what I think many people think like [03:10] thumper: if you squint real real hard, but both Aaron and I have argued that this is working around a bug rather than fixing the root cause. [03:10] :) [03:10] thumper: if we were sending in *patches* it would be trivially correct to use --author [03:11] * thumper cranks up the muzak [03:11] code in an elevator? [03:12] matrix soundtrack [03:12] nice [03:57] thumper: So, how do I get this testfix merged? [03:57] * thumper looks up [03:57] wgrant: eh? [03:57] what's broken? [03:57] Is there an easy way to run all the windmill tests in a given JS file? I know you can bin/test PATH/TO/PYTHON/FILE.py but that doesn't work for a JS file [03:57] prod_lp, at least, but devel should be too. [03:57] wgrant: oh right [03:57] wgrant: whats the branch, I'll ec2land it [03:58] lifeless: Is that going to work with [testfix]? [03:58] I don't know the process yet :) [03:58] thumper: how does one land a testfix fixing branch [03:58] pqm-submit [03:58] https://code.edge.launchpad.net/~wgrant/launchpad/testfix-getRestrictedfamilies is the branch. Trivial -- there were just some references in db-devel that were merged in while the branch was landing. [03:58] thumper: directly, not ec2land ? [03:59] for a test fix, normally yes, but... [03:59] it depends on the extent of the fix [03:59] since buildbot is broken anyway [03:59] adding an extra ec2 run doesn't add anything [04:00] All the archive and PPA tests pass fine. [04:00] Which is, I believe, all that failed in lifeless' run. [04:00] Although I haven't worked out how to get a list from the gzipped thing yet. [04:01] wgrant: gunzip -c | subunit-filter | subunit-ls [04:01] wgrant: or [04:01] gunzip -c | testr load; testr failing [04:02] Fancy. [04:02] nah [04:02] its all about the onions [04:02] if you want fancy [04:02] gunzip -c | tribunal - [04:02] now thats sexy [04:03] I'd just like the failing tests back in the email body :( [04:03] thumper: they are [04:03] thumper: +1 [04:03] aren't they ? [04:03] no [04:03] oh right, I mailed bac and mars when jml forwarded me mail [04:03] its very shallow. Lets do it at the epic. [04:04] * thumper nods [04:04] thumper: can you pqm-submit wgrants branch? the change is a three-liner, one char per line ;) [04:04] thumper: (given you're all setup for it) [04:04] I've got ec2land setup, not pqm-submit yet. [04:04] Hmm, I should make hydrazine talk lp targets to [04:05] too [04:05] I only have devel and db-devel set up [04:05] I should be able to do it without too much problem though [04:05] it should land on devel fine. [04:05] in devel if I do a missing on his branch I see just one commit [04:05] I'm not sure if this will work on production-devel, since I can't see it. [04:05] it seems the error isn't on devel is it? [04:05] But it should be fine. [04:05] devel is broken. [04:05] devel is naffed too [04:05] ah [04:05] I'm not sure if there's a failure yet, but it is broken. [04:06] the devel failure was a weird twisted one [04:06] I'll land wgrants [04:06] thumper: Thanks [04:06] lifeless: How do I run tests with testrepository? [04:07] quickstart doesn't say. [04:08] testr run [04:08] Ah, I presumed I'd have to pipe 'testr failing' in or something. [04:09] wgrant: you can (testr load - in the quickstart) [04:09] but you can also run tests directly if there is a .testr.conf [04:09] please file a bug though [04:09] that should be polished more [04:09] :( [04:09] bzr crashed [04:10] bzr pqm-submit -m "[testfix][r=thumper] getRestrictedfamilies camel case fixup." --public-location=lp:~wgrant/launchpad/testfix-getRestrictedfamilies --ignore-local --submit-branch=../devel [04:10] that is what I tried [04:10] and it crashed [04:10] What did it complain about? [04:10] Does anyone know anything about: createlang: language installation failed: ERROR: could not access file "$libdir/plpython": No such file or directory ? It happens in an updated default install of 10.04 server. [04:10] NoPQMSubmissionAddress [04:11] basicly [04:11] It happens during "make schema". [04:11] Muscovy: Make sure you have postgresql-plpython-8.3 or postgresql-plpython-8.4 installed, depending on which version of PostgreSQL you're using. [04:11] Thanks, I'll try that. [04:12] lifeless: any idea why my pqm-submit line doesn't work? [04:13] thumper: does your ../devel have a submit_thingy setting ? [04:13] wgrant: I'll pull your branch into my testfix [04:13] lifeless: yes [04:13] bzr info on it should say [04:13] thumper: then no [04:13] thumper: Sounds reasonable. [04:13] I'd like to delete pqm-submit soon [04:13] But it needs to go to p-d too. [04:13] wgrant: the magic of auto merge should do that, no ? [04:13] lifeless: I really hope that production-devel isn't auto-merging anything. [04:13] devel being the top of the input tree [04:13] oh right [04:14] I meant after it percolates to db-devel [04:14] lifeless: but not production devel [04:14] wgrant: mmm, I disagree on that hope; but there are some necessary conditions to make it safe that aren't true today. [04:15] lifeless: Well, OK, pre-MergeWorkflowDraft. [04:16] Is that still happening, what with our new overlord? [04:17] wgrant: branch in the pipe [04:17] dunno, you'll need to ask him [04:17] lifeless: Is it still happening? [04:17] wgrant: :P I was referring to jml, he of the hacking-like-an-evil-overlord talk [04:18] Heh, indeed. [04:18] wgrant: not that he can answer the question, just for fun. [04:18] anyhow [04:18] I think that reducing developer friction is important [04:19] that process change should do that, and if there is an available hacker doing it, great. [04:19] I would like to look a little deeper at how concept -> production all works [04:20] possibly up the dial on risk mitigation and down on risk avoidance [04:20] then push rate-of-change up [04:21] I hope thats not uselessly vague. I haven't dug into the precise details of the process in some time - before buildbot - so Monday I'll be *very* busy indeed :) [04:22] Heh. [04:23] concretely, I think we court risk by doing big rollouts [04:23] this is observation over years [04:24] so we put a lot of effort into making sure nothing can go wrong in the rollout : but because its always a big change, something always does. [04:24] FSVO always [04:24] 30 man-days of changes, more or less. [04:24] smaller rollouts would have less risk. [04:25] and if the risk handling stuff is non-linear, then smaller rollouts may be better than just the ratio of sizes-included. [04:25] Yep. [04:25] But this requires the ability to handle a crisis effectively at any time. [04:25] it may be easier to figure out what *can* go wrong looking at a smaller rollout, and so be more prepared for what-ifs. [04:26] Which is certainly not the case now. [04:26] wgrant: not quite true. It requires the ability to handle a crisis starting within some time window T of the rollout [04:27] you could, for instance, have an automerge-wait-for-losa-to-hit-a-red-button situation [04:27] and every losa at start of shift could assess the risk, and hit the button. [04:27] if the relevant people are around-and-will-be-for-say-2-hours [04:27] True. [04:27] * lifeless handwaves furiously [04:28] if you're interested in helping shape something like that, say so [04:28] I can chat about it with folk at the epic and see if we can give you a set of constraints and requirements [04:31] I mean to say: this is something that anyone interested should be able to work on. [04:32] and while I'm interested, I'm unlikely to have personal directed time on it immediately. But I'd love to see things become easier for everyone :) [04:58] wgrant: I'm applying your fix to the prod-devel branch too [04:58] thumper: Thanks. [05:07] fwiw, recognising it is handwaving, we don't look after just LP. we have 3 or 4 (depending on how you count it) major systems. so the implicit assumption that we can treat LP as special/above the others is ... hrm... unwise is harsher than I want, but you get the idea. So 'start of shift, make a choice' stuff would not be great I'm thinking [05:11] spm: Your name says you only have two services, so any of the others are figments of your imagination. QED. [05:13] * spm ponders if suspending wgrants account in retribution would 1. be overkill 2. pointless as he'd hack his way back in anyway 3. seen to be an excessive response 4.. there is no 4. [05:14] Heh. [05:17] lifeless: fwiw#2, yesterdays rollout was incredibly smooth. there were internal comments that a crisis needed to be manufactured so tom'd believe we were actually working. Having said that. 1hr 15 in the actual rollout portion of really quite intense very procedural Do X, then Y; isn't a walk in the sunshine either. :-) [05:18] Why does it take so long? [05:18] DB upgrades? [05:19] that's part of it, but no means all. [05:20] a basic breakdown: 10-15 mins to breakout to R/O and be able to do the DB updates. This includes shutting down the services that currently can't be kept up. code*, soyuz* etc [05:20] DB updates themselves, which can vary widely. 20-30 mins'd be the norm [05:21] re-enable the DB to be live again; restart all the app servers; verify; restart all the shutdowns. [05:22] Hm, OK. [05:23] And also ignores about 60 minutes of moderately intense prep to get to that stage :-) [05:24] Prebuilding the new trees on all machines and such? [05:25] crontabs, nagios, irc topics, etc yup [05:25] verifying the cowboys that are live are known cowboys and are included [05:42] spm: I know you have massive widespread responsibilities [05:42] spm: for start of shift, read 'appropriate time' : I find dwelling too much on the minutiae distracts from the concept. [05:42] heh [05:43] spm: I'd _love_ it if everything that makes a rollout slow (e.g. shutting down services to go ro) had bugs. And I knew the bug numbers. [05:43] spm: do you know if that is the case? [05:43] not sure tbh. some parts we have raised in the past. whether they became bugs, probably not. [05:44] I'm a data monster, really ;) [05:44] I hadn't noticed? This is very much news to me! [05:44] damn. left my sarcasm meter on. it just exploded. [05:44] :) [05:46] I guess what I'm not saying above - ideally I (personally!) don't want us to be yay/nay a rollout except in *exceptional* circumstances. like edge. it is it assumed it will work; if it doesn't we handle said failure gracefully, and that's the point we'd get involved - but the failure is not critical. Not a respond now event. More a respond soonish. [05:48] that makes sense to me [05:48] the nuance here, is that I was proposing you assess the surrounding support for dealing with stuff - controlling the timing, not the do/not do. [05:48] or perhaps: it's your code, your system (tho recognising that is an artificial distinction with ownership there...); if you want X, we'll make it so, but pls retain 'ownership' of the responsibility. kinda thiungy. [05:48] because you are the response team. You know if you're insanely busy, or just flat out busy. [05:48] ahh I see [05:49] we together provide launchpad.net - its lp-devs + lp-managers, not lp-devs alone or lp-managers alone [05:49] (which you know) [05:49] :P [05:49] I try to forget.... [05:51] back to the example - if there was a buildd change, you might want lamont available [05:51] so you'd say 'delayed to $his tz' [05:52] but most changes are relatively shallow and would just be 'doit' [05:52] I'd, again very much personally!!!, be very keen if lp prod rollouts were much like edge is now. just a matter of course and all done purty much automagically. The key reason being a somehwat selfish one - is that makes our life much easier. if the rollout is so smooth that simplish scripting can do; then problems are also equally trivial to solve [05:52] lifeless: hey, as your new role as TA, you can give out rc's for production-devel [05:52] lifeless: want to give me an RC for wgrant's fix? [05:52] thumper: What are the implications here :) - its day -2 :P [05:53] not much in this case, I'm just dotting the i's [05:53] actually - if there's a buildd change, I don't believe we have too many options but to wait for him. tbh, not sure of the fine detail there. [05:53] spm: so you see the point :) [05:53] oh yes. [05:53] lifeless: I could equally go release-critical=thumper, but I thought you might like your name there :) [05:53] spm: FTR, I'm a huge fan a automation. I was surprised by Tom's apparent disinterest in the recent thread about detecting stale trees or whatever it was. [05:54] thumper: use yours ;) [05:54] :) [05:54] ok [05:54] thumper: When I'm doooog tired after that terrible hotel, iz not a good time to do new things ;) [05:54] which thread was that, don't recall seeing it myself? [05:54] heh === Ursinha is now known as Ursinha-zzz [08:13] good morning [08:46] Yay, buildbot loves me. [08:46] nah, that was me forcing a build :-) [08:47] thumper: wgrant: edge1: canonical.database.revision.InvalidDatabaseRevision: patch-2207-56-0.sql has been applied to the database but does not exist in this source code tree. You probably want to run 'make schema'. <== is that you guys causing that? [08:48] spm: stable is out of date. [08:48] oh bah. [08:48] Since it broke soon after db-stable was merged. [08:48] buildbot is happy now, though. [08:48] So it should pull soon. [08:49] we can hop. I've reverted the edge update; I guess we'll find out this timish tomorrow if edge is happy again [08:49] hope too [08:51] spm: Heh, it's just pulled now. === lionel_ is now known as lionel === almaisan-away is now known as al-maisan [11:16] wgrant, do you know of any recipe builds that should work on either dogfood or staging so we can Q/A your change to launchpad-buildd? [11:17] jtv: I don't know if staging works yet, but I may be able to get something working on dogfood. [11:17] ... except that I forget how it interacts with codehosting. [11:17] Does it use staging or production codehosting? [11:18] It just doesn't have any. So we may have to fake database records or something. [11:18] Actually, if it's just for reading from a branch, it uses production codehosting. [11:19] Right. [11:19] Is buildd-manager running? [11:19] * jtv checks [11:20] Oh good, I can't log in on DF. [11:20] how jolly [11:20] (OOPS-1650DF10) [11:21] On the bright side, buildd-manager is indeed running [11:26] bigjools: Any idea why DF won't let me log in? [11:47] wgrant: checking [11:47] bigjools: Thanks. [11:48] "HTTP Response status from identity URL host is not 200. Got status 404" [11:48] awesome [11:48] No provider set up? [11:48] some config must have changed somewhere [11:48] no idea what that might be [11:59] Can someone please land https://code.edge.launchpad.net/~wgrant/launchpad/faster-and-more-general-getBuildQueueSizes/+merge/28476 ? [11:59] Morning, all. [12:01] wgrant: nice, I'll land it [12:02] bigjools: Thanks. [12:23] wgrant: try again [12:25] bigjools: Failure continues. [12:25] gnargh [12:25] OOPS 20 [12:26] same problem [12:26] the oops report doesn't do anything useful and print the url it's trying to use [12:26] that would be too useful [12:26] Heh. [12:26] sigh [12:27] What's the OpenID host it's configured to use? [12:27] fuck nose [12:27] the config is so convoluted it's hard to work out [12:27] Module canonical.launchpad.webapp.login, line 183, in render [12:27] allvhosts.configs[openid_vhost].rooturl) [12:27] but rooturl doesn't exist [12:29] bigjools: You need launchpad/openid_provider_vhost set. [12:29] it is [12:29] To one of your vhosts. [12:29] And that vhost has its hostname set? [12:29] yes [12:30] Yay... [12:30] 404 means it's talking to something at least [12:30] Something, yes. [12:30] but what.... [12:37] wgrant: please try again, I added some more info to the exception that goes in the oops [12:37] bigjools: Done. 22. [12:38] GAR [12:39] Oh? [12:39] I picked the wrong egg to hack [12:39] Heh. [12:41] try again... [12:46] nm [12:52] Any luck? [12:52] no [12:53] it's trying to access https://login.dogfood.launchpad.net/ which 404s [12:53] Right. [12:53] You could just tell it to use login.launchpad.net... [12:53] I *could* [12:54] Unless you want c-i-p fun... [12:54] trying one last thing before I give up [12:54] bah no fair, I have to travel on Sunday when the British GP is on [12:54] wgrant: cip? [12:55] lifeless: canonical-identity-provider. [12:55] Crazy Django thing. [12:55] yup [12:55] I remembers [12:56] * bigjools shovels more coal into dogfood [12:59] night [12:59] Night. [13:00] grar === matsubara-afk is now known as matsubara [13:03] still no dice === salgado-afk is now known as salgado [13:05] bigjools: What're you trying to do? [13:06] point df at staging's openid [13:07] Hm. It's not working? Firewalled, maybe? [13:07] changing the config made zero difference [13:07] There are three OpenID vhosts -- you changed the right one(s)? [13:08] I've managed to get the wrong one before. [13:08] there's only one in the DF config [13:08] FINALY [13:08] and finally too [13:09] Yay, it works. Thanks. [13:10] now, food [13:10] Is buildd-manager alive? [13:11] It apparently was before. [13:11] But it's not dispatching a recipe build now. [13:11] Ah, there it is. [13:17] rubidium really is unbelievably slow. === Ursinha-zzz is now known as Ursinha [13:25] jtv: Around? [13:26] wgrant: Around. [13:27] https://code.dogfood.launchpad.net/~wgrant/+recipe/ivle-test/+build/145 just worked. The i386 build seems to be working fine too. [13:27] jtv: ^^ [13:27] wgrant: \o/ [13:27] I'll tell lamont that we can roll out [13:28] Great. === stub1 is now known as stub === deryck is now known as deryck[lunch] === mordred_ is now known as mtaylor === mtaylor is now known as mordred_ === mordred_ is now known as mtaylor-away === mtaylor-away is now known as mtaylor === mtaylor is now known as mtaylor|away === mtaylor|away is now known as mtaylor === Ursinha is now known as Ursinha-nom === deryck[lunch] is now known as deryck === matsubara is now known as matsubara-lunch === Ursinha-nom is now known as Ursinha === matsubara-lunch is now known as matsubara === al-maisan is now known as almaisan-away === salgado is now known as salgado-dr [20:46] morning [20:53] morning lifeless [20:55] hi mtaylor [20:56] 'sup ? [20:57] lifeless: hanging out in dallas. enjoying sitting by the pool hacking [20:58]