[01:29] bryceh: just fyi, your blog has some trouble with openid ;) [01:29] anyway, thats a neat new feature :) [02:07] nigelb, it does? howso? [02:07] I didn't even know it could do openid [02:07] nigelb, although I did notice nobody wrote comments :-/ [02:11] bryceh: heh, I tried to login with LP [02:11] failed [02:11] you want me to grab the error? [02:13] yeah [02:13] I upgraded my web server to lucid, and drupal got upgraded, however it seems there is some bug that causes upgraded drupals to break in lucid [02:13] (which is why I haven't been blogging for a long time) [02:14] maybe that bug is why user registration doesn't work either [02:17] yeah [02:17] hold on, doing it again [02:19] is the drupal issue that it's using deprecated functions? [02:19] it probbaly is [02:19] lucid only has 5.3 [02:19] http://pastebin.com/r3fTKwmQ [02:20] ajmitch, no it looks like just a bug in the upgrade script, doesn't set fields to autoincrement [02:20] lucid does have a drupal6 package, fwiw, but it's even OT for here :) [02:21] yeah, autoincrement thing :) [02:21] nigelb, ok yeah I've reproduced that error [02:21] that explains the comments :) [02:22] why don't you move to something like wordpress for ease of use? [02:29] nigelb, why? [02:30] drupal is nice, but just "complicated" ;) [02:31] nigelb, I develop on launchpad and x.org, I shouldn't be put off by "complicated" ;-) [02:31] nigelb, is wordpress open source? [02:31] bryceh: hahahaaha [02:32] and yes, its open source [02:32] also complicated for visitors is what I meant :) [02:36] well, if I wanted something simple for visitors I'd just post to facebook [02:37] hahah [02:37] but that still needs signup [02:38] btw, you're still in Prague? [02:38] nope [02:38] got home last night [02:41] :) [02:41] someday I gotta start hacking on LP. [02:44] nigelb: JOIN US! [02:44] wgrant: I'm just worried [02:44] I use this system for work [02:44] nigelb, it might be interesting for the LP folks to know what is holding you back, so they can work to eliminate that as an issue [02:44] so, can't have my apache dying on me [02:45] nigelb: LP doesn't usually eat systems alive, and if you're really worried you can use a VM. [02:45] the instructions say that the settings used for working on LP means usual web development cannot happen [02:45] nigelb, do you not have another system you could use for development on? [02:45] rocketfuel-setup will try to remove PHP (because it installs apache2-mpm-worker), but that can be worked around. [02:46] launchpad-database-setup obliterates all PostgreSQL databases, but that can also be worked around if required. [02:46] bryceh: not yet. My own system is in the shop awaiting motherboard replacement [02:46] But apart from that, it coexists fine. [02:46] Just adds some vhosts to Apache. [02:46] * nigelb developes on PHP :( [02:47] VM seems like a good idea. [02:47] nigelb: that just means you have to tweak rocketfuel-setup to not install apache2-mpm-worker. [02:47] It's not actually required. [02:47] nigelb, well, like wgrant says, running in a VM works too. that's how brian does it [02:47] But yes, a VM works too. [02:47] I'll let a VM run rocketfuel overnight today :) [02:48] * nigelb gets sucked in LP [02:51] wgrant, maybe the newbie intro docs should just be written from the start to nudge the user towards setting up launchpad in a VM? it'd sidestep so many issues and worries [02:52] bryceh: Possibly. I'm not sure who manages those docs these days, though, since Karl disappeared. [02:53] disappeared? [02:53] there was someone who'd stepped up to that role I think [03:20] there is a volunteer, yes. [03:29] who? [03:31] it was announced to the list [03:31] heh this is a bad sign if no one knows [03:31] public I think. [03:32] I know, it just not paged in. [03:32] lajjr [03:32] also, I present, another victem of the heat. [03:35] nigelb, anyway I think you may be right I should just use wordpress. I had visions of using drupal for some more sophisticated website stuff, but these days I hardly have time to mess around with updating my website anyway [03:35] bryceh: heh :) [03:35] just be sure to update regularly [03:36] unfortunately that means losing all my old blog posts [03:36] nigelb, does something bad happen if you don't update regularly? [03:37] grrr jmkuhn - thanks mwhudson. [03:37] bryceh: vulnerabilities get reported often and get security updates [03:37] if you're lazy, you get owned :D [03:38] that sounds bad [03:38] bryceh: wordpress.com ++ [03:38] nigelb, is updating as simple as apt-get upgrade wordpress or is more to it? [03:38] there would be a zip file i guess which you would use [03:38] ick [03:38] I've never done it. I just use wordpress.com ;) [03:38] there is also a bzr branch you can just pull [03:39] yay, bzr++ [03:39] lifeless: that should be a snap then :) [03:40] lifeless: I just coped the planet.ubuntu.com idea of launchpad team + bzr branch of config for adding to blog syndication [03:40] lifeless: total win - should be a launchpad feature... [03:40] cool [03:40] hmm [03:40] interesting folk are here [03:40] here is a crazy idea I had. [03:41] what if every object was [crudely] available as a 1-item-long rss feed (using accept-encoding negotiation) [03:41] interesting [03:41] To what end? [03:41] pubsubhubbub [03:42] * mtaylor hates it that lifeless answer word is real and that I know what it is [03:42] we would pingthesemanticweb on a change to any object [excluding private in the first instance] [03:42] that is monitored by at least one hub [03:42] Hmm. [03:42] and you can get callbacks when the object has changed. [03:43] would make it easier to monitor things like lists-of-available-merge-requests [03:43] want to know when a build has finished on arm + i386? subscribe to the build [03:43] any thing that gets me callbacks will make me happy as far as those things go [03:44] bryceh: I stand corrected. There is a press-button upgrade feature. [03:45] nigelb, mm [04:16] mwhudson: Hmmm, so you're actually going to fix LP and modularise it (and the model classes)? [04:57] wgrant: well [04:57] wgrant: to some extent or other [05:06] rewrite from scratch. in php. [05:07] ewwww [05:07] * wgrant stabs spm repeatedly. [05:07] * mtaylor pokes spm until he stops [05:07] spm: that's the official word from what the LOSAs want then? [05:07] ajmitch: um.... [05:08] next he'll be saying that PHP 5 is too new, let's go for 4 [05:08] tho I will concede my troll has been matched, and the trolling stacks raised higher than I'm prepared to go. [05:08] Oooo! didn't think if that! ta! yes! php4 ftw. [05:10] * ajmitch shouldn't continue down this path [05:10] fuck that... cobol forms are much more powerful [05:11] spm: I just got a message that said that ~swift-bugs is subscribed to openstack - but I cannot for the life of me see _why_ that would be the case [05:12] mtaylor: hrm. looking [05:12] spm: also, I think we're clear to delete all of those -private things [05:12] mtaylor: ahh. sweet. I'll do that first. it's possible one is causing the other. [05:12] mmm. good point [05:13] openstack-private gone [05:14] woot [05:14] swift-bugs-private swift-core-private gone [05:15] gah. swift-private is being recalcitrant, per that answers request you have. been looking at that this morning to no joy [05:15] glad to know I haven't stopped causing you grief :) [05:16] actually.. thinking about that - I suspect it may have been self inflicted, by myself.one I need to chase. [05:17] mtaylor: I assume the ozone ones can go too? [05:18] spm: yes. all ozone can die [05:18] kk [05:31] Is there some DB load issue at the moment? [05:31] For the past 24-36 hours, my script that heavily uses getBuildRecords has timed out every time. [05:31] It previously only timed out one every few hours. [05:31] that's not cool... [05:32] 1 min load is higher than usual, but 5 and 15 is normal. light even. [05:34] wgrant: you wouldn't happen to know what table that references? [05:34] spm: BinaryPackageBuild, PackageBuild and BuildFarmJob, mainly. [05:36] is that talking to edge? I'd suggest not but... ? [05:37] It may be... let's see. [05:37] It moves around depending on where the API is least broken. [05:37] heh [05:37] It's currently using edge. [05:38] I'm seeing a few timeouts for queries with BinaryPackageBuild in them, all via edge [05:38] These should be 12-15 minutes past the hour. [05:39] ta! that helps hugely. [05:40] hrm. these are all almost uniformly on the hour; so possibly not you. [05:41] Hmm. [05:42] wine, firefox, kdm,? [05:42] ubiquity is another [05:42] My script requests everything with a build failure [05:42] nigelb, http://www2.bryceharrington.org:8080/wordpress/?p=3 [05:45] wgrant: are you getting an oops from these? the commonish query that somewhat matches is taking around 10secs based on a quick random grab. cop enough of those... [05:46] spm: I'm getting OOPSes, but this old launchpadlib doesn't display them. [05:46] Maybe I should upgrade that host to Lucid at some point. [05:46] bleh [05:48] Hm, wow, nearly a year since the open sourcing. === mtaylor is now known as mtaylor|beach [06:06] Argh. [06:06] https://code.edge.launchpad.net/~ted/+recipe/indicator-applet-dx/+build/177 is redispatching constantly. [06:06] Presumably because there is no longer a warty/i386 chroot. [06:06] Ah, and there are hoary and breezy builds too. [06:08] bryceh: w00t w00t! [08:15] wgrant: timeouts ? [08:16] lifeless: Yes. spm reminded me that you reduced the timeout last week. [08:17] there is a bug script hammering one bug on prod with trouble [08:17] the hourly edge graph is clean though [08:17] what issue are you having? [08:18] wgrant: ^ [08:18] lifeless: distroseries.getBuildRecords is frequently timing out for me. [08:18] got an oops id ? [08:19] This old launchpadlib doesn't display them, so no :/ [08:19] can you get one, please ? [08:19] OK. You can't easily find OOPSes for a particular method? [08:20] * wgrant turns httplib2 debugging on. [08:21] i wonder... if oopses record the remote IP, lifeless, you should be able to fgrep -R on devpad and hunt thru them all? brute force, but... [08:22] It's hopefully reproducible. [08:22] wgrant: no, not yet. [08:22] It normally happens within the first five minutes of the script, and it's running with debug logging on now... [08:22] I'll file a bug on finding oops by method [08:23] spm: seeing ip addressess is a bug [08:23] we need to fix that - and seeing private urls : both should be limited to losa and gsa [08:24] https://bugs.edge.launchpad.net/oops-tools/+bug/607087 [08:24] <_mup_> Bug #607087: enable 'search by method' [08:30] wgrant: any sign ? [08:31] lifeless: It seems to be an excellent Heisenbug. [08:31] arrrgh. [08:31] Just died now. [08:31] cool [08:31] But there's no X-Lazr-Oops header... [08:31] its not in the oops report either FWIW [08:33] wgrant: :( [08:33] what is there ? [08:33] pastebin ? [08:34] Hm. Maybe timeouts don't create an X-Lazr-Oops header :( [08:34] wgrant: this is what edge was last night - [08:34] === Top 10 Time Out Counts by Page ID === [08:34] Hard / Soft Page ID [08:34] 63 / 618 DistroSeries:EntryResource [08:34] 12 / 71 Person:+participation [08:34] 9 / 13 ScopedCollection:CollectionResource [08:34] 8 / 29 BugTask:+index [08:34] 5 / 50 Sprint:+temp-meeting-export [08:34] 5 / 6 Archive:+copy-packages [08:34] 4 / 2 DistroSeries:+index [08:34] 3 / 3 BugTask:+editstatus-page [08:34] 2 / 3 Milestone:+index [08:34] 2 / 1 Person:+map [08:34] wgrant: we do get timeout oops on edge apis [08:34] It's probably the first one. [08:34] OOPS-1660EB2408 [08:34] It would come under the first one, that is. [08:35] The request URL should have ws.op=getBuildRecords [08:35] !pastebin [08:35] QUERY_STRING: build_state=Failed+to+build&ws.op=getBuildRecords&ws.start=200&ws.size=50 [08:35] StevenK: !sanity [08:35] That's the right kind of request. [08:35] GET /1.0/ubuntu/maverick?build_state=Failed+to+build&ws.op=getBuildRecords&ws.start=1750&ws.size=50 [08:35] That's the one that failed just now. [08:36] lifeless: You should know better. :-P [08:36] I'd only expect 24 hard OOPSes a day from this, though. [08:36] StevenK: 12 lines in a totally dedicated channel is not a problem [08:36] StevenK: when there are no other discussions taking place. [08:36] StevenK: we are, after all, here to *communicate* [08:37] lifeless: I can has that OOPS? [08:37] lifeless: Ah, thanks for the review. [08:38] SQL time: 12978 ms [08:38] Non-sql time: 1499 ms [08:38] Aha. [08:38] What're the big queries? [08:38] Or are there lots? [08:39] http://pastebin.com/DsDmLqLk [08:39] one repeated query is the culprit I think [08:39] Forgive me -- I've no idea what those columns mean. [08:41] I'd really like to know where that query is coming from. [08:41] Hmm. [08:41] Maybe current_source_publication, I guess. [08:41] Although that query should be pretty quick. [08:43] lifeless: Do you have sufficient powers yet to land that branch? [08:47] wgrant: probably but EBUSY [08:47] wgrant: please grab someone else [08:47] lifeless: OK. [08:47] wgrant: oh, the 37 is repetitions [08:48] row reps time avg time-avg db-id query [08:48] there are also two very slow queries [08:49] 12 seconds total [08:49] http://pastebin.com/VsQpzqUq [08:49] wgrant: ^ [08:49] wgrant: thats actually the issue, probably. [08:49] WTF is it doing a count for. [08:49] the rest is inefficient but broadly noise. [08:50] Thanks. [08:50] Those are indeed the problems. [08:50] wgrant: I'd like to lower the timeout again today; is that ok with you, given you're about to propose a fix ? :) [08:50] lifeless: My other scripts are reasonably happy, so go ahead. [08:52] I fully support any measures to make LP less slow. === almaisan-away is now known as al-maisan [09:01] Hey up [09:01] Can I get an EXPLAIN ANALYZE executed, or is that a stub thing? [09:03] wgrant: any of the losas can do it for you. fire away. [09:04] spm: Can you EXPLAIN ANALYZE the second query on http://pastebin.com/VsQpzqUq, please? [09:09] wgrant: http://paste.ubuntu.com/465800/ [09:10] spm: Which DB was that? main master? [09:11] one of the slaves [09:11] Ah. Thanks. [09:41] spm: it was running on main [09:42] spm: you might want to check its the same? [09:42] actually nevermind [09:42] 7.7secs is 7.7secs [09:42] It's about a second slower, so it's pretty simialr. [09:43] heh, i'd be impressed if that made a difference, and no, looks near as identical. [09:43] 7.9 vs 7.7; within the bounds of load/server abuse [09:58] There are some recipe builds redispatching constantly, since they are targetted to series without chroots. [09:59] https://code.edge.launchpad.net/~ted/+recipe/indicator-applet-dx [10:04] rockstar: can you fix that? [10:05] * bigjools sighs at people making builds for breezy, hoary and warty [10:05] the UI should have blocked that [10:05] It does. [10:05] I bet the API doesn't, though. [10:06] wgrant, why are they targetted to series without chroots? [10:06] But we should fail builds if the chroot's missing, I think. [10:06] rockstar: see the series I mentioned [10:06] rockstar: Because Ted thought it was fun, I guess. [10:06] bigjools, the UI does block that, so I guess it was an API thing? [10:06] validation in browser code is largely useless now we have the API :/ [10:07] yay for fat models [10:07] rockstar: yeah it must be [10:07] bigjools: Also, we have a whole lot of impossible builds being created in the primary archive when security updates with failed builds are copied. [10:07] bigjools, truthfully, that should probably take place in the model anyway. [10:07] wgrant, can you file a bug? [10:07] They get marked as failed immediately, but they clutter up build listings and probably shouldn't exist at all. [10:08] * rockstar does not have a shortage of recipe bugs to fix [10:08] heh [10:08] rockstar: One for preventing creation, and another for dealing with broken ones? [10:08] (since they can become broken after creation) [10:08] wgrant, I think the broken ones is probably a buildmaster issue, right? [10:09] rockstar: No. You pick the chroot in lp.code somewhere, and that'd be what's crashing. [10:09] Also, what's going on with porting SPRBs and TTBJs to the new model? [10:09] it needs to fail the build [10:10] in progress [10:10] Excellent. [10:10] the latter is pronounced as "Titty Bee Jay" BTW [10:11] Heh. [10:11] rockstar: so those builds that won't ever complete, you should poke some SQL at the LOSAs to remove them [10:11] and add a "cancel" button to the UI [10:11] raising the tone [10:11] There's a cancel branch in progress, right? [10:11] bigjools, there is a cancel button already for admins. [10:11] I thought I saw it floating around last week. [10:11] ummm [10:11] losa ping [10:12] rockstar: why only admins? [10:12] rockstar: hi [10:12] should be the user, and buildd-admins [10:12] bigjools, because that's what I wrote the code to do. [10:12] :) [10:12] What does it do when the build is in progress or finished? [10:12] wgrant, where are these builds? [10:12] wgrant, i thought might please you [10:12] rockstar: the three pending builds on https://code.edge.launchpad.net/~ted/+recipe/indicator-applet-dx [10:13] poolie: Indeed, I was pleased to see that. [10:13] mthaddon, I have some builds that probably need to be cancelled. [10:13] rockstar: what does the cancel button do BTW? mark them with a new status or delete them? [10:14] mthaddon, https://code.edge.launchpad.net/~ted/+recipe/indicator-applet-dx/+build/175 https://code.edge.launchpad.net/~ted/+recipe/indicator-applet-dx/+build/176 and https://code.edge.launchpad.net/~ted/+recipe/indicator-applet-dx/+build/177 [10:14] * rockstar makes note to head punch ted [10:14] I'm somewhat concerned that Code is doing things that are impossible. [10:14] bigjools, I originally was deleting the build, but abentley thumped me, so I'm just changing the status now and deleting the buildqueue record if there is one. [10:14] Like deleting branches related to recipes related to builds. [10:14] rockstar: what is "wrong" with what ted has done here? [10:14] And cancelling builds that are in progress. [10:15] mthaddon, requesting builds against unsupported distros. [10:15] rockstar: great, sounds fine, with the caveat that wgrant is raising [10:15] rockstar: should he not be allowed to do that? if so is that a bug that we let him do that? [10:16] mthaddon: Yes, bug #607125 [10:16] <_mup_> Bug #607125: Forbid creation of recipe builds for obsolete series through the API [10:16] I see your caveat, and raise a "well, it's either that, or assume the buildmaster will fix it, which has been a bad assumption so far" [10:16] thx wgrant [10:16] * mthaddon adds to losa watched bugs [10:16] rockstar: Well, cancelling an in progress build will at the moment either not work, or not work and knock the builder out until somebody notices that it has been idle for weeks. [10:16] wgrant, there should be a cancel button on there now though. [10:17] wgrant, this is why we left it up to the admins. The reason it's here is because builds keep getting re-tried even though they are always failing. [10:17] we have some work to do to cancel in-progress builds [10:17] rockstar: those have been cancelled [10:17] rockstar: can you change that permission to buildd-admins please [10:17] mthaddon, great, thanks. [10:17] bigjools, you can... :) [10:18] rockstar: I would but we have all these buildd-manager bugs you keep exposing ;) [10:18] * rockstar refrains from "it's open source" comment. [10:18] * bigjools might do the same with the next buildd-manager bug [10:18] bigjools, this is a hell of a pissing match... "My code is more broken than yours!" "Nuh uh!" [10:19] lol [10:19] Ah, it already raises a CannotBuild when the chroot is missing. [10:19] But that just causes the dispatch to do nothing. [10:19] So it's going to be retried on the next run. [10:20] wgrant, ...unless we've cancelled the build. [10:20] rockstar: Right. [10:20] wgrant, "Cancel build" is there so that we don't have to make the losas run SQL. [10:20] I speak of the general case. [10:20] It shouldn't have to exist at all. [10:20] one of my b-m bugs is to detect failures and work out whether to fail the job or the builder [10:20] Luckily, spr is only available on edge. [10:21] * bigjools cringes at the 2nd meaning of SPR now :( [10:21] bigjools, maybe just see if the builder failed the last 5 builds before the builder commits seppuku. [10:21] rockstar: that's the exact plan [10:21] bigjools, my use of SPR is more important now. [10:21] it can see if it's the same builder or the same job [10:21] * rockstar only works on important things [10:21] bigjools: We also need to fire the reset trigger if a builder is being stupid. [10:21] haha [10:21] bigjools: eg. during an abort. [10:21] Or when we get a [10:22] I have a branch that does that, but I backed it out [10:22] :( [10:22] wgrant, also, recipe builds are about to be able to be re-scored as well. [10:22] wgrant: it disabled builders that were not resetting quick enough :/ [10:22] bigjools: Oh, right. [10:24] bigjools: Once we no longer need to preserve ddebs on the buildd filesystem, is there any need for the virt/non-virt distinction? I think the architecture restriction magic can handle it all. [10:25] well yes [10:25] It would also mean we wouldn't have to lie about ivy and kaylaberry. [10:26] we still need a split between non-virt and virtual i386/and64 [10:26] amd* [10:26] Why? [10:26] Once our queue discipline doesn't suck, there's no reason to. [10:26] bigjools, it'd be nice if we could return the spr builds to be "anything but arm" instead of i386 only. [10:26] archive restriction won't work there [10:26] bigjools: Why not? [10:26] rockstar: agreed [10:27] wgrant: because we don't want PPA builds ending up on distro builders [10:27] bigjools: But the distro builders don't need to be special. [10:27] yes, they do [10:27] Whyso? [10:27] they're not PPA builders [10:27] distro has its own hardware [10:28] Ah, so it is political. Yay. [10:28] besides, not resetting the builder shaves 30 seconds off the build time [10:30] Anyway, regarding the "anything but arm" thing: we could just leave them on i386 and let amd64 builders build i386 and lpia as well, as has been planned for years. [10:31] yeah there's a bug about that [10:32] it would need a bit of work server-side :) [10:32] Not really. [10:32] The slave side is done, but not merged. [10:32] And the master is simple enough. [10:37] Although it might make /builders a little odd. [10:45] mthaddon: hi [10:45] lifeless: hi [10:45] I has a new config CP on the wiki page [10:46] lifeless: you've added it to ones already done rather than requested ones [10:46] oh darn [10:46] fixing [10:46] lifeless: and since this is an edge one, it's not really a cherry pick - just a request for an edge rollout [10:47] mthaddon: ok [10:47] mthaddon: can you please rollout edge then? Sorry about not getting the process right :( [10:48] wgrant: the hard part is the build dispatch ETA [10:51] bigjools: Yeah. But arch-agnostic builds completely break that too. I think we need a new way to do things. [10:51] lifeless: ugh, we don't have a new commit to "stable" since the last edge rollout, so it'd make a rollout currently a very manual process [10:51] wgrant: they don't break it if they end up in the same arch-indep builders each time [10:52] it's when the arch is indeterminate, mixed with determinate builds that you have an issue [10:52] queueing theory is a bitch [10:52] Yes :( [10:53] mthaddon: here's what I'd like to do: I want to drop the edge timeout to 12 seconds. The ppr suggests 10 might be safe, but I don't totallly trust it. [10:54] mthaddon: I'd like to find out during our work day if that change is an issue so we can rollback in a controlled fashion. [10:55] lifeless: the next auto rollout would be tomorrow morning (assuming there's a new landing to stable between now and then), so how about that? [10:58] mthaddon: it would make me sad. But if thats the best we can do, its the best we can do. [10:58] mthaddon: also, its a bit risky. [10:58] mthaddon: because if the auto rollout happens, and its bad, we're in the poo, and I explicitly want to avoid that. [10:59] lifeless: there are LOSAs around during the auto-rollout, or if someone lands a change to stable between now and then, kicking off a rollout is fairly trivial [10:59] mthaddon: could you just do an empty commit then, from praseodymium ? [11:00] checkout lp:launchpad/stable; bzr commit --allow-unchanged ? [11:00] lifeless: I *could* but this is more "working around our process rather than fixing our process" which makes me sad [11:00] mthaddon: it makes me sad too [11:00] mthaddon: I don't understand why our process has this limitation. [11:00] Can you tell me about the limitation ? [11:01] lifeless: the limitation is that we want to be able to push and build code without taking down the current app server - we do that by pushing to a new directory [11:01] lifeless: the name of the new directory is determined by the revno of the top level bzr dir in the code tree [11:01] lifeless: so unless we have a commit to that, we'd be pushing over the top of the existing directory [11:02] lifeless: which our deployment scripts would stop from allowing [11:03] right [11:03] s/would stop from allowing/wouldn't allow/ [11:03] that would be bad. [11:03] so, a few things occur [11:03] we could make the top level number be a composite - lprevno,configrevno [11:03] we could use a datestamp - lprevno,$time [11:03] or even $time [11:04] do you have any other ideas? [11:04] lifeless: that would be problematic in terms of how the deployment scripts handle other projects (we don't just use it for LP), but the lprevno,$time would work [11:04] ok, where is the script, so that I can put a patch together [11:04] I mean the lprevno,configrevno wouldn't work, but lprevno,$time would work [11:04] lifeless: lp:~canonical-losas/deployment-manager/trunk [11:06] mthaddon: RepositoryFormatKnitPack1 - you could benefit by upgrading that :) [11:07] bigjools: Is there a reason that external_dependencies are between the archive itself and the other archive dependencies in sources.list? Would the world explode if I moved it to the end (it would make some refactoring simpler)? [11:07] I think so, let me remember [11:08] It's possible that people are relying on it. But that's a bit crazy :/ [11:08] * mthaddon does it through the UI [11:08] I think they are - it was for OEM's migration [11:08] I guess I could just insert them after the first line. [11:08] mthaddon: is that a bumper sticker? :) [11:08] wgrant: what are you changing? [11:09] bigjools: no, that's "LOSAs do it through the UI" :) [11:09] bigjools: https://code.edge.launchpad.net/~wgrant/launchpad/bug-598345-restrict-dep-contexts/+merge/30203 [11:09] mthaddon: :D [11:10] wgrant: yay! [11:10] btw I have a new buildd-manager that I finished (sorta)} last week [11:10] wgrant: so, please tell me you're going to fix the soyuz api :) [11:10] Yeah, I saw that. Looks good. [11:11] lifeless: Hm? The timeouts? [11:11] wgrant: are you stalking me? I didn't even make an MP yet! [11:11] yes, in particular [11:11] bigjools: I guess I'll just change it to insert the external deps after the first real dep entry. [11:11] bigjools: I stalked lots of LP branches over the weekend :P [11:12] lifeless: we need a way of jobbing copy-packages before we get rid of the major source of timeouts ;) [11:12] bigjools: you could use the jobs system [11:12] no ? [11:12] yes - but it would be a lot of work [11:12] how is it done now ? [11:12] all synchronously in the appserver [11:12] aiieeee [11:13] it issues a shitload of queries [11:13] the copy checking is very complicated [11:13] lifeless: This is Launchpad. Launchpad's motto is "Let's do everything in the appserver until it times out, then ignore it for a while after it starts to." [11:13] but it is well-factored so it's just a case of UI changes [11:14] bigjools: But isn't a lot of the time taken in verifying that the copy can actually happen successfully? [11:14] wgrant: yes, that's exactly what I am talking about [11:14] wgrant: that was last month. [11:14] the copy itself is trivial [11:14] the checking - not so much [11:14] rabbit is now available. [11:14] Hmm. Doing that asynchronously sounds a bit awkward. [11:14] the jobs system is pretty good even without rabbit. [11:15] lifeless: I hope you saw my comment on that bug about more daemons [11:15] lifeless: There isn't really a jobs "system"... [11:15] bigjools: I did, but I don't have an answer other than 'yes, we should fix it so we don't monkey with apache, pgsql, etc/hosts, etc etc etc' [11:15] lifeless: the problem is endemic [11:16] agreed [11:16] this is why I develop lp in a VM [11:16] that's not a cure, it's pain relief [11:16] If you have suggestions for how to develop software using rabbit without having rabbit present, I'm all ears. [11:16] rabbit ears? [11:16] Bwahaha [11:16] we just need a way of selectively starting this stuff when needed [11:17] during make run, or during the tests [11:17] in this case, perhaps file a bug on rabbitmq in debian [11:17] bigjools: Is buildd-manager really so slow at the moment because of process-upload.py, or is the candidate selection query also taking a while? [11:17] the package being how it is isn't really my fault :) [11:17] I am really fed up of more daemons running in the background :( [11:17] bigjools: we shouldn't/can't use the system rabbit for tests anyhow. [11:17] lifeless: it's not unreasonable that it starts when you install it [11:18] bigjools: please don't think I'm advocating having it running all the time: I'm not. [11:18] but the lp packages should turn it off [11:18] wgrant: most of it is upload [11:18] bigjools: that would worry me; too much chance of turning it off on prod. [11:18] the next slowest part is dispatching (it takes 10 seconds to send all the files to the builders) [11:18] candidate selection is very fast [11:19] bigjools: Hmm. But there are only 59 builders... and it gets out to 15 minutes sometimes. [11:19] lifeless: make start should start it [11:19] no [11:19] ....... [11:19] we should not use the system instance at all [11:19] make start should run a private instance using the system binary [11:19] wgrant: yeah that's upload time [11:19] Hm, I guess that is Hm, I guess that only means 15s of upload time, yeah. [11:19] which may be what you mean [11:19] Crazy. [11:19] lifeless: yep [11:20] althought, again, make start on prod - we have to be very careful not to mess up there. [11:20] lifeless: But we do the same sort of thing with the librarian, so there's precedent. [11:20] wgrant: the way we use the librarian causes lots of headaches, testing included. [11:21] lifeless: True. [11:21] mthaddon: I am admiring of your unit tests :P [11:22] lifeless: er.... yeah [11:23] I'd really like "make start" to start everything it should do, given the LPCONFIG [11:30] I'd like to not start Launchpad using make targets. [11:31] jml: what would you like to start using make targets? [11:31] lifeless, are you trolling? [11:32] only a little [11:32] lifeless, well, I don't understand the question. But to elaborate on what I meant, I'd like to not have make targets for running Launchpad or its bits, and instead use actual scripts. [11:32] maybe even the same scripts mthaddon uses! [11:33] we should talk about this later [11:34] lifeless, yes. [11:34] lifeless, after you review my branch, for example :P [11:34] my point was that I don't want to think about what to start [11:34] I was thinking after LP is fast [11:34] lifeless, please review my branch before then. [11:34] haha [11:35] seriously though, if someone else reviews it its all good too. [11:35] uhm, those logouts shouldn't be needed now though, should they ? [11:36] lifeless: are you lowering hard or soft timeouts on edge? or both? [11:37] lifeless: also have you notified launchpad-users? [11:37] lifeless, LP is fast??? [11:47] what they did for couchdb was make a couchdb-bin package with all the binaries, and a couchdb package with the init script. [11:47] doing a similar thing would be l-dev-deps could depend on rabbitmq-bin [11:48] +1 [11:48] james_w: while you're there! can you check out the LEP for derived distros please? [11:49] in particular please make sure we've captured all your requirements === al-maisan is now known as almaisan-away [11:58] Can someone please land lp:~wgrant/launchpad/remove-obsolete-buildd-junk? [11:59] wgrant, sure, if it's approved. [11:59] jml: It is indeed approved. [11:59] bigjools: prod can't go lower until we deploy edge to prod [12:00] bigjools: because its got many more timeouts as a baseline [12:00] wgrant, running ec2 land now. [12:00] jml: Thanks. [12:00] np [12:00] lifeless: that's not what I asked! :) you know soft vs hard? [12:00] yes [12:01] sorry, got ELOCAL for a sec [12:01] anyhow, edge - dropping both [12:01] bryceh: that is the goal. [12:05] deryck: hai [12:05] lifeless, hi [12:06] deryck: If you guys have any slack, or the ability to deal with oops and near-oops, I have a few things for you :) [12:06] lifeless, yes, we can deal with oops and near-oops. [12:07] deryck: so, I sent you a mail listing 4 pages that are near to timing out on prod [12:07] but also [12:07] Is stub around this week? [12:07] on friday night there was a big boom [12:08] wgrant: once he gets home, yes. [12:08] lifeless: Thanks. [12:08] deryck: on friday night, one API call went 8boom8 [12:08] lifeless, re: mail. I marked it to look at today, and will respond with comments. But generally, making those a priority is welcome and not a problem for me. [12:08] 4000 times [12:08] lifeless, I saw that. The message stuff, right? Same API stuff bryceh was asking about at the epic, right? [12:08] poolie and jtv have done some analysis [12:09] I guess, yes. [12:09] lifeless, yeah. gmb is testing a fix as we type. :-) [12:09] awesome [12:09] lifeless, haha [12:10] bryceh: everyone wants it to be fast. [12:10] so we shall be fast. [12:10] deryck: https://lp-oops.canonical.com/oops.py/?oopsid=1659B1002 for instance [12:10] deryck: poolie reckons its just the calculation of the canonical url per item [12:10] hey, there's bryceh :-) go to bed. ;) [12:11] * deryck is reading the OOPS [12:11] deryck, yeah, 4am == well past bedtime === salgado is now known as salgado-lunch [12:17] lifeless, so having processed all the info now, I think gmb and I can chase up a fix today for this. And I'll respond in depth to the timeout email before EOD for me. [12:18] awesome [12:18] we're getting up some momentum on performance [12:18] if we can just get a bit of a cadence going we'll be at 5 seconds in no time [12:18] deryck, lifeless: Have fix will travel: http://pastebin.ubuntu.com/465875/ [12:18] lifeless: Crank the timeout down to 1s. [12:18] Then see how quickly things get fixed :P [12:19] wgrant: that would just fuck everyone over [12:19] :( [12:19] There's already tonnes of test coverage. [12:19] gmb: isn't messages.first() or something even cheaper? [12:19] lifeless, It might be; I can never remember the Storm runes. Let me try it. [12:20] its not first, I don't remember them either [12:20] food time [12:20] bigjools: Am I OK to go through with https://code.edge.launchpad.net/~wgrant/launchpad/refactor-_dominateBinary/+merge/29667? It seems to be a good cleanup, and is somewhat necessary for ddeb stuff. [12:21] my eyes [12:22] what have you changed? [12:22] Oh, right, I haven't added a description yet. Oops. [12:22] You know the atomic arch-indep domination stuff? [12:22] vaguely [12:23] I've never done anything in that code [12:23] lifeless, FTR, first() appears to shave ~2s off test run time, though that could be the weather or the proximity of chickens to the developer for all I know. [12:23] it's headfuck territory [12:23] bigjools: Yes. Exactly. I'm trying to unheadfuck it. [12:23] At the moment, publications have a supersede() method. [12:24] It basically just sets datesuperseded and the status. [12:24] wgrant: in tandem with my b-m changes, we might start unheadfucking soyuz [12:24] The Dominator handles setting supersededby, and it also handles atomic domination rules. [12:24] (atomic domination means that if an arch-indep publication is superseded, other publications of the same binary immediately get superseded too) [12:25] This makes for exceedingly ugly code, and even uglier tests. [12:25] So I've moved all of that into the model. [12:25] gmb, so like lifeless I think use .first, but otherwise this looks good to me. [12:25] So Dominator just calls publication.supersede(dominant), and the model works out what else needs to die, sets supersededby, etc. Dominator just finds the candidates and tells them to do their stuff. [12:26] deryck, Cool, preparing a branch now. [12:26] A subsequent branch replaces the tests with some that suck less. [12:26] s/branch/mp [12:26] *Then* i can extend it to dominate ddebs properly, which is the whole aim of this. [12:26] gmb, I wonder if we should look at bug 606911 while we're here, and then cowboy to staging and hit with an api script to see if we're better? [12:26] <_mup_> Bug #606911: Bug.indexed_messages unused? [12:26] gmb, especially if that is just dropping unused code. [12:26] wgrant: ok seems like a rational thing to do [12:27] wgrant: my concerns are: performance and performance [12:27] bigjools: No performance changes here. [12:27] It just shuffles methods into the model and wraps them up. [12:28] ok - I'd still like to do testing pre and post-branch on DF to make sure [12:28] deryck, Yeah, I'm thinking that too. [12:28] * bigjools doesn't apologise for being paranoid [12:29] Heh, OK. [12:30] I hate finding my own code when searching on google for stuff [12:30] Haha. [12:30] deryck, Ahaa. Except that indexed_messages is exported as 'messages'. [12:30] So that would be a big bag of fail. [12:30] I hate finding all this Soyuz code that hasn't been touched by anyone who's still around, and is revolting. [12:31] wgrant, That's *why* they're not still around. [12:31] gmb: Heh. [12:31] deryck, I suggest we just try cowboying the initial_message fix for now. [12:32] gmb, that's fine with me. [12:33] gmb, I can investigate the other bug today after I settle in. Then, you can go back to subscriptions hacking after this fix. [12:35] deryck, Okay. I'll work up an mp for the performance fix anyway. [12:36] gmb, yes, do. I wasn't suggesting dropping this fix. Just that you don't have to worry about the other bug for now. [12:37] deryck, Right. I managed to put the word "anyway" in there without meaning to. What I meant was "now" [12:37] heh, fair enough [12:54] gmb: is it first () ? [12:54] lifeless, Yes [12:54] hah, who knew :) [12:54] lifeless, Well intuited :) [13:02] gmb: you win some, you win some more [13:03] wgrant, oops. forgot to update download-cache. trying again. [13:03] wait. no. [13:05] ec2 land has been broken by recent changes to Launchpad. [13:05] gmb: https://code.edge.launchpad.net/~gmb/launchpad/bug-606914/+merge/30259 === mrevell is now known as mrevell-lunch [13:06] jml: Awesomeness. [13:07] lifeless, Thanks. Will remove the whitespace; bad habit. [13:07] wgrant, I'm going to fix those before landing your branch. [13:07] gmb: my overly general rule of thumb is 'if it needs vws in a function, it probably needs two functions' [13:08] jml: what changes broke ec2land? [13:08] lifeless, Good rule :) [13:08] lifeless, a change to the webservice. [13:08] jml: .next() [13:08] lifeless, AttributeError: 'Entry' object has no attribute 'queue_status' [13:08] grah [13:08] thanks [13:09] lifeless, I think jkakar has a patch for better testing support with launchpadlib [13:09] jml: a bit, yes. [13:10] he's trying to get it landed [13:10] but launchpadlib is a bit single-writer-gated atm [13:11] hmm. [13:11] jam OOPS-1661ED1057 [13:11] hi there deryck [13:11] thanks for looking at the attachment bug [13:11] queue_status is still being exported :\ [13:12] benji, good morning. [13:12] good morning, jml; how goes it? [13:12] gmb: when you finish with that, I have a question on your reset watches stuff [13:12] benji, alright. [13:13] fixing bugs. not taking prisoners. [13:13] hi poolie. np. glad the fix was easy. [13:13] gmb did the heavy lifting :-) [13:15] benji, there are a few pending patches from Launchpadders to zope.testing. would you be interested in reviewing & landing them? [13:16] (if not, I'll try to find sidnei) === almaisan-away is now known as al-maisan [13:16] lifeless, Sure, ask. [13:16] glad to take a look [13:16] benji, https://code.edge.launchpad.net/zope.testing/+activereviews [13:16] who will be looking for bad attempts to reset bug watches [13:17] lifeless, wgrant: false alarm. it's my bug. [13:17] lifeless, I'm not sure what you mean. Do you mean "who will be looking to see which bug trackers need their watches resetting?" or "who will be looking to make sure no-one maliciously resets watches?" [13:18] Or something else? [13:18] the second [13:19] deryck, i'd be interested to review the result when it's up for review, for my own education [13:19] lifeless, Any UserCannotResetWatchesErrors should appear in the OOPS report, shouldn't they? So we should spot them fairly quickly. [13:19] poolie, ok, it's up. let me find the link. [13:19] lifeless, Only LP Devs and Admins are able to do the resetting or for that matter see the reset button at all. [13:20] poolie, https://code.edge.launchpad.net/~gmb/launchpad/bug-606914/+merge/30259 [13:20] got it thanks [13:20] gmb: who is going to be *looking* for that though [13:20] its going to show up in 'exceptions' [13:21] gmb: and how is it different to any attempt to circumvent permissions - e.g. creating a new private branch, viewing a private bug etc. [13:21] lifeless, I'm not sure that it is different... Ah, but, instead of the custom Exception it would make more sense to raise the same kind of exception as the other attempts to circumvent permissions do. [13:22] I don't know if anyone actively looks for those errors, but I think it's a safe bet that someone does because they get reported fairly quickly. [13:22] Let me do some digging. [13:23] gmb then i suspect bug 424671 won't be fixed by this [13:23] <_mup_> Bug #424671: attachments: accessing attachment's message is very slow [13:23] but it's similar [13:23] as you probably already know [13:24] gmb: I think the normal permission denied behaviour is really all that is needed [13:24] gmb: unless you have some particularly special horror story we're expecting [13:25] lifeless, No, I just wrote that stuff about half an hour before getting on a plane and didn't think about it very hard :) [13:25] :) [13:25] I'll change it appropriately. [13:27] poolie, I agree that they're similar but I think they should be fixed separately (if only because I knew exactly how to fix the first bug but wasn't sure I grokked a solution for the second). [13:27] if it fixes the timeout, I'm happy [13:27] there are 4 other near-timeouts that can queue between 'better' and 'perfect' for this API, IMHO :) [13:28] vocabularies.txt is 1428 lines long. [13:29] jml: Don't look at archive.txt... [13:29] I probably will eventually. [13:30] I'm on a quest to remove all the imports in canonical.launchpad.interfaces.__init__ [13:30] ahahah. [13:31] my current plan is to do it one import at a time [13:31] but to fix all of the imports in every file I touch [13:31] so that it gets easier as I go along [13:31] Right. [13:36] wgrant: your VWS in supersede doesn't really gel for me - please remove or comment [13:37] * gmb -> lunch [13:39] lifeless: What's the issue with it? [13:39] its not clear why these lines need special attention drawn to them [13:40] Should I remove all blank lines from both methods? [13:41] I generally follow pep8 here [13:41] by which I mean 'yes, unless you really want special attention' [13:41] Hm, OK. [13:41] http://www.python.org/dev/peps/pep-0008/ [13:42] " Use blank lines in functions, sparingly, to indicate logical sections. [13:42] " [13:42] They are, in my opinion, indicating logical sections. [13:42] It's unfortunate that one of those sections is a super() call, and that it has another section on top of it. [13:43] thus my request to add a comment :) [13:44] Like what? "# Set date and status."? [13:49] wgrant: if its self evident, I don't understand why its really a different section; if its not self evident, surely it should have a comment? [13:49] wgrant: however, its up to you, this isn't a blocking issue. === mrevell-lunch is now known as mrevell [13:53] lifeless: I don't think it deserves its own section. But the segments of code above and below it do. [13:54] lifeless: Thanks for the reviews. [13:57] de nada === Ursinha-afk is now known as Ursinha === salgado-lunch is now known as salgado [14:07] lifeless: Are you landing those, or must I coerce others? [14:07] coerce [14:07] k === vednis is now known as mars === beuno_ is now known as beuno [14:27] Can someone please ec2 land lp:~wgrant/launchpad/bug-598345-restrict-dep-contexts, lp:~wgrant/launchpad/refactor-_dominateBinary and lp:~wgrant/launchpad/really-publish-ppa-ddebs? [14:49] deryck: I like your team strat [14:49] lifeless, for testing? [14:49] and cycle time [14:50] ah, right. Cool. [14:51] jml: busy? [14:54] jpds: if you need a hand on https://code.edge.launchpad.net/~jpds/launchpad/fix_116279/+merge/17940 gimme a shout [15:08] james_w, ping [15:13] lifeless, hi [15:13] jml: where could I find you? [15:13] lifeless, community room [15:14] gary_poster, mars: hello. there's a doctest issue? [15:14] jml, looking [15:14] jml: hey. Did you see my discussion about the number of doctests in #launchpad-code [15:14] With mars [15:14] gary_poster, it's gone down, right? [15:14] That would be the fastest way to catch up [15:14] right [15:14] ok. [15:14] and I think I know why, and that it is ok [15:14] oh good. :) [15:14] but would love confirmation [15:14] looking now [15:15] yeah, cool [15:15] simple summary compare [15:15] gary_poster, can you tell me why you think why? [15:15] rockstar, I'll catch up w/ you re private branch lookup after this. [15:15] jml: "buildbot went from reporting 26777 tests to 8490 tests when jml landed the branch to use stdlib doctest. It looks like it might be OK: the amount of time didn't change, so I think this is because Zope's doctest counts each block as a test, while the normal doctest code counts the whole thing as one. " [15:16] yeah, that sounds right. [15:17] gary_poster, mars, I don't really know how I'd confirm that, other than comparing the fulltext buildbot output. [15:18] yep, doing so [15:18] test lines before: 8941 [15:18] test lines after: 8976 [15:19] so yes, a) zope counts doctests multiple times, and b) our test suite runtime sucks three times more than I thought it did [15:20] it's not quite that zope's doctest counts tests multiple times, its that it counted tests slightly differently than the stdlib doctest [15:21] yep [15:21] didn't notice that before [15:27] by the way, I found that a good sized part of our testrunner time is being spent parsing doctests [15:29] which is a pity because it happens in the test discovery phase, and after the doctest is discovered, then the new layer's subprocess means that the work has to be done again [15:29] that surprises me; doctests should be very cheap to parse [15:30] I tried profiling the layer setup, and doctest parsing was a big part of it [15:30] it was volume more than anything [15:31] \o/ [15:31] benji: 'parse' + 'cheap' does not exist in python. [15:32] rockstar, I've commented on the bug for private branches. I figured that was the best way. I'm available for face-to-face convo if you'd like one. [15:34] lifeless, that is part of that mysterious uncounted timesink when you first fire up the runner. I couldn't isolate .pyc cleanup due to disk caching issues. [15:34] btw, mars: remember last week when we were wondering if the lower-memory EC2 instances would work for ec2 test? I watched one of the instances when running the tests and it ended up using 2.6G of RAM, of which 1.3G was cache/buffers [15:34] mars: nice [15:35] I also suspect the importfascist may be sinking some time, again due to volume of use [15:36] benji, what does 50% memory use as cache/buffer percent mean? [15:36] that's primarily disk cache [15:36] That the kernel is using it as a disk buffer [15:36] salgado, ping for review of https://code.launchpad.net/~leonardr/launchpad/true-anonymous-access/+merge/30088 [15:37] poolie, salgado says it's ok [15:37] sweet [15:37] i just need to change a comment and i'll land it [15:37] fire one! [15:37] thanks for doing it [15:38] poolie, yeah, did that over IRC some time ago. just updated the MP, though [16:18] jtv1: hi [16:18] hi lifeless [16:18] jtv1: https://code.edge.launchpad.net/~ursinha/launchpad/add-ec2land-rules-orphaned-branches/+merge/29255 needs a follow up review === jtv1 is now known as jtv [16:18] per your request to urshina [16:18] lifeless: cool, thanks for letting me know [16:18] bah [16:18] lifeless, I haven't finished that yet [16:18] doing that today [16:18] irsinha [16:18] oh [16:18] Ursinha: oh, ok. [16:18] I'll put it back to wip for now hten [16:23] \o/ [16:23] no outstanding reviews [16:23] -> chemist [16:26] lifeless, ok then, I'm finishing details to resubmit, what I couldn't finish last week :) [16:28] and I have a hilight for urshina here, the mispelling happens quite often I consider myself that too [16:29] Ursinha: I'm sorry ;) [16:29] no problem :) === Ursinha is now known as Ursinha-nom [16:51] bigjools: hi [16:51] hello [16:52] bigjools: if you feel particularly strongly either way about admins & PPA creation via the API, please do what you think is right. [16:52] bigjools: I'm trying to express a logic chain in the review not to mandate that it go either way. [16:52] lifeless: well I don't feel strongly enough to cry if it's not done, as I said on the review [16:52] bigjools: the underlying reason why I am arguing against a special case here is that the admins have said they want *less* interrupts. [16:53] and given that any user can make a PPA via the web ui already, I don't understand why we'd need an admin override facility here. [16:53] bigjools: sure; I'm being explicit that its your choice. [16:54] I *feel* that we should default one way; however its always changable, and its in your part of the product :) [16:54] lifeless: I understand your concern. Mine is that one day we'll need it and the facility is not there. I don't expect LOSAs to routinely get asked to make PPAs! [16:55] I'm also still feeling moderately crap and I was communicating badly in the review which is why I've upgraded to real-time ;) [16:55] * bigjools hears ya [16:55] bigjools: thinking through it more I have another reason/condition around admins making arbitrary PPA's [16:55] we require a CoC signature from PPA users. [16:56] would that be done on the admin, or on the user the admin is making the PPA on [in the current code]? If on the admin, then that would be a bug ? [16:59] interesting point [16:59] lifeless: When you're done with this discussion, I have a question for you - In https://answers.launchpad.net/launchpad-code/+question/117193 you asked me to file a bug for the "class of problem" of requiring LOSA intervention to make code imports work. I don't understand why. Shouldn't bugs be about concrete issues that we can fix? [17:07] gary_poster: hi [17:07] hey lifeless. [17:07] gary_poster: I want to talk cronscripts with you in a bit [17:07] I've just realised I should answer maxb, and get dinner first. [17:07] gary_poster: the theme will be 'making them safer to upgrade' [17:08] I will be around later, I would hate to keep someone from their dinner :-) [17:08] maxb: the class of problem is 'needing to run ssh
and accept the key' [17:08] lifeless: BTW, I intend to write up the PQM/tarmac requirements; I realized this weekend that there are some bits that I still need to clarify [17:08] gary_poster: awesome [17:08] lifeless: cronscripts: cool [17:08] gary_poster: I saw the split to a subspec [17:08] lifeless: oh, ok I see what you meant now [17:08] maxb: that class of issue is actionable and automatable. [17:08] maxb: :) [17:09] ok, awol for a little bit [17:09] by "actionable", presumably you don't mean that we can sue. [17:09] jml: who knows.... the phantom knows [17:09] shadow. [17:09] jml: I'm grabbing a club sandwich if you want to join me [17:09] phantom doesn't know anything. [17:09] lifeless, sounds good [17:10] lifeless, once this ec2 land detaches :) === al-maisan is now known as almaisan-away [17:17] jml: I has a table [17:17] lifeless, k. still detaching. [17:17] done === flacoste is now known as flacoste_lunch === danilos is now known as daniloff === beuno is now known as beuno-lunch === Ursinha-nom is now known as Ursinha-afk === Ursinha-afk is now known as Ursinha === flacoste_lunch is now known as flacoste === beuno-lunch is now known as beuno === mtaylor is now known as mtaylor|lunch [19:33] gary_poster: hi, so if you're still @ work [19:33] Hey lifeless. Yeah [19:34] to change to the 'release features when they are ready' workflow [19:34] we need to drastically simplify and reduce the downtime windows involved in rollouts [19:34] james troup and I did a mental audit [19:34] lifeless: we release edge every night [19:34] and I'm going to file a bunch of bugs related to this [19:35] isn't that the model we want to pursue? [19:35] gary_poster: we do, but - and this is one of the problems with the current setup - that is only a small fraction of the lp instances around. [19:35] ah, ok [19:35] gary_poster: so we need to do what we've done for edge and solve the issues [19:35] for instance [19:35] ok, fair enough [19:35] we have a lot of cronscripts [19:35] (BTW: https://dev.launchpad.net/Foundations/Proposals/SimplifyMergeMachinery) [19:35] great, queued it up to read [19:36] cool [19:36] so in particular, we have a lot of cronscripts [19:36] which run 'prod' [19:36] so we will be updating their code more frequently [19:36] run 'prod': that means, there is no "edge" equivalent [19:36] I'm wondering if there is something we can do to make it safer to do [19:36] ? [19:36] right [19:37] so at the moment their 'deploy frequency' is 1/month [19:37] Eh, yeah, I hate the cronscript story. Touching them is terrifying. [19:37] My long term preference is that they disappear [19:37] agreed [19:37] however we have to break the deadlock somewhere ;) [19:38] right [19:38] 170 cronscripts! [estimated] [19:38] yeah [19:38] so, why is touching them terrifyingL [19:38] : [19:38] so the bug is going to be a bit generic [19:38] it is terrifying because the tests for them largely suck, AFAIK [19:38] but in general we want to aim at being able to automatically upgrade them [19:39] Is there a problem other than the one I perceive? [19:39] in the short term that could be manual but smoother than today. [19:39] gary_poster: yes, several. [19:39] cool, should I wait for the bug report? :-) [19:39] - they run from cron, so either they try to run while we're upgrading, or they may still be running while we upgrade [19:39] ah [19:40] - they run from the 'current' symlink, not the 'active revno dir', so they could in principle do late buggy imports (but this is rare, we can ignore for now) [19:40] - some of them have terrible knock on effects if interrupted, and take 15-20 minutes every hour. [19:41] * gary_poster tries to look at the cron scripts so they feel bad about themselves [19:41] ...not much effect... [19:42] I have some vague ideas on how to address this, that have to do with trying to come up with cheap machinery and/or patterns for cronscript devs to follow [19:43] I feel constrained both because I don't have any knowledge or even oversight of the cronscripts, and because I don't want to spend too much time on them, knowing that I want them to die. [19:43] Perhaps it would be reasonable to discuss the machinery that we want to have to replace them in the long term [19:44] perhaps [19:44] to see if it would inform the short term stepe we might take [19:44] Its late for me right now [19:44] Fair enough [19:44] * benji wonders if gary_poster is thinking of zc.async [19:44] I wanted to frame the discussion [19:44] broadly, I think we need something to wrap them [19:44] to quiesce them when we're preparing to deploy [19:44] benji, sure, but we won't be using that here, I don't think :-) [19:44] to alert us when important ones are running still [19:45] and to insulate us mid-upgrade [19:45] I assume some sort of LOSA gesture like "I plan to upgrade things in half an hour so let's start quieting things down" would be reasonable [19:46] https://wiki.canonical.com/InformationInfrastructure/OSA/LaunchpadRollout [19:46] for now, absolutely [19:46] https://wiki.canonical.com/InformationInfrastructure/OSA/LaunchpadRollout#Comment out Cronjobs and Kill DB Backups (approx 30-40 mins before rollout) [19:46] thats terrifying [19:46] we have a tool to comment out cronjobs! [19:46] sigh [19:47] Another aspect to this is that the cronscripts themselves have had no constraints on them to my knowledge [19:47] indeed [19:47] But again, I suspect that we want to retain that freedom for now [19:47] so I plan to conduct an audit [19:47] Or else we will be bogged down [19:47] etc [19:47] but yeah [19:47] yeah, cool [19:48] my basic thought is to have a single constant wrapper around all the scripts [19:48] so that rather than commenting them out and hoping they stop etc [19:48] the wrapper can check a flag somewhere [19:48] you make some flag to make it a no-op [19:48] and not run [19:48] yeah [19:48] right [19:48] for now, I'm going to file a bug on each area of concern for the new workflow [19:49] OK, sounds reasonable [19:49] this one is the least well formed [19:49] which is why I wanted to kick off some neuron [19:49] s [19:49] Cool [19:49] Oh, I'll just choose one :-) [19:49] :) [19:50] \o/ lsprof has landed [19:50] thanks mars [19:50] great news, mars [19:50] and thanks to you too lifeless [19:50] de nada [19:50] cool, one step closer [19:52] ...so, yeah, such a cron script wrapper could be fairly complex. I think the trick will be to come up with a plan for one that is as simple and failsafe as possible. I'l think about it, and I'll hope to brainstorm with you about it later. [19:52] great === mtaylor|lunch is now known as mtaylor [20:10] elmo: around? [20:10] elmo: I'm trying to remember whats on loganberry that is rollout-sensitive [20:13] what's the normal turn-around for a (relatively simple) RT? [20:13] yes [20:14] more precisely, its either in the priority queue, or 'sometime over the horizon' - like us, they have more incoming than outgoing. [20:14] so you should talk to francis [20:16] rockstar: what's the reasoning behind the need for a merge request to have a commit message? [20:17] lifeless: not sure it's relevant, but on my openstack dev deployment, I'm using a hudson master and scheduled jobs instead of cron to do repetitive tasks [20:22] losa ping [20:22] abentley: hey, can I ask you a question about the importd's [20:30] flacoste: rt 40477 [20:35] lifeless: before or after staging PQM deployment? [20:36] (assessing priority) [20:36] RT 39150 [20:36] yeah [20:36] staging pqm will only be useful when I get some cycles to do the thing for gary [20:36] so before I think [20:37] I was crook in the weekend, so didn't get anything interesting done. [20:37] lifeless, pong [20:37] hey abentley [20:37] lifeless: ok [20:37] leonardr: is RT 39922 still relevant? [20:38] I wanted to know about the interactions between upgrades and the importd service [20:41] flacoste, humor me and tell me the summary while i look that rt up [20:42] flacoste: got it [20:42] flacoste: no, the performance test has been run and there is no need to do anything else right now [20:44] flacoste: also rt 40480 [20:45] abentley: so if I wanted to upgrade the code on the importds, does it imply downtime, and interruption to long running tasks, and if so how much downtime and how much interruption. [20:56] leonardr: thanks, i'll make it as obsolete or invalid watherver their status is [20:56] lifeless: while I look at that other RT, mind looking at my follow-up to your review :-) [20:56] sure [20:58] lifeless: bumbed above PQM staging deployment [20:59] thanks [21:00] gnight [21:01] lifeless, I don't know the deployment story of the importds. [21:02] abentley: who would? [that knows the code too, so we can talk about how to make it slick] [21:02] lifeless, mwhudson. [21:02] thanks [21:06] gary_poster: do you remember if l-d-d is updated every buildbot run? [21:06] flacoste: I think so, but things may have changed. want me to look? [21:07] gary_poster: i don't think it would have changed, i remember not being the case [21:07] so if we fixed that at some point, i'm pretty sure it works now [21:08] is the staging server getting an update? [21:08] looking' [21:08] it breaks when using API [21:08] flacoste: ah, no, it does *not* update l-d-d [21:09] Ursinha: have you or matsubara asked for the daily staging setup yet ? [21:09] in RT I mean, if not, I will. [21:09] (matsubara is out of office) [21:09] m4n1sh: yes, getting update [21:09] gary_poster, thanks. does it happen daily? [21:09] yes, m4n1sh [21:10] gary_poster, Thanks for the info [21:10] np [21:10] rockstar: tags don't seem to be propgating through tarmac [21:11] lifeless: can you remove your Needs fixing? [21:11] vote [21:11] so that ec2 land doesn't stomp me on the fingers [21:12] morning [21:12] gary_poster, is there any attribute on launchpad API's me object which is private to all? [21:12] me [21:12] lifeless, no [21:13] gary_poster, I mean, I want to use it as an auto password... that attribute should not be visible in any case on launchpad.net/~foo [21:13] Ursinha: ok, I've filed RT 40482 [21:13] flacoste: that seems buggy [21:13] flacoste: like, it should trust that when I say approve, I'm not being schizotic. [21:14] flacoste: I've fixed it but perhaps you could make a sane bug report up [21:14] ? [21:14] i will [21:14] thanks! [21:14] I'm really going to sleep now. [21:14] Later on, all.... [21:14] the bug is that it looks at the approver to determine the r= [21:15] and since there are no reviewers approving... [21:15] flacoste: right, but *my* vote as a reviewer has to be treated as approve [21:15] flacoste: given I approved the merge. [21:15] ciao [21:15] chao, sleep weel [21:15] well [21:17] m4n1sh: mmm, I stared at your questions and I think I understand sort of. You want to hack/abuse an attribute in some way, right? [21:17] gary_poster, actually i want to use launchpadlib in a django app [21:17] cool [21:17] but django needs a password in authentication module [21:18] but i want to use launchpad OAuth [21:18] so when the user returns after giving the permissions [21:18] i dont want him to ask the password [21:18] but there should be some field which is private to the user [21:18] oh! so...if the person is authenticated by launchpadlib, he's ok by you? [21:18] which i can use as the implicit password [21:18] yes [21:18] it is [21:18] everything is done.. [21:19] OK [21:19] but just to satisfy django's authentication module i need a password [21:19] You can't override it? In the abstract, that feels like the right thing to do, but I don't know Django's auth stuff so I'm just waving my hands. [21:20] same here. I thought there should be way around [21:20] django has a openid module [21:21] gary_poster, have a look here [21:21] http://docs.djangoproject.com/en/1.2/topics/auth/#authentication-in-web-requests [21:21] esp in authenticate() method [21:21] Well, it sounds like you are in hack land [21:21] If you are in hack land... === almaisan-away is now known as al-maisan [21:21] the thing wont keep quiet without a password [21:21] http_etag would suffice? [21:22] no [21:22] if i am in a hack land then? [21:23] gary_poster, i am not able to think any way to overcome this password part. Something needs to be done [21:23] can you just say that the password for *everyone*is an empty string, or some other hardcoded blather, but make sure that it is not actually usable? If you have a launchpadlib object for the person, then you stuff the hardcoded password in; otherwise, you make sure that the machinery doesn't work? [21:23] To be clear, this sounds absolutely horrible to me :-) [21:23] gary_poster, same here. and i have no idea about the security aspect too [21:24] if i go this way [21:24] If I were you I'd be banging around on Django docs or code or mailing lists or something to figure out a way to subclass the auth machinery [21:24] stashing the password... [21:24] ...is hackland. :-/ [21:25] I'll look in a little bit if you really want me to (I don't know of an answer offhand) [21:25] but I have to hope Django lets you substitute this bit [21:25] gary_poster, I think so [21:25] I found django-oauth too, but launchpadlib is so darn easy that I don't want more stacked up modules [21:25] sure [21:26] probably I would go by the hack you suggested. Let's see. [21:26] probably I might shoot off a mail to django mailing lists. I know lots of flames might return, but still I will learn something new [21:27] gary_poster, well looks like you are busy. I don't want to disturb you anymore [21:28] Thanks a lot for your help [21:30] m4n1sh: sorry it wasn't more, but yeah, busy. :-/ http://docs.djangoproject.com/en/dev/topics/auth/ seems to imply that looking at set_unusable_password might help; and that trying to see how LDAP support works might give you a pattern to follow [21:31] gary_poster, I think you got exactly what I was looking for. [21:32] I am an idiot. Looked at the whole page, but couldn't find out that very line [21:32] :-) [21:32] it is like a fine print :) [21:32] takes another person sometimes [21:32] gary_poster, I think the last hurdle also solved [21:32] awesome! [21:32] now I can go ahead and start working on the app [21:33] thank you very much (no this is not sarcasm, I mean it) [21:34] lol [21:34] np glad I could help [21:41] mars, gary_poster: i got errors compiling numpy!?! [21:41] on ec2 land [21:41] flacoste, that's odd. I have not heard of that. [21:42] mars: actually, that's not the real error [21:42] never mind [21:42] ok [21:42] i forgot to add commit the dep in download-cache === al-maisan is now known as almaisan-away [22:53] Can someone please ec2 land lp:~wgrant/launchpad/bug-598345-restrict-dep-contexts, lp:~wgrant/launchpad/refactor-_dominateBinary and lp:~wgrant/launchpad/really-publish-ppa-ddebs? [23:02] urgh. Do we really need a rabbitmq server just to develop any bits of launchpad? [23:04] You will soon. [23:04] But it will probably be started by make run soon. [23:43] mwhudson: Could you be convinced to land the three branches I referenced above?