[00:04] sinzui: @##@@@# mic acting up again :-( [00:07] hi all [00:07] lifeless, anyone, can you suggest how (if at all) i should qa the librarian-mail-naming change? [00:08] Project devel build #788: STILL FAILING in 6 hr 13 min: https://lpci.wedontsleep.org/job/devel/788/ [00:08] Project devel build #789: STILL FAILING in 5.6 sec: https://lpci.wedontsleep.org/job/devel/789/ [00:08] i guess by sending it mail, syncing the logs, then peeking in there? [00:08] send mail to qas, sget someone with qas db access to check the .raw attribute and follow that back to make sure its accessible in the librarian [00:09] by handcrafting a url from the database row? [00:18] more-or-less :) [00:18] rather more than less [00:22] wgrant: wallyworld, StevenK bug 1334, bug 80902 [00:26] not 1337 ? [00:27] :-P [00:27] night [00:30] wallyworld: are you saying you fixed bug 86861 [00:30] <_mup_> Bug #86861: SinglePopupWidget only works with vocabulary registered by name < https://launchpad.net/bugs/86861 > [00:30] sinzui: it appears so - i would have to read the bug details to be sure [00:46] o/ wallyworld [00:47] ? [00:58] Hm. buildbot has failed to check out devel -- I suspect since it tried during the rollout [00:58] Do I need to force a build to get it to start again? [00:59] StevenK: I already did. [01:00] 17 seconds before you asked, it would seem. [01:00] Haha [01:30] jelmer: Are bzr-svn imports meant to be taking ages now? [01:30] jelmer: Perhaps it is only the first run with the new version, but it is still slightly worrying. [01:30] eg. https://code.launchpad.net/~vcs-imports/ljcode/trunk and https://code.launchpad.net/~vcs-imports/wxwidgets2.6/trunk [01:32] 30 seconds to an hour? Orsum. [01:54] They do seem to eventually finish. [01:55] Let's see if the second run is faster. [02:04] Project windmill-devel build #191: STILL FAILING in 1 hr 6 min: https://lpci.wedontsleep.org/job/windmill-devel/191/ [02:38] lifeless: Erk. [02:38] https://lp-oops.canonical.com/oops.py/?oopsid=1985CE168 [02:38] Trigger contention on bugsummary? [02:39] I hope not [02:39] As do I, but what else? [02:41] so, a bug on lp is fine [02:42] gcc-linaro is a similar size [02:44] wgrant: its possible; the ubuntu row specifically will be fairly high volume, but 8 seconds -- argh [02:44] so, we need to make write transactions faster [02:44] for bugs [02:44] >< [02:44] s/argh/oh shit shit shit/' [02:44] I will talk with stub this evening [02:45] We'll have to see how bad this gets. [02:45] I think it will survive until then [02:45] But it may be pretty dire. [02:45] we may be have to disable the triggers which can be done pretty quickly [02:45] and then rollback my use of bugsummary. [02:45] 20 / 1 Bug:EntryResource:subscribe [02:45] but I hope not [02:45] That was in half an hour? [02:46] subscribing to public bugs won't lock any rows [02:46] (in theory) [02:46] but we're going on theory to blame bugsummary [02:46] It's a pretty reasonable theory. [02:47] Hm, OK, only 5 subscribe timeouts today so far. [02:48] yesterday - 3 / 0 Bug:EntryResource:newMessage [02:48] May be tolerable for now. [02:49] that new task [02:49] was on a product [02:49] 258 [02:49] Yes. [02:50] which is gcc [02:50] choose-affected-product is 2.16 99th percentile historically [02:50] ... are there enough triggers involved here? [02:51] to get 8 seconds due to contention you'd need simultaneous edits of 4 gcc bugs serialised [02:52] the new task won't affect the ubuntu rows in bugsummary [02:52] now, we *may* have a problem where we don't shortcircuit no-op changes and we need to [02:53] -- Grab a suitable lock before we start calculating bug summary data [02:53] -- to avoid race conditions. This lock allows SELECT but blocks writes. [02:53] LOCK TABLE BugSummary IN ROW EXCLUSIVE MODE; [02:54] bugsummary_tasks deals with a bug, not a bugtask. [02:54] It will be locking all tasks. [02:54] yes [02:54] I was just looking at that [02:54] == we are fucked [02:54] Pretty much [02:55] we're going to have higher than desirable contention on the primary row in Ubuntu [02:55] == 9s insert queries == aaaaaaaaa [02:56] so what we need to do is capture the bug rows (rather than summarise), capture after, take a diff, and then apply the diff [02:56] No, what we need to do is drop the triggers. [02:56] And fix them later. [02:56] this is non trivial to implement [02:57] we need to wait for stub, to determine the right way to disable them with slony [02:58] First thing, ident.ica and irc notices [02:58] 16:55 < stub> wgrant: That one is a 'possibly, but very problematic'. We need to drop and recreate a trigger, which is fast enough if we manage to grab a lock but grabbing the lock is problematic. And then we need to apply a db patch *to the slaves only* during the next rollout that updates the trigger. [02:58] So it looks like it needs a lock and drop on the master. [02:58] well there are two ways [02:58] we can redefine the trigger function [02:58] or we can drop the use of the trigger [02:59] lifeless: Do you know why it uses all the tasks? [03:00] I'm not really keen on reading 500 lines of PL/pgSQL... [03:00] wgrant: [03:00] its pretty simple code [03:00] see bugtask_maintain_bug_summary [03:00] Yes, but it's PL/pgSQL. [03:00] IF TG_OP = 'INSERT' THEN [03:00] IF TG_WHEN = 'BEFORE' THEN [03:00] PERFORM unsummarise_bug(bug_row(NEW.bug)); [03:00] ELSE [03:00] PERFORM summarise_bug(bug_row(NEW.bug)); [03:00] END IF; [03:00] RETURN NEW; [03:00] wgrant: thus its clearer than python :P [03:01] lifeless: Yes, that says how it calls it, but not why. [03:01] Is it because it's lazy and just updates everything? [03:01] By removing the bug from everywhere and then readding it? [03:01] yes, remove the bug from the table, let the task be added, summarise the bug into the table [03:02] 13:56 < lifeless> so what we need to do is capture the bug rows (rather than summarise), capture after, take a diff, and then apply the diff [03:02] Looks somewhat buggy to me. [03:02] if you consider undue contention buggy [03:02] When unsummarising it uses the new rows. [03:02] no [03:02] So I don't think adding a task will actually increment... [03:02] this is an insert [03:02] it only has the NEW bug id [03:03] Oh, this is a before. Right. [03:04] http://www.postgresql.org/docs/8.4/static/trigger-datachanges.html [03:04] So, are we going to wake up stub or hope he arrives in a timely manner or wing it? [03:04] we're not winging it [03:04] its bad but its not disastrous [03:05] (its not too far off disastrous) [03:05] It needs exclusive locks on either side :/ [03:06] It's only not disastrous because most people have EODed already. [03:06] Yippie, build fixed! [03:06] Project db-devel build #622: FIXED in 6 hr 24 min: https://lpci.wedontsleep.org/job/db-devel/622/ [03:06] no, its not disastrous because it hasn't gone -completely- dead [03:07] what do you mean by exclusive locks either side ? [03:07] Both sides of each bugtask INSERT require an exclusive lock on BugSummary. [03:07] How ugly. [03:07] are you worried about deadlock ? [03:08] Possibly. [03:08] the lock once taken applies through to commit [03:08] its per-row [03:08] LOCK TABLE BugSummary IN ROW EXCLUSIVE MODE; [03:08] That's not per-row. [03:08] Oh. [03:09] ... iz not? [03:09] I missed the "ROW" because the comment is slightly misleading. [03:09] Ahem. [03:10] LOCK TABLE only deals with table-level locks, and so the mode names involving ROW are all misnomers. These mode names should generally be read as indicating the intention of the user to acquire row-level locks within the locked table. Also, ROW EXCLUSIVE mode is a sharable table lock. [03:10] How odd. [03:11] yeah [03:11] I am refreshing this too [03:12] I'm still not sure how this isn't disastrous. [03:12] An exclusive row-level lock on a specific row is automatically acquired when the row is updated or deleted [03:14] its 8am for stub [03:14] Oh, true, it is later than I had thought. [03:19] I'm considering raising the default timeout to 20 seconds [03:20] the contention is self limiting bounded on permitted transaction time [03:21] if the change rate is below some N (unknown) then we won't backoff indefinitely, it will just spike up to some number of seconds at high concurrency changes [03:21] if the change rate is above N, it will cascade and push back past whatever timeout we set [03:21] Yes, that's my worry. [03:23] ok, the lock mode is wrong I think [03:23] 18 subscribe timeouts so far. [03:24] On single-task Ubuntu bugs, too :/ [03:24] oh hell [03:25] subscribe changes the bug last-updated field doesn't it. [03:25] lalalalaalalaala [03:25] Hee hee so it does. [03:25] 00186-09119@SQL-launchpad-main-master INSERT INTO BugSubscription (bug, bug_notification_level, date_created, subscribed_by, person) VALUES (317370, 40, CURRENT_TIMESTAMP AT TIME ZONE 'UTC', 690731, 690731) RETURNING BugSubscription.id [03:25] Hmm. [03:25] But it's a single-task bug. [03:25] That shouldn't be that bad. [03:25] shouldn't matter: [03:25] IF TG_OP = 'UPDATE' THEN [03:25] IF OLD.duplicateof IS DISTINCT FROM NEW.duplicateof [03:25] OR OLD.private IS DISTINCT FROM NEW.private THEN [03:26] and hah - where is status? [03:27] oh, bugsubsctiption might not be fast-pathing [03:28] yes, thats it [03:28] whats the bug #f or this ? [03:28] Bug #794802? [03:28] <_mup_> Bug #794802: OOPS-1986EA9 trying to add 'linux' task to a bug < https://launchpad.net/bugs/794802 > [03:31] no answer on either phone [03:45] trying phone again [03:45] I have a fixed bugsubscription trigger [03:46] which is likely to be a rather huge component. [03:46] It was updating even for public bugs? [03:46] yes [03:48] [un]summarise doesn't know that it could skip for subscription triggered summarisation [03:51] stub! [03:51] he's on [03:52] I think that is what wgrant was commenting on [03:52] yeah, I'm just saying.... [03:52] lifeless: Around? [03:52] stub: hi [03:52] yo [03:52] stub: can I nab you for a moderately urgent voice call about fallot from bugsummary? [03:52] k [03:53] stub: (and sorry for repeatedly trying your mobile, all will become clear) [03:53] https://bugs.launchpad.net/launchpad/+bug/794802 and the topic in -ops [03:53] <_mup_> Bug #794802: many bug activities timing out due to contention on bugsummary < https://launchpad.net/bugs/794802 > [03:53] https://code.launchpad.net/~lifeless/launchpad/bug-794802 [03:54] oh, and http://www.postgresql.org/docs/8.4/static/explicit-locking.html#LOCKING-ROWS is going to be referenced [03:54] stub: skype or your mobile? [03:54] ah, skype. :) [03:54] lifeless: Is there going to be a test? [04:02] stub: http://bazaar.launchpad.net/~lifeless/launchpad/bug-794802/revision/13176 [04:04] stuart and I are talking through it [04:04] the table level lock is deliberate due to edge cases [04:04] we'll need to consider that in future [04:04] we are doing 0.19 busummary updates / second [04:05] * StevenK suspects the topic here is out of date [04:05] this may be contention (5 second transactions) or it may be that we're mostly filtered through [04:06] we're updating the bugsubscription trigger [04:07] stub: http://pastebin.com/RZi7gfkq [04:11] ok, stuart has applied the new subscription filter [04:29] wgrant: whats the oops frequency looking like ? [04:29] lifeless: Only 5 subscription timeouts in the last hour. [04:30] lifeless: one idea for lp people in Dublin is to to a bit of work towards turning loggerhead into a service, and then making it be hooked into the main lp web app [04:30] poolie: its already a service [04:31] poolie: I think hooking it in better and expanding its web service to be more useful to LP would be great [04:31] well, "used as a service" [04:31] is it already? [04:31] do you think this would be realistic to stab at in a one week sprint? [04:31] its json api is used to show diffs / revs when yo uclick on expanders [04:32] poolie: I wouldn't want the lp app servers doing callouts to loggerhead today: [04:32] - we are not in position to parallelise [04:32] - without parallelism it will make the appservers tiiiime out [04:32] wgrant: what was the last one ? [04:33] poolie: and loggerhead isn't consistently fast yet, and we don't have reliable timeout reports for it yet [04:33] ok, so the only path would be to have the browser pull data from it directly? [04:34] right, which it does now but from a different domain [04:34] so calculating the right url to pass and putting that in the lp pages, and making those urls available under the main hostnames to avoid SSL - thats tractable [04:34] lifeless: 2011-06-09T03:01:31.659696+00:00 [04:34] wgrant: thanks [04:38] "under the main hostname" meaning having apach proxy them or something? [04:39] Why is the Ubuntu branches celebrity being removed? [04:39] cody-somerville: Ubuntu's owners and uploaders are fulfilling the role instead. [04:40] LP #524173 - There is a need for a 'bot' to have write access to the branches but not upload permissions. Would it make sense to create a 'bot account' and make that a celebrity? [04:40] There will be no more celebrities. [04:41] so iow if we did this, we would model it by having an acl-type slot on the ubuntu distribution for "people who can write to branches but not upload"? [04:41] So whats the recommended way for a process like package-import to get elevated privileges? [04:41] cody-somerville: Give it upload privs, I suppose. [04:41] cody-somerville: It will effectively have them anyway. [04:41] that was francis's approach [04:42] wgrant, How? [04:42] cody-somerville: How what? [04:42] wgrant, That is, how will it effectively have upload permissions if it has write permissions to the branch? [04:42] cody-somerville: BFBIP [04:43] If you can alter the branch, you can compromise it. [04:43] The next person to use it will get your exploit into the primary archive. [04:44] stub: https://bugs.launchpad.net/ubuntu/+bugtarget-portlet-tags-content === stub1 is now known as stub [04:45] stub: https://bugs.launchpad.net/ubuntu/+bugtarget-portlet-tags-content [04:45] wgrant, lots of teams in Debian maintain packages in branches. Folks can have write privs to the branch but not upload. Its a legitimate use case by its self. The fact that the upload could potentially not review changes others have made before uploading does not mean having write privs is the same as having upload privs. [04:46] cody-somerville: Launchpad official branches operate under the rule that upload permissions == edit permissions [04:46] To change that would be a redesign. [04:46] i'll propose this on the tb [04:47] ok, we're rolling forward [04:47] poolie: To the TB, during their meeting, or on the TB's mailing list? [04:47] i think moving from james to a role account would be a step forward [04:47] on the list [04:48] poolie: yes, apache rewrite rules ftw [04:48] I imagine The Developer Membership Board is the appropriate body to approach as the TB has delegated controlling upload permissions to that body. [04:49] However, the TB has not delegated the branch permission, so I think it should go before the TB, not the DMB. [04:49] poolie: what are you proposing to the tb ? [04:49] Well, LP is kinda special: if LP decides to implement an internal role uploader, it doesn't really match the Ubuntu processes. [04:49] persia: package-import isn't part of LP. [04:49] to change the package importer from impersonating james to having its own account, per bug 524173 [04:49] <_mup_> Bug #524173: package-import uses james_w credentials < https://launchpad.net/bugs/524173 > [04:50] LP has more opportunity to be malicious with Ubuntu contents than any uploader. [04:50] wgrant, Should it be? [04:50] persia, +1 [04:50] persia: Depends on your definition of "part of LP". [04:50] so, I would love to participate in this [04:50] but I have a critical regression to fix. [04:50] This has been discussed with at least various TB members already [04:50] you can reply to my mail or on the bug [04:50] it will not be settled today [04:51] I'd consider a role account running services in a LOSA-controlled environment to be "part of LP" for the purposes of this discussion. [04:51] cjw at least has commented there [04:51] i just want to get it unblocked [04:51] at the moment you have access to operate as james_w [04:51] so you can be as malicious as you like [04:51] so, generally, lp will perhaps move into less-trusted interacting services [04:51] a dedicated account will mitigate that [04:52] by making it clear who uploaded etc [04:52] TB can easily have a bot that watches this account and makes sure it has no gpg key :) [04:52] What about the principle of least privilege? [04:53] cody-somerville, See comment #8 on the bug: there isn't really any semantic difference between commit-to-branch and upload. [04:53] it's a good principle [04:53] within the year push to the branch will do the build [04:53] this moves us closer towards it [04:53] so the principle will say 'this is the least privilege' [04:53] (modulo various details about how BFBIP all works) [04:53] if the role account is a member of Ubuntu Core Developers team then it still has tons of permissions it does not legitimately require [04:54] Agreed. [04:54] so, [04:54] we can always add a dedicated role in future if desired... but note that *right now* its running as jamesw [04:54] cody-somerville, Easier is to have the role account just have upload to everything, rather than making it a core-dev. [04:54] so it has those permissions,. [04:54] Moving to a dedicated losa administered account is an improvement. [04:54] you're welcome to argue that more improvement is needed. But that is a separate discussion. [04:54] nothing is *regressing* [04:55] (and I am sure that me, poolie, wgrant etc are all open to such an argument) [04:55] i completely agree [04:55] this is a step forward and not a step back [04:56] And there are more steps, but they need to be thought about more, and not having thought about them yet shouldn't block this. [04:57] Who would have access to the account? [04:58] canonical staff who maintain the importer [04:58] (that is, the people that currently have access to james_w's account) [04:58] If you say only LOSAs then I'd be alot happier about this [04:58] (as they can do anything they want already) [04:58] right [04:58] that is not true at the moment but it will become so [04:58] cody-somerville: there is an RT to move it to losa-only. [04:58] there is an RT asking for it in the queeu [04:58] Eventually it should only be the LOSAs, yes. [04:58] cody-somerville: its also in-progress. [04:59] And the role account should only have ArchivePermissions for the primary archive, not any others. [05:08] http://pastebin.ubuntu.com/622270/ [05:09] ^ draft email; comments welcome [05:10] poolie, Enough of TB has read access to RT that it may be worth mentioning the ticket number. [05:10] ^ great, more mail to ignore ;-) [05:10] :) [05:10] ok [05:10] good idea persia [05:11] poolie, Also, it's probably worth phrasing the request differently. The TB acts as a deliberative body, but is intentionally not expected to be a bottleneck. The expectation is that people do stuff, and the TB provides observance and guidance. [05:11] so "i plan to do this next week"? [05:12] I'd suggest either announcing to the TB that LP plans to do this, and asking for feedback, *OR* structuring it as a request for the TB to grant the appropriate permissions to the robot account. [05:13] In either case, I recommend cc: bryceh as the representative stakeholder [05:15] ok, ka-thunk === jtv is now known as jtv-eat === mwhudson_ is now known as mwhudson [06:12] Project devel build #790: STILL FAILING in 5 hr 56 min: https://lpci.wedontsleep.org/job/devel/790/ [06:13] Hmmmm [06:26] and it passes tests. --woot-- [06:26] maybe not all :P [06:31] wgrant: lp:~lifeless/launchpad/bug-794802 [06:31] if you're interested === almaisan-away is now known as al-maisan === al-maisan is now known as almaisan-away [07:48] Project windmill-devel build #192: STILL FAILING in 1 hr 6 min: https://lpci.wedontsleep.org/job/windmill-devel/192/ [08:08] wgrant: AWOL for ~ 40; if stub turns up point him at his email. [08:08] lifeless: Sure. [08:15] good morning [08:21] lifeless: FWIW we're still seeing some +choose-affected-product timeouts, but no subscribe ones. [08:22] And lots of BugTask:EntryResource timeouts now :/ [08:22] Mostly status changes on Ubuntu bugs. [08:22] Particularly linux. [08:23] Almost entirely on a single bug. Perhaps lots of retries. [08:26] Tag changes taking 5s... === wgrant changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: - | Critical bugs:214 - 0:[######=_]:256 [08:30] Project windmill-devel build #193: STILL FAILING in 41 min: https://lpci.wedontsleep.org/job/windmill-devel/193/ [08:38] wgrant: yeah [08:39] (back) === wgrant changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: - | Critical bugs:209 - 0:[######=_]:256 [08:44] \o/ [08:44] Yay! [08:44] Still well above where we were three weeks ago :( [08:45] need to stop adding bugs [08:45] We've only had like one new critical today. [08:46] 3 ? [08:46] bugsummary [08:46] Did I miss some? [08:46] francis escalated the accessibility-in-bugtask bug [08:46] and I think there was another [08:46] :( [08:47] certainly 2 [08:48] another day w/no microservice :( [08:52] wgrant: thanks [08:52] lifeless: Huh? [08:52] closing off the bugs [08:52] Oh, right. [08:53] I left the subs ones open for Yellow to close or not. [09:09] https://bugs.launchpad.net/ubuntu/+bugtarget-portlet-tags-content is still fast. [09:09] I'm having to pinch myself :) [09:10] Yes, but it made everything else slow :( [09:10] gotta pick your battles [09:10] we should chang that to json [09:11] We should change a lot of things to JSON. [09:11] yes [09:11] this is one of them === almaisan-away is now known as al-maisan [09:20] Morning [09:21] * jelmer waves [09:21] morning mrevell, jelmer [09:22] hi bigjools [09:23] wgrant: it's odd, some bzr-svn imports seem a lot slower; locally I don't see that effect though [09:23] jelmer: Some got fast again. [09:23] After a few tries. [09:24] I retried one 4 times. [09:24] wgrant: which one? [09:24] First was 40 minutes, second 25ish, third 10, fourth 1. [09:24] I can't remember... chromium crashed. [09:24] Let me see if I can find it in history. [09:24] ahh alpha software [09:25] wgrant: it looks to me like it's all bzr-svn imports that are affected - have you seen any bzr-git or bzr-hg imports becoming slower? [09:25] jelmer: https://code.launchpad.net/~vcs-imports/flylinkdc/trunk [09:25] jelmer: No, only bzr-svn. [09:25] jelmer: ANd only bzr-svn has been eating swap. [09:25] AFAIK [09:26] ah, so it's a memory thing? [09:26] Well, pear was swapping heavily. [09:26] One import eating 40% of the RAM. [09:26] \o/ [09:26] But it had vanished before the next ps. [09:26] So we don't know which it is. [09:26] I thought we had ulimit on it [09:32] even with some swapping, it seems like there shouldn't be a 10 second vs 1 hour difference [09:42] wgrant: how did you get to the ps output, losa interaction? [09:47] jelmer: Yes, LOSA. [10:06] wgrant: ahh, I think it's got to do with the fix for bug 309682 [10:06] <_mup_> Bug #309682: tags are copied but their revisions may not be < https://launchpad.net/bugs/309682 > [10:09] jelmer: It looks rather like that, yes. [10:09] the fact that they become increasingly faster (rather than just being slow once) is really weird though [10:09] jelmer: It's not going to try to pull all the revs mentioned by tags in the whole repo, is it? :) [10:09] Yes... [10:11] wgrant, I'm wondering if we should make that an optional thing, there are situations in which it's not correct and causes a lot of extra data [10:25] jelmer: Possibly. Anyway, it should settle down eventually, I guess. [10:36] Project windmill-devel build #194: STILL FAILING in 1 hr 9 min: https://lpci.wedontsleep.org/job/windmill-devel/194/ === al-maisan is now known as almaisan-away === Daviey is now known as Da === Da is now known as Daviey [11:36] is there any reason why a "bzr pull" would update loads of files and then finish with "bzr: ERROR: [Errno 13] Permission denied" ? [11:37] Yippie, build fixed! [11:37] Project devel build #791: FIXED in 5 hr 25 min: https://lpci.wedontsleep.org/job/devel/791/ [11:47] Project parallel-test build #22: STILL FAILING in 1 hr 6 min: https://lpci.wedontsleep.org/job/parallel-test/22/ [12:03] * henninge lunches [12:12] Project windmill-db-devel build #375: STILL FAILING in 1 hr 16 min: https://lpci.wedontsleep.org/job/windmill-db-devel/375/ === almaisan-away is now known as al-maisan [12:45] bigjools: O hai. You haz many lots QA to do. [12:46] me personally? [12:46] bigjools: Yes, four revisions are marked as assigned to you on the deployment report. [12:47] StevenK: and they were all submitted with --no-qa, [12:47] I see. [12:47] rvba: O hai. You haz many lots QA to do. [12:58] StevenK: indeed. [13:00] StevenK: please don't forget to take a look at the multiple parents init stuff when you get a chance :) [13:07] Project windmill-devel build #195: STILL FAILING in 1 hr 6 min: https://lpci.wedontsleep.org/job/windmill-devel/195/ [13:31] lifeless: it might have to be next week. sorry. === matsubara-afk is now known as matsubara [13:59] Morning, all. [14:11] morning folks. [14:11] morning [14:12] Morning, jcsackett [14:20] heya henninge: did the branch you merged my stuff into land? [14:20] jelmer: hi [14:21] jml: Hey [14:21] huwshimi: hello [14:21] huwshimi: let's have a call. [14:21] jml: Sure [14:21] jml: hi [14:21] jelmer: was going to talk to you about BFBIP, but need to talk to huwshimi first. [14:22] jml: ah, ok. you know where to find me :) === Ursula is now known as Ursinha === mrevell is now known as mrevell-lunch [14:33] jcsackett: yes, it did, your branches say so, too. ;-) [14:34] jcsackett: it's on the next buildbot ride [14:35] henninge: cool. do you know if qa-tagger will be able to keep track of this, or should i keep an eye on your branch to qa my related bugs? [14:35] jcsackett: I linked my branch to bug 787595, too, so it should update it. [14:35] <_mup_> Bug #787595: person picker could have a link to choose myself when I am a valid choice < https://launchpad.net/bugs/787595 > [14:36] ah, cool. thanks, henninge. [14:36] np [14:36] and sorry for the confusion regarding my branch names. === mrevell-lunch is now known as mrevell [15:21] Project windmill-devel build #196: STILL FAILING in 1 hr 8 min: https://lpci.wedontsleep.org/job/windmill-devel/196/ === al-maisan is now known as almaisan-away [15:45] deryck: in YUI widgets the "initializer" method is the right place to parse and deal with passed in config stuff, right? [15:46] jcsackett, yup [15:46] cool. wanted to make sure i wasn't embarking on something goofy (as has so always been the case). :-P [15:47] jcsackett, if you need to, of course. the widget infrastructure does all of the default values and changing values based on passed in config for you. [15:47] jcsackett, but if you need something based on the value of two config options, for example, then the initializer would be the place to do that. [15:47] deryck: yeah, this is dealing with non-default config options. [15:48] right [15:48] yup, you're on the right track [15:48] cool. thanks, deryck. [15:48] np [15:54] henninge: test_picker_displays_empty_list does not pass in devel [15:54] I am debugging this now. [15:55] sinzui: oh [15:55] sinzui: what do you mean? [15:56] The test fails in my browser and in the YUI test layer I just fixed [15:56] ah [15:56] strange, it passed for me [15:56] We expect an empty string, but get undefined [15:57] henninge: does it still pass in devel for you? I think this may be a merge compliation [15:57] let me try [15:58] henninge: I tested with fireforx and chromium [15:58] The test runner will use webkit [16:00] sinzui: all tests pass [16:00] how many tests passed? [16:00] I have 9 pass and 1 fail [16:00] 10 pass [16:01] have you pulled devel in that last hour? [16:01] sinzui: what does your line 22 to 25 of lib/lp/app/javascript/widgets.js look like? [16:01] I just pulled right now [16:03] henninge: It looked like the line in the diff [16:03] if (data.title === undefined) { [16:03] // Display an empty element if data is empty. [16:03] return li_title; [16:03] ok [16:03] } [16:04] that should pass [16:04] the failure you described looked like those lines were missing. [16:04] Project parallel-test build #23: STILL FAILING in 1 hr 6 min: https://lpci.wedontsleep.org/job/parallel-test/23/ [16:05] sinzui: 10 pass, both in firefox and chromium. [16:05] I can see the for the innerHTML for yui3-picker-results, yet the innerTEXT is undefined [16:05] oh. [16:05] maybe *I* need to rebuild [16:06] yes, this must be something to do with your local setup [16:09] deryck: I see that tests spend more than 1/5 of the time setting individual YUI test layers for each app. Do we really want a BugsYUITestLayer if we can run all the tests in 2 minutes [16:09] sinzui, no, we don't need app specific yui layers. [16:09] app-specific anything is old school anyway ;) [16:10] I will remove them. [16:10] especially with yui tests it makes less sense anyway.... say I change the picker.... I need to know all uses of the picker are safe, not just one app's version. [16:14] henninge: I think the merge is the issue. Your block of code is never entered. I need to review the test setup I think. [16:27] Project windmill-db-devel build #376: STILL FAILING in 1 hr 6 min: https://lpci.wedontsleep.org/job/windmill-db-devel/376/ [16:34] Project db-devel build #623: FAILURE in 5 hr 57 min: https://lpci.wedontsleep.org/job/db-devel/623/ === matsubara is now known as matsubara-lunch [17:07] sorry it's been such a brief day folks, but I have to go. back tomorrow for Action Friday. [17:11] Project windmill-devel build #197: STILL FAILING in 1 hr 6 min: https://lpci.wedontsleep.org/job/windmill-devel/197/ === deryck is now known as deryck[lunch] === salgado is now known as salgado-lunch [17:27] I need help with an ec2 image. I am following the steps in https://dev.launchpad.net/EC2Test/Image [17:27] I have uploaded the image and I see it in S3. I do not know how to make it public [17:28] sinzui: is that a new thing? I don't remember having to do that [17:31] I added my name to lib/devscripts/ec2test/account.py, but I do not see my ami listed. I assumed maybe that was because it is private. [17:31] maybe [17:31] you probably need the AWS console [17:32] right click the AMI and select "edit permissions" [17:32] I have a public/private radio selection there [17:33] bigjools: I do not see that. I see a bucket in s3 with with image name, but it does not say it is an ami [17:34] sinzui: click on "AMIs" on the left [17:34] My browser says there is no "AMI" on the page :( [17:34] https://console.aws.amazon.com/ec2/home?region=us-east-1#s=Images [17:35] bigjools: thank you. aws started in s3, not ec2 [17:35] heh [17:35] it's a wall of jargon on that page [17:39] hi abentley. Without having to do any research, can you confirm that we do not support importing private github branches? If so, please do. [17:54] gary_poster: I can't be 100% sure without research, but I think it's very unlikely. [17:54] ok, thanks abentley === deryck[lunch] is now known as deryck === salgado-lunch is now known as salgado === Ursinha is now known as Ursinha-lunch === matsubara-lunch is now known as matsubara [19:14] morning y'all [19:56] There seems like there used to always be an on call reviewer but now every time I've checked the topic its just '-'. : ( [19:57] what do you need reviewed? [19:58] https://code.launchpad.net/~timrchavez/launchpad/set_ppa_private_from_api_724740-2/+merge/63950 [19:59] lifeless, I notice that the associated bug (LP #724740) is listing the superseded MP instead of the new one. Is there something that needs to be done to update that or is that a bug? [19:59] <_mup_> Bug #724740: setting a ppa private cannot be done over the API < https://launchpad.net/bugs/724740 > [19:59] cody-somerville: file a bug, include a screenshot and the url to the old mp and the new mp [19:59] that should get reviewed today, unf mid-crisis still [20:01] and of course for some reason it now shows the new mp, lol. page must have been cached in my browser. [20:02] cody-somerville: I am OCR, it would seem my morning topic change didn't take. === jcsackett changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: jcsackett | Critical bugs:209 - 0:[######=_]:256 [20:05] cody-somerville: i'm looking at it now, and sorry about the topic confusion. [20:11] jcsackett, No problem! :) Kudos! === elmo__ is now known as k [20:21] k: special? === k is now known as Guest82647 === Guest82647 is now known as elmo === Ursinha-lunch is now known as Ursinha [20:25] Argh. [20:25] Just found a bug in LP. [20:25] Well, possible bug. [20:25] no, really? [20:26] lifeless: I headdesked for an hour trying to figure this out :-) [20:26] https://launchpad.net/~ubuntumembers/+members [20:26] There are 521 direct members of the "Ubuntu Members" team, and 695 people [20:26] the 695 is not entirely true. [20:26] So, if you have 10 teams [20:27] teams are members too [20:27] if counts the unique members in those 10 teams [20:27] lifeless: ah. [20:27] this could be presented better [20:27] Yes [20:27] and a case can be made that a team shouldn't count as a member [20:27] exactly [20:27] I was trying to figure out why my script wasn't catching all the members [20:28] however our db doesn't know the difference today in teamparticipation [20:28] so its not cheap to do differently [20:28] So, doing this would be expensive. [20:28] on your script shouldn't be iterating members [20:28] Right. [20:28] it should be interating participants [20:28] Project parallel-test build #24: STILL FAILING in 1 hr 15 min: https://lpci.wedontsleep.org/job/parallel-test/24/ [20:28] members was my use case. I wanted to get a list of all the members including sub teams. [20:29] I can just exclude the teams out of it and I could get a clean list. [20:29] or does participants do that for me? [20:30] https://launchpad.net/api/1.0/~launchpad/participants [20:30] [yes] [20:30] hi stub [20:31] still fighting the perf regression? [20:31] yo [20:31] SpamapS: yeah [20:31] SpamapS: did you see atimeout ? [20:31] It renders the sru-accept script we use in ubuntu-archive-tools unusable [20:32] what does the script do [20:32] lifeless: "members = team.participants" heh, I was already using participants :) [20:32] nigelb: then that is recursing into subteams for you [20:32] yup \m/ [20:32] nigelb: transitive closure, but I think it excludes teams. IMBW, check the Person interface and source ;) [20:32] stub: I have some feedback on the -journal case [20:33] lifeless: looks through the tasks of given bug #'s for the given package name/series, and marks it to Fix Committed, then tags the bug verification-needed and comments [20:33] stub: and am looking into the test failure [20:33] lifeless: Thanks! [20:33] cool. It didn't seem obvious, but looks legit. [20:33] lifeless: I'm not sure which operation is timing out [20:33] Only seems the one failure though, but I couldn't see what is different between the bugtag case and others [20:33] stub: is it safe to chang the python stuff around person merge without adding the references metadata to the db ? [20:34] lifeless: yes, that is fine [20:34] lifeless: but I have the HTML of the failure [20:34] stub: oh, bugsummary.txt is blowing up for me just now [20:34] SpamapS: got the oops ? [20:34] lifeless: I think it is one failure (off by one, as if a bugtag didn't get created), and fallout from that [20:35] OOPS-1986AO121 [20:35] stub: I'm concerned that we're going to journal a complete-remove and complete-add to the journal even when nothing interesting changed - 50% of the rows or more could be noise [20:36] lifeless: My assumption is that just spitting inserts to the journal is faster than trying to do updates and avoids the locking issues. [20:36] and OOPS-1986DY97 [20:36] stub: the temp journal won't have locking issues [20:36] lifeless: Using the temp file seems a good idea though [20:36] temp table I mean [20:36] stub: ok, I have a patch that does that, let me commit and push for your hilarity [20:37] lifeless: only look into it if it helps with the solution. If not, I can manually accept bugs until the problem is solved. :) [20:37] btw. there are no mixed case tablenames - FooBar is cast to lowercase. "FooBar" is a mixed case table name. [20:37] * lifeless headdesks [20:38] stub: lp:~lifeless/launchpad/bug-794802 [20:39] SpamapS: it hasn't synced yet; did you get a backtrace in the html ? [20:40] stub: there are two reasons I can see slow queries - contention, or we're just adding too much work [20:40] stub: we're kindof betting its contention [20:40] the queries I've checked are only attempting to update a handful of rows, so I'm betting on contention [20:40] stub: argh, *really* not awake [20:40] lifeless: no sorry just the oops number. [20:41] stub: yeah, if its slow on the db, its contention [20:41] (now that the seq scans are gone) [20:41] stub: they are, aren't they? [20:41] stub: as in the slow occurences you see when explained show index use ? [20:41] My scenario is some dude triaging a big and clicking three of four things - change status, target, add a comment. Each is a separate ajax request being launched around the same time and wanting to lock the same rows in BugSummary. [20:42] jcsackett: thanks for the review [20:42] stub: so status, target would serialise normally [20:42] benji: you're welcome. [20:42] stub: (same row in bug) [20:42] Yes. We have fixed index use. Queries are fast, except when they are real slow. Looking at the plans they seem fine. [20:42] stub: tag + status would serialise on bugsummary [20:42] stub: as would tag + target [20:43] commenting won't change the aggregates so won't lock anything [20:43] So if one takes 4 seconds, the other is blocked for 4 seconds before it gets to issue its query. [20:43] well, before the flush gets the row its trying to update [20:43] And this blocking is happening inside the trigger, so counts as SQL evaluation time. [20:43] yeah [20:44] so this failure [20:44] we're looking for tag is null [20:44] Best theory I've come up with anyway :) [20:44] (line 136 of the doctest) [20:45] 3 new bugs should result in an increase of 3 in the tag=null rows [20:45] but we're seeing a total of three [20:46] if I comment out line 128 I see the same thing [20:47] So given how similar the current branch is to the previous, wtf is is failing now and not before? [20:47] I'm trying with the rollup() call commented out [20:48] I guess it is in the combinedbug view... maybe rollup [20:48] which results in the right raw data [20:48] so the journal is capturing everything [20:48] the rollup appears to be discarding a row === benji is now known as Guest88900 [20:50] I'm going to toss a few sample bugs into launchpad_dev and look at the rollup by hand === benji___ is now known as benji [20:55] yeah its messed up I think [20:55] id | sum | product | productseries | distribution | distroseries | sourcepackagename | viewed_by | tag | status | milestone [20:55] ----+-----+---------+---------------+--------------+--------------+-------------------+-----------+-----+--------+----------- [20:55] | 2 | 21 | | | | | | | 10 | [20:55] | 1 | 21 | | | | | | moo | 10 | [20:55] (2 rows) [20:55] launchpad_dev=# select * from bugsummaryjournal; [20:55] id | count | product | productseries | distribution | distroseries | sourcepackagename | viewed_by | tag | status | milestone [20:55] ----+-------+---------+---------------+--------------+--------------+-------------------+-----------+-----+--------+----------- [20:55] 1 | 1 | 21 | | | | | | | 10 | [20:55] 2 | 1 | 21 | | | | | | | 10 | [20:55] 3 | -1 | 21 | | | | | | | 10 | [20:55] 4 | 1 | 21 | | | | | | moo | 10 | [20:55] 5 | 1 | 21 | | | | | | | 10 | [20:55] (5 rows) [20:55] no, actually, that sums to 3 [20:55] and we get three [20:56] though adding the tag should really have only journalld one row [20:57] but the rollout output was [20:57] 65 | 1 | 21 | | | | | | | 10 | [20:57] 66 | 1 | 21 | | | | | | moo | 10 | [20:57] which is clearly wrong [20:57] because we expand the tag to the null and the tag - two rows. [20:57] stub: no [20:57] because _dec and _inc only update by 1 [20:57] they don't use the vector [20:58] stub: UPDATE BugSummary SET count = count - 1 [20:58] easy fix [20:58] Yer [20:59] I see. count might be 2 or -2, but we dec or inc by one only [20:59] 'Cause I was a smart arse and wanted to minimize updates :) [20:59] You fixing on your branch? [20:59] yes [20:59] test running now [21:01] worked for the doctest [21:02] pushing now [21:02] we're running on the temp journal atm right ? [21:02] stub: :!bzr push [21:02] Using saved push location: lp:~lifeless/launchpad/bug-794802 [21:02] Pushed up to revision 13189. [21:02] Nothing in test_bugsummary is invoking rollup, so all those tests passed. [21:02] thank mercy for doctests :P [21:02] lifeless: Yes. The first MP is what is running right now, with patches -1 and -2 [21:03] so I think we're busy corrupting by off-by-ones just now [21:03] anyhow, my branch is pushed [21:04] So in patch-2208-75-0.sql we should delete * from bugsummary and reinitialize using the SQL from 63-0 [21:04] http://bazaar.launchpad.net/~lifeless/launchpad/bug-794802/revision/13188 and http://bazaar.launchpad.net/~lifeless/launchpad/bug-794802/revision/13189 are my incremental changes over your code [21:04] stub: nah - we've got those new dimensions to add [21:04] stub: do the reinit in that patch [21:05] lifeless: The bug is in rollup, and that isn't what is running atm. So data should be fine. [21:06] stub: no, the bug was in _inc and _dec, which the temp journal code uses [21:06] oh... your journal has the same problem [21:06] yer. [21:06] so how should we proceed; its late for you I know. [21:06] lifeless: ahh... but is there anyway to get anything but -1 or +1 in the count atm? [21:07] stub: there is [21:07] distro bug tasks and bugs with tags [21:07] erm, scratch tags. [21:07] distro source package bug tasks [21:07] invoking multiple triggers in one transaction [21:07] actually no I think its fine [21:08] because each _row_ changed flushes separately to the table [21:08] that that had damn well better be unary or we are so screwed. [21:08] yes, I think that is right. [21:09] which is why the doctest used to pass with the same bug [21:09] yer [21:10] lifeless: So count is positive when we call _dec? Yay. [21:10] *blink* [21:10] + UPDATE BugSummary SET count = count - $1.count [21:10] can't be [21:11] thats a thinko - and uncovered by tests [21:11] we only call _dec from rollup now [21:11] lifeless: Might want to invoke rollout in the test_bugsummary helper just after the flush and invalidate, like the doctest. [21:12] it has much more coverage [21:12] ok [21:12] perhaps we should run it twice [21:12] once without and once with [21:12] that can wait [21:13] I'm more interested in seeing if there are other lurking edge cases [21:14] running all the tests with the rollup call added and the buggy _dec [21:14] I want to see if we have coverage [21:15] So in theory, something should fail now or the laws of mathematics are being warped. [21:15] non euclidean geometry FTW [21:15] SpamapS: your oops is [21:15] SELECT bug_summary_dec( $1 )"\\nPL/pgSQL function "bug_summary_flush_temp_journal" line 9 at PERFORM\\nSQL statement "SELECT bug_summary_flush_temp_journal [21:16] SpamapS: e.g. the contention we're working on [21:18] yeah, we get fail [21:20] stub: all tests pass with that patch [21:20] woot [21:20] Push again? [21:21] 13191 available at your local bzr server. === almaisan-away is now known as al-maisan [21:25] stub: I'm going to fry up some brekky [21:25] k [21:25] stub: while you review [21:25] Yup [21:26] Don't worry about me, I'm already stuff full of dead pig. [21:30] yum [21:31] stub: so, this safe to progress? [21:32] the tag portlets will freeze until we get the new view code deployed [21:32] but thats fine IMO [21:32] You cook too fast [21:32] they are for most projects pretty static [21:32] <- optimiser [21:32] I thought views on this were still behind a feature flag? [21:33] no [21:33] bac: now that we have IServiceUsage, shouldn't we kill ILaunchpadUsage? [21:33] stub: but there is only one using it [21:33] stub: https://bugs.launchpad.net/ubuntu/+bugtarget-portlet-tags-content [21:34] flacoste: looking [21:34] stub: I'm going to delete patch 75 from the branch, so that we can land on devel [21:34] lifeless: DELETE FROM bugsummary_temp_journal; is better spelt TRUNCATE bugsummary_temp_journal (but only with temporary tables until we modernize our slony) [21:34] lifeless: Sure. [21:34] bac: flacoste: i seem to recall there being a uses_malone use case that buggered our desire to completely kill ILaunchpadUsage [21:34] i may be confusing things though. [21:36] jcsackett: we could have trim the interface though to only keep the uses_malone attribute in that case? [21:36] 13192 pushed (just deletes that patch) [21:36] flacoste: if i'm remembering correctly, yes. [21:36] flacoste, jcsackett: i have the same recollection. they certainly look like they could be combined [21:36] lifeless: Seems fine. Extra points if you switch to the truncate for probably unnoticable performance boost, and subclass the tests in test_bugsummary so they get run twice, once with rollup, once without. [21:36] bac, jcsackett: anyway, just wondered when I saw both, thanks for the clarifications [21:36] stub: s/subclass/parameterise :P [21:36] might yak shave it at some point [21:37] flacoste: probably worth filing a bug against it. [21:37] * flacoste only files yak shaving bug when actually doing the shave... [21:37] stub: I would like to do that separately as a tech debt issue, because we've run them both with and without [21:37] flacoste: seems like a reasonable policy. :-) [21:37] stub: I will change to the truncate right now [21:37] I'd just put the rollup into a method, subclass, and override it with a noop. But whatever. [21:37] cool. [21:38] stub: well theres a bunch of small classes [21:38] I thought all the tests were on one class? never mind then. [21:38] stub: so all of them need to subclass too - testscenarios is designed to multiply it out [21:38] ahm you are right [21:38] they are [21:38] so its easy and I'll do that [21:40] So... lets see if I can get these patches applied live to staging. [21:40] I guess qastaging first since it isn't replicated, then confirm I can add the table to the existing replication set on staging, then push the big red button [21:41] those changes are done, testing [21:41] \o/ big red buttons [21:41] Yell when you have pushed so I don't get confused versions of stuff on things [21:42] * stub loves technical jargon like 'stuff' and 'things' [21:43] sure thing [21:44] lifeless: production is blocked until the new replica is built, unless I kill it (which is a pita to clean up) [21:44] but we can get staging etc. done [21:44] so I believe there is a dsd script to run thats also blocked === al-maisan is now known as almaisan-away [21:45] perhaps we should kill it, run that scrpit, do this on prod, and then start the replica over [21:45] yes, but it was declared non urgent [21:45] right, it isn't urgent, but this is [21:46] stub: and see -ops, the script may be urgent [21:46] ok tests passed [21:46] 13193 pushing [21:46] statik: I think your cal update to weekly went awol :) [21:46] statik: pushed [21:47] bah stub: pushed [22:08] jcsackett: do you have time to mumble about yui tests? [22:08] sinzui: sure. [22:09] i am in mumble now. [22:16] Yippie, build fixed! [22:16] Project db-devel build #624: FIXED in 5 hr 42 min: https://lpci.wedontsleep.org/job/db-devel/624/ [22:33] sinzui: you broke up on me. [22:40] Project devel build #793: FAILURE in 5 hr 32 min: https://lpci.wedontsleep.org/job/devel/793/ === Ursinha is now known as Ursula === matsubara is now known as matsubara-afk === Ursula is now known as Ursinha [23:31] hi all [23:42] SpamapS: please try your script now [23:44] lifeless: good timing I delayed an accept just in case you fixed things. :) [23:52] SpamapS: hows it looking ? [23:52] lifeless: unfortunately, the queue is full of stuff to reject today.. I haven't found something to accept just yet.. moment. [23:57] Ok running now [23:57] \o/ [23:57] lifeless: fixed. :) [23:58] woot