/srv/irclogs.ubuntu.com/2010/09/11/#launchpad-dev.txt

wgrantlifeless: Are the appserver -> restricted librarian firewall rules completely sorted?00:27
wgrantWe are having 502s which could be caused by them.00:27
=== almaisan-away is now known as al-maisan
lifelesswgrant: I don't know02:03
lifelessabel said he was still seeing a failure if he pushed past 5 concurrent uploads, so I assume that we haven't figured it all out.02:03
lifelesswgrant: gather oopes!02:03
wgrantlifeless: There are no OOPSes.02:04
lifelesshttps://edge.launchpad.net/sprints/uds-karmic/+temp-meeting-export <- why is this being hit :<02:04
wgrantThey're proxy timeouts.02:04
lifelessrestricted librarian isn't proxied02:04
wgrantYay, c.l.security is finally being split.02:04
wgrantlifeless: Appserver connection timeouts, these are.02:04
wgrant"Sorry, we couldn't connect to the Launchpad server."02:05
wgrantOn an action that would be accessing the restricted librarian.02:05
wgrantAnd it's intermittent.02:05
lifelessAIUI that error, that can't be related.02:05
lifelesshowever, I may not understand the error02:05
lifelessWhat server group ?02:05
wgrantHm.02:05
lifelessedge/lpnet ?02:05
lifelessfile a bug, lets gather data.02:06
lifelessit may well be related, but no assumptions02:06
lifelesswtf02:06
lifelessBugTask LEFT JOIN Bug02:06
lifelessmakes no sense02:06
wgrantLooks like prod.02:06
wgrantlifeless: If there are no timeouts on librarian connections, and the connections are being dropped instead of rejected, why couldn't it be related?02:07
lifelesswell02:08
lifelesswhat does the error actually mean?02:08
lifelessdoes it mean 'got no SYN-ACK02:08
lifelessor does it mean 'got no HTTP response in X time' ?02:08
wgrantI understand that it means the proxy didn't get a response from the appserver in a timely manner.02:09
wgrantWhich probably means the appserver was waiting for something.02:09
wgrantWhich, given last week's happenings, and the fact that other stuff times out, is quite possibly the librarian.02:09
lifelessif it means no HTTP response in X time, then yes, it can be related.02:09
lifelessbut it also means we should be seeing OOPSes02:09
lifelesswhat pageids ?02:10
wgrantEven if there was no SQL executed afterwards?02:10
wgrantUm, it was on bug submission.02:10
wgrantSo possibly BugTarget:+filebug-guided or something like that.02:10
lifelesswgrant: yes, soft oops are generated if the request is > $time02:10
wgrantlifeless: Ah, I didn't know if that also depended on SQL statements.02:11
lifelessso02:11
lifelessthere's lazr.restful.utils.timeout or whatever it is02:11
lifelesswhich does a thread based timeout enforcer02:11
lifelessand there is the check in the storm tracer02:11
lifelessI plan to move all these checks to requesttimeline.02:11
lifelessor possibly something separate but connected.02:12
lifelessgandwana02:15
wgrantIt's having lots of +filebug timeouts ?02:17
lifelessfirst one is sql02:17
lifelessdeath-by-a-thousand-LFA lookups02:17
lifelesspotassium looks similar02:18
lifelessits awful o'clock to be calling the escalation phone just now02:19
wgrantWhat needs escalating?02:19
lifelessthis issue02:19
lifelessif its not fixed02:19
lifeless771 queries for +filebug02:20
lifelesswith apport data02:20
wgrantJust tried some other restricted download stuff.02:21
wgrantGot a failure from one prod appserver -- not sure which.02:21
lifelessdownload or upload02:21
wgrantDownload.02:21
lifelesswe only had upload enabled on the firewall02:21
lifelessthis might explain it02:21
lifelesswell02:21
lifelessmaybe not02:21
wgrantDownload has been used for ages, though.02:22
lifelesswe only *corrected a missing rule* for upload02:22
wgrantAh.02:22
wgrantSo, since StreamOrRedirectLibraryFileAlias failed at least once, the firewall is probably the problem.02:26
lifelesshave you seen that ?02:27
lifelesswas there an oops?02:28
wgrantNo OOPS. Just a plaintext "There was a problem fetching the contents of this file. Please try again in a few minutes."02:28
lifelessoh, feng shui ?02:28
wgrantNo.02:28
wgrantThis is displayed by the appserver proxy view.02:29
wgrantWhen LibrarianServerError is raised by getFileContents.02:29
lifelessI have to go02:30
lifelessplease - file a bug02:30
lifelesslets get all the data we can02:30
wgrantOK.02:30
wgrantThanks.02:30
lifelessalso it sounds like LibrarianServerError should be filing OOPSes02:30
lifelessif you wanted to fix that we could CP it to get more data.02:30
wgrantIt sounds like it might be better to just not catch it at all.02:31
lifelessit should generate oops, if the best way to do that is to not catch it - fine.02:35
* lifeless is gone, back in a few hours.02:36
wgrantsinzui: Is OOPS-1714K1846 another of the openid_identity_url LocationErrors?03:22
* sinzui looks03:22
wgrantThe user has OpenID issues.03:23
wgrantBut it may be unrelated.03:23
sinzuiYes it is03:23
wgrantIt works fine on edge, oddly.03:23
wgrantAnd I don't see what's changed on edge.03:23
sinzuiI see two views definitely provide the attr03:23
wgrant(in this case, post-rollout the SSO account mapped to the wrong account)03:24
wgrants/wrong account/wrong person/03:24
sinzuiwgrant, that me be the case03:24
sinzuiwgrant, this is the TB: http://pastebin.ubuntu.com/491936/03:24
wgrantHuh.03:25
sinzuiah we hit the XRDS code03:26
wgrantOh, right.03:26
wgrantThat's why it's only on prod.03:26
wgrantOf course.03:26
sinzuiThis is something that the foundations team may need to explain03:26
wgrantNow, there were some changes relating to OpenID on account merges last cycle.03:27
wgrantAnd the diff is huge, so I didn't even skim it. /me reads.03:27
wgrantGrrrrar.03:28
wgrantBranch is private.03:28
* wgrant diffs manually.03:28
=== al-maisan is now known as almaisan-away
lifelessback05:02
lifelesswgrant: how goes it, any more data?05:02
wgrantlifeless: Nothing.05:20
wgrantAnd I didn't file a bug, since if all goes well that view will disappear soon.05:20
wgrant(once your stuff is active)05:21
wgrantOr do you want a bug about the probably-not-bug +filebug issue?05:21
lifelessthe upload and download ports to the appserver need to be open regardless05:40
lifelessbecause; in-appserver stuff uses the restricted librarian to get at content sometimes05:41
wgrantThey do, yes.05:41
wgrantBut it's not a bug.05:41
wgrantIt's an operational issue.05:41
lifelessand uploads of all sorts are proxied via the appserver05:41
lifelesswgrant: 'meh'05:41
wgrantOOPS-1715S30208:02
wgrantlifeless: You're not still around?08:05
lifelesssigh, context manager fail08:06
lifelessyes08:06
wgrantWhat's the OOPS?08:06
wgrantI got that the first couple of times before the "Please try again" started appearing on staging.08:07
lifelessLaunchpadTimeoutError: Statement: 'SELECT DISTINCT SourcePackagePublishingHistory.archive, SourcePackagePublishingHistory.component, SourcePackagePublishingHistory.datecreated,08:10
lifelessQueryCanceledError('canceling statement due to statement timeout\\n',)08:10
lifelessSQL time: 10494 ms08:10
lifelessNon-sql time: 175 ms08:10
lifelessTotal time: 10669 ms08:10
lifelessStatement Count: 4308:10
wgrantHm, so probably unrelated.08:10
lifelessits on staging08:10
lifelessdifferent librarian08:11
wgrantIt is.08:11
wgrantBut I still got the same error later.08:11
wgrantSo it's not prod-specific.08:11
wgrantIs the staging librarian also on asuka, or not?08:11
lifelessI think so08:11
wgrantUrgh.08:11
lifelesslet me check08:11
wgrantSo... not firewall, in that case.08:11
wgrantI could try dogfood, which I know is the one machine.08:11
lifelessyes, asuka08:12
wgrantIf the failed request caused an OOPS, it should have been just after OOPS-1715S304.08:12
wgrantIs it obvious?08:12
lifeless LaunchpadTimeoutError: Statement: 'SELECT BinaryPackagePublishingHistory.archive, BinaryPackagePublishingHistory.binarypackagerelease, BinaryPackagePublishingHistory.component,08:13
lifelessthats 508:13
wgrantI didn't think I caused a third, but maybe I did.08:13
lifeless  LaunchpadTimeoutError: Statement: '(SELECT "_259ce".name, Person.displayname, EmailAddress.email FROM Person JOIN Account ON Account.id = Person.account JOIN EmailAddress ON EmailAddress.person = Person.id JOIN TeamParticipation ON08:13
lifelessthats 608:13
lifelessanon08:13
wgrantProbably not, then (but that looks like an auth query... how would that be timing out so early?)08:14
wgrantlifeless: The proxy timeouts go away if I remove most of the attachments from the uploaded blob, or if I file it against a project with only a couple of subscribers.09:18
lifelessheh09:19
wgrantNext test: Specifying a biggish team as the initial assignee, to emulate the lots of subscribers that Ubuntu has.09:19
lifelessthought so09:19
wgrantBut that should still be an SQL timeout :/09:20
lifelessand they all have been that I've seen, so far.09:20
wgrantOh look.09:20
wgrantSetting assignee=ubuntumembers when filing the bug also makes it die like that.09:21
wgrantBut that should still be an SQL timeout. So why does it not appear as one...09:21
* wgrant creates a few hundred people locally.09:22
wgrantUh.09:30
wgrantWould you like some queries?09:30
wgrantThat request has plenty.09:30
lifelessheh09:38
lifelessjames_w: https://edge.launchpad.net/python-fixtures/trunk/0.2https://edge.launchpad.net/python-fixtures/trunk/0.210:16
james_wthanks lifeless14:56
=== Ursinha-afk is now known as Ursinha
lifelessjames_w: please let me know how you like/dislike it.20:19
james_wI'll give it a go now20:20
james_wI assume testresources will become a layer on top of fixtures now?20:20
lifelessyeah20:21
lifelessgoing to look at jmls remaining testrepository patches20:21
lifelessthen package up fixtures20:21
lifelessthen start working back along the stack, harmonising things20:22
james_wexcellent20:22
lifelessI was surprised, 0.1 had 49 downloads.20:22
* jelmer cheers on lifeless20:23
james_wthe existence of fixtures fixture and testfixtures is unfortunate20:24
lifelessyes20:24
lifelessI thought had before wedging in there20:24
lifelessI also looked at their designs20:25
lifelessprobably want to subsume fixture functionality wise in a couple of releases20:26
lifelessand testfixtures, ah yes20:28
lifelesssugar but not AFAICT fundamentally solving it20:28
lifelessactually, revisiting, testfixtures is pretty neat20:31
lifelessbut the API for compare isn't quite disconnected enough for little ol me20:31
=== almaisan-away is now known as al-maisan
=== al-maisan is now known as almaisan-away

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!