/srv/irclogs.ubuntu.com/2010/09/06/#launchpad-dev.txt

thumperbet that farmer is pissed00:02
thumperhow's he going to have a straight line of trees now?00:02
lifelesswgrant: anyhow, I'mm not asking you to decide, I'm just going for as much input as I can get00:03
wgrantlifeless: I know.00:03
wgrantI don't really like the idea of rolling out with a known hole.00:03
wgrantPlus it's easily CPable to the single host once the request path is fixed.00:04
wgrantSo I don't see the urgency, since the appserver code is fine.00:04
wgrant(librarian updates can be done without downtime now, yeah?)00:05
lifelesshandwave00:12
lifelessin principle yes00:12
lifelessspm: do you have a few minutes?00:12
lifelessspm: I want to talk RT 41202 and some related bits00:13
spmha! no. not atm. need abot 30+ mins. 11+ reds to deal with.00:14
lifelessok00:14
lifelessI'll list out the bits here for when you get time00:14
lifeless - check the cert order can be done00:14
lifeless - check host headers getting to th ebackend librarian00:15
lifeless - check we can deploy librarian updates w/out downtime00:15
lifelessand seperately00:15
lifeless update edge once we figure out whats up with bb again00:15
wgrantSo, I suggest the following:00:16
wgrant - Hook in the new view, but initially FF'd out.00:17
wgrant - Get the hack in r11506 controlled by FF>00:17
wgrant - Release.00:17
wgrant - Get request path sorted.00:18
wgrant - Get domain match fix CPed to librarian.00:18
wgrant - Test.00:18
wgrant - Flip the two FF flags00:18
lifelesssame flag, surely ?00:18
wgrantCould be.00:18
lifelessit hits the same view00:19
lifelessso when the view is controlled by the ff, it will affect both, no ?00:20
lifelessbbiab00:20
lifelessspm: OOPS reporting on edge is naffed; just needs a redeploy. Add to the bottom of your reds list ?00:21
* lifeless awols00:21
spmsnort00:21
wgrantlifeless: Yeah, but I'd like to have a way out.00:21
wgrantOr Platform might kill someone :)00:21
wgrantlifeless: Is IBuilder:+history featuring on the timeout reports?00:26
lifelesswgrant: well I haven't filed a bug for it yet01:08
lifelesswgrant: its gotta be close though01:08
lifeless17.68 99% completion time01:09
lifelesswasn't in the top oops report in the weekend though01:09
wgrantlifeless: Yeah, I've had a few reports of it today.01:09
lifelessok01:09
lifelessdid anyone give you an OOPS?01:10
lifelessnot that they are any use till edge is redeployed01:10
wgrantOOPS-1709A1140 is one.01:10
lifelesswgrant: FWIW https://bugs.edge.launchpad.net/launchpad-project/+bugs?field.tag=timeout is my master list01:10
lifelesswgrant: no such oops, well not yet anyhow01:11
wgrantReally?01:12
wgrantIt should exist by now, surely.01:12
wgrantIt was 8 hours ago.01:12
lifelessgot any from different server?01:13
wgrantOOPS-1709F116901:13
lifelessthat works01:14
lifelessso, 'A' may not be rsyncing right01:14
lifelesswgrant: https://bugs.edge.launchpad.net/soyuz/+bug/63120601:16
_mup_Bug #631206: Builder:+history timeouts <timeout> <Soyuz:New> <https://launchpad.net/bugs/631206>01:16
wgrantIt's not fixed on edge.01:17
wgrantOOPS-1709EB1925, for example.01:17
lifelesswgrant: if you want to let me know what constants should be in there, or how to determine them, I'll do an explain analyze on staging01:20
lifelesspoolie: hi01:21
pooliehi there lifeless01:21
lifelesspoolie: feature flags; whats the best place to look for a howto ?01:21
pooliethe docstring01:21
pooliewhich i think is now up on the web01:21
lifeless[and I hope you had a good weekend etc etc]01:22
poolieyou too01:23
lifelessit was interesting01:32
lifelesspoolie: I'm trying to match up with sinzui did in commit 11470 and the docstring01:33
poolier11470 of devel?01:33
lifelesspoolie: lp.services.features.__init__ is the docstring I'm looking at01:33
lifelesspoolie: yes01:33
poolieheh, that diff is interesting as an example of the kind of tests we should make it easier to write01:34
poolieit's not all that bad, but itcould be smaller01:34
pooliei'm really sorry it caused disruption01:35
lifelesspoolie: meh, don't worry about it01:35
lifelessour process makes lp fragile like that01:35
poolie+        def in_scope(value):01:37
poolie+            return True01:37
poolie+01:37
poolie+        return FeatureController(in_scope).getFlag('gmap2') == 'on'01:37
pooliethis is a bit strange01:37
pooliei would have just used the default controller, and relied on only trusting the 'default' scope for now01:37
poolieit may be that api wasn't there in the initial landing to db-devel, which only had the sql and model code01:38
lifelesshttp://itsnotthecoffin.blogspot.com/2010/09/christchurch-earthquake-new-zealand.html01:38
pooliei mean, two orthogonal things:01:38
poolie1- that might have been the api skew that bit him01:38
poolie2- it's not wrong but it's not how i would have written it01:38
lifelesshow would you have written it? (So I can do the same)01:38
wgrantlifeless: SELECT id FROM builder WHERE name='adare';01:39
wgrantThat's the first one.01:39
lifeless301:40
poolielifeless, return getFeatureFlag('gmap2') == 'on'01:40
pooliealso perhaps we should add a thing for casting to bools, etc, so we don't have "== 'on'", "== 'yes'" etc all over01:40
wgrantlifeless: The two for archive privacy are false.01:41
wgrantlifeless: And use just about anyone for the person.01:41
poolielifeless, i'm going to ask spm to send the error output from process-mail.py to a log file synced to devpad01:41
wgrantI forget my ID.01:41
poolierather than as at present being sent as mail and being ignored01:41
pooliecan i get a t-a rs for this?01:42
lifelesspoolie: theres an ongoing effort to rearrange things along those lines01:42
lifelessI understand it to need code changes01:42
lifelessrather than being a deployment issue01:42
pooliecode changes might help01:43
lifelessmy understanding from mthaddon is that the goal is:01:43
lifeless - must-see remain in email01:43
pooliebut here we just have a cron job and i want to put a '>file' into it01:43
lifeless - activity, details, etc go to a log01:43
lifelessand we have the script-last-run mechanism to cope with as well01:43
lifelessI haven't looked into how that works yet01:44
pooliehm01:45
pooliei wonder if anyone is reading them now01:45
lifelessI was speaking with thumper about it this morning01:45
pooliei suspect they get a mail for every mail received by launchpad01:45
poolieso should i file this, or not?01:46
lifelesspoolie: anyhow, I'm ok in principle with any change to make the thing more diagnosable and useful; I have two concerns here: there are automated things that look for scripts running/not running and I don't know if changing mails will affect that01:46
lifelesssecondly I don't know where mthaddons change to make only errors be emailed is at either01:46
lifelessoh, and thirdly disk space - in the current setup the logs are archived in the mail archive system, devpad isn't setup to scale to huge numbers of logs (e.g. we have to prune OOPS reports to conserve space)01:47
lifelesspoolie: I'll happily put my stamp to a change once those things are looked into01:47
pooliemaybe i'll just ask spm to pull things out of mail for me one-off then01:48
lifelessI wish I knew this bit of the system better to be able to just say 'yeah, doit', but sadly I don't01:48
lifelessI wonder if you can get a filtered mbox from mailman01:49
wgrantlifeless: Hm, edge has 40x more non-SQL time?01:51
wgrantIt'd be handy if the timeline showed when we were GIL-blocked :(01:51
lifelesswgrant: thats extremely har dto do01:52
wgrantlifeless: Of course.01:53
lifelesswheee that query is slow01:54
lifeless22 seconds01:55
wgrantOw.01:55
lifelessplan in the bug01:57
wgrantYep, already reading.01:57
lifeless171MB disk merge.01:57
lifeless\o/01:57
wgrantI wonder why it decided to do that.01:59
wgrantI can't see where the final 20s went, though.02:03
wgrantUnless it was in that disk merge, and didn't show up in the Sort times.02:03
lifelessqTime: 1811.852 ms02:05
lifelessthats my rearranged one02:05
wgrantWhat did you do?02:05
lifelessput the condition in the join02:06
lifelessrather than bringing back a metric tonne of unrelated data and filtering02:06
wgrantInteresting.02:06
lifelessno guarantee I got it right02:06
lifelessbut the first few rows certainly look the same02:07
lifelessconceptualy we want:02:07
lifelessoh and here02:07
wgrantThat is slightly confusing.02:07
lifelessI had one bit I wasn't totally happy with02:08
lifelesstweaking now02:08
lifelessso the archive left join team-p thing brings back one row from team participation per archive, not *all the rows of the team*02:10
wgrantAh, good.02:10
lifelessbecause (person, team) is unique in teamparticipation02:10
lifelessarchive.owner will be correlated against archive02:11
lifelessbut the planner has some choice there02:11
lifelesswe end up with an archive, teamparticipation table with one row per archive02:11
lifelessand we then join that to packagebuild where packagebuild has an archive set02:11
lifelesshmm, I think we can cut a left join here02:15
wgrantWhere?02:16
* wgrant looks.02:16
lifelesspackagebuild - archive02:16
lifelesswe always want to look at archive02:16
lifelessif bfj.packagebuild is set02:16
wgrantWe do.02:16
wgrantBut how does that let us eliminate a left join?02:16
lifeless(packagebuild inner join archive)02:17
wgrantOh, one of your compound joiny thingies which I haven't seen widely applied before.T02:17
wgrantTrue.02:17
lifeless2.6 seconds. heh02:19
lifelessfiddling at this level is risky02:19
lifelessahh02:20
lifelessget rid of both, its much happier02:20
lifeless930ms02:20
lifelessspm: how goes the meltdown ?02:21
lifelesswgrant: left outer join means you have to iterate both sides rather than iterating one side and doing specific lookups on the other side02:22
spmlifeless: more in a semi solid state atm; just kicked off a u1 staging DB whatsits; so should be able to spare you some attention shortly02:22
lifelessgetting rid of the packagebuild query may help a great deal02:22
lifelessspm: ok, say in 40 ?02:22
spmlifeless: that should be fine; hopefully sooner... but I was ever optimistic.02:23
lifelessbrb02:29
lifelesswgrant: you may wish to read http://www.postgresql.org/docs/8.4/static/explicit-joins.html02:31
wgrantlifeless: Ah.02:33
lifelessbasically the goal of the planner is to do the most effective work first02:45
lifelessas soon as we explicitly constrain things it can't02:45
lifelessand left outer joins explicitly constrain things simply by being used.02:45
lifelessspm: hi03:14
lifelessspm: thats 40 and a bit :)03:14
lifelesswgrant: if you're tweaking soyuz queries03:19
lifelesswgrant: bug 629921 may be of interest03:19
_mup_Bug #629921: Archive:+packages with empty name search does like '%%' search. <timeout> <Soyuz:Triaged> <https://launchpad.net/bugs/629921>03:19
=== almaisan-away is now known as al-maisan
lifelessspm: tap tap tap03:23
spm:-)03:23
spmlifeless: so aiui, 41202 is predominantly a GSA thing. Also; you've set the pri to 89, but haven't given us any indication of timing needs around this? do you need this for the rollout this week? or later this month? or Mid next. ??03:27
lifelessspm:03:31
lifelessblah03:31
lifelesssoon as we cna, before rollout if possible03:31
lifelessspm: but first, can you please trigger an edge redeploy03:32
lifelessspm: to fix OOPSes03:32
wgrantlifeless: So, what do you think of the plan I outlined?03:33
wgrantFor the librarian stuff.03:33
wgrantAnd has kees looked at the thing yet?03:33
lifelesswgrant: sure, something like that03:33
lifelessno response from him yet03:34
wgrant:(03:34
spmlifeless: does that mean we *need* it for this rollout? we're operating *really* short staffed this week, so "I want" vs "I *need*" is really necessarilly separated atm.03:34
lifelesscritical path is knowing that *we can get them*03:36
lifelesscan't land the code till thats acked.03:36
lifelessThe code is intended to be able to be turned on at will, so if we get the certs a few days later, thats fine.03:36
lifelessWhen I say can't I mean 'it would be a bit odd to land something we're really not sure if we can do'03:37
lifelessspm: the code proposal https://code.edge.launchpad.net/++oops++/~lifeless/launchpad/private-librarian/+merge/3102003:38
lifelessspm: so, short story - this has had plenty of eyeballs.03:39
lifelessspm: there's still hair and fine tuning, but less than we had live 3 months back for bug attachments03:39
lifelessspm: requests for a file will allocate a token; the token will expire; folk can copy the token to e.g. wget if they want, content secured in this way is partitioned off from all other content by browser security rules03:40
lifelessspm: to make this happen we need:03:41
lifeless - the certs03:41
lifeless - to check Host headers reach the librarian (we need to cross-check the domain)03:41
lifeless - various small code changes over and above the current patch03:42
lifelessspm: the first two need your assistance03:44
lifelessspm: so (ping)03:49
lifelessspm: I get GSA on the certs; should I ask in is, or just wait :)03:50
spmyup. been looking at the mp03:50
lifelessspm: on the host headers side03:50
lifelesswe need to figure out if the requests squid is making to the librarian (for launchpadlibrarian.net requests) are preserving the Host header.03:50
lifelessI'm not sure how to do that offhand ;)03:50
spmI'll chase the vg and see if we can get some traction03:50
lifelessthank you03:51
spmhmm. except I'm not sure who the vg is today...03:51
lifeless'-'03:51
lifelessbrb03:51
spmgawd. I'm yak shaving again. need email to figure that; but home server (which holds email) is kaput. Need monitor on that server to WTF it; but desk is full of other crud and needs (mild) cleaning for room for a monitor. sigh.03:52
=== al-maisan is now known as almaisan-away
lifeless:|04:00
wallyworld_thumper: you having trouble with an unresolvable z3c lib? - "Getting distribution for 'z3c.recipe.scripts==1.0.1'"04:04
wallyworld_thumper: this project doesn't seem to exist in launchpad? i tried to view the revision history of buildout.cfg but an getting bzr error:04:05
wallyworld_bzr: ERROR: exceptions.TypeError: 'bzrlib._known_graph_pyx._MergeSortNode' object is not iterable04:05
lifeless\o/04:05
wallyworld_this was after running utilities/update-source-code04:05
spivOoh, I haven't seen that one before.04:06
jtvhi folks04:08
lifelesshi04:12
spmheya jtv04:14
jtvhi lifeless, hi spm04:14
lifeless\o/ oops are sane now04:14
lifelesshttps://lp-oops.canonical.com/oops.py/?oopsid=1710ED23704:14
jtvlifeless: thanks for fixing that oops problem in the weekend!04:14
lifelessjtv: de nada04:14
lifelessnow we has sensible oopses04:14
lifelesswith librarian stats :)04:15
jtvde algo… it was above and beyond.04:15
lifelessthumper: ^ have a look at that one04:15
lifelessthumper: items 98 and 99 in the 'sql' log04:15
* thumper has to run kids to art class04:15
lifelessthumper: later will do, its user created04:15
lifelessno idea why the analyser things the librarian stuff is repeated though04:16
jtvUnfortunately I screwed up a little bit in my TranslationGroup fix.  I prefetched a lot of objects in queries that I moved to the slave store, when they're supposed to come from the default store.  So fetch the page from the master store and you've got lots of icon queries back.04:16
lifelesssomething is naffed there04:16
lifelessjtv: ahh04:16
jtvThat's why we still got a timeout on edge.  :-(04:16
lifelessStore.of() might help too04:17
jtvWell yes, whatever gets the default store.04:17
jtvOr I use ISlaveStore(object).icon instead of object.icon04:17
jtvlifeless: so librarian interaction is now also logged in the oops?  I took the page from 1050 db queries to 303 actions, and if those 303 actions actually count more than just queries, that's extra-great.04:18
lifelessjtv: librarian download connects and reads are now logged yes04:18
lifelesswgrant: I have an Idea04:24
lifelesswhen you get back, tell me what you think:04:26
lifeless - i123.restricted... - must match domain and LFA04:26
lifeless - https?://launchpad-librarian.net/....  - also supports tokens04:27
lifelessthe tests will work04:27
lifelessI can test once, directly, that when restricted is in the domain they must match04:27
lifelessthis will prevent injecting content into the security context of a restricted file04:27
jtvlifeless: we've got a pretty annoying problem—there's a db column that was supposed to become obsolete ages ago but was still in use.  We landed a branch that stops initializing it and stops using it.  Can you guess the problem?04:28
lifelessother than db-devel going red because of conflicts?04:29
lifelessor skew between two things modelling the same content04:29
jtvedge vs production.04:31
lifelessprod reads it back04:32
lifelesswith the old queries04:32
jtvyup04:33
jtvedge produces them without the data04:33
jtv*I* *hate* so-called single-sign on for oopses!04:38
jtvOpen a dozen oopses in as many tabs: each and every one needs to go through the "single" sign-on page, and only the first one succeeds.  The rest just forwards to a different "single" sign-on page, which just loops back to itself.04:39
lifelessplease file a bug on that04:39
lifelessand/or an RT04:39
jtvOh, and then there's a few that just fail with "invalid transaction"04:39
thumperlifeless: what exactly about that oops should I be looking at?04:39
jtvYes, I will thanks.04:39
thumperlifeless: apart from the general inefficiencies04:39
lifelessthumper: search for librarian in it04:40
lifelessor go to the row index in the sql statements that I pointed out to you04:40
thumperlifeless: ah, nice04:40
thumper0ms  librarian-read04:41
thumperthat's pretty fast04:41
lifelessthumper: it will be in the socket buffer04:41
lifelessthumper: so effective a noop04:41
wallyworld_solved: TypeError: 'bzrlib._known_graph_pyx._MergeSortNode' object is not iterable04:44
wallyworld_historycache plugin is bad04:44
wallyworld_thumper: figured out the other problem - had to explicitly update the download cache. builds working again :-)05:19
thumperwallyworld_: :)05:22
wallyworld_thumper: bzr 101 question - can't recall, but i'm sure i did a bzr merge (and maybe also update) at the top level. is it sop to have to explicitly update the download cache?05:24
thumperwallyworld_: the download cache is likely to be a heavyweight checkout05:25
thumperwallyworld_: so it is the only thing you do bzr update one05:25
thumpers/one/on/05:25
wallyworld_thumper: thanks05:25
thumperwallyworld_: *I* do an explicit update05:25
thumperbut I don't use the rocketfuel scripts05:25
wallyworld_thumper: i don't use them either atm. is there anything in the workflow to indicate when one should update the d/l cache or should it be done say once per day?05:26
thumperwallyworld_: I update the download cache when I pull devel/db-devel;05:27
wallyworld_thumper: cool, i'll do the same. i was thinking one could also look for changes in the buildout.cfg file or other 3rd party depenency cfg file05:29
lifelesswgrant: and its pushing05:50
lifelesswgrant: I'm happy with this now, moving onto polish and integration05:50
jtvspm: I'm landing an RC fix that's only needed on edge, and only until the rollout.  What can I do to ensure that it hits edge soon?05:57
thumperjtv: cowboy hat?05:58
spmjtv: get it landed in the normal manner asap. we're really unkeen on cowboying patches onto edge (aka prod-lite) without an incident report05:58
thumperjtv: how do I find a product series that has a translation link for the branch?05:58
thumperjtv: I want to test deletion (on staging ofcourse)05:59
jtvthumper: Translations takes an interest in 2 branch links on a productseries:05:59
wgrantlifeless: I'm still a little wary that we're allowing users to shoot themselves in the foot without noticing.05:59
thumperlifeless: any idea how to record in oopses what other app server threads were doing?05:59
wgrantlifeless: If launchpadlibrarian.net itself doesn't work, then people know not to use it.05:59
thumperlifeless: I'm thinking of oopses caused by other long running requests06:00
jtvthumper: 1: the development branch (it imports files that appear there, but it can also fire off build farm jobs based on changes there)06:00
thumperlifeless: causing database locks06:00
jtvthumper: 2: the translations_branch, which is where it can write snapshots of the series' translations.06:00
thumperjtv: I guess I could just do a staging query now I have the power06:00
jtvthumper: Finding cases of 2 is easy: WHERE ProductSeries.translations_branch IS NOT NULL06:00
thumpermwa ha ha06:00
* thumper waits for staging to come back up06:01
* jtv looks up the enum for 106:01
wgrantlifeless: What's the benefit of allowing it?06:01
wgrantlifeless: If you have to craft a URL to test, you might as well craft it to the restricted librarian.06:01
jtvthumper: you find cases of 1 with WHERE ProductSeries.translations_autoimport_mode <> 1 AND branch IS NOT NULL06:02
jtvspm: I'm landing in the normal manner asap.  It undoes one line of change from an earlier branch.06:03
thumperhow the hell am I supposed to submit someone else's work to pqm without a local copy of it?06:14
wallyworld_lifeless: bug 631010? any eta on a fix to allow lp tests to run again? i've upgraded to maverick and can't run any tests06:16
_mup_Bug #631010: ProgrammingError: operator does not exist: text = bytea <database> <maverick> <storm> <Launchpad Foundations:New> <https://launchpad.net/bugs/631010>06:16
thumperphew06:22
thumperfinally06:22
wgrantwallyworld_: You could try downgrading to Lucid's python-psycopg206:31
wallyworld_wgrant: thanks, i'll give that a try. been having a few issues with maverick and kde :-(06:32
wgrantI should upgrade this week.06:32
wgrantI normally upgrade around alpha 1...06:32
wallyworld_you running kde or gnome?06:32
wgrantGNOME.06:33
wallyworld_they skipped the alpha this time i think06:33
wgrantNo, there were the usual alphas.06:33
wallyworld_i mean they skipped the last one?06:33
wallyworld_went to beta early06:33
* thumper off to get munchkins06:33
wallyworld_i really hope kde gets fixed with maverick. me and gnome don't get on very well :-(06:34
lifelesshi06:40
lifelesswgrant: I do craft the right url06:40
wallyworld_wgrant: thanks, downgrading to lucid's psycopg2 fixes it for now06:40
wgrantwallyworld_: Great.06:40
lifelesswgrant: the benefit is that we don't need wildcard dns on developers machines with https certs and -all that stack-06:40
wgrantlifeless: We can't make the dev config use restricted librarian URLs?06:41
lifelesswallyworld_: no info on the eta for it06:41
lifelesswgrant: restricted librarian urls are different06:41
wgrantOr at least only activate the tokens-on-launchpadlibrarian.net mode in the dev config?06:41
wallyworld_lifeless: as per wgrant suggestion i downgraded to lucid's copy and it seems to be ok for now. thanks06:41
lifelesswgrant: we do want to delete the proxy code06:42
lifelesswgrant: so that isn't a tenable goal; as a stop gap maybe, but I don't see that its better or worse06:42
lifelesswgrant: as for people foot-bulleting themselves; who are you thinking of ?06:42
wgrantlifeless: What's not a tenable goal?06:43
lifelesshaving the dev encironment run the current mode06:43
wgrantNot the current mode.06:43
wgrantEither linking directly to the restricted librarian (which is presumably accessible from localhost...), or having the local public librarian accept tokens on its primary name.06:43
lifelesswgrant: the latter is what I've implemented06:44
wgrantBut only in dev mode.06:44
wgrantNot on launchpadlibrarian.net.06:44
lifelessI can add an if to turn it off, but I don't understand why06:44
wgrantI can't think of a good reason to allow it.06:44
wgrantAnd if there's not a good reason to allow access to private data through a second mechanism, can we please not do it?06:45
wgrantSame-origin is a useful safetynet.06:45
lifelessright ...06:45
wgrantWe probably want private files to be protected by it.06:46
lifelessyou're not joining the dots here; its the same mechanism06:46
wgrantSo let's not introduce a way to work around it.06:46
wgrantEven if we can't immediately think of any attacks.06:46
lifelesswe'll want the apache front ends to be filtering tokens on http anyway06:46
lifelessno harder to have them filter on subdomains too06:46
wgrantHm?06:46
lifelessbut adding another conditional in that code adds to the complexity there for no good reason06:46
lifelessis anyone looking at the testfix ?06:47
wgrantIt's a workaround that's only require for dev installations. It is probably a single line of code to restrict it to that context, and it means that private content is forced to live in its own domain. That has to be a good thing.06:48
wgrantDespite the slight increase in complexity.06:48
lifelessso lets not add the workaround06:48
lifelessthat seems simpler to me06:49
wgrantThen dev systems break.06:49
lifelessThe vector you are talking about is 'public content happens to know the url and token for some private content'...06:50
lifelessthey can just damn well load that directly06:50
wgrantOr someone notices that omitting the 'i3532523.' works.06:51
wgrantThey proceed to do that.06:51
wgrantThere's nothing telling them that it's dangerous.06:51
lifelessI just said above the frontends can enforce that if we want06:51
lifelesstrivially06:51
wgrantAh, I see.06:51
wgrantI guess.06:51
lifelesswe have to have the front ends enforce httpS anyway06:52
lifelessbecause if someone is going to disclose a private file it shouldn't be us. :)06:52
wgrantTrue.06:52
wgrantOK, well, as long as they're changed to do that, my objection is retracted.06:52
wgrantAnd the plan seems good.06:52
lifelesswgrant: folk can't omit the ix bit anyway06:53
wgrantWhy not?06:53
lifelesswgrant: because the only way they get urls to use is via the appserver proxy service.06:53
wgrantTrue.06:53
lifelessthey can, for the short period a token is active, manually edit and change the url06:53
lifelessbut honestly, really?06:53
wgrantWhat is the limit?06:53
wgrantAn hour?06:53
wgrantI forget.06:53
lifelesscurrent code says 1 day06:54
wgrantI think paranoia is appropriate here.06:54
lifelessthats arbitrary; no reasoning at all has gone into it.06:54
wgrantBut OK.06:54
lifelessI wanted it to be longer than an ISO download to south africa06:54
wgrantRight.06:55
* wallyworld_ off to pick up Martin Pool form the airport07:01
=== jtv changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 3 of 10.09 | PQM is CLOSED | firefighting: lp builds broken, db_lp buildbot offline | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting | On-call review in irc:/​/​irc.freenode.net/​#launchpad-reviews
adeuringgood morning08:44
jtvhi adeuring08:57
jtvadeuring: to start the week off with a happy note, most buildbots are broken and even though you're probably innocent, you're on the blamelist.  :)08:58
jtvMe, I suspect lifeless of landing python2.6 code before all buildbots support it08:58
=== jtv changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 3 of 10.09 | PQM is CLOSED | firefighting: non-lucid builds are broken, db_lp buildbot offline | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting | On-call review in irc:/​/​irc.freenode.net/​#launchpad-reviews
adeuringjtv: thanks for the heads uo ;)09:01
lifelessadeuring: hi09:08
adeuringhi lifeless09:08
lifelessjtv: I'm -very- sure I haven't landed any 2.6 only code09:08
lifelessjtv: for starters ec2 is still 2.5, and I run through ec2 religiously09:09
jtvlifeless: ok, just trying to get your attention for this problem.  :)09:09
lifelessadeuring: lp:~lifeless/launchpad/private-librarian is much closer to being done09:09
jtvA lot of the failures are 2.6-isms afaict09:09
adeuringlifeless: yes, I've seen your mp09:10
lifelessadeuring: I mean 2 minutes ago :P I just pushed up more09:10
adeuringlifeless: sounds great!09:10
lifelessjtv: mmmm, not the best way to get my attention.09:10
lifelessjtv: anyhow, I can se eyou and noodles775 looking into it; you're both good folk, I'm sure it will come good rapidly.09:11
jtvlifeless: I wouldn't trust me—I merely spotted a 2.6-ism in propertycache breaking one  of the builds09:11
jtvI wouldn't have any idea how to fix it09:12
lifelesswhat line is it ?09:12
jtvlifeless: it'll take me a moment to look that up, but it was a "next(counter)"09:12
lifelessin devel ?09:13
jtvlifeless: yes, it was in the "lp" buildbot log09:13
noodles775jtv: I thought its only the lucid_db_lp buildbot that is critical (and it's a different error)09:15
noodles775(by critical, I mean, stopping landings)09:15
jtvnoodles775: I just found I'm well out-of-date with what breaks what; we don't have hardy servers any more then?09:16
lifelessnoodles775: any of the main buildbots failing is textfix09:16
lifelesswe have hardy servers09:16
lifelessif 'lp' breaks textfix is triggered09:16
noodles775OK. I'm looking at the lucid_db_lp one then.09:16
wgrantlifeless: The hostname restriction code is... um... But apart from that, your branch looks fine to me now.09:18
lifelessoh, in the doctest09:18
lifelesswgrant: buggy?09:18
wgrantlifeless: Ugly.09:19
lifelesswgrant: patches considered09:19
wgrantI'm not sure there's a better way.09:19
wgrantI just know that this way is ugly :)09:19
wgrantlifeless: Why use a regex rather than just "hostname == 'i%d.restricted.%s' % (self.aliasID, netloc)"?09:22
lifelesswgrant: good point, I should09:23
lifelessI realised half way through that there was an attack09:23
lifelessI was going to be fairly relaxed on the check09:23
lifelessbut09:23
wgrantI also don't really get what you're doing with netloc and :. Shouldn't urlparse already do that?09:24
lifelessi1234.restricted.iunderattack.restricted.launchpad-librarian.net would be bad09:24
wgrantOh, port.09:24
wgrantRight.09:24
wgrantlifeless: I wouldn't mind a comment saying that you're stripping off the port there.09:25
lifelessjtv: so, if next(iterator) advances it09:25
wgrantIt's not blindingly obvious.09:25
wgrantOr maybe I'm just tired.09:25
lifelesswgrant: I'll add one09:25
wgrantThanks.09:25
jtvlifeless: python2.5 doesn't seem to know about "next"09:25
wgrantNow the only bad bit is the '.restricted.' check, but there's not much that can be done about that yet.09:26
=== henninge_ is now known as henninge
lifelessjtv: next(i) == i.next()09:27
lifelessjtv: whats the next glitch09:28
lifelessjtv: the difference is the ability to say "next(i, 42)"09:28
jtvlifeless:     NameError: global name 'next' is not defined09:28
jtvthe line was         return next(counter)09:28
lifelessright09:28
lifelessreplace it with counter.next()09:29
jtvBut it's only one of so many failures that I'm now trying to figure out what bigger picture I'm missing.  I'm told the failures on lp are not the cause of the testfix mode; lucid_lp_db is broken with a different failure.09:30
lifelessits 8:30 herel; I need to go remind my wife what I look at.09:30
lifelesss/at/like/09:31
lifelessI'm sure there will be multiple failures09:31
lifelessI'm also sure that lp failing will cause testfix, because bb is watching *both*09:31
lifeless(or all 6 actually, anyhow)09:31
lifelessjtv: I suggest you, or someone you get to agree to it, puts together a branch to fix all the known devel issues; send that to devel (labelled testfix). then look at db-devel.09:32
jtvlifeless: noodles775 has been looking at db-devel already, and I've been trying to do the other thing for the past half day09:32
jtvso yeah, I agree with the approach basically :)09:33
jtvGo show your face to your family.  :)09:33
lifelessjtv: doing so; at least to the extent of not staring balefully at the laptop09:39
* jtv would reply if it weren't for the expected consequence of a certain person staring at laptop even longer09:39
lifelessadeuring: stub is reviewing the branch, but I don't know if all tests pass (the ones for code I've been directly changing do, of course)09:47
lifelessadeuring: so perhaps you'd like to : throw it at ec2; start dealing with any fallout it has, and I'll finish up tomorrow with whatever you push (as long as you tell me what you've pushed ;P)09:48
adeuringlifeless: ok, sounds good09:48
lifelessadeuring: it is in principle feature complete though09:49
adeuringok09:49
lifelessall cleanups etc deferred until we have successfully migrated09:49
lifelessgmb: there is another release-critical thing up for you09:49
gmblifeless, Go ahead09:50
lifelessgmb: its in your mail already09:50
lifelessgmb: this is high risk high reward09:50
* gmb looks at his mail client09:50
* gmb sees a greyed-out window09:50
gmbHmm.09:50
* gmb switches to the web interface09:50
gmblifeless, Okay, I get - and like - the rewards. Spell out the risks for me since I'm under-caffeined this morning.09:59
lifelessgmb: its a change to a fairly magical part of the system09:59
lifelessanything could happen09:59
lifelesswe'll be able to QA that privat ebugs are no -worse- on staging09:59
lifelessit will be hard to check the new stuff until we get the certs (but not impossible)10:00
gmblifeless, Do we have any alternative solutions that will fix the private attachment problem and will be ready before tomorrow evening (UK time)?10:01
lifelessin principle the private attachment problem is fixed for a limited time via the firewall hole (though I may be out of date)10:01
gmbadeuring, Can you confirm lifeless's statement ^^?10:01
lifelessbut we need to deliver a permanent fix fairly promptly10:02
adeuringgmb: yes, the temporary fix for the retracers works10:02
adeuringbut we should get rid of it quite soon10:02
gmblifeless, I agree. Since adeuring's temporary fix is in place, go ahead and land yours - I'm less worried about backing it out if we've got a kludge for the problem already.10:03
gmbI'll update the merge proposal.10:03
wgrantAnd this solution is the first one that doesn't make me cry :)10:03
gmbWell, naturally we wouldn't want that.10:03
wgrantI mean, it is actually a good, effective long-term solution this time. Which is really nice.10:04
gmbAgreed.10:04
lifelesswell, we can hope.10:05
wgrantWell, there are no obvious fatal flaws in this one.10:07
wgrantSo I think it should be good.10:07
wgrantbigjools: Is there a known issue with non-virt dispatches failing sometimes?10:08
wgrantbigjools: I had some complaints about amd64 distro builders doing it this morning.10:08
wgrant(the build restarted a few times)10:08
bigjoolshave you got examples so I can check the log?10:09
wgrantbigjools: kdeedu on amd64, reported a little under 10 hours ago.10:10
wgrantNot sure when it actually happened, though.10:10
wgrantLooks like it was not long before it was reported.10:10
wgrantSo look around 10 hours ago.10:10
bigjoolsit looks like temporary network brownouts10:11
bigjoolsor at least a lack of response from builders10:12
wgrantHmm.10:13
bigjoolsah I see what it is10:13
bigjoolsslow reset10:13
wgrantBut it's non-virt.10:13
wgrantThere is no reset.10:13
bigjoolsgood point10:13
wgrantOtherwise, yes, slow reset is the obvious thing.10:14
bigjoolswell, it's still timing out anyway10:14
bigjoolsthe log has "User timeout caused connection failure."10:14
bigjoolsfollowed by "reset failure"10:14
bigjoolswhich is odd10:14
bigjools(for crested)10:14
wgrantEr.10:14
wgrantWTF?10:14
bigjoolsI'd say that it's all working fine from a software PoV, it does what it's supposed to under the conditions10:15
bigjoolsthat log message is a little odd though10:15
wgrantExcept that it shouldn't actually be trying to reset a non-virt builder.10:15
wgrantOr is the log lying?10:15
bigjoolsI would not worry too much10:16
bigjoolsI've completely re-written the failure handling for the next release10:16
wgrantYeah, I saw that.10:17
wgrantLooks good.10:17
bigjoolsit won't catch build failures, just dispatch failures10:17
bigjoolswe need to add that feature at some point to stop bad jobs jumping around builders10:17
bigjoolsmy blood levels are dangerously high in my caffeine stream, I need to go fix that10:19
noodles775voidspace: re. your oops - one of the LP registry team will be interested to look at fixing the bug, but for what its worth, the exception is being raised while it is trying to tell you that the email address is already in use.10:20
wgrantFor which we have ShipIt to blame :(10:20
wgrantI suspect.10:21
noodles775wgrant: Ah, does it create email addresses where email.person is None?10:21
wgrantnoodles775: ShipIt does Accounts, not Persons.10:21
wgrantSo, yes :(10:21
wgrantAnd for no very good reason it still haunts the same DB as LP.10:22
lifelessadeuring: bunch of incremental changes from stuarts review pushed now; I expected librarian.txt to fail, fixing that now.10:46
lifelessadeuring: anything other than that failing is unexpected (but sadly probably predicatable :P)10:46
adeuringlifeless: ok, I'll see what ec2 will tell us ;)10:46
lifelessnight y'all11:01
wgrantNight.11:02
=== gmb changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 4 of 10.09 | PQM is CLOSED | firefighting: non-lucid builds are broken, db_lp buildbot offline | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting | On-call review in irc:/​/​irc.freenode.net/​#launchpad-reviews
thumpergmb: sha'ping12:22
thumpergmb: I'm aware of the outstanding QA for code12:23
thumpergmb: but staging has been futzed all day12:23
thumpergmb: so I'll be looking at it tomorrow morning12:23
thumpergmb: just letting you know12:23
gmbthumper, Yep, I expected as much. No worries, and thanks for the update.12:24
* thumper -> cuppa12:24
thumpergmb: np12:24
bigjoolswgrant: got a sec to help work out WTF is going on with bug 62983512:33
_mup_Bug #629835: cannot delete 'linux' from the ubuntu-security-proposed ppa <oops> <Soyuz:New> <https://launchpad.net/bugs/629835>12:33
wgrantbigjools: Sure.12:44
wgrantIs that the delayed copy one?12:44
wgrantYes.12:44
wgrantbigjools: You need to look for the initial error.12:45
wgrantIt's possible that that's it.12:45
wgrantBut I doubt it.12:45
bigjoolswgrant: it's allowed a 2nd copy of the same thing somewhere, somehow12:45
bigjoolswhich is scary12:45
wgrantbigjools: Or it's just continuing to process the same copy, because it failed somehow the first time.12:46
wgrantThis has happened once before.12:46
wgrantWe couldn't work out what it was.12:46
bigjoolshmmm12:46
bigjoolshe's done a pocket copy in the same archive12:46
wgrantI wish we tracked where they were copied from.12:46
wgrantCan you see when that delayed copy was initially processed?12:47
bigjoolsI wonder if the delayed copy checks are broken12:47
bigjoolsI can grep logs12:47
wgrantThere is a race in the delayed copy mechanism.12:47
wgrantBut it's pretty unlikely.12:47
bigjoolslogs don't go back far enough :/12:49
bigjoolsit's been doing this since 1st Sep12:49
wgrantEr.12:49
wgrantWhat?12:49
bigjoolsat least12:49
wgrantYeah, it's been doing it since the 26th.12:49
bigjoolswe only keep 5 days of publisher logs12:49
wgrant...12:49
wgrant26th, 23:01Z12:49
wgrantEr, 25th, 23:01Z.12:49
wgrantWhy so short?12:50
bigjoolsthat's the default log rotation12:50
wgrantBaaah.12:50
wgrantCan you check for any other delayed copies of that source?12:50
bigjoolssigh, loads of PPA pub;isher OOPSes getting logged but not reported12:51
bigjoolsPoolFileOverwriteError12:51
wgrantPool file overwrites, mostly, I guess?12:51
wgrantYeah.12:51
bigjoolsthe same file, over and over12:51
* bigjools wonders how that can happen12:51
wgrantI used to know.12:52
* wgrant hunts.12:53
bigjoolsso, the inconsistent state error first happened 2010-08-25T23:05:39.797695+00:0012:53
bigjoolswhich tallies12:53
wgrantAh, you have oopses?12:54
bigjoolsyes12:54
wgrantAwesome.12:54
bigjoolslot sof 'em12:54
wgrantSo, that's interesting.12:54
bigjoolsall the same as the one I put in the bug12:54
wgrantThere's nothing four minutes earlier?12:55
wgrantThat's when it was processed first.12:55
wgrantI expect a different error.12:55
* bigjools hunts12:56
=== lindbohm.freenode.net changed the topic of #launchpad-dev to: Launchpad Development Channel | Week 3 of 10.09 | PQM is CLOSED | firefighting: - | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting | On-call review in irc:/​/​irc.freenode.net/​#launchpad-reviews
wgrantbigjools: Are any of the PoolFileOverwriteErrors for files published after May?12:57
wgrantBefore then the copy checker didn't actually check for contents conflicts.12:57
bigjoolshard to tell12:58
wgrant(devel r10701)12:58
wgrantWe can query for that reasonably easily.12:58
wgrantFind Pending PPA publications older than not very old.12:59
bigjoolsthere are two13:01
bigjools2010-07-27 and 2010-08-2513:01
wgrantbigjools: Actually, can you tell (from the repetitve PFOE oopses?) if 23:01 was the same publisher run as 23:05?13:02
bigjoolsand 9 more from 2010-09-0313:02
wgrantHm...13:02
wgrantThese are Pending, in undisabled PPAs?13:02
wgrantEr, undisabled, and with the published flag set, too.13:02
bigjoolsACCEPTED, not sure about the status, hang on13:02
wgrantOh, these are the delayed copies of that source?13:02
bigjoolsall enabled apart from the 07-27 one13:03
wgrantAre these the delayed copy PUs?13:03
wgrantIf so, that's an awful lot.13:04
bigjools2 of them are delayed13:05
bigjoolsone is the disabled PPA one is u-s-p13:05
bigjoolsdated 2010-08-2513:06
* bigjools scratches head13:06
wgrantCan you tell by the oopses if the 23:00 publisher finished in time, making the 23:05 OOPS the second run?13:07
bigjoolsI can't tell13:08
wgrantNot even by looking for the repeated PFOEs?13:09
bigjoolsoh I am looking at staging which is out of date, which explains the load of 09-0313:14
wgrantAh.13:16
bigjoolsI'm going to set that upload to rejected13:20
bigjoolsto remove this OOPS13:20
wgrantSounds reasonable.13:21
wgrantCan you also increase log retention, or make the OOPSes less easy to ignore?13:22
bigjoolsI'm working on the latter13:22
bigjoolsthe former is a good idea13:22
bigjoolsI'd love to know how this happens though :/13:23
wgrantWe'll find out next time :)13:23
bigjoolsJamie said he copied it from their private PPA in hard to maverick in the public one13:23
wgrantI don't think I've tried a cross-series delayed copy.13:23
bigjoolsI wonder if the change in series has anything to do with it13:23
bigjoolsand then he deleted it13:24
bigjoolscould have been before or after it was published in the new archive13:24
wgrantYou can't tell?13:24
bigjoolswgrant: http://pastebin.ubuntu.com/489219/13:31
bigjoolsummm interesting13:31
wgrantThe one that was never published is a little odd.13:32
wgrantI noticed it before, but thought little of it.13:32
wgrantHowever, there's something really odd there.13:33
bigjoolsthat's the most recent one13:33
wgrantNo, I mean the second one in that list.13:33
bigjoolsoh right, missed that13:33
wgrantLooks like he deleted it 25 seconds after it was published.13:33
wgrantWhich seems unlikely.13:33
wgrantHowever, that's still well after everything initially broke.13:34
bigjoolsdeleting 25 seconds after sounds reasonable to me13:35
wgrantAfter an out-of-band delayed copy?13:35
bigjoolsit's not delayed at that point13:35
wgrantIt isn't?13:35
bigjoolsthe first one will make it instant for future copies13:36
wgrantWasn't that publication just created by the retry?13:36
wgrantYes.13:36
wgrantBut AIUI p-a is doing the copying.13:36
bigjoolsretry?13:36
bigjoolsonly for delayed copies13:36
bigjoolsaegh, I wonder if this is the "hit copy twice" bug13:36
bigjoolsI bet it is13:37
wgrantThere'd be a DONE PU in that case.13:37
wgrantThere appears to not be.13:37
wgrantYeah.13:38
wgrantThat source has only ever been in Maverick in that PPA.13:38
wgrantAnd there's only one delayed copy to Maverick.13:38
wgrantAnd it's the Accepted one.13:38
wgrant206969613:38
wgrantSo.13:40
wgrantThe bug suggests that the delayed copy keeps being reprocessed.13:40
wgrantBut the OOPS suggests that it's failing.13:40
wgrantI wonder if it stops OOPSing when the source is deleted.13:41
wgrantOh look, yes it does.13:42
wgrantSo there will be some publisher runs where it doesn't OOPS.13:42
wgrantThat's immediately after jdstrand deleted the last publication.13:42
wgrantSo the next p-a reprocesses it successfully, creating new publications.13:42
wgrantBut somehow still fails to set the status.13:42
wgrantMaybe custom uploads.13:43
wgrantThere must be more OOPSes there that we are missing.13:43
wgrantIf you can't see them, delete it again and watch the next p-a run.13:43
wgrantSomething has to show up.13:43
bigjoolsyeah13:43
bigjoolsI'll grab jamie later, I can't delete it13:43
* bigjools is desperate for food now13:44
wgrantbigjools: The last publication is from the third.13:44
wgrant00:12Z13:44
wgrantYou should have logs for then.13:44
wgrantThe copy was processed, but not set to Done.13:44
bigjoolsare you querying on the API or using my sql output?13:44
wgrantAPI and web UI.13:45
bigjoolsk13:45
bigjoolswgrant: so it started publishing it again at 2010-09-03 00:00:4513:46
bigjoolsthe publisher run at 00:10 has no errors13:47
bigjoolsthe next one goes bang13:47
wgrantDid 00:10 actually run at all?13:47
bigjoolsyes13:47
wgrantOr was it skipped because 00:00 overran twice?13:48
wgrantHmm.13:48
bigjoolsoh wait13:48
bigjoolsno13:48
bigjoolsdamn this log file13:48
bigjoolsthe next run was at 00:15 and it failed on the same queue item13:49
bigjools2010-09-03 00:00:45 DEBUG   Publishing source linux/2.6.24-28.77 to ubuntu/maver13:49
bigjoolsso it ignored the bad PU for that run only, it fails for the run before and after13:50
wgrantbigjools: What's the extent of its logging of the 00:00 publication?13:50
wgrantAny errors?13:50
wgrantAny success?13:50
wgrantI expect to see something there.13:50
wgrantPreferably an error.13:51
bigjoolsargh13:51
bigjoolsI didn't scroll down enough13:51
wgrantHeh.13:51
bigjoolsso it published it for ubuntu/maverick/amd6413:51
bigjoolsthen failed for that queue item again13:51
=== Ursinha-afk is now known as Ursinha
wgrantHmmm.13:53
wgrantI wonder.13:53
wgrantI wonder.....13:53
wgrantbigjools: There's nothing about lpia?13:55
wgrantPublishing that build could blow everything up.13:55
wgrantBadly.13:55
bigjoolsnup13:55
wgrantOr hppa?13:56
wgrantI expect a NotFoundError when publishing either of those.13:56
wgrantAhem.13:57
wgrantThat would explain why amd64 is the only thing published there, if it's doing it alphabetically, which is plausible.13:57
bigjoolsyep13:57
bigjoolsthen bails out with an exception13:58
wgrantCan you see anything like that in the log?13:58
bigjoolslike what?13:59
bigjoolsNFE?13:59
bigjoolsit does amd64, then bails out on the dodgy PU14:00
wgrantWell, I want to see where it tries to accept the hppa build.14:00
bigjoolsit doesn't get that far14:00
wgrantHuh.14:00
wgrantIs there only one PUB on the PU?14:00
=== matsubara-afk is now known as matsubara
wgrantHm, no.14:03
wgrantIt has all of them, AFAICT.14:03
wgrantIncluding hppa, which is probably killing it.14:03
wgrantBut it should be logged!14:03
wgrantDAmmit.14:03
bigjoolsoO14:03
bigjoolshuh?14:04
wgrantSo. The 00:00 publisher. Can you show me the lines between where it starts publishing the source, and where it starts publishing the next item?14:04
wgrantI'm now reasonably sure that it's dying due to an NFE when attempting to realise the hppa build.14:05
wgrantBut it should be logging that somewhere.14:05
=== almaisan-away is now known as al-maisan
bigjoolsthis is your lot: http://pastebin.ubuntu.com/489241/14:07
* bigjools has to run for 30 mins14:07
wgrantWhat's OOPS-1707PPA1?14:08
wgrantI doubt it's the usual.14:08
wgrantOK.14:08
* bigjools checks quickly14:08
bigjoolsWTF14:09
bigjoolsFatalUploadError: Signing key C51BC47D4C80ED00E96B4DC1839AEABCCF5C0A1F not registered in launchpad.14:09
wgrantuH.14:09
wgrantI doubt it, really.14:09
wgrantMust be a different location.14:10
bigjoolsthat's can't be right14:10
wgrantIt's not.14:10
wgrantThey're throwing oopses in different directories.14:10
wgrantSo we have conflicts.14:10
wgrantYayyyyy.14:10
bigjoolsthe reporting tool is crack14:11
bigjoolsthe real error is:14:11
bigjoolsNotFoundError: u'Unknown architecture hppa for ubuntu maverick'14:11
bigjools:)14:11
wgrantAHA14:11
wgrantAs expected.14:11
wgrantPhew.14:11
bigjoolsso, we've copied builds that can't be published14:11
wgrantExactly.14:11
* bigjools high-5s wgrant14:11
wgrantSo it was the cross-seriesness.14:11
wgrantBut with added complexity.14:12
bigjoolsthanks for helping wgrant14:13
wgrantI am glad we no longer have a copy corruption mystery.14:16
=== bigjools is now known as bigjools-lunch
=== bigjools-lunch is now known as bigjools
wgrantbigjools: Just to be sure, can you confirm that the NotFoundError also occurs at the time of the initial publication?15:11
wgrantWould be nice to be absolutely positive that it was the root.15:11
bigjoolsok15:11
=== Ursinha is now known as Ursinha-afk
=== Ursinha-afk is now known as Ursinha
benjigmb: good afternoon; I noticed that one of my branches was reverted with the commit message of "Revert the merge of the check-in-WADL branch, which was causing the build to break."  I'm looking for more info on the breakage so I can fix it.  Any hints?16:14
=== Ursinha is now known as Ursinha-lunch
stubbenji: Something about breaking the launchpad/stable -> launchpad/db-devel auto merge, because it would always conflict.16:18
gmbbenji, Sure, let me dig out the failure messages. There's also an ML thread about it between lifeless and I.16:19
benjihmm16:19
gmbbenji, "WADL test will break db-devel merges regularly (was) Re: buildbot failure in Launchpad on lucid_db_lp" is the thread you're looking for.16:19
gmbbenji, And this is the build that failed: https://lpbuildbot.canonical.com/builders/lucid_db_lp/builds/17616:20
benjigmb: thanks; what list was that thread on?  I don't think I saw it.16:21
gmbbenji, canonical-launchpad.16:21
gmbbenji, I'll find the thread in the archive for you.16:21
benjimuch appreciated16:21
gmbbenji, https://lists.ubuntu.com/mailman/private/canonical-launchpad/2010-September/060017.html is the top message in the thread; everything else is under that.16:22
benjithanks16:22
=== matsubara is now known as matsubara-lunch
=== salgado is now known as salgado-lunch
=== beuno is now known as beuno-lunch
=== al-maisan is now known as almaisan-away
=== matsubara-lunch is now known as matsubara
=== Ursinha-lunch is now known as Ursinha
=== benji is now known as benji-lunch
=== salgado-lunch is now known as salgado
=== almaisan-away is now known as al-maisan
=== beuno-lunch is now known as beuno
=== benji-lunch is now known as benji
lifelessadeuring1: hi19:53
lifelessbenji: hi19:54
benjihello there19:54
adeuring1hi lifeless19:58
=== lifeless changed the topic of #launchpad-dev to: Launchpad Development Channel | Performance Tuesday! | Week 3 of 10.09 | PQM is CLOSED | firefighting: - | https:/​/​dev.launchpad.net/​ | Get the code: https:/​/​dev.launchpad.net/​Getting | On-call review in irc:/​/​irc.freenode.net/​#launchpad-reviews
lifelessadeuring1: how did it go?20:05
lifelessbenji: see the thread, I think we covered all the salient points there20:05
adeuring1there were basically two failures, one in test_db20:05
adeuring1(fixed)20:05
lifelessbenji: there was one extra thing not really covered which is that as a way of prevented api incompatabilities, I think it will need to be ----way---- easier.20:05
lifelessadeuring1: \o/20:05
lifelessadeuring1: so its landed?20:06
adeuring1lifeless: no, I could not fiugre out how to fix the other one, in lib/canonical/launchpad/ftests/../doc/librarian.txt20:06
lifelessgmb: up still ?20:06
adeuring1lifeless: AttributeError: 'thread._local' object has no attribute 'features'20:06
lifelessadeuring1: ok, I'll fix the other. where is your branch with the db fix ?20:06
lifelessadeuring1: that will be because a test has no request object live but is using a view.20:07
benjilifeless: as soon as I get access to the list archives, I'm sure your comments will make total sense :)20:07
adeuring1lifeless: lp:~adeuring/launchpad/private-librarian20:07
lifelessbenji: ah -20:07
lifelessbenji: ok so20:07
adeuring1lifeless: I also moved some ZCML stuff forSafeStreamOrRedirectLibraryFileAliasView to l/c/l/zcmnl/librarian.zcml20:08
lifelessadeuring1: great stuff20:08
gmblifeless: Yes, I'm still around.20:11
lifelessrc stampy stampy needed20:11
lifelesshttps://code.edge.launchpad.net/~gary/launchpad/bug627442/+merge/3470120:11
gmblifeless: Ah, yes. Looking at that now.20:11
lifelessthanks20:11
gmbgary_poster, lifeless : rc=me20:11
gary_posterthank you gmb20:12
gmbnp20:12
lifelessgary_poster: the new librarian stuff should land today20:12
gary_posterawesome!20:12
lifelessgary_poster: stub did a full review, there were two test failures abel found overnight, and he fixed one, I'm fixing the other now.20:13
gary_posterfanstastic20:13
lifelessI'll queue up the RT tickets later and get flumper to hi-prio them20:13
gary_poster:-) k20:15
lifelessgary_poster: also the memcache/email/librarian client stuff seems happy20:15
gary_posterI didn't see that, but from the list I'm guessing that's timeout related?20:16
lifelesslogging those actions in OOPS20:16
lifelessas part of the request timeline20:16
gary_posterah ok20:16
gary_posteryeah I did see that actually20:16
gary_postercool20:17
lifelessoopstools are getting a -little- confused on some of them20:17
lifelessso we've got some fine tuning to do20:17
lifelessbut nothing major, probably just a stray \n ors something20:18
gary_posterk20:18
lifelesshah20:18
lifelessand the 99999 thing20:18
lifelessoverflows oopstools :P20:18
gary_poster:-(20:18
lifelessttps://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1710EC143020:19
lifelessI'll change it to 020:19
lifelessactually20:19
lifelessI'll change it to consider it finished 'now' for that code path.20:19
lifelessif you look in that oops, its all sensible20:19
lifelessthe last one is (6x9)ms as expected20:20
gary_posterhee hee, I do like the overflow20:20
* gary_poster hasn't had lunch yet, and is starting to feel a bit...out-of-body20:20
lifelessshoo20:20
* gary_poster should rectify that20:20
gary_posterbiab20:20
* gmb -> afk for a while; will check back periodically20:35
=== Ursinha is now known as Ursinha-brb
lifelessman, this race condition on librarian startup is really annoying20:43
jelmerlifeless: TacException: Unable to start /home/jelmer/lp/daemons/librarian.tac. Content of /var/tmp/librarian.log: ?21:06
lifelessjelmer: ?21:06
jelmerlifeless: Is that one of the symptoms of that race condition?21:07
lifelessno, I don't think so21:07
lifelessrunning just a librarian using test and having them deadlock is21:07
lifelessthe client has sent the reqest, the librarian is still reading it21:07
lifelessor something21:07
jelmerok, that's different from what I've seen that21:08
jelmer*then21:08
lifelessit may be21:09
lifelessI've seen that too, but very rarely21:10
jelmerit's quite rare here as well, I'd say about once every two dozen test runs21:10
lifeless\o/21:26
lifelessnow I just need an incremental review21:26
lifelesshi poolie21:33
lifelessI imagine you're running out how; will you be working today?21:33
thumpergmb: ping when you are ready21:37
gmbHi thumper; Do you want to have a call or just a chat?21:38
thumpergmb: call is fine21:38
thumpermy skype is running21:38
thumpermumble doesn't like nz21:38
lifelessthose things used to be synonyms21:38
gmbthumper: Okay, give me a minute to shift locations.21:38
thumperlifeless: there is a different local isp here that offers SDSL21:38
thumperlifeless: who doesn't get affected by telecom's shaping21:39
lifelessthumper: ooo21:39
lifelessthumper: who?21:39
thumperlifeless: I'm wanting to try mumble with it21:39
thumperwicked networks21:39
lifelessI signed up for a year with telecom, just to get settled, its tolerable21:39
thumperI know an office that has it all set up21:39
thumperthey offered for me to go and try it out21:39
lifelessbut yeah, I would totally pay as much as I was paying in .au for /actual internet/21:39
thumperit is only 2 meg up/down21:40
thumperas opposed to 8 down and .5 up that I get now21:40
lifelesssame average :P21:40
thumperyeah, but I get a lot more down than up21:40
gmbthumper: Okay, calling now.21:41
=== Ursinha-brb is now known as Ursinha
=== salgado is now known as salgado-afk
mwhudsonmy up/down ratio at home is stupid22:12
mwhudson18 meg up, 0.5 down22:13
lifelessnice22:13
lifelessI'm getting 4MB here22:13
lifelessbut I suspect its the line quality as much as anything22:13
mwhudsonyeah, i think i'm very close to the nearest cabinet22:15
ajmitchthumper: we use that isp at work, it seems OK most of the time. sometimes has some poor international connectivity, but generally not too bad22:15
lifelessmwhudson: there are 5 phone points in the house22:15
lifelessbenji: did you get clarity?22:18
lifelessbenji: or would you like me to expound on the issue here?22:18
benjinot yet; I haven't gotten a response on my subscription request to the list22:18
benjiif you know who to pester, I'd appreciate to know22:19
lifelesswe really should move that to LP22:19
lifelessuhm, its possibly out of date; #is would be a good place to ask - the gsas.22:19
lifelessanyhow, let me recap22:19
lifelesswe have two branches22:19
lifelessdb-devel and devel both feeding into a single tree - the 'db-devel buildbot test tree'22:20
lifelessif both branches have had API changes made directly on them, then every single time that devel change the API, the merge to db-devel will alter - correctly - the WADL, but the test will fail.22:20
lifelessas gmb and I understand the goal of the test to be fixing the apidoc issue primarily, we rolled it back as the simplest way to unblock the release.22:21
lifelessbenji: but there are a couple of extra quirks to bear in mind when retackling it22:22
lifeless - as a way of checking for API regressions, examining a big bytestring is very human intensive. I fully expect the human response to be 'oh, I need to run X and commit it' - that is, it won't prevent incompatible changes at all.22:22
=== gary_poster_ is now known as gary_poster
lifeless - its generally a bad idea to check in the output of build processes; VCS's can interact badly with that, and its wasteful to store the duplicate/derived data in the VCS.22:26
benjiI didn't intend it as a way of checking for API regressions; I had the goal of speeding up make.  The tests were a way of being sure the files checked in were current.  The tests also provide a guard against unintenional API changes (if you make a change and the WADL changes and you didn't know you were impacting the web service, then you've been warned).22:26
lifelessbenji: so, the pragmatic issue is: this must reliably, work when two branches with API changes are merged, with no human intervention.22:30
lifelessbenji: *I* strongly suspect that means that making wadl generation faster is an easier approach.22:30
lifelessnaively, I don't see why it would be more than a second or so's processing.22:31
thumperlifeless: do you know if we test against the ubuntu-bug script?22:32
thumperlifeless: or which version of the api it uses?22:32
lifelessthumper: there is something done with apport yes.22:32
lifeless1.0 I believe, just because the distro did an audit-and-sweep for beta->1.0 a while back.22:33
gary_posterhey poolie.  I'd like to briefly show off bzr in a talk I'm giving.  I was having trouble with bzr-git until I just upgraded to the new packages (https://launchpad.net/~bzr/+archive/ppa)--my example works now, yay!  why isn't bzr-hg in that ppa though?  In a perfect world, I'd show that too.22:48
thumperlifeless: I'd like to talk to you at some stage to talk about how to understand some oopses I'm seeing22:49
mwhudsongary_poster: bzr-hg is not nearly as polished as bzr-git22:50
mwhudsoni don't know if that's why22:50
lifelessthumper: sure thing22:50
gary_postermwhudson: ok, thank you.22:50
lifelessthumper: skype?22:50
thumperlifeless: ack22:51
thumperhttps://lp-oops.canonical.com/oops.py/?oopsid=1698XMLP10822:51
thumperhttps://lp-oops.canonical.com/oops.py/?oopsid=1698XMLP11022:52
thumperhttps://lp-oops.canonical.com/oops.py/?oopsid=1698XMLP11522:52
thumperhttps://lp-oops.canonical.com/oops.py/?oopsid=1698XMLP11822:52
gary_postermwhudson, I've actually come to trust bzr-svn so much that I use it to commit to public svn repos.  I didn't always feel that way.  Do you happen to know if bzr-git is similarly polished?22:54
mwhudsongary_poster: it's pretty close22:55
gary_postercool22:55
gary_posterthanks again22:55
wgrantApart from the whole push vs dpush thing.22:55
mwhudsonit's less polished than bzr-svn probably, but it's a bit less of a model change so it's job is a bit easier22:55
mwhudsonoh right yeah, and it doesn't roundtrip22:55
jelmerwgrant: roundtripping support is on the way and will be in the next non-bugfix release22:56
james_w`\o/22:56
gary_posterawesome :-)22:56
=== matsubara is now known as matsubara-afk
gary_posterI see this page with caveats, and links to more http://doc.bazaar.canonical.com/migration/en/foreign/bzr-on-git-projects.html22:57
wgrantjelmer: Ooh, nice.22:57
wgrantjelmer: We're you storing the data?22:57
jelmerwgrant: a special file in the tree and the commit message22:58
wgrantjelmer: The file stores file IDs?22:58
jelmerwgrant: yes, for file ids introduced by bzr23:00
lifelessthumper: https://devpad.canonical.com/~stub/ppr/lpnet/latest-daily-pageids.html23:00
wgrantjelmer: Isn't that going to be merge conflict fun?23:01
jelmerwgrant: yes, there might be some issues in that regard23:11
wgrantrockstar: FWIW, the lp-buildd chroots have nothing to do with security.23:12
jelmerwgrant: fortunately it will only happen when people merge from multiple others who have used bzr to introduce new files23:12
wgrantchroots are useless when you have root.23:12
wgrantjelmer: Yep.23:12
wgrantStill not ideal, but such is git...23:12
jelmerwgrant: It's the best I could come up with that was scalable, reliable and practical.23:12
SnorlaxAnyone here, that has deployed a launchpad install on a private server?23:25
lifelessjelmer: wgrant: merge is hookable. Hook and win.23:25
jelmerlifeless: we were talking about git though23:26
lifelessits hookable too :P23:26
jelmerlifeless: fair enough :-)23:26
wgrantSo Git really doesn't have any custom metadata support?23:26
wgrant*At all*?23:27
jelmerwgrant: well, there is notes, which are basically branches with metadata about e.g. commits23:27
wgrantBut I hear they don't propagate.23:27
jelmerthey are used to make after-the-fact modifications to existing fields in git23:27
jelmerso we can't use them - their contents would still show up in the git ui23:28
wgrantHah.23:28
=== al-maisan is now known as almaisan-away
lifelessmars: around ?23:39
lifelessOOPS-1710EC915, OOPS-1710ED88023:39
Ursinhalifeless, I guess he's out for labor's day23:50
lifelessah yes23:52

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!