[00:02] <huwshimi> lifeless: Morning
[00:02] <huwshimi> lifeless: I started to take a look at the caching
[00:02] <huwshimi> lifeless: Is this what you meant? https://code.launchpad.net/~huwshimi/launchpad/avatars-everywhere-712894/+merge/81430
[00:03] <huwshimi> (I've pushed changes which will be in the diff)
[00:03] <huwshimi> lifeless: Is there more that I need to do?
[00:05] <huwshimi> lifeless: You'll have to bear with me, as I really don't know what I'm doing here :)
[00:07] <lifeless> heh, sure
[00:07] <lifeless> huwshimi: thats pretty close
[00:08] <lifeless> I'll give a longer answer in the review
[00:08] <huwshimi> lifeless: Thanks :)
[00:13] <lifeless> done, -> optometrist, bbiab
[00:54] <wgrant> %#$@
[00:54] <wgrant> die, postgres, die
[00:54] <lifeless> ?
[00:54] <StevenK> What about it?
[00:55] <wgrant> The planner is making stupid decisions.
[00:55]  * wgrant pastes.
[00:57] <wgrant> lifeless: http://paste.ubuntu.com/731544/
[00:57] <wgrant> First plan, for a single policy.
[00:58] <wgrant> Very sensible.
[00:58] <wgrant> Straight double nested loop down three indices.
[00:59] <wgrant> Second one, also a single policy but not specifying the ID directly (this is how I initially tried it): terrible terrible plan doing a manual person sort afterwards. It should have known it would get at most one row, but perhaps the policy ID I hardcoded is in the stats so it gets special treatment.
[00:59] <wgrant> Third one, hardcoding the three IDs.
[00:59] <wgrant> Even worse plan.
[00:59] <wgrant> (these are Ubuntu's three access policies, on real data)
[00:59] <lifeless> wgrant: a request for these pastes in future
[00:59] <lifeless> wgrant: please wrap the queries.
[01:00] <wgrant> lifeless: I normally do, but was too grumpy this time.
[01:00] <wgrant> I tried a CTE, but it doesn't work. Even if it returns just the one row.
[01:00] <wgrant> I guess because it plans beforehand.
[01:01] <wgrant> It's probably seeing that the 97822 policy is special and planning accordingly, I guess.
[01:02] <lifeless> so id=97822 == policy=1 ?
[01:02] <wgrant> But if it knows it's going to dominate, then why does it ignore it when I specify 3 IDs...
[01:02] <wgrant> Ah, yes, sorry.
[01:02] <wgrant> 97822 == distro 1, policy 1
[01:02] <wgrant> Which is Ubuntu private non-security.
[01:02] <lifeless> is policy an enum ?
[01:02] <wgrant> Yes.
[01:03] <wgrant> It may be better to explode it here.
[01:03] <wgrant> But only because the planner is being obtuse.
[01:03] <wgrant> It's perfectly feasible to do the same simple nested loop down three indices, just with an ANY on the apg filter.
[01:03] <lifeless> look at the estimates
[01:03] <lifeless> its estimating 2, then 45
[01:04] <lifeless> 2nd plan
[01:04] <wgrant> Yeah.
[01:04] <lifeless> first one its estimating 57K rows - and considering 1.2M
[01:04] <lifeless> but the actual execution is fast
[01:05] <lifeless> this is on df ?
[01:05] <wgrant> Yes.
[01:05] <wgrant> I have the queries to set up the data in a minute or two, if you want.
[01:06] <lifeless> the second plan has a lower cost estimate
[01:06] <lifeless> and the third lowest of all
[01:06] <wgrant> Yeah, because the row count estimates are off.
[01:06] <lifeless> so the interesting question is why
[01:06] <lifeless> have you analyzed ?
[01:06] <wgrant> By a factor of 3.
[01:06] <wgrant> Er, 3 orders of magnitude.
[01:06] <wgrant> Yes.
[01:06] <wgrant> VACUUM ANALYZE, ANALYZE, etc.
[01:06] <wgrant> Several times.
[01:06] <wgrant> No change after the initial one.
[01:06] <lifeless> scale 1000 ?
[01:06] <lifeless> or the 100 df is running ?
[01:06] <wgrant> 100
[01:07] <lifeless> care to try
[01:07] <wgrant> I guess.
[01:07] <wgrant> Worth a try.
[01:07] <lifeless> its a thing to rule out
[01:09] <lifeless> so the first one is potentially a bad plan too
[01:09] <lifeless> but we expect the data sizes to stay modest
[01:09] <lifeless> what does it do if you say offset 500 to it ?
[01:09] <lifeless> [curiosity]
[01:09] <wgrant> I'm not sure that's important, but sure.
[01:10] <wgrant> 680ms, so it's pretty linear.
[01:10] <wgrant> As expected from the plan.
[01:12] <wgrant> Doing a sort-key-based offset is very fast, also as expected.
[01:12] <wgrant> That plan is perfect for the main disclosure management view.
[01:13] <wgrant> (unless the set of observers is small)
[01:15] <wgrant> person_sorting_idx is only 57MB, and presumably very hot, so it may even be fast enough when there aren't many.
[01:16] <wgrant> I suspect the first batch is actually slower than the rest.
[01:17] <wgrant> Because it's going to have to walk through all the usernames that start with digits.
[01:17] <wgrant> Roughly none of which are going to be relevant, because sensible people don't generally have usernames that start with digits.
[01:17] <StevenK> UPDATE person set name = '8wgrant' where name = 'wgrant'; commit;
[01:18] <wgrant> I actually mean displayname.
[01:18] <wgrant> But yes.
[01:31] <lifeless> rofl
[01:31] <lifeless> https://github.com/juuso/BozoCrack
[01:35] <wgrant> Heh
[01:35] <wgrant> It is a useful strategy sometimes.
[01:40] <wgrant> lifeless: So, any query suggestions?
[01:41] <lifeless> sorry was sidetracked
[01:41] <lifeless> uhm
[01:41] <lifeless> (distribution, policy) is unique
[01:42] <lifeless> but id is what shows up on the other tables
[01:42] <wgrant> Yes.
[01:42] <lifeless> where are your data generating scripts?
[01:42] <lifeless> (and do they work on temp tables ;))
[01:44] <wgrant> They are temp tables, yes.
[01:44] <lifeless> wgrant: try this http://paste.ubuntu.com/731583/
[01:44] <wgrant> Just dropped the fks and it all works.
[01:44] <lifeless> bah
[01:44] <lifeless> not limit 0
[01:44] <lifeless> offset 0
[01:45] <wgrant> http://paste.ubuntu.com/731585/ <- forgive the awful code to create artifact-specific permissions. There's probably a better way to do it, but everything else I tried was unbelievably slow on DF.
[01:50] <wgrant> lifeless: Erm, explicit offset 0?
[01:50] <wgrant> Isn't that redundant?
[01:50] <wgrant> (also, that's still 3.5s)
[01:51] <lifeless> optimisation barrier
[01:51] <wgrant> I think that's the problem.
[01:52] <wgrant> I suspect it chooses the index-scan-only plan because the ID it's looking for dominates the stats.
[01:52] <wgrant> Except not.
[02:06] <lifeless> what do you mean by 'dropped the fks and it all works'
[02:06] <wgrant> The original table definitions had foreign keys to person/bug/product/etc.
[02:06] <wgrant> But temp tables can't have fks to permanent tables.
[02:06] <wgrant> So I dropped the constraints.
[02:06] <lifeless> oh right, of course ;)
[02:07] <lifeless> I presumed that (bugsummary experience most recently ;P)
[02:08] <lifeless> hah
[02:08] <lifeless> Time: 27666.227 ms
[02:09] <wgrant> Oh?
[02:09] <lifeless> first run of second plan
[02:09] <wgrant> Ah
[02:10] <lifeless>  SELECT DISTINCT ON (person_sort_key(person.displayname, person.name)) person.id FROM accesspolicygrant JOIN accesspolicyuse ON accesspolicyuse.id = accesspolicygrant.policy JOIN person ON person.id = accesspolicygrant.person WHERE accesspolicyuse.distribution = 1 AND accesspolicyuse.id=97822 ORDER BY person_sort_key(person.displayname, person.name) LIMIT 75;
[02:10] <lifeless>  id
[02:10] <lifeless> ----
[02:10] <lifeless> (0 rows)
[02:10] <wgrant> Your IDs will be different :)
[02:10] <lifeless> ah, doh
[02:11] <wgrant> SELECT id, policy FROM accesspolicyuse WHERE distribution=1;
[02:11] <lifeless> we'll need to iterate on abels code
[02:11] <lifeless> to paginate on a function sort
[02:12] <wgrant> Yes.
[02:12] <wgrant> I had already looked at that a little.
[02:12] <lifeless> did you look at nuking the functional index?
[02:13] <wgrant> Just using eg. person.name for these queries? Yes, didn't help.
[02:14] <huwshimi> wgrant: Is this what you meant for fixing that XSS problem? http://paste.ubuntu.com/731598/
[02:15] <wgrant> huwshimi: !!! someone did it right the first time. Yes, that's what I was meaning.
[02:15] <huwshimi> wgrant: Hah! It might look right, but it provides a mangled output
[02:16] <wgrant> Oh?
[02:16] <wgrant> Ah
[02:17] <wgrant> You may need to return .escapedtext, depending on how you're using it.
[02:17] <huwshimi> wgrant: It returns this: http://paste.ubuntu.com/731600/
[02:17] <wgrant> If it's in a TALES formatter, it probably will need .escapedtext.
[02:17] <huwshimi> wgrant: Ah
[02:17] <wgrant> Yeah.
[02:17] <wgrant> Sorry, forgot to mention that.
[02:18] <huwshimi> wgrant: That's fine :)
[02:19] <huwshimi> wgrant: Perfect, working now :)
[02:19] <wgrant> Great!
[02:21] <lifeless> wgrant: .name wouldn';t help here, but for abels work...
[02:22] <lifeless>  registrant                         |          1 |         4 |          0
[02:22] <lifeless> we might want to drop that
[02:22] <wgrant> ?
[02:23] <lifeless> totally unused colum on person
[02:23] <wgrant> s/drop/fix/
[02:23] <lifeless> also language
[02:24] <lifeless> 'meh'
[02:28] <lifeless> wgrant: can you paste me
[02:28] <lifeless> SELECT attname, most_common_vals, most_common_freqs FROM pg_stats WHERE tablename='accesspolicyuse';
[02:29] <lifeless> actually, nvm
[02:29] <wgrant> http://paste.ubuntu.com/731614/ anyway.
[02:31] <lifeless> thats very interesting
[02:31] <lifeless> I wonder why distribution is blank
[02:31] <wgrant> Distributions are extremely rare.
[02:31] <wgrant> 40 vs 33000
[02:32] <lifeless>  SELECT attname, most_common_vals, most_common_freqs FROM pg_stats WHERE tablename='accesspolicyuse' and attname='distribution';
[02:32] <lifeless>    attname    |                                         most_common_vals                                          |                                                                                                                                                                                                           most_common_freqs
[02:32] <lifeless> --------------+---------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[02:32] <lifeless>  distribution | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35} | {3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3.05157e-05,3
[02:32] <wgrant> Even a stats target of 1000 might not catch a distribution.
[02:32] <lifeless> ...
[02:32] <lifeless> e.g. its not blank for me
[02:32] <wgrant> You probably still have a huge target.
[02:32]  * wgrant tries.
[02:33] <lifeless> 105 apu entries with no product
[02:33] <lifeless> 98K where it is
[02:33] <lifeless> 0.1%
[02:33] <lifeless> 1-0.001^1000 chance of not seeing one
[02:34] <lifeless> IIRC
[02:34] <wgrant> WTF
[02:34] <wgrant> Now product is empty.
[02:34] <wgrant> distribution has lots.
[02:36] <lifeless> 1000 is the stats it keeps, not the rows it consults
[02:36] <lifeless> to determine most common it has to consult more rows than it keeps
[02:36] <wgrant> Oh.
[02:36] <wgrant> Still doesn't explain it.
[02:36] <lifeless> we're looking at the rows it keeps, and if that was random over the population we could estimate by frequency
[02:37] <lifeless> but this is explicitly the most common values per attribute
[02:37] <lifeless> so yeah, I dunno
[02:37] <wgrant> And on APU it's 3 tuples for each non-NULL value.
[02:39] <lifeless> tell me something, why does your script generate 98K policy uses
[02:39] <lifeless> is that because all products get security ?
[02:40] <wgrant> It currently just creates public, private, security for every pillar.
[02:40] <wgrant> Until we have a sensible way to manage comment editing, we probably have to allow private bugs for all projects.
[02:44] <wgrant>         4 / 1346  RootObject:+login
[02:47] <lifeless>   EXPLAIN ANALYZE SELECT person.id FROM person WHERE id IN (SELECT person FROM accesspolicygrant where policy IN (SELECT id from accesspolicyuse where distribution=1 and policy=1)) ORDER BY person_sort_key(person.displayname, person.name) LIMIT 75;
[02:47] <lifeless> 780ms
[02:50] <lifeless> wgrant: I think your first plan is a fluke the more I look at it
[02:50] <wgrant> Oh?
[02:50] <lifeless> wgrant: the indices are not in the same order
[02:50] <wgrant> It may be a fluke, but the others are crap :)
[02:50] <lifeless> one is person sort
[02:51] <lifeless> the other is person
[02:51] <wgrant> Yes.
[02:51] <lifeless> it can't dual iterate
[02:51] <wgrant> No.
[02:52] <lifeless> it can?
[02:52] <wgrant> But it happens that (in this case) the coverage is good enough that it's not a problem.
[02:52] <wgrant> It can't, AFAIK.
[02:52] <lifeless> double negs :)
[02:52] <lifeless> so
[02:52] <lifeless> I think a better question isn't 'why are the other plans slow'
[02:52] <lifeless> its 'what do we want to actually query'
[02:53] <lifeless> thus the query above
[02:53] <lifeless> or
[02:54] <lifeless> a less extreme
[02:54] <lifeless> EXPLAIN ANALYZE SELECT person.id FROM person WHERE id IN (SELECT person FROM accesspolicygrant, accesspolicyuse WHERE accesspolicygrant.policy=accesspolicyuse.id AND  distribution=1 and accesspolicyuse.policy=1) ORDER BY person_sort_key(person.displayname, person.name) LIMIT 75;
[02:54] <wgrant> Which is 1180ms
[02:57] <lifeless> interestingly
[02:57] <lifeless> you realise that you are querying outside of both your partial indices
[02:58] <wgrant> Yes.
[02:58] <wgrant> I added a new non-partial one.
[02:58] <lifeless> is NULL -> 4ms
[02:58] <wgrant> Which would possibly be better done by making another one non-partial, but perhaps not.
[02:58] <lifeless> is not NULL -> 2000ms
[02:58] <wgrant> On what?
[02:58] <lifeless>  EXPLAIN ANALYZE SELECT DISTINCT ON (person_sort_key(person.displayname, person.name)) person.id FROM accesspolicygrant JOIN accesspolicyuse ON accesspolicyuse.id = accesspolicygrant.policy JOIN person ON person.id = accesspolicygrant.person WHERE accesspolicyuse.distribution = 1 AND accesspolicyuse.policy = 1 AND artifact is NOT NULL ORDER BY person_sort_key(person.displayname, person.name) LIMIT 75;
[02:58] <wgrant> artifact?
[02:59] <wgrant> Well, yes.
[02:59] <wgrant> There are far fewer where artifact is NOT NULL
[02:59] <wgrant> That's not so much the indices as the volume of data.
[02:59] <wgrant> Your thing now has to sort a total of 1 person.
[02:59] <wgrant> Which is not a significant challenge.
[02:59] <lifeless> different plans
[03:00] <lifeless> not unsurprising
[03:01] <lifeless> ok so to rephrase the problem
[03:01] <lifeless> 'there are 57K people with explicit access within Ubuntu and we want to sort them by fields only on Person'
[03:01] <wgrant> Yes.
[03:02] <lifeless> by definition we need to either read *all people* in sort order filtering against the set of grants - which will for random names mean we read ~ all people
[03:02] <lifeless> or we need to read all grants (57K) and then cherry pick people
[03:02] <wgrant> Yes.
[03:02] <lifeless> which is much more selective and what its doing
[03:02] <lifeless> so why is it slow?
[03:03] <wgrant> I don't know. I wouldn't expect materialising 57k rows from such a narrow table to be slow at all.
[03:03] <lifeless>                ->  Nested Loop  (cost=3.39..66.16 rows=13 width=25) (actual time=14.369..1910.529 rows=57946 loops=1)
[03:03] <wgrant> Which plan is this?
[03:03] <lifeless> EXPLAIN ANALYZE SELECT DISTINCT ON (person_sort_key(person.displayname, person.name)) person.id FROM accesspolicygrant JOIN accesspolicyuse ON accesspolicyuse.id = accesspolicygrant.policy JOIN person ON person.id = accesspolicygrant.person WHERE accesspolicyuse.distribution = 1 AND accesspolicyuse.policy = 1 AND artifact is NOT NULL ORDER BY person_sort_key(person.displayname, person.name) LIMIT 75;
[03:03] <lifeless>                      
[03:03] <lifeless> but any of them start to look like this
[03:03] <wgrant> eparse
[03:04] <lifeless>  explain analyze SELECT DISTINCT ON (person_sort_key(person.displayname, person.name)) person.id FROM accesspolicygrant JOIN accesspolicyuse ON accesspolicyuse.id = accesspolicygrant.policy JOIN person ON person.id = accesspolicygrant.person WHERE accesspolicyuse.distribution = 1 AND accesspolicyuse.id=98234 ORDER BY person_sort_key(person.displayname, person.name) LIMIT 75;
[03:04] <lifeless>                       
[03:04] <lifeless> is weird
[03:04] <wgrant> What about it is weird? The plan?
[03:05] <lifeless> yes
[03:05] <wgrant> Hmm.
[03:06] <wgrant> I guess the non-indexed sort of 57k people is not going to be terribly fast, because person_sort_key is PL/Python.
[03:06] <wgrant> But I'm pretty sure it happened with a straight string sort too.
[03:07] <lifeless> I'll just reply to your mail then test ;)
[03:07] <lifeless> I think materialising the sort column may well help
[03:07] <wgrant> Did anything useful come from the discussion this morning?
[03:07] <lifeless> yes, we cleared things up
[03:07] <wgrant> FSVO
[03:07] <wgrant> And for now.
[03:07] <StevenK> Hah
[03:08] <StevenK> Does that mean I can continue with my branch to hide private teams in subteam of?
[03:08] <wgrant> ANd probably by trebling the scope of the project.
[03:08] <lifeless> StevenK: yes, the confusion was that that page shows super and sub
[03:08] <lifeless> StevenK: see bug 405277 which really covers the visibility rules
[03:08] <_mup_> Bug #405277: Private teams are not able to join other teams (public or private) <disclosure> <lp-registry> <privacy> <teams> <Launchpad itself:Triaged> < https://launchpad.net/bugs/405277 >
[03:09] <StevenK> wgrant: If it's a death march until March, what difference does December next year make? :-(
[03:09] <lifeless> StevenK: you probably also want to hide proposed memberships from private teams, unless the private team has signed off already.
[03:10] <StevenK> I really want to finish this branch that kills strict_component from {check,verify}Upload since it's close to done, but wgrant is ignoring me. :-P
[03:10] <wgrant> Bah, that could have been said here rather than in PM.
[03:11] <StevenK> You were busy here :-P
[03:11] <wgrant> StevenK: I don't understand what you're saying.
[03:11] <StevenK> So, there is two things
[03:12] <StevenK> One test failure is test_checkUpload_without_strict_component. I'm concerned that is a buggy test since strict_component is dead and is possibily relying on behaviour that is removed.
[03:14] <StevenK> Hm. It's expecting checkUpload() to pass, but it now raises NoRightsForComponent.
[03:14] <StevenK> s/raises/returns/
[03:14] <StevenK> Since returning exceptions is sexy or something
[03:15] <wgrant> It's no longer a useful or possible test.
[03:16] <wgrant> So it should be removed.
[03:17] <StevenK> The other test failure is from the upload processor, which is not rejecting a package that should be.
[03:17] <StevenK> I think it is because checkUpload() is returning an exception and we do not reject because the package is NEW.
[03:18] <wgrant> pastebin the failure
[03:18] <wgrant> And diff
[03:18] <StevenK> AssertionError: !=:
[03:18] <StevenK> reference = "Not permitted to upload to the RELEASE pocket in a series in the 'CURRENT' state."
[03:18] <StevenK> actual = ''
[03:18] <wgrant> Ah, indeed.
[03:18] <wgrant> So, we need to catch the no privileges exception.
[03:19] <wgrant> Well, "catch"
[03:19] <wgrant> Check for it.
[03:19] <StevenK> Yes.
[03:19] <wgrant> If checkUpload has failed due to a lack of permissions, then do the NEW check.
[03:19] <StevenK> This strikes me as ugly
[03:19] <wgrant> Welcome to Soyuz.
[03:21] <StevenK> wgrant: Any exception from checkUpload is lack of perms?
[03:21] <wgrant> No.
[03:21] <wgrant> eg. the pocket error that you quoted
[03:21] <StevenK> NoRightsForArchive() == lack of permissions
[03:22] <StevenK> NoRightsForComponent() == lack of permissions
[03:22] <wgrant> NoRightsForArchive is a lie and should probably die, or be moved into NascentUpload.
[03:22] <wgrant> NoRightsForComponent is stupid and too specific.
[03:23] <StevenK> CannotUploadToPPA is arguably permissions
[03:23] <StevenK> However, I agree ArchiveDisabled is not.
[03:23] <wgrant> It's not a matter for agreement.
[03:24] <StevenK> So, how should this code be structured?
[03:24] <StevenK> And do you still need to see the diff?
[03:27] <wgrant> I would probably rework verifyUpload to raise a single exception when there are insufficient upload rights.
[03:27] <wgrant> Except perhaps preserving NoRightsForArchive, but implementing it in a non-buggy fashion.
[03:27] <wgrant> NoRightsForComponent has no right to exist.
[03:28] <StevenK> Isn't it used for, say, partner?
[03:29] <wgrant> Partner doesn't need to behave well.
[03:29] <wgrant> As long as it vaguely works for 6 months.
[03:30] <StevenK> Hm, partner is a seperate Archive anyway
[03:30] <wgrant> In some ways.
[03:30] <StevenK> It isn't Archive.id 1, is what I mean
[03:30] <StevenK> Which means it has its own ArchivePermission entries
[03:30] <wgrant> Yes.
[03:30] <wgrant> But that's almost all it means.
[03:32] <StevenK> NoRightsForComponent is only checked due to checkArchivePermission(person, component), is that check pointless too?
[03:33] <wgrant> That would be cheating.
[03:35] <StevenK> I don't really want to rewrite verifyUpload() in this branch
[03:35] <wgrant> It's only like 10 lines.
[03:35] <StevenK> It's currently 30, so that sounds like an improvment
[03:36] <StevenK> I'm tempted to say 'prove it' :-P
[03:40] <lifeless> wgrant: ok, the call
[03:40] <lifeless> wgrant: so the two bits we were getting stuck/confused on were:
[03:40] <lifeless>  Person:+index shows sub and super; this was just confusion. super should not be shown. sub should always be shown.
[03:40] <lifeless> StevenK: ^
[03:40] <lifeless> and bug subscriptions should always be shown too
[03:41] <lifeless> no spying!
[03:41] <lifeless> wallyworld_: ^
[03:41] <wgrant> That was my understanding.
[03:42] <wallyworld_> lifeless: so we don't *ever* want to hide private subscribers?
[03:42] <lifeless> super should not be shown because the teams a person are in should not be disclosed just because you know the person -> visibility is not granted to non-participants of a Person against their memberships.
[03:43] <lifeless> subteams should always be shown (if you can see the team they are a member of) because the subteam could otherwise give you an invisible project owner. e.g. public project, public owning team, no members visible. But really there is a private team lurking inside the public team.
[03:43] <lifeless> we don't want that-> we show the existence of the private team.
[03:44] <lifeless> wallyworld_: the rule is, if you can see the bug, you can see the name, displayname, url, of its subscribers (including structural subs)
[03:44] <wallyworld_> ok. but if you click on the link, it will 404?
[03:44] <wgrant> lifeless: aaaaaaa
[03:44] <lifeless> wallyworld_: there are a few drivers for this. One is the guiding principle that interacting with a public object discloses private objects.
[03:44] <wgrant> lifeless: We're changing privileges based on admin status?
[03:44] <lifeless> wallyworld_: it will give a 200 and a little details.
[03:45] <lifeless> wallyworld_: which will require separate work.
[03:45] <wallyworld_> ok,so the normal +index view with a bunch of stuff not rendered
[03:45] <lifeless> wallyworld_: the second is that if anyone replies to the bug via mail, and they are subscribed via a private team, they are very likely to disclose the private team anyhow due to the footer we include, which 'Reply' in many modern mail clients includes.
[03:45] <lifeless> wallyworld_: yes
[03:45] <wgrant> lifeless: How do you propose that Person:+index knows if you have partial visibility to it?
[03:46] <lifeless> wgrant: theres a something like 10 clause EXISTS union we need to write.
[03:46] <lifeless> wgrant: AFAICT it should be fast.
[03:46] <wgrant> I was about to say that that was not an answer.
[03:47] <wallyworld_> only 10 clauses? surely we could try for 20 :-P
[03:47] <wgrant> We would have to join bugsubscription->bug->accesspolicyartifact->accesspolicyuse->accesspolicygrant->teamparticipation, and that's just for one part of it.
[03:48] <lifeless> wgrant: its not trivial unfeasible.
[03:48] <wgrant> It is, however, trivial silly.
[03:48] <lifeless> I have been thinking about this for a year or so
[03:49] <lifeless> and I disagree with you :)
[03:49] <wgrant> So.
[03:49] <lifeless> 9
[03:49] <lifeless> bah
[03:49] <wgrant> Say that one day in future we decide that being a monolithic webapp is actually a stupid idea, and SOA is cool.
[03:49] <lifeless> how about that huh
[03:50] <lifeless> \o/
[03:50] <wgrant> How does that work now?
[03:50] <wgrant> We call out to 50 services?
[03:50] <lifeless> in parallel, yes
[03:50] <wgrant> ...
[03:50] <StevenK> Like fun that scales
[03:50] <lifeless> (and cancel the calls when one answers)
[03:50] <wgrant> This also puts the LP services in a strongly privileged position.
[03:50] <lifeless> StevenK: it actually scales very very well
[03:51] <lifeless> StevenK: because each service can cache answers to these questions when appropriate
[03:51] <lifeless> StevenK: 'can user X see object Y' is a -stock- query any service that owns data has to be able to answer efficiently
[03:51] <StevenK> lifeless: But 50 parallel calls? Times 50 users?
[03:51] <lifeless> StevenK: where do you get times 50 users ?
[03:51]  * wallyworld_ gets out the popcorn
[03:51] <StevenK> Out of the air
[03:52] <lifeless> StevenK: so, this is the sequence:
[03:52] <lifeless> traversal
[03:52] <lifeless> oh this is private
[03:52] <lifeless> check rules
[03:52] <lifeless> -> denied
[03:52] <lifeless> -> full
[03:52] <lifeless> -> restricted
[03:52] <lifeless> where denied is a failure to meet full or restricted
[03:52] <lifeless> full depends on the object, say for Person, it is granted by: is_member or is_owner
[03:53] <lifeless> restricted is granted by 'common ground', so we need to find *a* object where:
[03:53] <lifeless> user.expand_through_team_participation() can see && object subscribed-to, owns, etc.
[03:54] <lifeless> so one call to the team membership service, to get all the teams of user
[03:54] <lifeless> thats fast, we know it is.
[03:54] <lifeless> and then for each data store that might hold an object the Person we are looking at owns [note we don't need TP expansion of it]
[03:55] <lifeless> we ask for exists(discloses=Person, viewer=TP_expanded_user)
[03:56] <lifeless> will it be dead trivial to write? Probably not. Is it beyond our current schema, or an SOA - not at all.
[03:56] <lifeless> its also very cachable if we need to.
[03:56] <lifeless> though that might be terrifyingly large.
[04:11] <lifeless> wallyworld_: StevenK: I've commented on both your bugs.
[04:11] <wallyworld_> thanks
[04:11] <wallyworld_> nice to see it clarified :-)
[04:11] <lifeless> hopefully it makes sense
[04:12] <wallyworld_> i'll read soon. don't want to context switch right at the minute
[04:12] <lifeless> if it doesn't, I'm certain sinzui and I are in sync (we were very close to as it turned out), and I can clarify further as needed.
[04:12] <lifeless> thats fine
[04:12] <lifeless> tis EOD here, so I'm going to be in and out for a hiwle
[04:12] <wallyworld_> i'll grok before tomorrow's standup
[04:12] <StevenK> wgrant: I still can't see how verifyUpload() can be only ten lines :-(
[04:13] <wgrant> 20, maybe.
[04:14] <StevenK> wgrant: If it makes it easier and simplier, then I'm for it, I just can't envision where to start ripping out bits
[04:15] <lifeless> wgrant: http://www.postgresonline.com/journal/archives/209-Manually-setting-table-column-statistics.html
[04:15] <lifeless> wgrant: may help
[04:15] <lifeless> wgrant: e.g. upgrade to pg 9
[04:15] <StevenK> Oh sure, because the upgrade to 8.4 went so well :-P
[04:15] <wgrant> We should really do that eventually.
[04:15] <lifeless> stub and I began discussing logistics for it yesterday, with fdt nearly done
[04:18] <lifeless> wgrant:  EXPLAIN ANALYZE SELECT person.id FROM person WHERE id IN (SELECT person FROM accesspolicygrant, accesspolicyuse WHERE accesspolicygrant.policy=accesspolicyuse.id AND  distribution=1 and accesspolicyuse.policy=1) ORDER BY person.displayname LIMIT 75;
[04:18] <lifeless> Time: 351.454 ms
[04:19] <lifeless> wgrant: compare the plans
[04:19] <lifeless>                ->  Index Scan using person_pkey on person  (cost=0.00..6.39 rows=1 width=14) (actual time=0.012..0.013 rows=1 loops=17313)
[04:21] <lifeless> wgrant: its a bit variable, but as low as 200ms
[04:21] <lifeless> wgrant: I suspect staging is getting a hammering just now ;)
[04:22] <lifeless> wgrant: smaller sort too
[04:22] <stub> It will be faster if you ORDER BY lower(displayname)
[04:22] <stub> Or at least, it might be able to use an index then
[04:23] <wgrant> lifeless: Hm, does that win for any reason beyond the DISTINCT being done first, and not using person_sort_key?
[04:23] <wgrant> The DISTINCT only cuts it by a factor of 4.
[04:23] <wgrant> The rest is pretty similar, apart from the sort key.
[04:23] <stub> ORDER BY person_sort_key(displayname, name) might be good too
[04:23] <lifeless> stub: thats twice as slow
[04:23] <lifeless> stub: functional index
[04:24] <lifeless> stub: wgrant can fill you in on the gory details :)
[04:24] <stub> There used to be bugs with functional indexes, where the function would be called unnecessarily. Not sure if that has been fixed yet.
[04:25] <stub> But if the index is usable, lower() should be fast since we have an index on lower(displayname) and the functional index bug does not occur for magic internal functions like lower
[04:26] <lifeless> stub: its not using the index
[04:26] <lifeless> http://paste.ubuntu.com/731585/
[04:26] <lifeless> will get you the datatructure in temp tables, to play with
[04:26] <lifeless> http://paste.ubuntu.com/731544/ is the problem wgrant put to me
[04:26] <stub> With lower() it is not using the index?
[04:26] <wgrant> It can't really use the index.
[04:27] <lifeless> it could
[04:27] <wgrant> Except for that one special case.
[04:27] <wgrant> Well.
[04:27] <wgrant> It could.
[04:27] <lifeless> load it all up and walk
[04:27] <lifeless> but its going to be inefficient
[04:27] <wgrant> But it would have to walk the whole thing.
[04:27] <wgrant> Right.
[04:27] <wgrant> Like the first plan I provided.
[04:27] <wgrant> Which lifeless may have just linked.
[04:27] <lifeless> because we're likely to get a wide spread
[04:27] <wgrant> Yes
[04:27] <lifeless> only special cases will walk successfully
[04:27] <wgrant> For Ubuntu it works very well.
[04:27] <wgrant> For everyone else, probably not so much.
[04:28] <lifeless> we're pulling back 17K / 1437K peoples
[04:28] <lifeless> <2%
[04:28] <stub> But anyway, if there is slowness with a person_sort_key query, it might be due to PG bugs and try a lower instead as there is an index on that too.
[04:28] <lifeless> index walks will hit data pages, so an index walk becomes roughly equivalent to a table scan at this size of random-spread data
[04:28] <lifeless> stub: yeah, got that
[04:29] <lifeless> stub: I think there is (because the function is called per row, and there are [depending on query structure] 56K or 17K rows,.. and python is slow
[04:29] <lifeless> I might put a extra column on person and give it a spin, I'm curious
[04:29] <wgrant> I'm not quite sure why it's done in Python.
[04:30] <wgrant> Possibly because why not.
[04:30] <wgrant> In 2005.
[04:30] <lifeless> because we didn't comprehend the cost of functional indices when they were introduced.
[04:30] <wgrant> Right.
[04:30] <stub> Its designed to be used to order results, so slowness in theory only matters on insert
[04:30] <stub> Cause ordering would hit the index
[04:30] <lifeless> stub: and in sorting
[04:30] <wgrant> Ordering can hit the index.
[04:30] <wgrant> But it's only effective if you're sorting *lots* of people.
[04:30] <lifeless> ordering by (foo) requires foo to be calculated
[04:30] <lifeless> wgrant: other way around
[04:30] <lifeless> its only useful if you're doing an index walk
[04:31] <lifeless> which requires the index to -very very- closely match the query
[04:31] <wgrant> Right, which is only efficient if you're wanting lots of those people...
[04:31] <wgrant> Yes.
[04:31] <wgrant> Isn't that what I said?
[04:31] <lifeless> intersecting sets
[04:31] <lifeless> lots of people
[04:31] <lifeless> or subsections of the index that are well defined can do it too
[04:31] <wgrant> Walking the index is only efficient if you want a significant chunk of its values.
[04:34] <lifeless> or a small subsection
[04:34] <lifeless> with deep indices
[04:34] <lifeless> we can go around this all day :)
[04:34] <mwhudson> heh are you still talking about query plans?  i've reinstalled my machine and restored ~ from backup since you started on that
[04:34] <wgrant> OK, "walking a portion of an index is only efficient if you want a significant chunk of the values from that portion"
[04:35] <lifeless> yes, though really we should qualify that to 'in pg' - because other systems can walk on key prefixes ('loose scans')
[04:38] <StevenK> mwhudson: Why?
[04:38] <lifeless> qastaging may timeout for you for a little bit
[04:39] <huwshimi> lifeless: If you happen to get a chance to have another look at that avatar branch it would be much appreciated.
[04:39] <lifeless> huwshimi: is thursday ok ? I have tomorrow off for baby stuff, and doubt my ability to page it in properly tonight
[04:39] <mwhudson> StevenK: mixture of really strange acpi bugs and wanting the disk space win7 was taking up
[04:39] <huwshimi> lifeless: Yeah sure, no problems at all
[04:39] <lifeless> huwshimi: perhaps you can get the OCR (wgrant?) to review it, and take my comments into consideration.
[04:40] <huwshimi> lifeless: Sure, thanks :)
[04:40] <wgrant> I started at 6am, so I was actually about to wander off. But if it's short...
[04:40] <lifeless> mwhudson: my new intel 320 ssd is coming ... :)
[04:40]  * wgrant looks.
[04:40] <lifeless> mwhudson: I got the 300GB model
[04:40] <huwshimi> wgrant: It's not a problem. Another time is fine
[04:42] <mwhudson> lifeless: yeah, that's a bit tempting, but the 140 gigs i now have will take me a while to fill :)
[04:42] <mwhudson> i don't keep photos or much music on this machine anyway
[04:42] <lifeless> mwhudson: funny win7 stry
[04:42] <lifeless> mwhudson: my desktop came with win7 - its a 16GB ram machine
[04:42] <mwhudson> (about 1/3 of my disk space consumption is in ~/VirtualBox\ VMs/ ...)
[04:42] <lifeless> mwhudson: I kept win7 for games and just-in-cases
[04:43] <lifeless> mwhudson: gave it, I think 100GB or something.
[04:43] <lifeless> mwhudson: imagine my surprise when it creates a 16GB swapfile.
[04:43] <StevenK> Hah
[04:43] <lifeless> wgrant: stub: 40 seconds to do:
[04:43] <lifeless> select count(distinct person_sort_key(person.displayname, person.name)) from person;
[04:43] <wgrant> lifeless: Swapfile or hiberfile?
[04:44] <lifeless> wgrant: pagefile.sys
[04:44] <wgrant> Huh
[04:44] <wgrant> Odd
[04:44] <lifeless> no, unchanged since NT 3.5
[04:44] <lifeless> its daft, because more ram == more slowness from page file and less need
[04:45] <wgrant> I thought it by default used a dynamically scaling one.
[04:45] <lifeless> yes, thats the default size heuristic
[04:45] <StevenK> Linux still wants swap with a large amount of RAM
[04:45] <lifeless> StevenK: lots of sites that care about perf disable it these days for linux.
[04:45] <StevenK> Sure, the kernel wants it, doesn't mean it gets it. :-)
[04:46] <lifeless> StevenK: you still get VM backed shared pages etc; but there is precious little use for swap per se
[04:46] <stub> launchpad_prod_3=# explain select count(distinct(person_sort_key(displayname,name))) from person;
[04:46] <stub>                               QUERY PLAN
[04:46] <stub> -----------------------------------------------------------------------
[04:46] <stub>  Aggregate  (cost=51296.04..51296.30 rows=1 width=21)
[04:46] <stub>    ->  Seq Scan on person  (cost=0.00..47617.83 rows=1471283 width=21)
[04:46] <stub> (2 rows)
[04:46] <stub> Time: 534.701 ms
[04:46] <mwhudson> lifeless: hah
[04:46] <lifeless> stub: garh what
[04:46] <mwhudson> lifeless: that might have happened to me, i guess
[04:46] <mwhudson> (8 gigs in this machine)
[04:47] <stub> lifeless: Warm index
[04:47] <lifeless> mwhudson: http://support.microsoft.com/kb/314482
[04:47] <StevenK> I thought Windows had an option to twiddle for no swap
[04:47] <lifeless> The default paging file size is equal to 1.5 times the total RAM. However, this default configuration may not be optimal in all cases
[04:47] <wgrant> StevenK: It does.
[04:47] <lifeless> so, yes, not 16. 24GB
[04:47] <mwhudson> rofl
[04:47] <stub> lifeless: actually, not hitting the index. just warm and gobs of ram I guess to hold the immutable result of the function :)
[04:48] <wgrant> StevenK: Hm?
[04:48] <wgrant> Er.
[04:48] <wgrant> stub: That was an EXPLAIN.
[04:48] <stub> ahaha
[04:48] <wgrant> stub: Not an EXPLAIN ANALYZE...
[04:48] <lifeless> hah yes
[04:48] <lifeless> ECOFFEE
[04:48] <wgrant> So how was it 500ms.
[04:48] <wgrant> 1000?
[04:48] <wgrant> But surely not on such a trivial query.
[04:48] <wgrant> I mean, what planning is there to do...
[04:49] <mwhudson> does pg 9.1 work for lp dev?
[04:49] <lifeless> mwhudson: maybe ;)
[04:49] <wgrant> mwhudson: Unlikely.
[04:49] <wgrant> But possibly.
[04:49] <stub> mwhudson: probably
[04:49] <stub> mwhudson: Might need some tweaking of the various setup scripts
[04:49] <lifeless> person update still going... probably should have chunked it
[04:49]  * mwhudson chuckles
[04:49] <mwhudson> i guess i'll find out sooner or later
[04:49] <lifeless> at least its not waiting for anything
[04:49] <wgrant> Well, yes, if you want to tweak stuff then of course it will work :P
[04:49] <stub> lifeless: So the python method is about 3x slower than the builtin lower()
[04:50] <lifeless> stub: yes :)
[04:50] <lifeless> stub: which will be slower than an actual column
[04:50] <lifeless> I'm making one on qastaging's Person table.
[04:50] <lifeless> just because
[04:51] <stub> lifeless: I'm thinking of whitelist for scripts we know have long running transactions, so we need to add them explicitly and open bugs.
[04:51] <wgrant> stub: Hmm, but we often have to kill them early.
[04:51] <lifeless> stub: so there are three cases I think
[04:52] <wgrant> Because eg. karma sticks around for 10 minutes after we kill it.
[04:52] <lifeless> stub: readonly safe to kill
[04:52] <lifeless> stub: readonly unsafe (mutates disk unrecoverably)
[04:52] <lifeless> stub: readwrite unsafe (mutates disk unrecoverably)
[04:52] <lifeless> first case should be readonly/readwrite safe to kill (atomic, no external changes)
[04:52] <stub> lifeless: safe to kill, but should we do it without making note that this task needs to be fixed?
[04:53] <lifeless> stub: I like making a note
[04:53] <lifeless> stub: I worry that you mean 'have the deploy abort when it finds a new one'
[04:53] <stub> wgrant: But it won't stick around for 10 minutes if the full-update.py terminates the backend
[04:53] <stub> lifeless: if we find a new one, we should abort and investigate.
[04:54] <wgrant> stub: Are you sure?
[04:54] <lifeless> stub: why do you say that? we know what code we have...
[04:54] <wgrant> pg_cancel_backend isn't effective. I thought we tried pg_terminate_backend once too...
[04:54] <stub> lifeless: sticking our head in the sand, crossing our fingers and full steam ahead will *normally* work
[04:55] <stub> lifeless:  I mean defining EVIL_USERS = [], similar to fragile users, where we list the badly behaved db users along with a bug number. These users we ignore their long running transactions and let preflight.py terminate things.
[04:55] <lifeless> wow, person is sparse on id
[04:55] <stub> erm... full-update.py terminate things
[04:56] <mwhudson> lifeless: all those shipit accounts that got deleted?
[04:56] <lifeless> mwhudson: p'rhaps
[04:56] <StevenK> lifeless: Out of interest, did you see my analysis on bug 618399?
[04:56] <_mup_> Bug #618399: DistroSeries:+nominations timeouts <lp-blueprints> <timeout> <Launchpad itself:Triaged> < https://launchpad.net/bugs/618399 >
[04:56] <lifeless> StevenK: yes, sounded plausible
[04:56] <wgrant> They actually mostly didn't get deleted.
[04:56] <wgrant> There's still a tonne around.
[04:56] <lifeless> wgrant: check a histogram of person ids on 100K intervals
[04:57] <lifeless> stub: what about WARNING in the script (so losas see it) if something isn't in EVIL, but not aborting.
[04:57] <stub> lifeless: we used to be at >10million rows
[04:57] <stub> lifeless: Yes, so we proceed into downtime hoping everything is working fine.
[04:57] <stub> I'd rather not hope. I'd rather we paused and thought about things a little bit.
[04:57] <wgrant> stub: But the highest person ID is only like 3.6m
[04:57] <wgrant> And account ID is 4.8m.
[04:58] <StevenK> DROP TABLE Account;
[04:58] <stub> Could have sworn we got to 10million. Might be thinking of account?
[04:58] <wgrant> 2.2m people no longer exist.
[04:58] <stub> Graphs will have the answers if anyone wants to wait for them to load ;)
[04:58] <wgrant> I suspect lots of the remaining ones shouldn't either.
[04:59] <wgrant> Account retains 4.6m of its 4.8m rows :/
[04:59] <wgrant> I suspect 3.4m of those are unused.
[04:59] <wgrant> Now that shipit is deceased.
[05:02] <StevenK> How can we tell, and can we just delete them?
[05:02] <lifeless> stub: so you're placing a premium on us having *unidentified* code that is both (bad) and (harmful to interrupt).
[05:03] <lifeless> stub: I'm placing a premium on failing to do a FDT being disruptive to velocity and our squad leads having identified (harmful to interrupt) cases already.
[05:04] <lifeless> stub: how can we better refine the expected outcome of e.g. me being wrong and something horribly harmful and bad coming along
[05:09] <lifeless> stub: one way would be to list everything not fragile as evil, going from the list of db users
[05:09] <lifeless> anything we think interrupts could be harmful to wouldn't be evil.
[05:09] <lifeless> (and would cause a stop)
[05:09] <lifeless> but this seems, to me, to be equivalent to fragile
[05:16] <stub> lifeless: I think the rare occurrence of delaying a FDT a day or two (without any downtime) is worth us not guessing about what is safe to terminate or not. We have done a lot of FDT now, and identified two processes that are a problem.
[05:18] <stub> If it is not rare that some new is badly behaved or an existing process starts misbehaving, then we have a different problem.
[05:21] <stub> Why has this process changed? Why is this new process behaving badly? Has the system become unstable enough that we shouldn't risk further changes?
[05:22] <stub> Have the assumptions we used when designing fastdowntime deploys become invalid?
[05:25] <lifeless> wgrant: 200ms is as fast as I can get it tonight; stub may do better
[05:26] <stub> I haven't seen the query yet
[05:26] <lifeless> stub: the pastebins ;)
[05:26] <stub> I don't see the query we are trying to optimize in the pastebin I have.
[05:27] <lifeless> stub: separately: functional index - 35s, cached sort column, 8s - count distinct over person (1.4M rows) - hot index
[05:27] <stub> oic. found it
[05:28] <lifeless> stub: two pastebins: http://paste.ubuntu.com/731585/ and http://paste.ubuntu.com/731544/
[05:28] <lifeless> bah, yes
[05:29] <stub> My first question is why on earth would we want to do this query? Being able to hide from security reports by changing your name to match someone elses doesn't seem a good idea.
[05:29] <lifeless> stub: they can't do that :) - name is unique
[05:29] <lifeless> stub: its the url component
[05:30] <stub> So the distinct on is redundant
[05:30] <lifeless> yes, but wgrants first query needs the distinct to eliminate 40K duplicate rows
[05:30] <lifeless> my rephrased one focused on person only finds the 17K interesting rows inthe first place
[05:31] <lifeless> this is the fastest I have (so far):
[05:31] <lifeless> EXPLAIN ANALYZE SELECT person.id FROM person WHERE id IN (SELECT distinct person FROM accesspolicygrant, accesspolicyuse WHERE accesspolicygrant.policy=accesspolicyuse.id AND  distribution=1 and accesspolicyuse.policy=1) ORDER BY person.temp_sort_key LIMIT 75;
[05:31] <lifeless> where temp_sort_key is just a materialised person_sort_key
[05:31] <lifeless> it jumps around but hovers aorund 300ms
[05:32] <lifeless> stub: i'm thinking about the other stuff; need to cook dinner now
[05:32] <stub> yup
[05:34] <stub> Bah. setup pastebin missing a bracket
[05:37] <stub> Or gnome ate it
[05:48] <stub> I can't see any obvious ways of improving those queries more. I suspect we don't care enough to materialize the person_sort_key and should just use displayname or lower(displayname)
[05:49] <stub> (although lower(displayname) won't give unique ordering, blergh)
[05:50] <stub> EXPLAIN ANALYZE SELECT person.id FROM person WHERE id IN (SELECT DISTINCT ON (person) person FROM accesspolicygrant, accesspolicyuse WHERE accesspolicygrant.policy=accesspolicyuse.id AND  distribution=1 and accesspolicyuse.policy=1) ORDER BY lower(displayname),name LIMIT 75;
[05:50] <stub> Seems a little faster actually... probably cache artifacts.
[05:52] <stub> yer - performs the same as using temp_sort_key
[05:56] <jtv> Who can help out with an openid-and-account-merging problem?  https://answers.launchpad.net/launchpad/+question/177539
[06:02] <jtv> Come on people, not everybody at once please.  :)
[06:03] <jtv> stub, lifeless: any ideas what to do about that one?
[06:59] <lifeless> jtv: they need to merge them separately in SSO AIUI
[07:32] <wgrant> stub: I was initially doing DISTINCT ON (Person.name). But that wasn't fast, so I tried using the other indexed values that we normally use (person_sort_key). Those aren't real queries, but they're close :)
[07:39] <stub> wgrant: yer. surprised the standard join version comes in at twice lifeless' version though. But selecting 55k rows, distilling & ordering looks like it takes 2-300ms.
[07:39] <wgrant> stub: Yeah :/
[07:44] <lifeless> I was surprised at the slow too
[08:33] <adeuring> good morning
[09:11] <mrevell> Hi
[09:14] <allenap> Hello.
[11:19] <mrevell> hiw bigjools
[11:19] <mrevell>  or rather, "hi"
[11:20] <bigjools> mrevell: yello
[11:21] <bigjools> ok, how much do you know about copying packages in PPAs?
[11:21] <mrevell> bigjools, Relatively little.
[11:21] <bigjools> mrevell: ok I'll show you an example page
[11:22] <bigjools> mrevell: https://launchpad.net/~launchpad/+archive/ppa/+copy-packages
[11:22]  * mrevell looks
[11:22] <mrevell> Righto.
[11:23] <bigjools> mrevell: so at the moment, people click the boxes and the buttons and it tries to do all the copying inside a webapp request
[11:23] <mrevell> Right, I see.
[11:23] <bigjools> this frequently times out
[11:23] <mrevell> I can imagine :)
[11:23] <bigjools> because there's a million checks to make when copying
[11:24] <bigjools> we now have an infrastructure to do copying in the background in a job
[11:24] <bigjools> done as part of derived distros
[11:24] <bigjools> so I'd like to switch over to using that
[11:24] <bigjools> but there are two problems:
[11:25] <bigjools> 1. some people still want to do "instant" copying
[11:25] <bigjools> 2. if we do the background copying, we have no way of providing feedback if it fails
[11:25] <bigjools> so my question is, do you feel up to providing some input on that?
[11:25] <mrevell> I certainly do.
[11:26] <mrevell> I'd like to ask some questions, first.
[11:26] <bigjools> sure
[11:26] <mrevell> wrt the first point: how important do you judge this need to be? We can look at it in more detail etc but what's your feeling right now?
[11:27] <bigjools> sort of.  people are used to it, it would be hard to take away. I think wgrant had some strong opinions?
[11:28] <maxb> Also, there are scripts which invoke API methods to copy packages, and may legitimately (IMO) assume that a successful method return indicates a successful copy
[11:28] <bigjools> ah yes, I forgot about that
[11:28] <maxb> Copying a single source package generally does not time out
[11:29] <mrevell> bigjools, Would it be naive of me to ask if we can just hand the job to the longpoll infrastructure and then have a live status box on the page?
[11:29] <bigjools> maxb: there's a new API call for background copying
[11:29] <maxb> Just so long as the synchronous one doesn't go away
[11:29] <bigjools> it won't
[11:29] <mrevell> As for instant copying, I can imagine it might be fairly straightforward to have a UI that allowed for instant copying of a single package
[11:30] <mrevell> but explained it'd be a batch job for anything more.
[11:30] <bigjools> mrevell: longpoll would complement this for sure, but we'd need it to work without as well
[11:30] <mrevell> Huw can help us design that.
[11:30] <mrevell> bigjools, Forgive me for asking, but why does it need to work without it?
[11:30] <bigjools> mrevell: basically lifeless is insistent that we don't depend on rabbit
[11:31] <bigjools> I sort of agree
[11:31] <mrevell> Okay, fair enough.
[11:31] <mrevell> bigjools, So, we could we email people if there's a failure?
[11:31] <bigjools> it's an option, but one that I dislike
[11:32] <mrevell> Why?
[11:32] <bigjools> email indicates you have a UI failure
[11:32] <bigjools> we all get too much email already
[11:33] <wgrant> Sorry, back.
[11:33] <mrevell> Okay, fair points; particularly the second about getting too much email already.
[11:33] <bigjools> now it *could* be justified in that we already get email notifications for uploads anyway
[11:33] <wgrant> So, email is probably wrong.
[11:33] <wgrant> We should provide an API method to poll.
[11:33] <mrevell> bigjools, This is something we could fix with a personal dashboard.
[11:33] <wgrant> And longpoll for the normal case.
[11:33] <bigjools> wgrant: longpoll
[11:33] <wgrant> Right.
[11:34] <bigjools> we already have API to see if a package is present
[11:34] <wgrant> We should identify syncSource callers and ask known ones to migrate.
[11:34] <mrevell> bigjools, There'd be a notifications area. We could even push out a Twitter notification. "@bigjools Your package copy failed. See pad.lv/fail1234 for more detail."
[11:34] <bigjools> well we can keep syncSource
[11:34] <wgrant> Of course, there's no reason that reasonably sized copies can't be done in the webapp.
[11:34] <wgrant> It's just that our schema and code are terrible.
[11:34] <mrevell> wgrant, Do we have any idea of what is reasonable?
[11:34] <wgrant> Mostly the code is terrible.
[11:34] <wgrant> mrevell: Not really, no.
[11:35] <bigjools> mrevell: I think these are all great - there are Rabbit → Twitter etc. gateways
[11:35] <wgrant> mrevell: Nobody has spent much time optimising the copier.
[11:35] <mrevell> Are we safe to say that, generally, a single package copy will work in the webapp?
[11:35] <mrevell> bigjools, But we need a solution that doesn't rely on Rabbit, right?
[11:35] <bigjools> however much it's optimised, it'll always fail given too many packages
[11:36] <mrevell> And is a single copy the primary use case?
[11:36] <bigjools> mrevell: for throw-away notifications I think it's fine. what's important is that the *only* way doesn't involve rabbit
[11:36] <bigjools> single copy can fail actually
[11:36] <StevenK> mrevell: Generally. And then the kernel team tries to copy 'linux' and it constantly times out.
[11:36] <mrevell> bigjools, So, we need a notification area on a personal dashboard.
[11:36] <bigjools> that --^
[11:36] <lifeless> bigjools: mrevell: a bit of nuance on rabbit
[11:36] <mrevell> Blimey, okay
[11:36] <bigjools> mrevell: or a notification area on the PPA page
[11:36] <lifeless> I'm saying we can't depend on it not being rebooted etc
[11:37] <bigjools> mrevell: which is better in this case as others will want to see the copy
[11:37] <lifeless> I think its fine for something *we can accept occasional loss on* to be rabbit only
[11:37] <lifeless> but we need to explicitly make that choice.
[11:37] <bigjools> lifeless: right.  Like twitter notifications
[11:37] <lifeless> right
[11:37] <mrevell> bigjools, The beginnings of a wall for the PPA, effectively
[11:37] <lifeless> the reality is that such loss will be extremely rare
[11:38] <lifeless> because a graceful reboot will keep persistent rabbit messages
[11:38] <bigjools> mrevell: I was thinking of a notification area in the PPA page that shows in-progress and failed copy jobs
[11:38] <lifeless> its only if ackee shits itself :) that we'll lose in-flight messages.
[11:38] <mrevell> Thanks for the clarification lifeless.
[11:38] <bigjools> and you can ack the failed ones which makes them go away
[11:38] <mrevell> bigjools, Sounds ideal.
[11:38] <mrevell> bigjools, Let's involve Huw.
[11:39] <mrevell> bigjools, Is the Twitter thing realistic?
[11:39] <mrevell> bigjools, Within the scope of a maintenance task, I mean.
[11:39] <bigjools> mrevell: excellent - if he can design something that'd be fab.  I am away for 2 weeks from Thurs, but I can send an email describing what we need
[11:39] <lifeless> so you could, for instance, do rabbit only notifications, and if we have a hardware failure on rabbit, we lose some
[11:39] <bigjools> mrevell: sorta, depends on how hard it is
[11:39] <StevenK> I don't like the idea of Twitter
[11:39] <mrevell> bigjools, Please do. Huw can get something done this week.
[11:39] <bigjools> we also want an RSS feed based on Rabbit notifications
[11:39] <lifeless> mrevell: twitter - I'll need to talk to elmo about how we should interact with twitter, but in principle doable
[11:39] <mrevell> StevenK, Or identi.ca. I'm looking for a way to do push notifications that doesn't involve email.
[11:40] <bigjools> there's a million things that soyuz does that we need notifications for
[11:40] <lifeless> mrevell: (firewall aspects basically)
[11:40] <mrevell> lifeless, Sounds good. I'm not totally sold on it but "Why doesn't LP send me a message on Twitter when X happens?" seems to come up every now and then.
[11:40] <StevenK> mrevell: I don't like it for the concept, not the use of Twitter.
[11:40] <lifeless> mrevell: I'm not arguing for or against, just letting you know :)
[11:40] <mrevell> StevenK, Not even as a back-up to a notifications area on the PPA npage?
[11:41] <lifeless> mrevell: wearing my TA hat I want a scalable secure pub-sub to the internet facility for all of LP, which would let folk external to lp do interesting things like twitter notifications
[11:41] <mrevell> lifeless, Yeah, thanks :) I mean, if there are mountains to be moved to get Twitter notifications, then I'd really rather spend to time ... and research ... looking at whether we really want to do this.#
[11:41] <lifeless> mrevell: but we haven't built that
[11:41] <lifeless> mrevell: I have a design in mind though, FWIW :)
[11:41] <mrevell> lifeless, Right, precisely. I'd rather do it properly than just tack something on for this one use case.
[11:41] <bigjools> lifeless: Rabbit has all sorts of useful gateways :)
[11:41] <StevenK> mrevell: I seriously want notifications about stuff I'm doing within Launchpad to stay within Launchpad.
[11:42] <mrevell> StevenK, There's no reason we couldn't make it opt-in.
[11:42] <bigjools> yes there is the privacy issue
[11:42] <lifeless> bigjools: its a likely component, I'm actually thinking pshb as the primary protocol, rabbit can be used to shove notifications around internally, and pingthesemanticweb.com externally; I have a rough idea for handling private objects even.
[11:42] <mrevell> Like I say, this is something that people ask for but that I havne't necessarily given all that much thought.
[11:43] <StevenK> The fact that I *despise* Twitter/identi.ca might be shining through ...
[11:43] <mrevell> However, I'm generally in favour of Launchpad pushing notifications out to where people want to get them, rather than forcing them to come to LP.
[11:43] <bigjools> #luddite
[11:43] <mrevell> StevenK, In which case, don't opt-in
[11:43] <mrevell> bigjools, Haha!
[11:43] <mrevell> StevenK, I think it's right not to force people to use it.
[11:44] <bigjools> +1 t o push
[11:46] <wgrant> Does Twitter do private tweets?
[11:46] <wgrant> We have lots of private notifications :)
[11:47] <bigjools> DMs
[11:47] <mrevell> wgrant, There's direct messaging but ... whether we consider that private or not is another matter.
[11:54] <bigjools> ok thanks for the input everyone
[12:17] <stub> I think it would be awesome if people could choose from a variety of notification methods, making email addresses in Launchpad optional.
[12:18] <stub> email/jabber/twitter/identi.ca probably covers the bulk of our userbase.
[12:20] <lifeless> facebook.
[12:20] <lifeless> would cover nearly all :P
[12:31] <allenap> jelmer: Can you help me with mardy in #launchpad. He's having problems doing a git import, http://launchpadlibrarian.net/84700393/mardy-webcredentials-libaccounts-glib-trunk.log
[12:31] <allenap> ?
[12:39] <bigjools> wgrant: did you ever look at the debtags bug?
[12:45] <jelmer> allenap: looking now
[12:46] <allenap> jelmer: Ta, thanks.
[12:46] <stub> lifeless: Facebook can be accessed via email
[12:50] <wgrant> bigjools: I mostly ignored it.
[12:50] <bigjools> wgrant: I'm giving it an initial analysis - I'm a little confused as to the aims of the bug
[12:50] <bigjools> bug 57418
[12:50] <_mup_> Bug #57418: Support debtags in Packages.gz <escalated> <feature> <lp-soyuz> <soyuz-publish> <Launchpad itself:Triaged> < https://launchpad.net/bugs/57418 >
[12:52] <bigjools> seems like we need another import :/
[12:52] <wgrant> bigjools: Well, yes, we need to work out what they want. I strongly suspect that using Debian's DB directly will be unacceptable, so we'll need to DBify it or have a custom Ubuntu one.
[12:52] <bigjools> indeed
[12:52] <wgrant> (because we have lots of local package)
[12:52] <wgrant> +s
[12:52] <bigjools> which brings the question of merging upstream's
[12:54] <wgrant> And that is the stage at which I play the midnight card and disappear :)
[12:54] <bigjools> heh
[12:54] <wgrant> For it is a mess.
[12:54] <bigjools> it's an escalated mess
[12:54] <wgrant> Very similar to P-a-s.
[12:54] <bigjools> hence I am trying to clear it up
[12:54] <bigjools> I am not doing another P-a-s
[12:54] <wgrant> Yeah, but escalated Soyuz issues are all your problem now :)
[12:54] <wgrant> Heh.
[12:54] <wgrant> Well.
[12:54] <wgrant> It'd be easy to.
[12:55] <bigjools> that's the problem
[12:55] <bigjools> it's soooooo tempting
[12:57] <wgrant> As is sleep.
[12:57] <wgrant> So I shall leave you to temptation.
[13:04] <bigjools> jml: are you involved with the request to support debtags?
[13:04] <bigjools> wgrant: good night
[13:05] <jml> bigjools: yes. I asked Francis when we could expect it and whether there were any plans / specs in case we wanted to do the work.
[13:06] <bigjools> jml: I am doing some analysis
[13:07] <bigjools> jml: I wondered if the requirement was to use Debian's debtags database or to start a new one for Ubuntu?
[13:07] <bigjools> it's not specified
[13:07] <jml> bigjools: cool. If you need requirements from us, mvo & james_w are the best to ask.
[13:07] <jml> bigjools: I *think* we'd need to start a new one
[13:07] <jml> bigjools: but mvo would know for sure.
[13:07] <bigjools> jml: ok cool - do you guys have an IRC channel of your own?
[13:08] <jml> bigjools: oh boy, do we ever! #ubuntu-app-devel, #software-center on Freenode
[13:08] <bigjools> ok I'll pop in after lunch
[13:08] <bigjools> cheers
[13:08] <jml> np
[14:31] <sinzui> mrevell, Did you ever send StevenK the text for the team PPA emails?
[14:32] <mrevell> StevenK, I did, yes, this morning.
[14:32] <sinzui> fab
[14:32] <nigelb> Gmail thinks every mail bigjools writes is very important :P
[14:32] <sinzui> mrevell, did you see the email I sent to the list about open/closed teams. Maybe Dan can help solve the language issue?
[14:34] <mrevell> sinzui, I did; I'm hoping to reply later. Dan'll be back tomorrow.
[14:34] <sinzui> thanks
[14:35] <bigjools> nigelb: gmail is clever
[14:43] <nigelb> bigjools: :)
[14:48] <sinzui> jcsackett, ping
[14:48] <jcsackett> sinzui: pong.
[14:48] <sinzui> jcsackett, do you have a few minutes to mumble?
[14:48] <jcsackett> of course.
[14:48]  * jcsackett fires up mumble.
[14:57] <deryck> abentley, adeuring -- you guys see any reason we shouldn't turn on new listings for ourselves now?
[14:58] <adeuring> deryck: sounds fine
[14:58] <abentley> deryck: we might get timeouts, and we can't turn them off.
[14:59] <deryck> abentley, are you opposed to trying, seeing what it's like, and if arise, disabling the flag?
[15:00] <abentley> deryck: no.
[15:00] <deryck> great, thanks!
[15:00] <abentley> deryck: Just be sure to disable wallyworld's stuff, if it's enabled for us.
[15:00] <mrevell> hey abentley, I just need to grab a coffee; time ran away with me. Five minutes.
[15:01] <deryck> abentley, ok, will do.  I'll make the rules match qastaging.
[15:01] <abentley> mrevell: no problem.
[15:03]  * bigjools just deleted a bugtask.. Awesome
[15:08] <abentley> deryck: dynamic bug listings are currently disabled on qastaging for me.
[15:08] <deryck> abentley, ok.  I think you need to be in the demo group.  I'll look here in a sec.
[15:10] <abentley> deryck: Also, the rule for disabling ajax.batch_navigator.enabled is set for orange squad, when it should be set for custom-buglisting-demo
[15:11] <deryck> abentley, ok.  And I don't need that rule on lpnet if it's not turned on at all on lpnet, right?
[15:11] <abentley> deryck: right.
[15:11] <deryck> ok, cool.
[15:16] <deryck> abentley, I added us back to the demo team.  Things seem to work ok, until I can change the flag for the other feature to the demo team.
[15:17] <abentley> deryck: great.  Thanks.
[18:03] <mrevell> Night all
[18:21] <bryceh> deryck, quick question (raised by one of our upstreams) - do you know if the bugzilla comment importer is enabled for GNOME?
[18:21] <deryck> bryceh, hmmm, I don't actually know.  I recall it being raised some time ago about wanting to get it working....
[18:22] <deryck> bryceh, I may have dropped the ball on getting it enabled.
[18:22] <bryceh> deryck, ok thanks
[18:23] <deryck> bryceh, I think it would be better for one of the maintenance squads to check into it... perhaps yellow, which has gmb on it, since he knows a lot about that area.
[18:24] <deryck> bryceh, I think that's why I dropped it before, just caught in feature work like I am now.
[18:24] <bryceh> deryck, ok good to know.  Yeah at this point it's not a request, just question about the status of it
[18:24] <deryck> ok, cool.
[22:36] <deryck> abentley, you around still?
[22:42] <bac> hi lifeless, the test i added to the branch you reviewed yesterday is causing other worker tests that run after it to fail.  can you see anything i have bungled?  http://pastebin.ubuntu.com/732539/
[22:56] <lifeless> bac: getMirrorFailureForException is a helper, not a test itself
[22:56] <lifeless> bac: what is the test that fails (and does it fail when run on its own ?)
[22:57] <lifeless> bac: one possibility is that the test isn't twisted safe - the call to worker.mirror() may be assuming synchronous behaviour, which isn't a guarantee in any twisted environment.
[22:58] <lifeless> bac: if that is the case, the helper will need to be reworked to use deferred and so forth
[22:58] <lifeless> :(
[22:58] <abentley> deryck: kinda.
[22:59] <deryck> abentley, one question.... I assume I need to call change_fields with the right config for my integration work.  true?
[23:00] <abentley> deryck: Right.
[23:00] <deryck> abentley, awesome, thanks!  Have a good night.
[23:00] <abentley> deryck: The list of fields is around browser/bugtask.py:2241
[23:01] <deryck> abentley, thanks
[23:02] <abentley> deryck: kp
[23:02] <abentley> deryck: np