[00:00] the 503 page change was the one that uses flash for xdr? [00:00] mwhudson: Yes. [00:00] boggle [00:00] mwhudson: To get at identi.ca/twitter feeds. [00:00] oh [00:01] Yes, the Web is broken... but Flash is not a solution :) [00:11] mwhudson: wgrant: it uses flash as a fallback, not by default [00:11] modern browsers with javascript enabled will just work without the flash [00:11] however failing tests is a good reason to rollback :) [00:17] I wonder if those profiling changes weren't meant to land at all. [00:17] They seem somewhat insane and irrelevant. [00:17] * wgrant rolls it all back. [00:19] wgrant: How did you figure out which js was using mochikit? Or did you just remember? [00:20] huwshimi: I remembered seeing some bits of it around... grepping for it is going to be difficult :/ [00:22] heh [00:23] explain select * from oops_oops; [00:23] QUERY PLAN [00:23] ------------------------------------------------------------------------ [00:23] Seq Scan on oops_oops (cost=0.00..1922179.42 rows=27723142 width=464) [00:23] 27M rows. [00:23] Hmm. [00:23] ah, i was wondering why you needed to see the query plan for that statement :-) [00:24] mwhudson: faster than count(*) :) [00:24] different answer too, i guess? [00:25] statistics vs actual [00:25] the stats at last update know the total live row count [00:25] vs instantaneous row count [00:25] * mwhudson slams head into desk about interactions in lp's tests [00:25] mwhudson: welcome ! [00:26] Sensible interfaces? What are they? [00:26] mwhudson: what got you onto this? [00:27] https://bugs.launchpad.net/launchpad/+bug/631884 basically [00:27] <_mup_> Bug #631884: multiple variables pointing to FeatureController get out of sync easily

< https://launchpad.net/bugs/631884 > [00:27] ah yes [00:27] there is another one about get_request_timeline [00:27] same issue [00:28] yeah, features are a bit simpler, more localized [00:28] so attacking them first [00:28] I like the sound of that [00:29] and my change isn't directly related to the problem i was head desking about, or at least my change doesn't make any difference [00:29] but for example, it doesn't seem we make any systematic effort to ensure that an interaction doesn't outlive a testcase [00:30] also, TestCaseWithFactory sets up an interaction, TestCase doesn't [00:30] which is a bit ... dunno, something [00:30] anyway! [00:30] doesn't suprise me [00:30] if I may make a suggestion [00:30] a fixtures.Fixture to push and pop interactions would be very useful [00:31] yes probably [00:31] s/suggestion/observation/ [00:31] i think also a pithy email and wiki page along the lines of "10 things every launchpad developer absolutely positively has to know about interactions and participations or else" is forthcoming [00:32] \o/ [00:34] https://code.launchpad.net/~lifeless/python-oops/docs/+merge/77849 [01:02] wgrant: mwhudson: can has review? ^ [01:20] oh right, profiling tests are/were broken on devel, right? [01:21] yes [01:21] and guess what! [01:21] thread local variables are involved [01:21] orly [01:22] yrly [01:22] def assertCleanProfilerState(self, message='something did not clean up'): [01:22] """Check whether profiler thread local is clean.""" [01:25] * mwhudson chucks his feature flag thing into ec2 [01:29] ok pgsql, thats just daft [01:30] or rather, like queries really really suck [01:30] Howso? [01:31] explain select pathname from oops_oops where pathname like '/srv/launchpad.net-logs/production/wampee/2010-05-24/%' order by pathname; [01:31] QUERY PLAN [01:31] ------------------------------------------------------------------------------------------------------ [01:31] Sort (cost=2000294.79..2000301.89 rows=2841 width=68) [01:31] Sort Key: pathname [01:31] -> Seq Scan on oops_oops (cost=0.00..2000131.82 rows=2841 width=68) [01:31] Filter: ((pathname)::text ~~ '/srv/launchpad.net-logs/production/wampee/2010-05-24/%'::text) [01:31] possibly too-few statistics for the table size [01:31] Is there an index? [01:31] yes [01:31] "oops_oops_pathname_key" UNIQUE, btree (pathname) [01:32] of course, its 5.5GB due to bloat. [01:35] wow, disk sorts. funky. [01:40] explain select pathname from oops_oops where pathname like '/srv/launchpad.net-logs/production/wampee/2010-05-24/%' and date between '2010-05-24'::timestamp - '1 day'::interval and '2010-05-24'::timestamp + '1 day'::interval order by pathname; [01:40] seems to do whats needed. [01:41] we may be better offer with an IN statement or a temp table. [01:41] though they can degrade to seq scans horribly quickly. [01:43] temp table is at leat reasonabl [01:43] e [01:44] any reviewers in the house ? 8 liner ... [01:52] lifeless: ah yes, my machine crashed after you pasted the link [01:52] lifeless: link me up again? [01:53] https://code.launchpad.net/~lifeless/python-oops/docs/+merge/77849 [01:53] thanks! [01:55] done [01:55] er, lunch time [02:23] lifeless: So, do we have any plans to fix all these timeouts? [02:23] wgrant: yes. [02:42] lifeless: Any *specific* plans? :) [02:43] wgrant: yes :) [02:44] * StevenK waits for "* wgrant shakes lifeless until some plans fall out." [02:44] Pretty much! [02:45] 'cause we haven't made any progress in quite a while. [02:45] Apart from the batchnav thing. [02:45] bug search is a prime candidate for solr [02:45] And that's at least 18 months away. [02:45] thats a different discussion [02:46] Is it? [02:46] broad strokes: - tune implementations now (e.g. denorm, search schemas etc), and do a dedicated search engine if/when we reach the limits of that approach [02:47] We have some debilitating, embarrassing timeouts. We're making little to no progress on them. === wgrant changed the topic of #launchpad-dev to: https://dev.launchpad.net/ | On call reviewer: - | Critical bugs: 253 - 0:[#######=]:256 [02:48] we have no engineers currently hacking on them [02:48] thats the proximate cause of no progress [02:48] francis is analysis *why* thats happening [02:48] That's been happening for months now. [02:48] analysing. [02:48] We have not made critical bug progress for 4.5 months now. [02:49] We adopted the new critical bug process 8.5 months ago. [02:49] More than half the time the new process has been working, it has been failing,. [02:49] . [02:50] yes, thats the evidence. [02:50] now, do you have an analysis of causes? [02:51] Well, the evidence points to one. [02:51] It's not all of it. [02:51] But it is something. [02:51] too many new criticals? [02:52] No. [02:52] <...> [02:52] The maintenance squad configuration. [02:52] please expand on that :) [02:52] In particular, the week Teal stopped taking criticals, the downward trend stopped. [02:52] http://webnumbr.com/launchpad-critical-bugs [02:52] We stopped taking criticals just after the middle of May. [02:53] the problem started earlier [02:53] Certainly. [02:53] so, that simply speaks to our ability to tread water [02:53] Sure, but we're no longer even close to treading water. [02:53] right [02:53] the key question is where are all the criticals coming from [02:54] It was clear we were in a terrible situation 3 months. [02:54] 3 months ago [02:54] until we rule out maintenance efforts, we can't even thing about things like 'add more maintenance folk' [02:54] How long is this going to take? [02:54] We are getting into a steadily worse situation. [02:54] And I've seen no acknowledgement that we actually have a crisis here. [02:55] I don't think its a run around in circles crisis [02:55] its a significant issue [02:55] What is a run around in circles crisis, then? [02:55] and francis is working on the analysis [02:55] wgrant: one where stopping and thinking will not help the outcome [02:55] The new definition and process are clearly not working. Reasonable functioning of the new structure of the LP team depends entirely on the critical queue being exhausted. [02:56] The critical queue is being the opposite of exhausted. [02:56] wgrant: yes, yes. Yes. [02:56] Therefore the new structure of the LP team will not provide reasonable response and functionality. [02:56] It's been nearly a year. [02:56] uhm [02:56] Something needs to be done. [02:56] so, lets detangle some things [02:57] firstly, the delta from old structure to new is *still*, even with this issue, positive: we -are- progressing both maintenance and development efforts with less scheduling headaches. [02:57] Hmm? For some definitions of maintenance, perhaps. [02:57] we haven't solved /all/ the issues we hoped to, but we have some. [02:57] I would posit that maintenance is, on the whole, substantially worse than it was. [02:58] Unless you restrict "maintenance" to "fixing timeouts and OOPSes" [02:58] wgrant: and escalated bugs, which previously were a real headache, we are doing them more reliably and with less delay than previously. [02:58] In some cases, yes. [02:59] wgrant: welcome to statistics. [02:59] now, [02:59] as for doing something. [02:59] Now, I'm not saying that the new structure is bad. I'm saying that the current implementation is objectively not working. [02:59] Do you propose we go off half cocked without analysing the situation ? [02:59] No, we need analysis. [02:59] But we needed analysis 3 months ago. [02:59] There is no analysis :( [02:59] francis is doing one. [02:59] Whats your beef with it ? [03:01] I posit that you want it complete yesterday. [03:01] Is it going to come to a conclusion? [03:01] of course [03:01] It has been several weeks. [03:01] of course [03:01] its not a trivial undertaking, and the person doing it has other things (like .e.g qbr prep) that are higher priority [03:01] Sure. [03:02] it took 4 or so months to do the analysis around the restructure [03:02] But it's something that needs to be done quickly, so perhaps the terribly busy person isn't the ideal person to do it :) [03:02] We have a serious problem here, and the longer we take to analyse it, the worse it is getting. [03:03] It took months for people to admit that we had a problem. [03:03] the scale is changing slowly, the structure isn't. [03:03] And now it is going to take months to analyse. [03:03] well, it may not take months, but its going to take more than a couple of days :) [03:03] It has already taken months. [03:04] What is your objective in this discussion? To convince me its drop-dead urgent? It doesn't seem to be to me. Important - yes. Important enough that its being done right now? Yes. Run-with-knives urgent? no. [03:05] the situation we're in is broadly 'too much to do and not enough hands to do it with'; thats -normal-. Whats interesting here is that we have things we believe we -must- do, which we don't have enough hands to do. [03:06] And the number of hands is decreasing :) [03:06] And that looks abnormal when you consider the definition of the things we -must- do is pretty modest [plug regressions, plug scaling issues, do escalated tasks] [03:06] Indeed. [03:07] the stuff francis has noted so far is that its mainly self inflicted [03:07] IIRC > 1/2 the bugs analysed so far have been caused by feature work [03:08] but he hasn't analysed enough to be confident that its indicative in the big picture [03:08] *IF* it is, then the thing to point at is cost of change: particularly hidden costs. [03:08] Hm, is this analysis progress visible somewhere? [03:09] And that will come back to code structure, which gets back to what reviews are [perhaps failing] to do. For instance. [03:09] I'd not heard any of this. [03:09] I don't know where his working notes are [03:09] It was not clear that any progress had been made whatsoerver. [03:09] I talk with him weekly [03:10] StevenK: ++ [03:10] Hm? [03:10] nuking sampledata. Love it. [03:11] I'd remove all of it if I think it had any hope in hell of passing ec2. [03:12] oops-tools is truely painful to hack on [03:12] slow tests + no way (that I can see) to run just one. [03:20] wgrant: StevenK: I need a brief teddy bear about removing oops prefixes and the impact on fauly analysis [03:20] could either of you spare 10 minutes? [03:20] lifeless: Sure. [03:20] Public holiday and about to brave the daystar [03:21] wgrant: skype ? [03:56] wallyworld: Do you know much about the names of the permission levels on the +managedisclosure page (e.g. Observer, Restricted Observer)? Are they used elsewhere in Launchpad? [03:57] huwshimi: no, this is new terminology afaik [03:59] wallyworld: OK, I'm just trying to figure out if there are any better labels we could come up with. Something a bit more descriptive. E.g. Restricted Observer can only view subscribed so they could be labelled "Subscriber" [03:59] wallyworld: Any suggestions? [04:00] huwshimi: i think we are trying to get away from conflating visibility with subscriptions [04:01] wallyworld: OK, that makes sense. I think I mostly have reservations about the word "observer". [04:01] it's hard to come up with a one or two word label that conveys the intended meaning [04:01] yeah [04:01] Same with "disclosure" [04:02] i think though that once the terminology becomes socialised, it will be understood [04:02] We're very good at using bad terminology. [04:02] you could just not label them. [04:02] and instead describe their rights [04:02] perhaps we need good inline or popup help to explain what it means [04:02] [full access]\n* person\n* person ... [04:03] [access to listed objects only]\n* person\n* person [04:03] labels are needed if we want to do a choice list selection widget though [04:03] wallyworld: Yeah I was working on that when I decided I really wanted to think about the labels [04:03] wallyworld: Yeah that's right [04:03] wallyworld: do we need a label for 'class of role' really? [04:03] wallyworld: Or for things like "Add an Observer" links [04:03] "Grant access"? [04:03] wallyworld: or just a phrase that explains [04:03] right [04:04] i think we need 3 things: 1. a good label, 2. a short sentence that can go in brackets after, and 3. verbose popup help text [04:06] there are 2 role classes that can see stuff because of access being explicitly granted, vs one access class that can implicitly see everything because of the project role [04:06] so that makes it a bit harder to come up with concise labels [04:06] there are? [04:07] "Project role => access" is a myth that I hope to dispell soon. [04:07] sorry, thats opaque [04:07] observer and restricted observer are granted access roles [04:07] wallyworld: so is e.g. driver, and owner. [04:07] but if you are a project owner, then you can see stuff [04:07] wallyworld: for performance we almost certainly want the owner etc in the one system, with checks that they are in sync. [04:07] but if you register a project, you are the owner without it being granted [04:08] ie visibility is implicit in that case [04:08] wallyworld: there is a difference between 'the user does not click to grant it' and 'it is not granted' [04:08] i know ownership can be assigned [04:08] I think it should be the default, but it should not be implicit. [04:08] lifeless: agree on the difference [04:09] but if i register a project, i want to implicitly be able to see everything by default [04:09] wallyworld: its my understanding that owners etc won't be checked during lookup: if its not in the table, access denied. [04:09] wallyworld: no, you want to see everything by default. The word implicit isn't needed for the use case :) [04:09] wallyworld: I am asserting that your needs are met by: [04:10] - defaulting project owner to full access (much like we default team owners to administrator status) [04:10] lifeless: the balsamiq mock ups imply implicit visibility due to project roles [04:10] - possibly having a rule that the owner must retain full access [04:10] Old mockups are not gospel :/ [04:11] sure, but they are the current starting point [04:11] if we decide to do things differently so be it [04:11] Mm, I think they are hints. [04:11] Nothing more. [04:12] sure [04:13] now we have an interactive demo based on the mockups which can be used to try and help clarify these sorts of requirements [04:13] Yep. [04:13] anyhow, my point is this: you are saying its complex to describe because you have multiple implicit accesses + granted access. [04:14] yes, that's my current understanding [04:14] one way to address that, and a way I would encourage for reliability and performance in the system, is to drop implicit access. Instead use business rules to populate granted access when something should be automatic. [04:14] no problem with that, i agree [04:14] but we may still want to distinguish to the user that is the case [04:15] ie someone has access due to automatic propogation based on business rules [04:15] rather than an explicit grant of access by an authorative user [04:15] thats an interesting case [04:16] so that if someone needs to be removed from the bug supervisors team, they ca nbe' [04:16] I think that the authoritative user still exists (they added someone to the role) [04:16] what about the case where a team is assigned as bug supervisor. [04:16] and someone in the team should be cut iff [04:16] cut off [04:16] and the grant is a consequence of being added to the role - but that doesn't imply revoking when removed from the role [04:16] but not the whole team [04:17] wallyworld: I don't think we should support that [04:17] wallyworld: let me be more clear: supporting that will be extremely hard. [04:17] You can't just remove them from the team? [04:17] wallyworld: either remove them from the team, or use a team that accurately models the supervisors desired. [04:17] wgrant: yes, but you need to see why and how someone has acess so you know what you need to do to remove visibility [04:17] lifeless: see above [04:18] wallyworld: so in this situation,it would show 'user FOO has access because they are in team BAR which has FULL ACCESS' [04:18] wallyworld: Sure, you'll see that they're a member of Ubuntu Bug Control, and that team is granted access. [04:18] I presume there is a free text field related to each observer, for explanations ? [04:18] not that i am aware, but i could be wrong [04:18] e.g. 'fred is auditing us for iso9008:2000 status' [04:19] that may be something to add [04:19] if it is not already there [04:19] that would supply a way to explain business rule added access too - 'BAR added when set as bug supervisor' [04:20] lifeless: with your example above, it may be that we want to say ".. in team BAR which has FULL ACCESS because team BAR is the bug supervisor" [04:20] Of course, ideally all access control would be managed through this view... [04:20] that isn't enough to support automatically removing BAR when changing bug supervisor (but that may be undesirable anyway) [04:20] So it would show the bug supervisor privilege for that team as well. [04:20] wallyworld: understood, I can't comment about whether we need that or not :) [04:21] wallyworld: I will comment that for teams we started with implicit stuff for owners [04:21] wallyworld: and have slowly, to great rejoicing, removed it. [04:21] wallyworld: so my *hunch* is that it will be a problem. [04:21] As Curtis and I discussed last week about bug supervisors. [04:22] i agree 100% that the model will be much simpler to have one access lookup meachanism, and that owners etc should be put there via business rules [04:22] but i think we need to record that, is all [04:22] so we can show it to the users [04:22] to explain *why* [04:22] someone has access [04:24] You might recall that we discussed this on maybe Wednesday's call. The thought was that ideally the roles would not be conflated at all, but the initial project setup wizard would guide you through setting probably the same people into each role. [04:24] The explanation only needs to exist if the roles are configured in separate places and somewhat conflated. [04:24] Ideally they would be separated. [04:24] And manageable from the same place. [04:24] So it's obvious what's what. [04:25] Rather than having the bug supervisor on one page, security contact on another, and then a huge list of (restricted) observers. [04:25] All separated. [04:25] yes, i recall. but relying on the setup wizard is not enough. [04:25] With no single view of access control. [04:25] they would be separated [04:25] not [04:25] all users with access would be shown on one page [04:26] regardless of how they got that access [04:26] I speak of more than just project-wide observer access [04:26] The same page really should list everyone with bug supervisor access, security contact membership, etc. [04:26] It probably won't in the first iteration. [04:27] But we should certainly not design ourselves into a corner. [04:27] you mean for a given project though? [04:27] Yes. [04:27] ... and this is why I should just JFD things instead of asking questions on IRC :P [04:27] and that's what the mockups do now [04:27] wallyworld: Hm? [04:27] wallyworld: Is "member" a confusing word to use? [04:27] wallyworld: They list everyone *with observer access through those roles*. [04:27] wallyworld: They don't take into account the privileges that those roles actually exist to grant. [04:28] wallyworld: Actually on second thoughts I guess it probably is [04:28] huwshimi: i can't think of a better term now [04:28] wgrant: what page mockup are you looking at? [04:28] wallyworld: Better than "Observer"? [04:28] wallyworld: None. [04:29] huwshimi: maybe i don't have the context - you talking about team members? [04:29] wgrant: when you said "they list everyone.....", what does "they" refer to then? [04:30] wallyworld: The last version of the mockups that I saw. [04:30] wallyworld: definitely confusing then... I'll try and come up with something else [04:30] They only deal with observer access. [04:30] Not any other kind of access, like, say, bug supervisor write privileges. [04:30] wgrant: and restricted observer, and access via a project role [04:31] They're just different paths to the same privilege. [04:31] not really? restricted observer != observer [04:31] Restricted observer is the same sort of privilege, just over a smaller scope. [04:31] The privilege is visibility. [04:31] There are other privilegs. [04:32] like ability to assign/triage bugs? [04:32] you mean? [04:32] Right. [04:32] Which is what bug supervisor really means. [04:32] sure, but that stuff we didn't do for this first cut [04:32] Indeed. [04:33] But it's very relevant for if you want to describe why someone has a permission. [04:33] Because then their access management page will say that they are the owner and an observer. [04:33] yes, so there is a project role filter [04:33] Rather than just an observer because they are the owner. [04:34] i think we sort of agree [04:35] So, your explanation suggestion is only needed because A) I haven't convinced you yet that project roles have nothing to do with observer privileges, except for being sensible defaults during the setup wizard; and B) the UI that is in scope right now will only cover observer privileges. [04:36] If A becomes no longer the case, then "PRIVILEGE because ROLE" is no longer the case. [04:37] If B becomes no longer the case, then the team's access management page will list ROLE anyway. [04:43] wallyworld: So is an Observer part of a team or are they just directly connected to a project somehow? [04:44] huwshimi: an observer is a person or a team who have been granted access to view a project's private artefacts [04:44] so a person may be an Observer via their team membership [04:45] huwshimi: a Restricted Observer can just see private bugs and branches [04:45] huwshimi: an Observer can also see milestones, blueprints etc etc [04:45] make sense/ [04:45] ? [04:46] wallyworld: thats temporary [04:46] wallyworld: But an Observer can also see a projects private artefacts without being part of a team that gives them that permission right? [04:46] wallyworld: long term restricted will be able to be granted milestones blueprints etc [04:46] wallyworld: there is no intention to arbitrarily limit the things restricted observers can be given access to see [04:47] ok [04:47] huwshimi: yes, a person may be individually made an observer [04:47] wallyworld: OK great thanks [04:47] huwshimi: or a team they are in (standard LP idiom there) [04:48] lifeless: i said that above :-) [04:48] ah ll [04:48] kk [04:49] lifeless: i assume the long term then is to allow projects to define what 'restricted' means [04:53] wallyworld: I don't know what you mean [04:53] lifeless: so you said "restricted = bugs and branches only" is temporary [04:54] and that later [04:54] restricted would be gratned access to blueprints etc [04:54] restricted would *permit* granting access to blueprints etc [04:54] the limitation to bugs and branches is an implementation limit [04:54] yes, that's what I was asking, thanks [04:55] just wanted to be sure i understood [04:55] AFAIK there is no intent for restricted to ever change its meaning: you will always need restricted *to a concrete thing* [04:55] its just that the things we can put in the concrete list will expand as we implement more [04:55] yes, that's how i understood it from what you said [04:56] kk [04:56] we're synced then :) [04:56] thanks [04:56] for now until more people start to discuss it :-) [04:58] * wallyworld runs off tp pick up kid from school [04:58] wallyworld: haha [05:07] can has review? https://code.launchpad.net/~lifeless/oops-tools/nomoreprefix/+merge/77858 [05:09] wgrant: diff is there, if I can tempt you :) [05:13] lifeless: prefix.upper? [05:13]