[00:06] wallyworld_: Don't know if you want to chat, but I'm back if you want to. Hackerspace meeting ran long. [00:06] jcsackett: sure, just give me a minute [00:07] Anytime. Going to have to be irc tho. Too loud here for anything else. PM me when you want. [00:12] jcsackett: tl;dr; we will be adding a (+)Team link to the person picker, will transition picker content to simplified add team form like we do for confirmation stuff in the picker [00:12] this outcome is different to what you discussed with sinzui yesterday [00:13] Yeah. Y'all convinced him? [00:14] jcsackett: we also need to look at simplifying the open/closed team choices from 4 radio buttons down to 2 choices (open/close) with an option checkbox for each (thus providing the 4 scenarios) [00:14] we talked it through [00:14] was a good discussion [00:14] jcsackett: i believe you will be asked to look at the open/closed policy choices as per the above [00:15] Cool. I'll bug sinzui about it tomorrow then. :-) [00:15] and then we can combine that work with the new picker/overlay form [00:15] sounds good [00:24] StevenK: were there no other soyuz vocab tests you could add to? ie did you have to create a new test module just for that one test? [00:30] wgrant: we have a small workflow issue. share all es with aaa. subscribe aaa to es bug. no aag created because apg exists so bug is visible. change sharing permission for es for aaa to some. bug not visible and no aag created [00:39] wallyworld_: Yeah, that's where conflating subscription and visibility continues to be awkward.. [00:39] wallyworld_: I suspect we should just create the usual revocation job and remove the subscription. [00:40] wgrant: not sure that's the best thing to do. i mean the user intent is to be subscribed to that bug still, but only see that and not all other private things [00:42] wallyworld_: Doesn't that same argument apply to the team's members? [00:43] in which scenario? [00:43] eg. foo a member of bar, foo is subscribed to an artifact on a policy on which bar has an APG [00:43] bar's APG is revoked. [00:43] Possibly changing to Some. [00:43] Or to None. [00:44] yes, the case for apg revocation converting existing subscriptions to aag should be transitive [00:44] maybr [00:45] no, i mean the team gets the aag [00:45] and then the members can see the bug while they remain in the team [00:46] That's illegal [00:46] Since the team doesn't have a subscription. [00:46] It can't have an AAG. [00:46] wallyworld_: There are no other Soyuz vocab tests that I could find. [00:47] really? that's not good [00:47] not even in doc tests? [00:48] wallyworld_: I grepped the entire tree for the vocab names [00:48] wgrant: my assumption was the team did have a subscription, but if not, then i think what you say sounds ok [00:49] StevenK: r=me [00:51] wgrant: so everything looks ok so far on qas with the triggers removed, except for an oops in the job created when a team member is removed. [00:51] wallyworld_: Thanks, I've tossed it at ec2. [00:51] wallyworld_: What's that OOPS? [00:51] OOPS-96532e891268827ddb456c0488c160ef [00:51] i haven't looked at it yet, created a card [00:59] * wallyworld_ goes to get key cut, bbiab [01:08] wgrant: So, query? [01:12] StevenK: The query itself is probably fine. [01:12] But we likely want an index on access_policy [01:13] There is one? [01:14] The EXPLAIN ANALYZE mentions an index scan [01:14] It was probably an index scan on information_type. [01:16] However [01:16] There's few enough private branches that that might not be a problem. [01:18] wgrant: It's worse with an index on access_policy [01:19] wgrant: http://pastebin.ubuntu.com/1050184/ both hot, top is without the index on access_policy. [01:20] StevenK: Try now [01:22] wgrant: Same as the top one [01:22] StevenK: Also, what's the actual query you're using? [01:23] wgrant: Oh, sorry. http://pastebin.ubuntu.com/1050188/ [01:25] So, that's not quite the actual query. [01:25] The real thing will have a LIMIT foo at the end [01:37] wgrant: http://pastebin.ubuntu.com/1050199/ [01:38] StevenK: It seems to be fine without the extra index. [01:38] Not marvellously fast. [01:38] But the public branches aren't clustered badly enough that it will be a problem. [01:43] Is utilities/prioritygrid.py in use any more? It looks obsolete since I don't believe those wiki prioritisation tables are being used any more (?). Maybe 146 easy LoC for somebody. [01:47] nuke from orbit [01:48] wgrant: Right, so do you want to approve the MP, then? [01:49] lifeless: righto, will do [01:50] StevenK: Done [01:50] cjwatson: classic NIH stuff that [01:51] cjwatson: https://dev.launchpad.net/VersionThreeDotO/Soyuz/Inputs - last edit 2009 [01:53] def __getitem__(self, name): [01:53] for series in self.series: [01:53] if series.name == name: [01:53] return series [01:53] raise NotFoundError(name) [01:53] wtf [03:21] wallyworld_, StevenK: Extremely challenging review for one of you: https://code.launchpad.net/~wgrant/launchpad/bug-1015289/+merge/111139 [03:22] wgrant: looks ok to me, i assume we have an approximatedate tal formatter [03:22] there are no tests needing updating? [03:22] wallyworld_: wgrant: if you guys have FDT pending stuff, I'm willing to grant exceptions to the one-patch-per rule; I'll be updating docs shortly to reflect that. [03:23] needs to route through me or flacoste on a case by case basis. [03:23] wallyworld_: All the tests use sampledata, so they're a bit awkward to test this with. It's the formatter we use for dates basically everywhere, so I don't think tests are very useful. [03:23] lifeless: We deployed the last of the two week-long batches on Monday. [03:24] wgrant: ok, r=me [03:25] wallyworld_: Thanks. [03:25] np [03:31] lifeless: Oh, because I don't do DB patches? :-P [03:31] StevenK: because you haven't been whinging about it :) [03:32] StevenK: (actually, I forgot to ping you as well, mea culpa) [03:32] lifeless: wgrant is very good at whinging, so I've been letting him play to his strengths ... [03:33] LOL [03:36] Heh [04:04] Ah, 8-way UNION ALLs, my favourite. [04:14] wgrant: the openid thing thats terrible, or something else ? [04:19] lifeless: My bugsummary recalculator. [04:19] lifeless: The OpenID thing being the name-generation timeout you pointed out yesterdayish? [04:19] funky chicken [04:19] wgrant: is that what it was? [04:20] lifeless: Yeah [04:20] email address started with launchpad@ [04:20] wgrant: win. [04:20] So it tried to generate a nick from that [04:20] Conflict, so add number [04:20] Conflict, so increment number [04:20] Conflict, so increment number [04:20] Conflict, so increment number [04:20] haha [04:21] Although we may have another bug, since there aren't *that* many, I don't think. [04:21] perhaps it should select count(*) from person where name like 'prefix-%' instead. [04:24] and docs updated [04:24] -> to buy a PC powersupply for his reprap [04:25] My god [04:25] The algorithm isn't actually that stupid [04:25] It's even worse [04:25] I don't even... [04:25] what the heck is it trying to do [04:25] does the name blacklist get in on the fun too? [04:25] (for those playing at home, it gets from 'launchpad' to '78luphr0rnk2nuqimstywepozxn9kl19tqh0tx66b5dki1xxsh5mkz9gl21a5rlwfnr8jn6ln0m3jxne2k9x1ohg85w3jabxlrqbgs-launchpad-a811i2i3ytqlsztthjth0svbccw8inm65tmkqp9sarr553jq53in4xm1m8wn3o4rlwaer06ogwvqwv9mrqoku2x334n7di44o65qze') [04:26] actual lol [04:26] (must be tired) [04:26] Oh my ... [04:26] So it tries to add a random suffix. [04:26] Then a random prefix [04:26] Then mutate a character [04:26] Then a prefix and a suffix [04:27] Then all three [04:27] And somehow it gets 78luphr0rnk2nuqimstywepozxn9kl19tqh0tx66b5dki1xxsh5mkz9gl21a5rlwfnr8jn6ln0m3jxne2k9x1ohg85w3jabxlrqbgs-launchpad-a811i2i3ytqlsztthjth0svbccw8inm65tmkqp9sarr553jq53in4xm1m8wn3o4rlwaer06ogwvqwv9mrqoku2x334n7di44o65qze [04:27] Still [04:27] Not as bad as it could have been [04:27] It does up to 1000 attempts [04:28] Each of which adds a character to the prefix and suffix, and mutates the name further [04:28] wgrant: how did it get to the number it got to ? [04:28] So if we didn't ahve the timeout... [04:28] I suspect it's because it contains launchpad [04:28] So is blacklisted [04:28] hah [04:28] Hm [04:28] so, just plain fail. [04:28] Except then the mutation should pass [04:28] So maybe not [04:29] Hmmmmmmmmmmmmmmmm [04:29] mmmmmmm [04:29] mmmmmmm [04:29] mmmmmm [04:29] The mutation it ends up with is cdpq7ln4t [04:29] Which is already the name of a person [04:29] Surely we don't actually start with a predictable seed. [04:29] ROFL [04:29] ROFL [04:29] whatever could go wron with that ? [04:29] back soon, to hear the end of the story [04:30] # We seed the random number generator so we get consistent results, [04:30] # making the algorithm repeatable and thus testable. [04:30] random_state = random.getstate() [04:30] random.seed(sum(ord(letter) for letter in generated_nick)) [04:30] Hahahahahahah [04:30] ah uh what [04:30] Because that's a good idea. [04:31] i presume this means there are some truly comedy nicks already [04:31] So because so many people have a local user part of 'launchpad'... [04:31] yes [04:31] that [04:31] Checking for just what we have... [04:31] That must mean that 78luphr0rnk2nuqimstywepozxn9kl19tqh0tx66b5dki1xxsh5mkz9gl21a5rlwfnr8jn6ln0m3jxne2k9x1ohg85w3jabxlrqbgs-cdpq7ln4t-a811i2i3ytqlsztthjth0svbccw8inm65tmkqp9sarr553jq53in4xm1m8wn3o4rlwaer06ogwvqwv9mrqoku2x334n7di44o65qze already exists [04:31] Or it's too long [04:32] http://launchpad.net/~78luphr0rnk2nuqimstywepozxn9kl19tqh0tx66b5dki1xxsh5mkz9gl21a5rlwfnr8jn6ln0m3jxne2k9x1ohg85w3jabxlrqbgs-cdpq7ln4t-a811i2i3ytqlsztthjth0svbccw8inm65tmkqp9sarr553jq53in4xm1m8wn3o4rlwaer06ogwvqwv9mrqoku2x334n7di44o65qze :) [04:32] https://launchpad.net/~78luphr-lannyuwgd-a811i2i [04:32] hahahahahahahahahahahahaha [04:32] ahah [04:33] * mwhudson runs out of breath [04:33] I'd seen crazy stuff like this before [04:33] But assumed it was just users being strange [04:33] given [04:33] "This should be impossible to trigger unless some twonk has " [04:33] "registered a match everything regexp in the black list.") [04:33] i'd be inclined to blame SteveA [04:33] Nah [04:33] Not really [04:33] (probably not for the seeding though) [04:33] Back in his day it included the domain [04:33] ah [04:34] But we stopped including the first segment of the domain because people complained it gave away their email address. [04:34] heh [04:34] sha256 the address instead? :) [04:34] launchpad_dogfood=# SELECT COUNT(*) FROM person WHERE name LIKE '78lup%'; [04:34] count [04:34] ------- [04:34] 306 [04:34] And that's less than half of them. [04:35] length(name) > 100 ? [04:35] or whatever the sql is for that [04:36] 275 [04:36] oof [04:36] Lots of those are deactivatedaccount, though [04:37] Lots is actually only 14 [04:37] Lots, in troll. [04:37] Most of the rest are the 78luphr things [04:37] Except for the ones that are generated from ubuntu@ instead of launchpad@ [04:37] eg. mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4k-bp84z2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6 [04:38] Memorable. [04:38] admin@ and info@ are also not uncommon [04:39] https://launchpad.net/~mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4k-bp84z2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6 -> "Are you Ubuntu?" [04:39] Heh [04:39] "we are all ubuntu!" [04:41] Ubuntu does not use Launchpad. This page was created on 2012-05-03 when the landscape-api_12.04+bzr4143-0ubuntu1.8.04+jenkins547 package was uploaded to hardy/RELEASE. [04:41] * StevenK cackles [04:45] So [04:45] Things that shouldn't be deterministic: [04:45] - Collision resolution algorithms [04:53] indeed. [05:18] steven@undermined:~/launchpad/lp-branches/destroy-old-privacy-ui% bzr di | diffstat -s [05:18] 20 files changed, 72 insertions(+), 425 deletions(-) [05:19] I am disappoint. [05:20] StevenK: Must have missed some stuff. [05:22] wgrant: But where? [05:23] Yes. [05:23] steven@undermined:~/launchpad/lp-branches/destroy-old-privacy-ui% bzr grep show_information_type_in_ui [05:23] steven@undermined:~/launchpad/lp-branches/destroy-old-privacy-ui% bzr grep show_information_type_in_branch_ui [05:23] steven@undermined:~/launchpad/lp-branches/destroy-old-privacy-ui% [05:25] I'm going to toss it at ec2 and see how much carnage there is. [05:25] wgrant: did you file a bug about th eappserver src ip visibility ? [05:29] lifeless: No. [05:31] wgrant: will you? [05:33] lifeless: Not unless there's a good argument for it :) [05:35] the law isn't good enough? [05:37] The DPA doesn't, AFAIK, say "Launchpad engineers aren't allowed to see source IP addresses in request logs to diagnose operational issues" [05:38] That may be implied somehow, but I haven't seen an argument to that effect. [05:38] Perhaps there is a wiki page on this. [05:42] Huh. So I accidentally landed my branch by using bzr lp-land instead of ec2 land yesterday, and reverted when this was noticed due to buildbot failure. I then ran ec2 land as intended. And it landed, and buildbot is green. [05:44] The errors buildbot yesterday looked like they should consistently fail too... [05:44] stub: What was the buildbot failure like? [05:44] stub: It was only on the non-real buildbot. [05:44] So it might have been using 8.4, for example [05:45] ohh.... that would explain it [05:45] xx-dbpolicy thinking the slave was the master etc. [05:45] I don't know what gary's setup is these days. [05:45] But it's reasonable plausible that it's still on 8.4, since we haven't used 9.1-specific stuff yet. [05:45] Hm [05:46] Except I think StevenK did last week [05:46] Real buildbot is green so the landing is good. [05:46] 2209-16-6 uses ORDER BY in an aggregate, so that should have weeded out any 8.4 installations [05:46] Do you recall the traceback? [05:47] It must have been some other buildbot - can't see read on the board [05:48] It was an ephemeral parallel buildbot test from gary. [05:48] It's gone now. [05:48] Anyway, let's migrate staging to streaming replication now :) [05:48] yer, looks like an ec2 instance in my history [05:48] Right? :) [05:49] yes, that is what this landing is. when it is on staging and qastaging, I can convert the staging dbs. [05:49] wgrant: its the conclusion of IS, after discussion with legal etc. [05:50] Unfortunate side effect - at the moment, we have a single cluster so all the RAM on sourcherry can be dedicated to it. With the new setup, we have to run two clusters so ram usage will be less optimal because we will have two copies of the hot set in shared_buffers, and probably should reduce shared_buffers too. [05:50] no, that isn't quite right. [05:51] but it will still be less optimal, as load isn't split 50/50. [05:51] lifeless: Is this insanity documented anywhere? :/ [05:52] wgrant: rt 52417 [05:52] wgrant: I wanted a solid ruling, so I sought clarity. [05:52] Doesn't this also mean that non-EU sysadmins shouldn't be able to see it? [05:52] Since that would involve transferring the data to somewhere outside the EU [05:53] wgrant: I'm not going to go seeking trouble; elmo has presumably walked through all of that [05:53] wgrant: the issue for our purposes is sysadmin/non-syadmin/reason-for-access [05:56] brb, starting new planet [05:58] I guess sysadmins will have to do all further analysis of load issues. [05:58] did you remember the Genesis device? [05:58] yes [05:58] wgrant: I think thats an extreme proposition to take [05:59] Nah, the extreme (but probably correct) position here is the EU should be involved purely because of batshit insane laws like this. [05:59] s/involved/dissolved/ [05:59] actually, if we operated in NZ, we'd have similar constraints [05:59] EU is in no way a special snowflake [06:00] hey, some of us like batshit insane laws protecting privacy :) [06:01] wgrant has a point - the EU should be dissolved for plenty of reasons :) [06:01] stub: Filtering source IP addresses from the engineers operating the service is hardly protecting privacy in any useful manner :) [06:03] Assuming benevolent engineers [07:41] Ah [07:42] The initial bugsummary population query counted assignees in viewed_by, but the triggers to maintain it did not. [07:43] That explains some of the drift. === frankban changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | Welcome our new intern: ivory | On call reviewer: frankban* (rvba) | Firefighting: - | Critical bugs: 3.47*10^2 [08:03] good morning [08:21] wgrant: Thanks for your review of https://code.launchpad.net/~cjwatson/launchpad/gina-deb-vendor-debian/+merge/111120. I made that correction; could I have "Status: Approved"? [08:22] cjwatson: Sorry, probably had too many MPs open so the request didn't make it through before I closed the tab. Fixed. [08:22] Thought so, thanks. [08:31] wgrant: \o/ [08:32] wgrant: (finding a/the bug) [08:32] lifeless: It's one of the bugs, but there's at least one other that there's probably no hope of ever finding. [08:32] I suspect it's gone now. [08:54] frankban: could you review this? https://code.launchpad.net/~ivo-kracht/launchpad/bug-425934/+merge/110786 [08:55] ivory: sure [08:56] frankban: thx [09:39] Dear qastaging, would you care to be any slower? I'm sure it's possible. [09:39] * czajkowski hands cjwatson a baseball bat [10:47] wgrant: Re the pcj-reupload / PermissiveSecurityPolicy discussion from last night: the test that fails if I ditch that removeSecurityProxy is lib/lp/soyuz/stories/ppa/xx-ppa-files.txt:274. Is it even possible to run PermissiveSecurityPolicy stuff within pagetests? [10:47] It certainly isn't obvious how to e.g. change the layer to something zopeless. [10:49] I suppose I could change that code to run Archive.copyPackage rather than running do_copy directly, but ISTM that that would just move the problem to one of how to run the job in-place with a permissive security policy === almaisan-away is now known as al-maisan [11:11] "Visible only to users with whom the project has shared embargoed but you've stopped reading by this point anyway." [11:11] :) [11:18] (reported as bug 1015509) === G is now known as Nigel [12:25] cjwatson: Yeah, pagetests will be a problem there. [12:45] frankban: could you please review this MP: https://code.launchpad.net/~adeuring/launchpad/bug-29713/+merge/111212 ? [12:46] adeuring: on it in a minute [12:46] frankban: thanks! [12:47] stub: since this MP adds a DB patch file, you might want to have a look too, though it's just a change of ftq() and _ftq(): https://code.launchpad.net/~adeuring/launchpad/bug-29713/+merge/111212 [12:53] adeuring: Ta. I'll look at it later. Not sure if this counts as a db patch or a code patch really with all that python. [12:53] stub: yeah, but formally it's a DB patch ;) [12:54] adeuring: I think the old code duplication between _ftq and ftq was because we were generating it from a single source. Making one depend on the other now is a good move. [12:54] thanks :) [13:01] stub: Can you tag the bug related to r15444 as 'qa-bad bad-commit-15444' ? [13:02] Hmmm, and then it lands in r15448. === benji is now known as Guest14462 === benji___ is now known as benji [13:50] What happened to http://lpqateam.canonical.com/qa-reports/deployment-stable.html? [13:51] (404) [13:52] looks like maybe the vhost is now served by something else on the same machine? [13:52] http://lpqateam.canonical.com now looks suspiciously like https://lpbuildbot.canonical.com/ [13:53] Yes, quite [13:54] (Only without HTTPS/openid) [13:55] webops: lpqateam.canonical.com appears broken - see above [13:55] james_w: weirdly, those two aren't on the same IP address [13:55] I'll take a look [13:56] cjwatson: interesting, I suspect that's me - checking [13:57] mm, wrong VirtualHost in /etc/apache2/sites-enabled/lpbuildbot.canonical.com perhaps [13:57] cjwatson: sorry about that, fixed [13:57] thanks [14:04] adeuring, https://plus.google.com/hangouts/_/698ee8339ef1f2f3cbdce4185991264b12204423?authuser=0&hl=en [14:05] StevenK: Did I break qatagger? [14:12] jcsackett, we have a busy day today. I think we have a check point in 20 minutes and I need to inform you of the revelations about creating a team [14:16] sinzui: yeah, i talked a bit with wallyworld_ about it last night, but i def need more info. [14:17] you want to talk post-checkpoint? [14:17] yes. [14:17] cool. long round of meetings then. i better go make more coffee. :-P [14:17] Maybe we can hijack the hangout [14:17] i'm fine with that. === skaet_ is now known as skaet === al-maisan is now known as almaisan-away === frankban changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | Welcome our new intern: ivory | On call reviewer: - | Firefighting: - | Critical bugs: 3.47*10^2 === salgado is now known as salgado-lunch [16:40] stub, I suspect (hope for your sake) that you are long gone, but just in case, http://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel/revision/15448 is causing failures on the parallel test machine. the default flavor of stores that we are getting in the failing tests is master, not slave, which appears to be the source of the errors. I can't duplicate locally, but I duplicate every time, including across new [16:40] machines, on EC2. [16:40] Failures look like this: http://ec2-184-72-208-228.compute-1.amazonaws.com:8010/builders/lucid_db_lp/builds/0/steps/shell_9/logs/summary [16:40] I can duplicate on the EC2 machine trivially, just by running the first test in the lxc container in isolation. [16:41] the store provides IMasterStore, not ISlaveStore. [16:41] I'm digging in, but three reliably failing tests is a big step back for us [16:41] so help would be appreciated. [16:42] I'll send you a note at my EoD to let you know where we are with this, one way or the other [16:42] gary_poster: yo [16:43] oh, hey stub! I'm sorry you are still around :-) [16:43] so it went through ec2 and buildbot fine. I'm not sure why it fails there, but a start would be to confirm PG 9.1 [16:44] stub, yeah, I saw it passed everywhere else fine, and I have not been able to dupe locally at all [16:44] and yes, just verified the verification: the machine is running 9.1 [16:44] well, to be precise-- [16:45] 9.1.4-1~lucid4 [16:46] xx-dbpolicy.txt is the interesting one. Others will be fallout I assume. [16:46] We seem to be in master only mode [16:46] right [16:46] lp.services.webapp.tests.test_dbpolicy.LaunchpadDatabasePolicyTestCase.test_defaults is really just the same thing [16:47] when I put a pdb in there, it says that the store implements master [16:47] getting timeouts on launchpad... [16:47] yeah, me too, but it eventually loaded [16:48] gary_poster: It is valid to get the master store if you are asking for a slave store. We need to trace that to see why it is deciding to do that. For this, it would mean the lag is too high and it considers the slave useless. [16:48] oh [16:48] ok [16:49] + streaming_lag = slave_store.execute( [16:49] + "SELECT now() - pg_last_xact_replay_timestamp()").get_one()[0] [16:49] I think that is the only relevant line added [16:51] stub, in dbpolicy.py, in getStore, self.default_flavor == 'master'. Is that consistent? [16:51] (this is in a pdb, I mean) [16:52] test_dbpolicy.py is probably better to debug this I guess. [16:53] yeah, that's where I am [16:53] I got the lp.services.webapp.dbpolicy.LaunchpadDatabasePolicy [16:54] which in says the default_flavor is master, in code and in pdb, so I must have zigged when I should have zagged earlier, because it makes sense that the store should be a master once I'm here. [16:55] s/which in/which/ [16:55] gary_poster: So we attach the interface to the object we return from getStore? How could this happen then? [16:57] stub, I'm currently exploring whether lib/lp/services/webapp/adapter.py(848)get() should not have a LaunchpadDatabasePolicy from StoreSelector.get_current() [16:57] because once it does, afaict, we're on the wrong path [16:57] gary_poster: and in that test, yes, it should be default of slave because we are testing the Slave policy [16:58] stub, so the following result is the wrong one--we should see some other policy class, right? [16:58] ((Pdb)) p StoreSelector.get_current() [16:58] [16:59] def setUp(self): [16:59] if self.policy is None: [16:59] self.policy = SlaveDatabasePolicy() [16:59] super(SlaveDatabasePolicyTestCase, self).setUp() [17:00] so yeah, should be a slave policy [17:00] huh [17:00] ok, I'll do my pdb in that setUp [17:01] I'd watch SlaveDatabasePolicyTestCase.setup [17:01] It will not set the policy if it is not None, to avoid stoping on subclasses having already set it. [17:01] yeah that's what I meant; trying [17:03] whoa! [17:03] self.policy is not None [17:03] it is LaunchpadDatabasePolicy. [17:03] Who did that and why! [17:04] oh [17:04] it is a confusing test assembly [17:05] stub, notice the failing test is actually lp.services.webapp.tests.test_dbpolicy.LaunchpadDatabasePolicyTestCase [17:05] which has its won setUp [17:05] oh [17:05] which sets the policy to LaunchpadDatabasePolicy [17:05] So now I'm wondering why it is working everywhere else [17:06] DEFAULT_FLAVOR is decided on by the policy, it might be slave, might be master [17:07] line 246 of dbpolicy.py [17:07] That line should be hit, and I bet it isn't [17:07] the getReplicationLag call above it is the new code I landed. [17:08] stub, the code I saw in pdb was getStore in the base class, not install [17:09] line 100 (with line 92) were the pertinent bits [17:09] oh [17:09] gary_poster: Can you run "SELECT now() - pg_last_xact_replay_timestamp()" on one of your test dbs if you have one available from a paused test? [17:09] but then we set [17:09] k [17:10] It should always be returning NULL since the database isn't replicated [17:10] Unless I've misunderstood something, and your dbs are being setup differently [17:13] stub, I should be connecting to a db that looks something like launchpad_ftest_2391, yes [17:13] ? [17:14] gary_poster: yes, that looks fine [17:14] stub, I get 02:17:48.970632 [17:14] 2 hours, 17 minutes. [17:14] Bum. [17:15] So how are you setting your databases up? [17:15] using lpsetup, which call launchpad-database-setup or whatever it is called. Lemme get that code for us... [17:15] nah, that is fine. it is the standard stuff. [17:15] Hmm... [17:16] I get null, lucid staging database 9.1.4... [17:17] But I think this means my landing needs to be reverted, and I need to detect if the slave database is using pg 9.1 replication before doing that check. [17:17] I'll double check the docs first. [17:20] stub, fwiw, the database is setup in line 222 of http://bazaar.launchpad.net/~canonical-launchpad-branches/lpsetup/trunk/view/head:/lpsetup/subcommands/install.py [17:20] gary_poster: but parallel testing hasn't changed this? [17:20] hasn't changed what, stub? We're using the same LP tree as everyone else, if that's what you mean [17:20] database setup. [17:21] If you start with 'make schema', it should all be fine. [17:22] which is utilities/launchpad-database-setup, and that should be fine :-/ [17:22] stub, we do run make schema. results of all of that per-build setup are here: [17:22] http://ec2-184-72-208-228.compute-1.amazonaws.com:8010/builders/lucid_db_lp/builds/0/steps/shell_8/logs/stdio [17:22] Unless perhaps your database crashed at some point and started up in recovery mode? Which would mean it replayed logs, and that function would return something. [17:22] We've gotten the LANG complaints forever. We now know the cause but we haven't bothered to change anything [17:23] because it's been working fine up to now [17:23] yer, I think I need to revert and not assume pg_last_xact_replay_timestamp() returns NULL on unreplicated systems. [17:24] stub, where would I look to see if we have the crash you expect? or do you not want to bother? [17:24] Don't bother. === salgado-lunch is now known as salgado [17:24] ok cool [17:24] If it can happen, it can happen. [17:24] ok [17:25] This code needs to cope. [17:25] gotcha [17:29] stub, fwiw, I see this: http://pastebin.ubuntu.com/1051255/ . That seems to align with the time frame that we are talking about. I bet we are shutting down the lxc containers harder than postgres expects, so that's what it is complaining about. [17:29] * gary_poster likes tidy explanations [17:29] gary_poster: Yes. So replication works the same was as recovery after a hard shutdown. [17:30] The wal files are replayed to ensure the data files are correct. [17:30] gotcha [17:30] Which makes my replication lag check return a value [17:30] Bum. [17:30] gotcha. :-/ [17:31] I'll revert that again now [17:31] ok thanks stub, and thank you very much for helping so far past EoD [17:31] np. I had a few hours in the middle off so am catching up :) [17:33] :-) cool === deryck is now known as deryck[lunch] [18:12] stub: good $midnight? [18:13] 1am? [18:13] yowser :> [18:13] gary defeated my cunning plans [18:13] I see [18:14] is launchad the team that never sleeps! [18:14] this is the team that goes on and on [18:30] gary_poster: If you haven't messed with your evil postgresql installation, can you run 'SELECT pg_is_in_recovery(), pg_last_xact_replay_timestamp()' for me? [18:31] stub, sorry, you missed the window by seconds [18:31] doh. [18:31] stub, although... [18:31] I'll try and break my local install then :) [18:31] unless... [18:32] (1) I got trunk after your branch and things were fine again, and (2) if you give me about five minutes, I can log into a running instance (with tests running) and do stuff there if that helps [18:33] or (3) I could kill the running tests and do stuff then. [18:33] any of those of interest, stub? [18:33] Nah, I think I've reproduced it here (shutdown immediate with changes made but before checkpoint). [18:33] cool === deryck[lunch] is now known as deryck === salgado is now known as salgado-afk [22:31] sinzui: wgrant: StevenK: jcsackett: link works now [22:34] sinzui: http://launchpad.net/~78luphr0rnk2nuqimstywepozxn9kl19tqh0tx66b5dki1xxsh5mkz9gl21a5rlwfnr8jn6ln0m3jxne2k9x1ohg85w3jabxlrqbgs-cdpq7ln4t-a811i2i3ytqlsztthjth0svbccw8inm65tmkqp9sarr553jq53in4xm1m8wn3o4rlwaer06ogwvqwv9mrqoku2x334n7di44o65qze [22:34] (that's where 'launchpad' gets you) [22:34] That's still the best URL ever [22:34] The 'cdpq7ln4t' is 'launchpad' [22:34] Mutated deterministically to avoid collisions. [22:41] wgrant: My theory is that Launchpad is trying to beat bug 255161 for the funniest bug ever. [22:41] <_mup_> Bug #255161: I am unable to print from open office, I tried reinstalling open office but it did not work. I use a brother mfc240c printer and I am running Hardy. Printing from other apps has not been an issue. < https://launchpad.net/bugs/255161 > [22:49] cjwatson: Ah yes, I remember that one. [22:49] wgrant: Does http://paste.ubuntu.com/1051761/ look about right as a gina hack to make it not commit anything? [22:49] For dogfood [22:49] It has commits all over the place so it's a little awkward [22:51] cjwatson: I think that's correct. [22:52] I won't try it until after I next regain consciousness, so there's a while to catch idiocies :-) [22:53] Yep. [23:28] wgrant: so, why don't we just count the matching things and add -(N+1) and be done with it ? [23:31] lifeless: Because seeding an RNG deterministically is more fun.