/srv/irclogs.ubuntu.com/2012/12/17/#launchpad-dev.txt

StevenKAh00:00
StevenKSo the two tests are terrible?00:00
StevenKEffectively, they are getUtility(IPackageUploadSet).getAll(distroseries, name=u"'") with and without exact_match=True00:00
wgrantHuh?00:00
wgrantNo00:01
wgrantThe tests have revealed a bug in your code00:01
wgrantIf API input can cause a crash, then our code is buggy00:01
StevenKThat's not API input, that's calling the method directly00:02
wgrantSure00:02
wgrantBut I can pass in a name of "'" through the API00:02
StevenKwgrant: But name = name.replace("'", "").replace('"', '') probably makes you hate me ...00:21
wgrantThat's very PHPesque of you00:22
wgrantYou should escape the string according to the various syntaxes that are involved00:22
wgrantNot just blindly remove characters until it works :)00:23
StevenKwgrant: IE, remove everything that isn't alphanumeric?00:24
wgrantThat's precisely the opposite of what I suggested00:25
StevenKwgrant: Hold on00:25
StevenKEscaping in the ' or " case doesn't work00:25
wgrantWhy not?00:25
StevenKBecause as you said earlier, it isn't a valid tsquery syntax00:26
wgrantConfused...00:26
wgrantA plain ' is not valid tsquery syntax00:26
wgrantBecause ' is special in tsquery syntax00:26
wgrantLike in SQL00:27
wgrantHow do we resolve this in SQL?00:27
wgrant(and Python, and JavaScript, and...)00:27
* StevenK is clearly missing something00:28
wgrant' delimits a quoted string in Python, SQL, and a tsvector00:30
wgrantHow do we normal handle ', and other special characters, in a Python or SQL string?00:31
StevenKBy escaping them00:31
wgrantRight00:31
wgrantSo how might we handle a ' in a tsvector string?00:31
StevenKBy escaping it00:31
wgrant:)00:31
StevenK'''' isn't already escaped, or it needs to be escaped twice?00:32
wgrantHow's that escaped?00:32
wgrant'''' is the SQL equivalent of Python's "'"00:32
StevenKAh00:32
wgrantIt's escaped in SQL terms00:33
StevenKwgrant: So, I have two MPs -- I'm happy to self-review the drop-garbo one, it just deletes 182 lines of code. What about the index MP?01:01
wgrantThat should only be landed once the query branch is ready01:03
wgrantIn case it requires changes01:03
StevenKMmmmm01:03
StevenKI should drop my hacks on DF, add two rows to LDR and update it01:04
wgrant(none of this sort of thing should ever land until the whole thing is confirmed to work, really)01:04
StevenKWe had confirmed it was working on Friday01:04
wgrantWe hadn't ever done a full test01:05
* StevenK executes on his plan01:06
StevenKwgrant: The lib/lp/registry/model/person.py cowboy on DF is yours?01:08
StevenKAnd lib/lp/soyuz/model/binarypackagebuild.py, I guess01:08
wgrantIndeed01:11
wgrantWell, person.py is me, binarypackagebuild.py is cjwatson01:11
StevenKwgrant: Do you want your change preserved?01:12
wgrantNo01:12
StevenKwgrant: No objections to landing the death of garbo branch?01:52
wgrantStevenK: As long as it's done on all four DBs01:52
StevenKYeah, all five -- dev, dogfood, qas, staging and prod01:52
* StevenK stabs dogfood02:52
StevenKDamn it, load02:53
StevenKwgrant: DF's front page is unhappy, but +queue looks snappy02:58
StevenKHmmm, I think JS is still broken02:59
wgrantYeah03:00
wgrantNo convoy, remember03:00
StevenKThen remove the flag?03:00
wgrantWon't help03:00
wgrant3.5 won't work without convoy03:00
StevenKBleh03:01
wgrantThough we still haven't removed 3.3, don't rely on that for much longer :)03:01
wgrantWhy do you want JS?03:01
StevenKI can't see render times03:01
StevenKAnd I'd like to03:01
wgrantAh03:01
wgrantview source :)03:01
StevenKAt least 255 queries/external actions issued in 9.53 seconds03:01
wgrantThat doesn't sound indexed03:02
wgrantWhat's the query?03:02
StevenKHowever, that's 'xorg' for Quantal done03:02
wgrant(or it could just be a really really cold cache)03:02
StevenKLet me resubmit03:02
StevenKAt least 255 queries/external actions issued in 3.97 seconds03:02
wgrantRight03:03
wgrantCan you track down the main search query and see how long it took?03:03
StevenKCause an OOPS?03:03
wgrantOr look at the query log that's inline in the page03:03
wgrant(you'll probably need to poke with firebug, or view source)03:04
StevenKSELECT DISTINCT PackageUpload.* FROM PackageUpload WHERE PackageUpload.distroseries = 110 AND PackageUpload.status IN (3) AND PackageUpload.archive IN (1, 534) AND ((searchable_names::tsvector @@ 'xorg:*')) ORDER BY PackageUpload.id DESC LIMIT 31 OFFSET 0 == 54 ms03:05
StevenKLet me cause an OOPS, I think we're missing some preloading03:07
StevenKOOPS-48338103ccc22a3accd4c8c2476a610703:08
StevenKHmmm, and why isn't that OOPS ID on neem? :-(03:10
lifelessyou did a grep, not a find, right ?03:10
wgrantStevenK: DF doesn't transmit to neem03:10
wgrant/srv/launchpad.net/dogfood-logs03:11
lifelessThe OOPS id isn't used for the file name, because that would lead to trusting untrusted data.03:11
StevenKlifeless: No, I did a load up the website and put the id in the field03:11
StevenKI was hoping for it to be formatted nicely, not in a file on DF :-(03:12
wgrantlifeless: How is trusting that data problematic, as long as you verify it's safe for use as a filename?03:17
wgrantIf someone wants to collide, they already can03:17
wgrantIt searches by ID, not filename03:17
* StevenK finishes writing a novel for CA03:18
* StevenK glares at this OOPS on mawson03:20
StevenKHoly crap it's slow the first time03:23
StevenKboost is still pathetic at 8.4 seconds03:23
wgrantBut which part is slow?03:28
wgrantIf boost is slow, it probably just means that the loading of the binaries is slow03:28
wgrantNot that the search is03:28
StevenKMmmm03:30
StevenKwgrant: http://wedontsleep.org/~steven/f.html is the query log03:31
lifelesswgrant: mmm true03:31
StevenKThere is no one query that is terribly slow03:31
wgrantC IS FOR COOKIE03:31
wgrantBad StevenK03:31
StevenKwgrant: So hurry up and load it so I can rm it03:32
wgrantI've loaded it and erased your cookie03:33
wgrantWe can see from the log that the search query is below 50ms03:35
wgrantSo for the purposes of this change it is fine03:35
wgrantWhat's not fine is... the entire rest of the page03:35
StevenKIndeed03:37
wgrantBut we already knew that :)03:37
StevenKwgrant: So looks like I'll be sticking to DistroSeries:+queue after this work lands ...03:40
StevenKwgrant: Anyway, index MP?03:40
wgrantSeveral decades ago I had a branch which mostly fixed DistroSeries:+queue03:40
wgrantI'm not sure if it's still around, and it's probably changed enough now that it's not useful03:41
wgrantI didn't propose it at the time because cachedproperty on model objects was banned03:41
StevenKI think the first step is loading everything related in one query, rather than this looping garbage it's doing03:42
wgrantWell03:42
wgrantYou need to preload things03:42
wgrantIt's pretty simple03:42
wgrantThere's just a number of different sorts of things you'll need to preload03:43
StevenKwgrant: Are you satisified about the working-ness of this change?03:49
lifelesslifeless's contribution to lp: unbanning cachedproperty.03:49
wgrantlifeless: Yep03:50
wgrantStevenK: Well, your app code is wrong, but otherwise it seems to be good03:50
StevenKwgrant: What about it?03:51
wgrantStevenK: It'll probably still go horribly wrong if I include a : in my search term, for example03:53
wgrantThe precise parsing rules don't seem to be very well documented03:55
StevenKwgrant: I need to escape : in the same way?03:56
StevenKParsing rules for tsquery seem to not be documented at all03:56
wgranthttp://www.postgresql.org/docs/9.1/static/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES goes some way03:58
wgrantAh, http://www.postgresql.org/docs/9.1/static/datatype-textsearch.html is what I was looking for03:59
StevenKSo, ' ", : * look special04:00
StevenKWithout that errant comma04:01
wgrantWell04:02
stubabel was the last person to look hard at this04:02
wgrantI'd just put the whole thing in single quotes, backslashing any ' or \04:02
wgrantGiven we want the whole thing treated as a single token04:02
StevenKwgrant: Which doesn't handle your : case04:03
wgrantDoesn't it?04:03
wgrant: isn't special when inside a quoted string04:03
StevenKAh04:03
wgrant(and " and * aren't ever special)04:03
StevenKDoes that mean it turns into @@ 'foo':* for substring?04:04
wgrantRight04:04
wgrantEr04:04
wgrantNo04:04
wgrant@@ '''foo'':*'04:04
wgrantOr @@ E'\'foo\':*' if you're using that syntax04:05
wgrantOr @@ E'\'foo\\\'\\\\\':*' if you want to search for foo'\ :)04:05
stubraise StupidUserSearchError(wtf)04:06
StevenKHaha04:06
stubWhat is all this for btw? Someone working on the antique database text search?04:07
StevenKstub: Searching on DistroSeries:+queue04:08
StevenKUsing tsvector because we like fast queries04:08
StevenKAnd not an eight table join and four likes04:08
StevenKOh god, doesn'04:09
StevenKBleh04:09
StevenKOh god, doesn't this turn into .replace(r'\\', r'\\\\')04:09
stubIn that case, why have you got odd characters you need to match? For word search, you want to normalize the data in the index.04:13
stubOh... this is for the search04:13
stubI don't know what the data looks like you are searching. But if odd characters are coming in, you have bigger problems with fuzzy matching like tsvector/tsquery such as case sensitivity, stop words, stemming04:18
wgrantstub: Well04:19
wgrantstub: stemming only happens if you use to_tsvector04:19
wgrantSo it's not a problem here04:20
wgrantWe just skip stemming and stopwords etc04:20
wgrantCreating the tsvector and tsquery directly with exactly the content we desire04:20
wgrant(I only opted to use ts2 because otherwise we'd need a custom GIN operator to do prefix search on an array of strings)04:21
StevenKGod damn I'm confused04:21
stublovely that this is all documented now. i might be able to understand that old code now.04:21
wgrantStevenK: What's confusing?04:21
StevenKIf I paste this, I may get murdered04:22
wgrantLet me sharpen my knives04:22
StevenK:-(04:22
StevenKWaiting for an error, anyway04:22
StevenK            name = name.replace("'", "\\'").replace(r'\\', r'\\\\')04:23
StevenK            name = "''%s''" % name04:23
StevenKAnd then we enclose it in '' in the SQL condition04:23
wgrantEw04:23
StevenKProgrammingError: syntax error in tsquery: "''':*"04:23
StevenKLINE 1: ...chive IN (16) AND ((searchable_names::tsvector @@ '''\''':*'...04:23
wgrantOne level at a time pls :)04:23
wgrantCreate the string as you desire ::tsquery to see it04:24
wgrantThen wrap that in sqlvalues() or similar, to the do the SQL escaping04:24
wgrant(also, you need to escape the backslashes first, or you'll be escaping escape characters04:25
StevenKHmm, that may not be helping04:26
StevenKwgrant: The index and NOT NULL MPs are up04:27
StevenKThe branch I'm working on depends on the indexes, but the NOT NULL is optional04:27
StevenKI think sqlvalues isn't helping me04:28
StevenKSearch string: ''unique-from-factory-py-line3348-101271''04:31
StevenKsqlvalues escaped: '''''unique-from-factory-py-line3348-101271'''''04:31
wgrantOh right04:32
wgrantsqlvalues still uses the horrid old non-standard escaping syntax04:32
StevenKI'm tempted to ignore it04:33
StevenKAnd just build the string by hand04:33
wgrantWhy?04:33
StevenKMainly because it's annoying me04:34
StevenKHowever, I win04:34
wgrantWhat is your code?04:34
StevenKwgrant: http://pastebin.ubuntu.com/1444536/04:35
wgrantLine 3 is odd04:35
wgrantAre you sure you really want a raw string?04:36
wgrantthat'll replace \\ with \\\\04:36
wgrantNot \ with \\04:36
StevenKAh04:36
StevenKNow '\\', '\\\\'04:36
StevenKI seem to not be stabbed, so that's a good thing ...04:36
wgrantIn a raw string, a backslash causes the following character to be interpreted literally, but the escaping backslash is retained in the final string04:36
wgrantWell04:37
wgrantYou're still using sqlvalues04:37
wgrantWhich you should stop doing04:37
wgrantBut it's better than manually doing the SQL escaping04:37
StevenKOh?04:37
wgrantSQL("whatever @@ ?", params=('blah',))04:37
StevenKThen search_string turns into "''%s''" ?04:38
wgrantConfused04:39
wgrantWhy?04:39
StevenKBecause you had ''' in your earlier example04:39
wgrantparams handles escaping for you04:39
wgrantThat's why it exists04:39
wgrantYou give it a normal string, it will inject it into the SQL safely04:40
StevenKwgrant: sqlvalues gutted from name and version searching04:43
wgrantLovely04:43
wgrantsqlvalues is thoroughly deprecated; you should avoid it except in legacy code04:43
StevenKwgrant: http://pastebin.ubuntu.com/1444550/04:46
wgrantStevenK: searchable_versions can be done with a native Storm expression04:52
wgrantThere are Array and ArrayContains (though the latter is @> rather than <@, I'm sure you can flip your args :)) bits in lp.services.database.stormexpr04:52
wgrantAnd the parens in the searchable_names SQL are likely redundant04:53
wgrantAnd "if name is not None and name != ''" sounds suspiciously close to "if name"04:53
StevenKOld code04:54
wgrantI'm an equal opportunity complainer04:55
StevenKwgrant: http://pastebin.ubuntu.com/1444568/05:01
wgrantIndentation!05:02
wgrantThat looks like you're appending an ArrayContains and an Array05:02
StevenKFixed05:03
StevenKAny other complaints?05:03
* StevenK waits for the flood05:03
wgrantSadly not05:07
StevenKwgrant: Should I put up an MP for this branch, then? Then you can happily nitpick more sections of the code.05:08
wgrantyep05:09
StevenKwgrant: That MP is now up05:20
StevenKwgrant: Thanks. Shall I wait for stub about the index branch, or which?05:28
wgrantLooks good, but let me just test something05:33
StevenKwgrant: Ah, seeing if a composite helps?05:41
wgrantYeah05:42
wgrantIt still doesn't05:42
StevenK:-(05:42
StevenKIt's still fast enough05:42
wgrantSure05:42
StevenKAnd beats the pants off the 15 seconds it was :-)05:43
wgrant(also, it's "indices", not "indicies")05:43
StevenKRight, index branch landing05:44
StevenKWhen stub reappears and has applied them both to qas and prod, I'll land the use branch05:45
StevenKAnd when he approves the NOT NULL branch, I'll land that.05:45
wgrantRight05:49
wgrantHmm05:50
StevenKHah05:52
StevenKDRS is already used for DistroSeries:+queue05:52
wgrantYes05:52
wgrantIIRC it only preloads PU*05:52
wgrantNot everything else05:52
StevenKBPF, SPRF, SPR05:53
StevenKIt may be trying to add in BPB05:54
StevenKBut it at least needs BPB and BP05:54
StevenK*PB05:54
StevenKstub: O HAI. Can you have a look at 40-2 in devel and apply the indicies to prod and qas?09:27
* stub looks09:30
StevenKstub: https://code.launchpad.net/~stevenk/launchpad/index-searchables-for-pu/+merge/140120 is the MP09:31
stubStevenK: Backups still running, so it needs to wait a bit longer09:32
stubI'll do staging and qas09:32
stubStevenK: It is on qas, staging already had it from nightly updates09:42
cjwatsonStevenK: you can drop my binarypackagebuild.py cowboy if you like09:44
StevenKcjwatson: It was two lines, and didn't hurt to restore.10:02
StevenKI certainly committed more evils to DF10:02
StevenKstub: Surprised the backups are still going10:03
stubyeah10:03
stub9 hours so far, only got as far as translationmessage10:08
stubless than 30 mins into that one10:08
stuboh, only one minute into translationmessage10:09
StevenKI thought they only took six?10:10
wgrantwildcherry's backup would normally have finished about 20 minutes ago10:12
stubyeah, suspect high io. Running at just over 10% iowait atm.10:12
wgrantIt's possible that allocate-revision-karma is causing excessive load as it catches up with 15 million revisions of backlog10:12
stubrevisionkarma catchup10:12
stubyer10:12
wgrantIt's about half done10:12
stuband our io utilization has been creeping up anyway as things get optimized and we become less cpu bound.10:13
wgrantYeah10:13
wgrantI've been noticing more and more IO issues in explain analyzes lately.10:14
wgrantBlocks just don't read as quickly as they used ot10:14
stubwgrant: production or dogfood?10:18
wgrantProd10:18
wgrantdogfood's always been dreadful :)10:19
wgrantReading in cold blocks on prod is just not as fast as it probably should be10:19
stubI'll blame contention on production, but we have a few knobs to twiddle if it looks like it needs it10:19
stubAt the moment, we are tuned so pg thinks random access is pretty darn fast due to almost always having our data in disk cache. That might be a lie now.10:20
wgrantYeah10:20
wgrantWith the DB this big...10:20
stubits always been big, it is the size of the hot set of data that is important.10:21
wgrantIt wasn't all that long ago that the DB was 250GB10:21
wgrantNow it's more than 350GB10:21
wgrantWe've gone from 2x RAM to 3x RAM10:21
stubyeah, but ram is 128GB10:21
StevenKHmm, that's a bit horrible11:01
* StevenK glares at buildbot11:02
StevenKAnyway, this is wgrant's fault.11:02
StevenKHe told me to Storm-ify it.11:02
nigelbStevenK: When in doubt, blame wgrant?11:07
StevenKIt works with other tests11:08
StevenKSo maybe the doctest is just bong11:08
wgrantStevenK: The doctest is probably calling it with a str, not a unicode11:11
StevenKYes11:12
StevenKwgrant: The previous call has unicode, but the one that errors doesn't.11:12
wgrantRight11:13
wgrantEasy fix11:13
StevenKYeah, landing it in a few11:13
=== benji changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | On-call reviewer: benji | Firefighting: - | Critical bugs: <150
=== slank_away is now known as slank
gary_postersinzui, hey.  Am I right that https://code.launchpad.net/~chrisjohnston/launchpad/fdl/+merge/140216 will not need a db change, because 330 is all we have in the DB?  On a related note, can you or benji get that landed soon?  I don't have an LP dev environment ATM15:00
sinzuicorrect15:02
cjohnstonmornin sinzui15:03
gary_posterthanks sinzui15:18
sinzuinp, hi cjohnston15:18
gary_posterbenji do you have an LP dev environment to land cjohnston's branch, above, sometime today?15:18
rick_h_gary_poster: I'll get it going. cc/ benji15:19
gary_posterthank you much rick_h_15:19
cjohnstonthanks rick_h_15:19
rick_h_cjohnston: went ahead and sent it off to ec2 just in case it hits some stray tests. See you in the afternoon.15:26
cjohnstonthanks15:27
cjohnstonrick_h_: I spoke with gary_poster more about my issue with lpsetup, I guess it's because I'm on quantal15:29
gary_posterthat's my hope anyway :-/ was working for us on precise15:30
rick_h_cjohnston: ah the lxc issue? I have lxc working on here, but I did things more the hard way to customize the setup so not sure15:30
rick_h_cjohnston: oh right, benji started to help me with some issues on quantal and I ended up just manually creating an lxc the way I wanted and doing more manual setup in there15:31
rick_h_back in budapest15:31
cjohnstonGotcha.. It would be really nice if the barrier to help for the community as low as possible15:32
=== teknico_ is now known as teknico
=== yofel_ is now known as yofel
cjohnstonrick_h_: ping21:47
rick_h_cjohnston: pong21:56
rick_h_cjohnston: what's up? around just a couple min21:56
rick_h_cjohnston: reforcing the build on buildbot, spurious failure21:57
rick_h_hopefully that's what you were looking for21:57
rick_h_will check back in later tonight21:58
cjohnstonyup21:58
cjohnstonso its all good then21:58
cjohnstonthanks21:58
rick_h_cjohnston: yea, ec2 came back clean so just need to smack build-not around a bit21:59
=== benji changed the topic of #launchpad-dev to: http://dev.launchpad.net/ | On-call reviewer: - | Firefighting: - | Critical bugs: <150
cjohnstoncool22:00

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!