[00:00] <Bert_2> wgrant: but loggin in as root with the root password doesn't work either
[00:00] <Bert_2> or does it only copy the password of the one user it's created with ?
[00:01] <wgrant> Bert_2: If asked to bind ~ (the '-b $USER' option) it will copy the relevant lines from 'getent passwd' and 'getent shadow'. If -b is omitted, it will use ubuntu:ubuntu
[00:01] <Bert_2> I see
[00:02] <Bert_2> so I'll have to recreate the container then I guess ?
[00:02] <wgrant> You'll probably have to leave out -b, or manually 'chroot /var/lib/lxc/$CONTAINER/rootfs passwd $USER'
[00:02] <Bert_2> wgrant: okey, thanks :D
[00:13] <wallyworld_> StevenK: yeah, i've been otp for an hour sorry. doing it now
[00:28] <wallyworld_> StevenK: is done
[01:01] <StevenK> wallyworld_: Thanks
[01:02] <wallyworld_> StevenK: sorry it took a while, i was otherwise engaged for a bit
[01:02] <StevenK> wallyworld_: Tis all good.
[01:13] <StevenK> wgrant: So I've cribbed get_bug_privacy_filter for get_branch_privacy_filter, but I'm not sure how to collapse down the policy grant filter
[01:28] <StevenK> 2012-07-04 01:15:01 DEBUG2  [PopulateBranchAccessArtifactGrant] Done. 30536 items in 304 iterations, 625.594790 seconds, average size 100.449373 (48.8121224706/s)
[01:29] <StevenK> wgrant: ^ DF
[01:32] <wgrant> StevenK: Rather than saying access_policies && foo, you can say access_policy = ANY(foo)
[01:54] <StevenK> wgrant: Does Storm export an Any, or do I need to screw around with SQL() more?
[01:58] <wgrant> StevenK: I'm not sure. At worst you'll need to add a three-liner to lp.services.database.stormexpr
[02:08] <lifeless> pretty sure it does
[02:08] <lifeless> and please add ovbvious things like that to storm itself
[02:08] <lifeless> we have reviewers in-team, no need to make the techdebt higher than it already is
[02:10] <StevenK> lifeless: I've been looking and I can't see it
[02:12] <lifeless> ah, probably doesn't then. its weak on scalars
[02:12] <lifeless> still, as wgrant says, easy to add.
[02:15] <wgrant> lifeless: You mean !scalars?
[02:16] <lifeless> maybe :P
[02:27] <wgrant> Bug #1020785
[02:27] <_mup_> Bug #1020785: lp.services.apachelogparser.base.get_files_to_parse doesn't like gzipped files over 4GiB <ppa> <Launchpad itself:Triaged> < https://launchpad.net/bugs/1020785 >
[02:28] <wgrant> Can anyone see a better solution than checking if the read size is >4GiB and gzip isize <4GiB, then ignore the file if read size % 2**32 == gzip isize?
[02:32] <wgrant> lifeless: Can I have a One Patch Policy exception for http://bazaar.launchpad.net/~launchpad-pqm/launchpad/db-stable/view/head:/database/schema/patch-2209-20-1.sql and http://bazaar.launchpad.net/~launchpad-pqm/launchpad/db-stable/view/head:/database/schema/patch-2209-25-1.sql?
[02:32] <lifeless> well, thats also going to fail.
[02:33] <lifeless> why not keep a signature of the last bytes in the file, or the mtime, or something? What are the constraints.
[02:33] <wgrant> lifeless: It'll fail in roughly 1 in a couple of billion files.
[02:33] <wgrant> Which is certainly less than ideal.
[02:34] <wgrant> But not fatal.
[02:35] <lifeless> wgrant: +1
[02:35] <wgrant> Thanks.
[02:35] <lifeless> on the file thing
[02:36] <lifeless> what are the constraints, why do we need to check that way?
[02:36] <lifeless> would a logstash consumer be a better answer ?
[02:36] <wgrant> OK
[02:36] <wgrant> So
[02:36] <wgrant> We need a solution this decade
[02:36] <wgrant> => logstash is not an option
[02:37] <wgrant> gzip seeking is expensive.
[02:37] <wgrant> So we can't just seek to the end and see what is there.
[02:38] <lifeless> so these are problems not constraints.
[02:38] <lifeless> they are obvious, the constraints aren't.
[02:38] <lifeless> when do we need it fixed.
[02:39] <lifeless> Does it have cpu limits? latency limits?
[02:39] <lifeless> imagine you're telling a particularly annoying dev about the problem for the first time :)
[02:39] <wgrant> We need to be able to progressively parse logs
[02:40] <wgrant> Ideally without gunzipping 50GB of data we've already read
[02:40] <wgrant> Every time
[02:40] <spm> wgrant: I've done ... similar, but if I understand this code I'm looking at correctly, basically pull in a buffer of gzipped data, and pass that to gzread to decode and give back.
[02:41] <spm> of course, this is C, and doing it the hard way is expected.
[02:41] <lifeless> wgrant: well, that seems unsubstantiated; really not trying to be difficult. but "why not" ?
[02:41] <wgrant> notsureifserious
[02:41] <lifeless> am
[02:41] <lifeless> also you can fingreprint a compressed file trivially
[02:41] <lifeless> so not sure why you aren't proposing that
[02:42] <lifeless> as a short term stop gap at least
[02:42] <wgrant> We could store and check the CRC, but that's probably only marginally better than the filesize here, and it involves a schema change.
[02:43] <lifeless> is that a constraint?
[02:43] <wgrant> It's a very undesirable thing, as it means I can't fix it before lunch.
[02:44] <lifeless> anyhow, short term hack - just store the gzip file length.
[02:44] <lifeless> if the files are append only
[02:44] <wgrant> How's that better than my initial suggestion?
[02:45] <wgrant> It corrupts the database
[02:45] <wgrant> And requires SQL to unbreak the history.
[02:46] <lifeless> this seems like a fraught discussion
[02:46] <lifeless> I'd rather move it to voice, or pass; We're trying to solve the same stuff, so the contention is wholly unneeded.
[02:46] <wgrant> Storing the bad size seems strictly worse than changing the check -- I can't see any benefit.
[02:46] <wgrant> Can you explain?
[02:47] <lifeless> you want to optimise handling of files, avoiding rereads; I'd do that by fingerprinting the primary source of data, not be depending on a documented-as-broken field.
[02:48] <wgrant> Oh
[02:48] <wgrant> By "gzip file length" you mean the length of the compressed file?
[02:48] <lifeless> yes
[02:48] <wgrant> Not the file length from gzip.
[02:48] <lifeless> right
[02:48] <wgrant> Can't do that, since we need to know where to seek to within the file.
[02:48] <wgrant> So we know which line to start at.
[02:48] <lifeless> yes, it would then be different for existing things in the db, but depending on how you handle repeated reads that might not be an issue.
[02:48] <wgrant> So that requires a schema change too
[02:49] <lifeless> personally, I'd do the schema change; yes it means another day or two to get the fix deployed, but its much less likely to bite you in the bum later.
[02:49] <lifeless> alternatively, use memcache to store out-of-file fingerprints and have fallback code for when thats missng, but - gnnngh, I don't really like that approach except perhaps as a stopgap.
[02:50] <wgrant> memcache is only sensible when it's not hideously expensive to recalculate.
[02:50] <wgrant> In this case we store logs for months/years
[02:50] <wgrant> Worth hundreds of gigabytes
[02:50] <wgrant> So the loss of the data is a huge problem.
[02:51] <lifeless> what seems odd to me is that we have code even attempting to process old logs; theres no high water mark ?
[02:51] <lifeless> if there isn't we have quadratic overheads anyhow
[02:51] <wgrant> It's linear over a couple of thousand files and DB rows in a single run.
[02:52] <wgrant> There's no watermark because we don't assume anything about the filenames.
[02:52] <lifeless> if you want to do a quickiefixie, I've no objection. But I do think its worth considering how to make it deal with a couple of OOM of growth.
[02:52] <wgrant> So ops can do as they please with logrotation etc.
[02:52] <lifeless> wgrant: you assume a single file can get altered though
[02:52] <StevenK> lifeless: Out of memory?
[02:52] <wgrant> lifeless: It can
[02:52] <lifeless> StevenK: order of magnitude
[02:52] <StevenK> OH! Orders of magnitude
[02:52] <wgrant> lifeless: It gets appended to throughout the day.
[02:52] <StevenK> Yeah, just got it
[02:53] <lifeless> StevenK: cool
[02:53] <wgrant> lifeless: Also, not just that it can get appended to.
[02:53] <wgrant> lifeless: We parse 10 million lines per run
[02:53] <wgrant> lifeless: Logs have well over 10 million lines.
[02:54] <wgrant> So we'd have to only store the size of the gzip file once we reached the end.
[02:54] <wgrant> Storing the physical location in the file isn't going to work, because of the trailer.
[02:55] <lifeless> sure, because thats how you're handling repeated reads (and optimising for finding-what-to-process-next).
[02:56] <wgrant> lifeless: How would you propose to do it?
[02:56] <lifeless> I'd suck stuff off of the logs as it happens and put it in a persistent queue, process it just in time from there.
[02:57] <wgrant> If I had a list of things that were good ideas, queueing up thousands of writes a second on any data store we have today would probably not be on that list.
[02:58] <lifeless> wgrant: 3K per second isn't high for amqp.
[02:58] <lifeless> wgrant: we've got several OOM headroom. Particularly for small things, which this is.
[02:58] <lifeless> well, at least two anyhow.
[02:58] <wgrant> It also prevents us from ever disabling the process, or reading historical logs.
[02:58] <lifeless> An we can shard the queues (and servers) if needed.
[02:59] <lifeless> it doesn't discard the logs, we can indeed do historical logs if needed, but they become a special case, not the general case.
[02:59] <wgrant> Anyway
[02:59] <lifeless> Disabling the process can be done a few ways; have a stateful sucker-upper; disable the consumer (and just rotate queues if we have too many 10's of GB of logs pending).
[02:59] <wgrant> That rewrite is at least 10 years away
[03:00] <wgrant> How would you do it this century?
[03:00] <wgrant> You seem to have a strong opinion that there's a better way than storing these offsets.
[03:00] <lifeless> I think that when you're re-reading append only files, storing where you got to in it is fine.
[03:01] <wgrant> 12:55:13 < lifeless> sure, because thats how you're handling repeated reads (and optimising for finding-what-to-process-next).
[03:01] <StevenK> Hmm
[03:01] <wgrant> That suggests you think it's not fine :)
[03:01] <lifeless> I think for incremental and startup performance you don't want to read the file content at all unless you have work to do, so storing a stat fingerprint along with the amount read is important.
[03:02] <wgrant> lifeless: Ah, perhaps once the number of files is about 3 orders of magnitude higher, yeah.
[03:02] <lifeless> I think that having linear growth of files to handle, which is going to jump by three in the short term, suggests that considering all files each run may have a shorter lifespan than you think.
[03:02] <wgrant> s/files/files per parser instance/
[03:02] <wgrant> lifeless: It *should* have a shorter lifespan than I think.
[03:02] <wgrant> lifeless: But it won't.
[03:03] <lifeless> so, right now, I'd do a schema patch, adding a string. I'd use bzrlibs stat finger print function to fingerprint files, and I'd set that for files where we processed to the end of the file as we saw it.
[03:04] <lifeless> I'd nag lifeless about getting near-realtime log shipping in place and prepare a cunning plan for doing it better later this year.
[03:05] <wgrant> The fingerprinting thing is completely unhelpful for the problem we've encountered here.
[03:05] <wgrant> It's an unrelated optimisation.
[03:06] <wgrant> The problem is that we have this file.
[03:06] <wgrant> We haven't completely parsed it yet.
[03:06] <wgrant> But it hasn't changed since we last tried.
[03:06] <lifeless> can you explain why that makes it unhelpful ?
[03:07] <wgrant> We can't skip based on the fingerprint check unless we only set the fingerprint once we hit EOF.
[03:07] <wgrant> Otherwise we'll skip files that we haven't completely parsed.
[03:07] <StevenK> My Any addition doesn't work. :-(
[03:07] <wgrant> And we haven't hit EOF here; that's the entire problem.
[03:07] <wgrant> So the fingerprint check is unrelated.
[03:07] <wgrant> StevenK: Oh?
[03:08] <lifeless> wgrant: I don't follow. Consider this logic:
[03:08] <StevenK> wgrant: Can you prod at http://pastebin.ubuntu.com/1074183/ after you and lifeless are done. Time for lunch.
[03:08] <lifeless> does the fingerprint match the file (no fp matches no file)
[03:08] <lifeless> onploooooooooooooopiuuuuu
[03:08] <lifeless> cat
[03:09] <lifeless> if yes, open, ungzip and seek to the last recor~6~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[03:09] <lifeless> BAH
[03:10] <wgrant> StevenK: +class Any(CompoundOper):
[03:10] <wgrant> StevenK: ANY is not a compound operator
[03:10] <lifeless> if yes, open, ungzip and seek to the last recorded offset and process from there. On halt, if we finished the file, store the fingerprint we obtained at the beginning of this.
[03:10] <wgrant> StevenK: It's a function.
[03:10] <StevenK> wgrant: It's a NamedFunc ?
[03:10] <wgrant> Right
[03:11] <wgrant> lifeless: That's precisely what I described, right.
[03:12] <StevenK> wgrant: That looks better, but I think the SQL is still bong: SELECT Branch.id FROM Branch WHERE Branch.id = 77 AND Branch.information_type IN (1, 2) AND COALESCE((Branch.access_grants)&&(SELECT ARRAY_AGG(TeamParticipation.team) FROM TeamParticipation WHERE TeamParticipation.person = 243654), false) AND (Branch.access_policy) = ANY((SELECT AccessPolicyGrant.policy FROM AccessPolicyGrant JOIN TeamParticipation ON TeamParticipation.team = Acce
[03:13] <wgrant> StevenK: Where'd the array_agg go from the APG bit? If you can remove that easily, as it seems, then just say access_policy.is_in(apgs). No need for ANY
[03:14] <wgrant> lifeless: Oh. I missed the "and I'd set that for files where we processed to the end of the file as we saw it" you said earlier as I was dying of optimism poisoning from the following line.
[03:15] <wgrant> False optimism poisoning, that is.
[03:25] <lifeless> well
[03:25] <lifeless> you can call it that
[03:25] <lifeless> anyhow
[03:25] <lifeless> whats the issue with that algorithm? How will it not work ?
[03:25] <lifeless> I'd hate to be suggesting something bong
[03:26] <wgrant> With the crucial "only set on EOF bit" that I missed you saying, it should be fine.
[03:26] <lifeless> there are other ways to fpr files, but the bzr one is shall we say battle tested.
[03:27] <wgrant> Heh
[04:10] <StevenK> wgrant: http://pastebin.ubuntu.com/1074238/
[04:13] <wgrant> StevenK: Indeed,
[04:18] <StevenK> wgrant: Indeed?
[04:18] <wgrant> StevenK: That looks roughly correct. Does it work?
[04:18] <StevenK> It returns []
[04:18] <StevenK> I think the public query should not be ANDed in
[04:18] <wgrant> Ah
[04:19] <wgrant> Yes
[04:19] <wgrant> That should be an or
[04:19] <wgrant> public OR artifact grant OR policy grant
[04:19] <wgrant> I meant the complex bits of the query looked correct :P
[04:19] <StevenK> Haha
[04:20] <StevenK> wgrant: Do you think returning the branch id is okay? I'm really interested if one of those three is True for branch.id == self.id
[04:20] <StevenK> Can't COALESCE against three, I guess
[04:21] <wgrant> StevenK: Why would you want to COALESCE against three?
[04:21] <wgrant> You basically want to SELECT 1 FROM Branch WHERE id = $BRANCH_ID AND $VISIBLE
[04:21] <wgrant> In this case you're selecting branch.id, but that's just as good
[04:22] <StevenK> Hm, still returns []
[04:23] <StevenK> Oh, but it's supposed to. I think this is the pre-subscribe check
[04:23] <wgrant> What's it meant to return?
[04:23] <wgrant> Ah
[04:24] <StevenK> wgrant: So my next question is I want something more elegant than "return Store.of(self).find((Branch.id,), conds) == [self.id]
[04:24] <StevenK> With an ending double quote
[04:25] <StevenK> And putting a list around the store, but handwave
[04:26] <wgrant> StevenK: More like 'return not store.find(1, Branch.id == self.id, $VISIBLE_FILTER).is_empty()'
[04:29] <wgrant> (Storm might not like the 1 very much, but it doesn't at all matter what it is)
[04:30] <StevenK> (1,), even?
[04:30] <StevenK> AttributeError: 'int' object has no attribute '__dict__'
[04:30] <wgrant> Yeah, it's probably trying to autotable it
[04:30] <wgrant> Which won't work well
[04:31] <StevenK> Same error with (1,) :-(
[04:31] <wgrant> Might as well just say Branch, then
[04:32] <StevenK> Yeah, we don't give a toss about what it returns
[04:40] <StevenK> wgrant: http://pastebin.ubuntu.com/1074270/
[04:41] <wgrant> StevenK: Does it work?
[04:41] <StevenK> wgrant: Nope
[04:42] <StevenK> Returns 0 rows before subscription which is fine, and 0 rows after which isn't.
[04:42] <wgrant> Have you checked the contents of AAG?
[04:48] <StevenK> No AAG created :-(
[04:49] <wgrant> That would be the issue. Missing FF?
[04:49] <StevenK> IBranch.subscribe() does not use a FF to guard
[04:50] <StevenK> Hm, maybe the sharing service still thinks subscription confers visibility
[04:53] <StevenK> Sort of. getUtility(IAllBranches).visibleByUser() needs porting
[04:53] <StevenK> Which I'm trying to work out without my brain dribbling out of my ears.
[05:14] <StevenK> wgrant: http://pastebin.ubuntu.com/1074301/
[05:16] <wgrant> StevenK: You'll want to remove the helper methods that you've eliminated references to, but indeed.
[05:17] <wgrant> StevenK: It may be worth seeing how the query looks if you reimplement the Branch method in terms of BranchCollection, as as I did with Bug.userCanView
[05:21] <StevenK> % bzr damage
[05:21] <StevenK> Using submit branch /home/steven/launchpad/lp-branches/devel
[05:21] <StevenK> Healing: 4 lines of code
[05:21] <StevenK> I approve
[05:27] <StevenK> Hm, neat. permission denied on APG
[07:03] <wgrant> stub: Hi, I've got a couple of DB reviews up for you when you have some time
[07:06] <huwshimi> This is the appropriate way to run a single test right? 'testr run -- -t lib/lp/bugs/tests/bug.py'
[07:07] <huwshimi> It should fail, but it's passing
[07:08] <stub> wgrant: k
[07:08] <wgrant> huwshimi: For Python modules you want to specify something like '-t lp.bugs.tests.bug'
[07:08] <wgrant> huwshimi: But lp.bugs.tests.bug doesn't have any tests in it. It's just a helper module.
[07:08] <wgrant> Perhaps you mean lp.bugs.tests.test_bug
[07:10] <huwshimi> wgrant: Oh, I see where I've gone horribly wrong. Thanks.
[07:16] <huwshimi> I've changed a table to a list and in lib/lp/bugs/tests/bug.py the print_bugfilters_portlet_unfilled() helper wants to grab the table and print its contents with print_table(). Any suggestions on how to print it now that it's a ul?
[07:17] <StevenK> print_table is probably defined somewhere in that test
[07:18] <huwshimi> StevenK: It's defined in lp.testing.pages
[07:18] <huwshimi> StevenK: And there doesn't seem to be print_ul or anything
[07:19] <StevenK> huwshimi: BeautifulSoup?
[07:19] <huwshimi> StevenK: You mean, just print out the contents?
[07:21] <StevenK> huwshimi: You could use a doctest matcher, but no, I don't mean just print out the contents.
[07:22] <StevenK> huwshimi: BeautifulSoup will parse the output of print_bugfilters_portlet_unfilled() and you can then inspect it with asserts.
[07:23] <wgrant> huwshimi: This is in a doctest, right?
[07:23] <huwshimi> wgrant: No, this is in the helper
[07:24] <wgrant> huwshimi: Well, yes, but it's probably eventually used by a doctest. The function prints it out, so you don't want to directly make assertions. I'd try the usual extract_text(find_tag_by_id(...)) pattern, see if that yields usable output.
[07:26] <huwshimi> I see so I just need to get it to extract the same data in the same format as the table printer was doing. I think I'm on it.
[07:27] <wgrant> Right, hopefully that's easy enough.
[07:27] <wgrant> If you can't get it exactly the same, get something that looks sensible and change the callsites :)
[07:29] <huwshimi> Thanks :)
[07:53] <huwshimi> Well, that wasn't too bad
[07:55] <adeuring> good morning
[08:11] <wgrant> :(
[08:11] <wgrant> postgres upsets me
[08:11] <wgrant> It won't use an index including 'foo IS NULL' to satisfy 'foo IS NOT NULL';
[08:18] <wgrant> czajkowski: Ew
[08:19] <czajkowski> wgrant: morning to you too
[08:19] <wgrant> Not really. But we like OpenStack, so we might be able to arrange something with a sufficient degree of evil.
[08:19] <wgrant> Indeed, morning.
[08:19] <czajkowski> wgrant: wrong channel
[08:20] <wgrant> IMO it belongs here, but I guess it's arguable.
[08:20] <czajkowski> it's early
[08:21] <lifeless> wgrant: whats the specific index its not using ?
[08:22] <wgrant> lifeless: bugsummary(distribution, sourcepackagename, tag IS NULL) WHERE distribution IS NOT NULL
[08:22] <wgrant> lifeless: It'll use that for a tag IS NULL query
[08:22] <wgrant> And if I s/tag IS NULL/tag IS NOT NULL/ on both it'll work
[08:24] <lifeless> tag is not null where distribution is null ?
[08:25] <wgrant> lifeless: Confused.
[08:25] <lifeless> wgrant: may be a stats thing not an index thing; e.g. many rows predicted
[08:25] <wgrant> lifeless: That's pretty much disproven by the fact that inverting the last element of the index causes the query to use it.
[08:25] <lifeless> oh hangon.
[08:26] <lifeless> wgrant: you say'
[08:26] <lifeless> 20:22 < wgrant> lifeless: It'll use that for a tag IS NULL query
[08:26] <lifeless> 20:22 < wgrant> And if I s/tag IS NULL/tag IS NOT NULL/ on both it'll work
[08:26] <lifeless> so both ways around work.
[08:26] <lifeless> Or you mistyped somewhere.
[08:26] <wgrant> Yes. Both ways work.
[08:26] <jml> hello
[08:26] <wgrant> But only with the matching index.
[08:26] <lifeless> so
[08:26] <wgrant> But both are satisfiable from either.
[08:26] <lifeless> bugsummary(distribution, sourcepackagename, tag IS NULL) WHERE distribution IS NOT NULL looking for 'tag IS NOT NULL' is what doesn't work ?
[08:27] <wgrant> Yes.
[08:27] <lifeless> wgrant: I presume you have a distribution in the query as well ?
[08:27] <wgrant> lifeless: Naturally.
[08:27] <wgrant> Otherwise it wouldn't use that index :)
[08:28] <lifeless> well, as it isn't using the index, I felt I should check.
[08:28] <lifeless> :>
[08:28] <lifeless> jml: elloh
[08:28] <wgrant> jml: Re. your email, that excludes test deps, I guess?
[08:28] <lifeless> wgrant: so why index tag IS NULL; looking for the specific sub summary?
[08:29] <wgrant> lifeless: We currently maintain separate indices on tag IS NULL and tag IS NOT NULL
[08:29] <wgrant> Was trying to avoid that.
[08:29] <lifeless> hmm, I forget the detail
[08:29] <lifeless> have you tried DISTINCT FROM ?
[08:30] <lifeless> wgrant: and NOT tag IS NULL ?
[08:30] <jml> wgrant: yes, the output. the script is just a Pythonically aware grep for 'import'
[08:31] <jml> wgrant: I'm probably going to buff it up so it recurses directories (or maybe packages)
[08:32] <jml> and ignores internal imports
[08:33] <jml> and then maybe consults some kind of db to figure out which things are modules and which aren't
[08:34] <jml> but, you know
[08:36] <wgrant> lifeless: NOT (tag IS NULL) has the same behaviour as tag IS NOT NULL
[08:36] <wgrant> Which doesn't really make much sense.
[08:36] <stub> wgrant: It never will, because an IS NULL index contains no information about rows without a null in that column.
[08:36] <wgrant> stub: Not a partial on tag IS NULL
[08:36] <wgrant> stub: But an index with the boolean "tag IS NULL" as one of its elements.
[08:37] <lifeless> computed indices are evil
[08:37] <lifeless> wgrant: why doesn't NOT (tag is NULL) behaving the same as tag IS NOT NULL, not make sense ?
[08:38] <wgrant> lifeless: While you're here, what do you think about soren's question in #launchpad? We'd need to set archive.signing_key on the new PPA fairly soon after creation.
[08:38] <wgrant> lifeless: Well, it was possible that it didn't consider IS NULL and IS NOT NULL to be inversions of the same operator.
[08:38] <wgrant> But it clearly does.
[08:39] <lifeless> I was hoping it might :)
[08:39] <lifeless> what about DISTINCT FROM ?
[08:40] <lifeless> uhm, so I think doing what soren asks would be good; be better to automate it (at least to the extent of sysadmin only apis).
[08:40] <lifeless> or sometihng.
[08:41] <lifeless> I'm not a huge fan of manual fiddling, but this is a case where our model makes decisions more stressful for users, its largely incumbent on us to deal with the side effects.
[08:41] <wgrant> Exactly.
[08:42] <wgrant> Although APIing this sounds beyond perilous.
[08:43] <stub> I think NOT( ) has to assume the contents of the parenthesis is true, false or null. IS NOT NULL is a single operator and it is known it returns true or false.
[08:44] <stub> wgrant: You go into detail about bugsummary journaling performance, but isn't that irrelevant for writes atm?
[08:45] <wgrant> stub: Not at all. It's the most expensive bit of big bug updates apart from the usual Storm n+1 query thing.
[08:46] <stub> wgrant: Yes, and are big bug updates a performance problem?
[08:46] <wgrant> stub: We have about 3 timeout bugs about them.
[08:46] <wgrant> eg. approving Linux SRUs
[08:46] <stub> Do you know what sort of number of bugs are being done in a batch?
[08:46] <wgrant> Can create about 50 tasks in one transaction, because the kernel team are insane.
[08:47] <wgrant> We end up with bugs with 80ish tasks, 2/3 of them created in one transaction.
[08:47] <wgrant> If the bug has 8 tags, which is fairly common, that's a lot of journal rows.
[08:47] <wgrant> The last task will unjournal 79*8*afew rows, journal 80*8*afew rows.
[08:47] <stub> Hmm... So I'm not against performance improvements, just improving performance doesn't really solve scalability problems.
[08:48] <wgrant> It stops bulk task addition from being quadratic.
[08:48] <stub> We can fix it so creating 80 tasks works on our hardware in a single transaction, then some sod will try and create 81.
[08:51] <stub> And things like dropping foreign key constraints mean we should have DELETE triggers too (they are there already I think?)
[08:51] <wgrant> stub: Well, the FKs really shouldn't have been on these tables at all. They actually cause correctness bugs.
[08:52] <stub> We normally don't bother with them in this situation. Can't recall why we had them.
[08:52] <stub> I'm wondering if we rely on ON DELETE CASCADE behaviour somewhere.
[08:52] <wgrant> We don't.
[08:52] <wgrant> Even if we did, it would be wrong.
[08:53] <wgrant> Since the deletion process would also remove the references in the original tables.
[08:53] <wgrant> Which would create journal entries.
[08:53] <wgrant> So you'd end up with negative counts in bugsummary after journal rollup.
[08:54] <wgrant> Negative counts are bad :)
[08:54] <wgrant> Though we have some :/
[08:55] <wgrant> 289 bugsummaryrows with count < 0 :/
[08:56] <stub> That would be a logic flaw somewhere in the triggers, wouldn't it?
[08:57] <wgrant> There's a lot of corruption in Ubuntu's kernel bugs. I've identified a couple of bugs in the initial population code that cause some of them, but it doesn't explain all of it.
[08:57] <wgrant> The triggers themselves weren't significantly buggy AFAICT. Just the initial population query.
[08:58] <wgrant> lifeless: Given https://pastebin.canonical.com/69406/  I propose https://pastebin.canonical.com/69407/
[09:04] <stub> wgrant: I wonder if we should dump doing this in triggers entirely, but I guess that would require reliable async job processing.
[09:05] <jam> mgz: did you see that there is one more failing test on python-2.7? It looks like a doctest that is (again) overly sensitive.
[09:05] <wgrant> stub: I came close to doing that when doing BugTaskFlat, but before we can sensibly do that we need both reliable async job processing and something resembling a sensible data access layer in the app.
[09:10] <stub> r=stub
[09:13] <wgrant> stub: Thanks.
[09:14] <wgrant> stub: Other disclosure stuff (that doesn't touch the existing bugs model so directly) is using celery-based eventual consistency fairly extensively.
[09:16] <stub> Yeah, just wondering if we are investing too much in rabbit. Complex beast, and more complexities to ensure we don't drop tasks.
[09:17] <stub> Using PG as the task queue for celery could suit us much better, and I think celery supports that out of the box.
[09:18] <wgrant> Indeed, although it hasn't been problematic yet, and we're not using it for absolutely critical stuff.
[09:19] <stub> IIRC We haven't chosen it for some things because of this problem, or chosen more convoluted implementations.
[09:25] <wgrant> stub: Right, the recent port of the classical job system to celery involves a regular task to reschedule jobs that remain pending but without a rabbitmq job.
[09:25] <wgrant> We don't have pure-rabbit jobs.
[09:25] <stub> Yer, that extra complexity.
[09:25] <stub> Is that generic, or do we have to do this every time?
[09:26] <stub> Guess you get it if you use the lp job system.
[09:27] <wgrant> stub: It currently only support branch scan jobs, I believe, but it's pretty simple to genericise. I imagine it will be once it's fully deployed and tested for branch jobs.
[09:30] <lifeless> wgrant: +1
[09:31] <wgrant> lifeless: On the signing_key SQL?
[09:31] <lifeless> yes, caveat being to follow the normal procedure :)
[09:31] <wgrant> Of course, thanks.
[10:36] <jam> lifeless: I thought we didn't run tests for code that wasn't launchpad proper. I'm getting a test failure on python-2.7 inside the launchpadlib test suite (in the egg)
[10:43] <jam> and the plot thickens. As near as I can tell, you run the lplib tests by running "python setup.py test", but that suite doesn't load the doctests... so they are only being run by launchpad bin/test, and it is in an egg, so we need a new version of launchpadlib, a new release, and then updating the versions.cfg file
[10:43] <jam> bac: I noticed you are prominent in launchpadlib development. Can I ask you some questions?
[10:44] <czajkowski> jam: dont think he's on today
[10:44] <jam> czajkowski: thanks
[10:44] <jam> I imagine neither is benji, or anyone else in the US
[10:44] <czajkowski> jam: as is most of USA
[10:53] <wgrant> jam: Some launchpadlib tests only run as part of the Launchpad test suite, since they need to test against Launchpad.
[10:54] <Bert_2> Hi, I'm about to do some changes to my newly rockfuel'ed launchpad, now I am to make some config changes to prevent launchpad from going into dev mode, I was hoping to use this page: https://dev.launchpad.net/WorkingWithProductionConfigs but while it says it describes editing configuration it actually doesn't seem to explain anything else apart from repo and branch stuff (which are not as important as getting out of devmode), so help ?
[10:57] <jam> wgrant: so if I have a fix for a launchpadlib test, I have to get it committed into launchpadlib trunk, then officially packaged and uploaded, then update versions.cfg, right?
[10:58] <jam> but I can't run that test outside of launchpad itself...
[10:58] <jam> wgrant: do you just hack the code in the egg directory directly?
[10:59] <wgrant> jam: You're correct on all counts.
[11:00] <jam>  /cry
[11:02] <wgrant> Bert_2: That page mostly documents how Canonical LP engineers should work with our production configs. The closest thing to a public production config example is the old configs from before the open-sourcing, in r8666. Are you aware of the trademark and copyright restrictions that apply to the Launchpad name and images?
[11:03] <Bert_2> wgrant: I'm not fully aware of that no, I thought I should only remove the launchpad logo
[11:04] <wgrant> https://dev.launchpad.net/LaunchpadLicense
[11:04] <Bert_2> k, so basically if I lose the logo and icons and ask permission to use the name launchpad I'm fine ?
[11:05] <jam> and the hole deepens. The specific contents of the config file is set by lazr.restful.authorize_auth.OAuthAuthorizer...
[11:05] <wgrant> Bert_2: You'd be best off changing the name.
[11:05] <wgrant> jam: You need to change the config file?
[11:05] <jam> wgrant: we have a doc test that asserts the exact bytes of the content
[11:06] <jam> and in py2.7 it changed 2 things (a blank entry now has a trailing space, and the order is changed)
[11:06] <jam> I'm guessing the ordering is defined by iter(dict)
[11:06] <jam> and that 2.7 just does it differently.
[11:06] <jam> It was a very brittle test, waiting to break
[11:07] <Bert_2> wgrant: and can that be done through the configfile or am I forced to change the name by hand in every python file ?
[11:07] <wgrant> jam: Oh, lovely. Can you just change the test to be a bit more flexible?
[11:08] <jam> wgrant: I was thinking to put a "sorted()" around it, yeah.
[11:08] <wgrant> Bert_2: There's no config option. The one other production instance I know of basically ran sed over all the python files and templates.
[11:08] <jam> I'm just struggling to even run it in isolation :). I think I at least have a plan of attack.
[11:08] <wgrant> jam: Heh
[11:09] <Bert_2> wgrant: alright, that can be done, and how about going from development to production mode, how's that done, cause the wikipage isn't all that clear :S
[11:10] <wgrant> Bert_2: You'll see the existing public configs under configs/
[11:10] <wgrant> You select a config to use with the LPCONFIG environment variable. It defaults to 'development'.
[11:10] <jam> wgrant: so I think I'm going to monkey patch ConfigParser.SafeConfigParser in part of a doc test. That sounds sane, right? :)
[11:10] <cjwatson> Any objection to me updating the copyright year in LaunchpadLicense?
[11:10] <wgrant> cjwatson: I was going to do that myself, but feel free :)
[11:11] <cjwatson> Done
[11:11] <jam> (no monkey's will be patched, I swear)
[11:11] <wgrant> Bert_2: I'd copy the development config, and fix up the paths and domains. In launchpad.conf of your new config, remove the 'devmode on' line. Most of the settings are in launchpad-lazr.conf.
[11:11] <wgrant> jam: If it works...
[11:12] <Bert_2> wgrant: alright, so if I change paths, domains, the devmode line and then the usual emailaddresses and numbers for question deprecation etc. I'm fine ?
[11:14] <ivory> gmb: could you take a look at this? https://code.launchpad.net/~ivo-kracht/launchpad/bug-921901/+merge/113234
[11:15] <gmb> ivory, Sure - thought the topic's out of date :)
[11:18] <wgrant> Bert_2: IF you want it to send email you'll also have to reconfigure a couple of things. zcml/package-includes/mail-configure-normal.zcml redirects all email to root@localhost, and you might want to enable the immediate_mail.send_email in launchpad-lazr.conf as well. You might also want to do away with the sampledata, and instead create a clean DB using the scripts in https://code.launchpad.net/~wgrant/launchpad/bootstrap-db-from-scratch
[11:19] <Bert_2> wgrant: awesome, I thought going out of devmode would give me a clean db, so thx for saving me the frustration :D
[11:27] <wgrant> Bert_2: Changing an appserver configuration file fortunately won't erase your database :)
[11:43] <Bert_2> wgrant: I thought the database was generated by a python script that read the config, but I didn't really take the time to read the sourcecode, I've barely been out of exams and this development platform is rather important for our student's body
[11:52] <jam> wf
[11:52] <jam> wgrant: do you have any experience rolling out an updated EC2 AMI?
[11:56] <jam> I found EC2Test on the wiki, I'll try following that one
[11:58] <wgrant> jam: Why do you need to?
[11:58] <wgrant> But yes, that page is correct.
[11:59] <jam> wgrant: we want to get the test suite running w/ python-2.7
[11:59] <jam> so that we can get production moved to precise afterwards
[11:59] <wgrant> jam: Not until we've scheduled it on production we don't.
[11:59] <jam> wgrant: I'm working with flacoste to do... 1) upgrade db-devel to run py2.7 along with ec2-test running py 2.7
[11:59] <jam> and then leave devel running 2.6
[12:00] <jam> once we have both passing
[12:00] <jam> then we can upgrade prod
[12:00] <jam> but we can't upgrade prod until we know it works, and have a clean test suite.
[12:01] <jam> he wants ec2-test to fail early rather than going into testfix on db-devel, and buildbot will run on devel w/ 2.6 before it goes to qastaging
[12:01] <wgrant> I'm not sure it's a great idea to upgrade ec2 unless we know we're going to be upgrading production within a very few weeks, or we're going to be in trouble when people start using 2.7isms and breaking buildbot.
[12:01] <jam> wgrant: so we need a migration path, which I was asked to work on. Certainly each step gets reviewed, etc.
[12:02] <wgrant> Sure.
[12:02] <wgrant> Let's not just actually deploy the 2.7 image to everyone until we have a schedule worked out.
[13:06] <flacoste> wgrant: the plan is to upgrade quickly, really not more than a few weeks
[13:30] <rick_h_> adeuring: ivory https://plus.google.com/hangouts/_/3dd64d7a6ec2e029e3e9abfc99a72a95ff0bef6c?authuser=0&hl=en-US
[14:07] <jam> mgz: did you see my python-upgrade bug? I submitted a fix for it to launchpadlib
[14:08] <mgz> I saw the mp
[14:08] <jam>  it will take a bit to finish that one off, but that was the last failure after manually merging your branches.
[14:08] <jam> mgz: did they all land in trunk?
[14:08] <mgz> guess I can review that? not sure what the funny lp reviews system applies to
[14:09] <mgz> ^jam, yup landed now and tickets moved horizontally with lp2kanban
[14:54] <jam> mgz: what is the approval process for launchpadlib code?
[14:54] <mgz> jam: no idea, my experience has been "bug someone after mp has been forgotten by everyone"
[14:54] <mgz> or was that a lazr thing...
[14:55] <jam> mgz: I think it is a more-than-one-projects thing
[14:55] <jam> but everyone I've seen in launchpadlib on lp's "active in this project" is asleep or on holiday. so I might try later.
[14:55] <jam> eg, tomorrow
[14:55] <bac> jam: have a look at https://dev.launchpad.net/HackingLazrLibraries
[14:56] <jam>  /wave bac, not doing July-4 stuff yet?
[14:56] <bac> jam: it has good stuff, especially if you need to make a release
[14:56] <bac> bah, our founding fathers would be embarrassed by our slothful coworkers.  :)
[14:56] <mgz> we probably don't need a release, if we can just bump the lp dep to a specific revno
[14:57] <jam> mgz: it is an egg, not a sourcecode
[14:57]  * mgz is not sure if that's a done thing though
[14:58] <jam> bac: so who actually commits stuff to launchpadlib trunk?
[14:59] <bac> jam: i think the short answer is 1) make a dev branch, don't work in your copy of trunk, 2) make a MP and get it reviewed, 3)  merge to trunk and commit with a good message including [r=reviewer], 4) push.  you'll then need to decide on whether to make a release, create an egg, and bump versions.cfg in a LP branch.  probably best to just read that long wiki page, now that i think about it.
[14:59] <jam> (a member of the lazr-developers team)
[14:59] <jam> bac: oh, I know all about that stuff :)
[14:59] <jam> I've done 1-3
[14:59] <jam> "merge to trunk", you just do it directly?
[14:59] <jam> or it goes through pqm (the page says pqm)
[14:59] <bac> jam: yep
[14:59] <bac> no bots
[15:00] <bac> oh really?  ok, my memory is faulty then
[15:00]  * bac defers to the wiki
[15:00] <jam> I'm not sure I'm in lazr-developers
[15:00] <jam> I'm in launchpad
[15:00] <jam> (launchpad-dev? whatever that group is)
[15:02] <jam> bac: looks like I can commit straight to lp:launchpadlib
[15:02] <bac> jam, you should be in this group, indirectly: https://launchpad.net/~lazr-developers
[15:02] <bac> yep
[15:03] <jam> bac: it looks like 'bin/test' doesn't run the doctests either. Only running "bin/test" in launchpad itself, but not in launchpadlib
[15:04] <bac> yes, iirc getting the test to run is done via a LP branch
[15:08] <ivory> abentley: could you review this for me? https://code.launchpad.net/~ivo-kracht/launchpad/bug-562532/+merge/113414
[15:08] <abentley> ivory: sure.
[15:09] <abentley> ivory: How does that remove a page request?
[15:10] <ivory> abentley: allthe other edit buttons are sprite icons, only those were <img> tags
[15:13] <abentley> ivory: I'm used to sprites looking like '<span class="add sprite">'.  Why is this one different?
[15:14] <rick_h_> abentley: it's removing the request to: https://launchpad.net/@@/edit via the src of the <img> tag. and referencing the sprite in the CSS.
[15:16] <abentley> rick_h_: ivory already answered that.  Do you know what's up with this <button class="lazr-btn yui3-activator-act">?  I'm used to sprites looking like '<span class="add sprite">'.
[15:16] <jam> bac: I cannot register with pypi, as my user doesn't have perms. I've done the upload and changes to trunk.
[15:16] <jam> bac: do you have rights and can 'bin/buildout setup . register' for me?
[15:17] <ivory> abentley: i just copied it, i cant really tell you why it lookslike that ...
[15:17] <rick_h_> abentley: ah no, I'm guessing he copied it from somewhere.
[15:17] <rick_h_> ivory: where did you grab it from?
[15:17] <ivory> rick_h_: from the other implementaions of that icon
[15:18] <jam> bac: I'm leaving for end-of-day now. If you get a chance, please do. Or poke benji to do so? I'll try to address it tomorrow if it hasn't happened yet.
[15:19] <abentley> ivory: Okay, if there's no reason not to use "sprite edit", I think we should do it that way.
[15:21] <rick_h_> abentley: adeuring heads up, afk for a sec while I pick up the family from the parade. brb
[15:21] <adeuring> rick_h_: ack
[15:35] <bac> jam: ok, i'll look at it
[15:37] <czajkowski> bac: thought you weren't around today ?
[15:43] <bac> czajkowski: i decided to work today and bank a swap day.
[15:46] <beuno> how independent of you
[15:47] <bac> beuno: yeppers
[15:50] <czajkowski> bac: cool
[16:31] <ivory> abentley: wouldn't the icon be on the other side if "sprite edit" is used?
[16:34] <ivory> abentley: additionally i should mention that the <button class="lazr-btn yui3-activator-act"> is already used in this template
[16:35] <abentley> ivory: I wouldn't think so.  In any case, "lazr-btn yui3-activator-act" does not mean "edit icon" to me, and I doubt it would for most LP developers.
[17:13] <bac> jam: lplib 1.10.1 uploaded to pypi.  thanks for the fix!
[17:13] <czajkowski> bac: abentley would one of you mind lookinat https://bugs.launchpad.net/launchpad/+bug/1020983  I don't know about receipes . please.
[17:13] <_mup_> Bug #1020983: Recipe default text versioning is wrong <Launchpad itself:New> < https://launchpad.net/bugs/1020983 >
[17:43] <bac> czajkowski: he may have a point in that bug.  i'd mark it triaged/low.
[17:46] <czajkowski> bac: ack
[18:03] <lifeless> jam: wgrant: pass --develop to buildout to point it at the launchpadlib source tree and you can edit it in situ - it won't use the egg.
[18:04] <abentley> czajkowski: I know about bzr, but not about packaging.  IMO, he's wrong-- the version number is unique, can't be confused with a release version number, and that's all that's required.
[18:07] <lifeless> abentley: there is another constraint you have missed, and that the bug filer hasn't called out either.
[18:08] <lifeless> abentley: but I think its what motivates the filing; the packages produced by the recipe should version higher than the package they are derived from but lower than the next version of that package.
[18:09] <lifeless> I'm not entirely sure his proposals are much better at that, though they will handle some cases better.
[18:10] <abentley> lifeless: I think this is a distributed numbering problem that can't be completely solved.
[18:12] <abentley> lifeless: It's quite possible that a point release "should" version lower, and I think it's possible for reasonable people to disagree on which should version lower.
[18:13] <abentley> lifeless: Depending on the nature of the branch, of course.
[18:17] <lifeless> yah
[18:18] <lifeless> I guess I'm mainly advocating that we get a clear understanding of what the default intends to achieve and document that somewhere discoverable.
[18:18] <lifeless> then, if the default is mismatched with our intent we can change it, and bugs that want to change our intent we can actually reason about
[18:27] <abentley> lifeless: that makes sense.
[18:37] <abentley> lifeless: So if we grant that we want to sort later than the version of the packaging we're based on, that suggests we should use debversion, not debupstream.
[18:38] <lifeless> I agree.
[18:38] <lifeless> I wonder if we wrote our original intent down, perhaps in a doctest or something.
[18:39] <abentley> lifeless: It was me and james_w flailing around, IIRC.
[18:39] <lifeless> hah :)
[18:40] <abentley> lifeless: I believe we were using debversion originally, so there may be a related MP.
[18:44] <lifeless> or something in bzr log
[18:45] <lifeless> ok, time to prep for my joy of joy 3 1/2 hours of meetings.
[18:51] <abentley> lifeless: bug 652109 inspired this self-reviewed proposal: lp:~thumper/launchpad/recipe-default-version into lp:launchpad and I don't think there was any discussion of whether debversion would be more appropriate as a default.
[18:51] <_mup_> Bug #652109: Default Recipe text should contain {debupstream} <lp-code> <qa-ok> <recipe> <Launchpad itself:Fix Released by thumper> < https://launchpad.net/bugs/652109 >
[18:52] <abentley> lifeless: sorry, proposal url: https://code.launchpad.net/~thumper/launchpad/recipe-default-version/+merge/41663
[18:52] <lifeless> abentley: cool archaeology :)
[18:53] <lifeless> so, we're doing a drunkards walk more or less; might be time to take another step :)
[18:53] <lifeless> anyhow, I have to run - sorry.
[19:16] <jam> bac, lifeless: Unfortunately, it appears that someone a
[19:17] <jam> made a launchpadlib 0.10.0 which was not landed in launchpad itself
[19:17] <jam> because it breaks the test suite
[19:17] <jam> (issues with doctests expecting a reply_token? to be valid and not None)
[19:18] <jam> so we'll need an 0.10.2 after I can sort it out. It looks like benji did the 0.10.0, I wonder if he meant to land it, but found out late it broke the doctest suite because it doesn't run via 'bin/test' in launchpadlib alone.
[19:18] <jam> lifeless: i also haven't tracked down how you specify --develop to bin/buildout, since it seems to want a command first.
[19:27] <abentley> lifeless: https://dev.launchpad.net/Code/RecipeBuildDebversioning
[19:38] <lifeless> jam: https://dev.launchpad.net/HackingLazrLibraries?highlight=(buildout:develop)
[19:38] <jam> lifeless: thanks
[19:38] <lifeless> applies to LP itself too
[19:52] <cjwatson> I noticed launchpadlib test failures somewhere a few days ago, which then went away.  Is it possible 0.10.0 landed briefly and then was reverted?
[19:52] <cjwatson> (I'm afraid I don't remember the failures.)
[21:44] <lifeless> cjwatson: is sbsign intended to be 64-bit only? what about arm...?
[22:24] <cjwatson> lifeless: not this month's problem
[22:25] <lifeless> cjwatson: is that 'no its not' ?
[22:25] <cjwatson> the interface doesn't mandate 64-bit-only, it's just that that's all that's implemented right now because that's the urgent bit
[22:25] <cjwatson> or rather amd64-only
[22:25] <lifeless> gotchya
[22:26] <cjwatson> (I didn't write sbsign, just did the integration - but it's all the same binary format for UEFI so I'd be fairly surprised if it weren't basically trivial to adjust)
[22:27] <cjwatson> I'm told Steve has a patch to make it handle i386, at least, but I didn't have that to hand when doing the QA
[22:27] <lifeless> I'm just being nosy
[22:27] <lifeless> 100% curiosity
[23:31] <StevenK> wgrant, wallyworld: https://code.launchpad.net/~stevenk/launchpad/drop-populate-branch-aag/+merge/113467
[23:57] <StevenK> wgrant: And it seems BranchCollection makes use ClassAlias, so access_{policy,grants} might have to be added to the model.