=== yofel_ is now known as yofel | ||
jimis | which algorithm does bzr use to compress .pack files? | 21:52 |
---|---|---|
jimis | is it gzip/bzip2 or something? | 21:57 |
jimis | does it spawn an external process to decompress or it uses a C module from within the same process? | 21:58 |
jimis | it bogs down my CPU and I must find out why... | 21:59 |
wgz | it's zlib, in process | 22:04 |
wgz | compression just takes work, there aren't magic fixes | 22:04 |
jimis | hmmm so I guess it could be related | 22:05 |
jimis | wgz: there are fixes, not magic ofcourse | 22:05 |
jimis | wgz: the .pack files I assume the contain many small compressed chunks. About how large is each? | 22:06 |
jimis | I'm asking because I notice too much seeking too | 22:06 |
jimis | I've already filed a bug report, I'm just looking a bit more into it | 22:07 |
wgz | right, most of the obvious issues are much higher level than the comprssion step | 22:07 |
jimis | probably true, but if it reads through the *whole* pack files then it would take some time to decompress .5GB | 22:08 |
bob2 | ? | 22:09 |
bob2 | there's an index file next to it | 22:09 |
jimis | bob2: saw it, but still when stracing it I see too many reads, I think it's seeking through most of the file... | 22:09 |
jimis | I should measure exactly how much it reads from disk | 22:10 |
wgz | decompressing an entire pack when only part of it needed may be a bug, that decompressing a big chunk of data takes cpu isn't :) | 22:10 |
jimis | wgz: that's why I was asking about the comression | 22:10 |
jimis | first, if it was spawning separate gzip processes then it would use a second core | 22:11 |
bob2 | except that if the decompression is the bottleneck like oyu say, that's of no use | 22:11 |
jimis | second, there are faster algorithms from gzip, with other tradeoffs of cource | 22:11 |
bob2 | have you read the pack format docs? | 22:11 |
jimis | true have to find out how much it reads :-p | 22:11 |
wgz | I'd suggest that's the wrong end to tackle things from | 22:12 |
bob2 | +1 | 22:12 |
jimis | bob2: only a generic doc is what I found | 22:12 |
bob2 | no idea what that means | 22:12 |
jimis | pointer to the pack docs? | 22:12 |
wgz | for something like bug 1006194 I'd start by seeing if it's opening the repo twice | 22:13 |
ubot5 | Launchpad bug 1006194 in Bazaar "bzr diff too slow (cpu intensive) on large projects" [Undecided,New] https://launchpad.net/bugs/1006194 | 22:13 |
wgz | rather than assuming zlib can be improved on or trying to launch multiple workers | 22:13 |
jimis | wgz: how can I see it? | 22:13 |
jimis | and why would it be opening it twice? | 22:14 |
jimis | bob2: http://doc.bazaar.canonical.com/latest/developers/packrepo.html is the doc I found | 22:14 |
wgz | because the bug states that it's an issue when there are two different branches are involved | 22:15 |
jimis | but didn't mention anything about compression algorithm | 22:15 |
jimis | wgz: even with same branch but -r-2 overhead is the same | 22:16 |
wgz | also see the mailing list archives about log performance | 22:16 |
jimis | latest quarter? | 22:17 |
wgz | jimis: eg. <https://lists.ubuntu.com/archives/bazaar/2011q4/073729.html> | 22:20 |
jimis | thanks wgz, seems relevant | 22:22 |
jimis | Even though these points may apply, bzr log -r-2 is instantaneous in my case. | 22:27 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!