chris2abentley: i think we talked past each other... i was talking about storing a new revision. i need to compute a lcs-linewise-diff against the previous revision first, no?15:52
abentleychris2: Yes, in order to store a new revision, you'd need to reconstruct the previous.15:57
abentleychris2: Unless you want to do it inefficiently.15:57
chris2i know how to checkout a revision, i think :)15:57
chris2but when i want to store a minimal delta, i need to have both contents beforehand i guess15:58
abentleychris2: You might be able to treat the instructions for constructing the old revision as if they were instructions for creating the new revision, correct them, and then use the corrections to generate instructions for generating the new revision.  That's hypothetical.15:59
chris2that's what i was playing at16:00
abentleychris2: May I ask why you're interested in weavediffs?16:02
chris2sure. i'm thinking about a new vcs and wondered if sccs-style weave wouldnt be a good storage for text files16:02
chris2but i wanted append-only16:02
chris2otoh git-style packfiles seem to solve the issue easily and more generally with brute force :)16:04
chris2and i currently dont really care for merging, so...16:04
fullermdWasn't the knit format sorta reminiscent of an append-only weave?16:07
chris2its the evolution of weavediff afaiu16:07
chris2but i couldnt find details on how it works :)16:07
fullermdProbably aren't many, outside of the code...   Of course, an alternate answer would be "poorly enough that they were superceded for a reason"   ;)16:08
chris2needing to read the file backwards probably is not nice for rotating disks16:10
fullermdEh, for reasonable sizes files, you probably just slurp up the whole thing anyway.  Heck, for files up to a certain size, that probably happens between hardware and OS prefetch whether you try to or not.16:11
chris2if you have one weavediff per file, yearh16:12
fullermdNot so much if it's a hundred megs, to be sure.  But then (unless the reconstructed file is most of that anyway) you're probably gonna hit some of the nasty weave cases trying to work with it anyway.16:13
chris2a naive weave needs linear time to extract, i guess i want to avoid that anywayt16:13
chris2so you need to store snapshots somehow16:13
abentleychris2: So there were two advantages of weaves: 1. weave merging, 2. annotation.  Knits provided annotation, and we actually found three-way merge worked very well most of the time.  These days, with pack files, we do annotation the expensive way, but it's cheap enough if you'd doing it locally.16:16
fullermdYeah, experience hasn't been kind to weaves.  Seems like they have great properties in reasoning and performance for the things you hardly ever do, while being difficult and compoundingly expensive for the things you wind up doing all the time.16:16
chris2i mostly have git experience. and annotation is slowish, but usually good enough16:16
chris2but the store is one of the fastest and most compact i know16:16
chris2and 3-way-merge works Good Enough, exactly16:17
fullermdbzr knit formats were certainly a lot better than weave (my daily bzr.dev 'pull's took >20 minutes with weaves), but they still lost the race by a long way...16:17
chris2so what does bzr use currently? this 2a format? is it explained somewhere?16:18
fullermdIt was once called something like "brisbane-core" while it was in dev.  I know there were a number of docs written under that name at the time.16:18
abentleyfullermd: I think my weavediffs would have been better than weaves, but likely worse than knits.16:18
chris2how does it compare to git packs?16:20
fullermdI vaguely recall that git uses xdelta or something?16:22
chris2git is sorting all blobs by size and then by date i think, and then doing xdelta chains16:22
abentleyfullermd: I don't think it uses an existing delta format.  I remember lifeless invented it on the plane because he didn't have internet access.16:22
fullermdI think bzr packs are moderately different in implementation, using something more like straight entropy coding, though there's some deltaing on top?16:23
fullermdOr underneath.  Whichever.16:23
fullermdBut that's way out of my depth.  I just commit and log stuff   :p16:23
fullermdThere may be some comments in the code giving an overview.16:24
chris2i'll have a look16:24
fullermdgroupcompress.py is probably the place to start.16:25
abentleychris2: You also might want to look at docs/groupcompress-design.txt16:26
chris2very good. thanks16:29

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!