[12:21] <grimboy_uk> Hey, is there a way to get bzr branch to get the latest revision first, copy it out so it's a working copy *then* get the history.
[12:21] <grimboy_uk> ?
[12:23] <abentley> grimboy_uk: No.  That would mean fetching a bunch of data twice.
[12:26] <grimboy_uk> But it would also mean starting quicker.
[12:26] <grimboy_uk> Since sometimes history takes ages.
[12:29] <NfNitLoop> grimboy_uk: the only way to get the latest revision is to build it from the entire history, afaik.
[12:30] <NfNitLoop> so your first step would:  1) fetch all the history, 2) make a working copy from it.
[12:30] <NfNitLoop> then your second step would:  3)  fetch all the history.
[12:30] <NfNitLoop> It works as it currently does for a reason.  :)
[12:31] <NfNitLoop> It may work differently if/when horizon branches/checkouts are implemented.
[12:31] <NfNitLoop> But my guess is that that would require a smart server.
[12:32] <abentley> grimboy_uk: Fetching partial history is something that jelmer's working on.  But fetching the working tree, and then fetching all history as one operation doesn't make sense.  It takes longer.
[12:33] <NfNitLoop> abentley: plus, not all repositories will have a working tree.
[12:33] <NfNitLoop> and you can't rely on the working tree being up to date.
[12:38] <abentley> NfNitLoop: Oh, fetching from the remote working tree would be total crack.
[12:39] <NfNitLoop> aah, ok.  :)
[12:39] <NfNitLoop> abentley: any idea how you'll end up with a working tree with a partial history, then?
[12:40] <NfNitLoop> abentley: am I right in assuming you'll either need a smart remote server, or you'll still have to fetch the whole history?
[12:40] <abentley> I was talking about fetching only the data to build the working tree from the repoistory, and then fetching all the data.
[12:40] <abentley> NfNitLoop: No, there's no reason you'd need a smart server.  You just fetch what you want.
[12:41] <NfNitLoop> abentley: wouldn't you possibly need the whole history of a file to rebuilt it in its current state?
[12:41] <NfNitLoop> Sorry if I'm way off... I don't actually program vcs back-ends.  :)
[12:41] <abentley> No.  The storage format has periodic snapshots of the complete text.
[12:42] <NfNitLoop> Oh really.   As it is now?  Or as it will be with horizon support?
[12:42] <abentley> As it is now.
[12:42] <abentley> It's a performance thing.  We must not scale with the length of history.
[12:43] <NfNitLoop> Yeah, I wondered about that.
[12:43] <NfNitLoop> Hmm.  What's the (rough) algorithm for deciding when a new snapshot should be stored?
[12:44] <NfNitLoop> is it per file, or per the entire repo?
[12:44] <abentley> I don't know the current status.  Originally, it was every 25 revisions, but it might be "every time the size of deltas exceeds the size of the fulltext".
[12:44] <abentley> It is per file.
[12:46] <abentley> As it turned out, we do scale with the length of history, due to the indexes.  Which is the problem keir is working on.
[12:54] <NfNitLoop> Hmm.  How so?   because the entire index must be read for an operation?
[12:54] <NfNitLoop> And that's O(n) and not O(log(n))?
[12:57] <abentley> NfNitLoop: Yes.
[12:58] <abentley> The new pack format is designed to make recent data fast, at the expense of making old data slow.
[12:59] <marcus> hi.  when I use a shared repository, is there anything I need to pay attention to when removing branches?  or should I just use rm -fR branch?
[12:59] <abentley> rm -R branch should do fine.
[01:00] <abentley> No need for the -f.
[01:00] <abentley> There aren't any caveats, except that removing the branch doesn't remove the revisions you've committed.
[01:01] <abentley> Just a pointer to them.
[01:01] <NfNitLoop> abentley: I was talking about this earlier with someone.   Isn't there a command that will prune your shared repo?
[01:01] <NfNitLoop> or was it a plugin?
[01:02] <abentley> Yeah, there's a plugin.
[01:02] <NfNitLoop> marcus: worst case, though, you create a new shared repo, branch your remaining branches over there, and delete your old/bloated repo.  (if its size becomes too big.)
[01:03] <NfNitLoop> but that should only happen if you're working with lots of patches that you're throwing away.  (ie: rarely)
[01:03] <NfNitLoop> lots of *big* patches.  :p
[01:13] <marcus> NfNitLoop: you mean that is semantically what the pruning plugin does?
[01:14] <marcus> thanks, guys.  I didn't check out all plugins.  It makes sense for a pruning operation to exist.
[01:15] <abentley> marcus: Yeah, it does, but only for security or if you've accidentally committed an ISO.
[01:19] <marcus> abentley: maybe I was just confused because I don't have the right mental model about internal details.  from a newbie point of view, it's more comforting if the operation that removes a branch is available through bzr, or at least documented that rm -R is safe.
[01:19] <marcus> irregardless of pruning.
[01:22] <abentley> Yeah, we should probably make it clearer that rm -R is safe.
[01:24] <marcus> while I am here.  I recently imported an SVN repository, but not via bzr-svn but using a different stand alone tool (svn2bzr.py).  Trying to push bzr changes back to svn failed with an "no merge base revision specified", but I couldn't figure out how to set a merge base revision.
[01:25] <abentley> marcus: Because you didn't use bzr-svn, you can't push back.
[01:32] <abentley> Oh, netsplit.  Haven't seen that for a while.
[01:33] <Peng_> 24.2 second lag. Wheee.
[01:33] <Peng_> Usually I'm not on the wrong end of one.
[01:33] <Peng_> 46.1-second lag! 18.2.
[01:55] <igc> morning
[02:07] <lifeless> hi abentley
[02:07] <lifeless> abentley: I'm not here today :) - just whipping past
[02:08] <abentley> lifeless: Okay.  I may have a CommitBuilder.abort patch for you in a bit.
[02:34] <Basic> my osx has even stumped the gurus (for now)
[02:34] <Basic> was hoping it was something specific to my box
[02:36] <abentley> Are you the guy with the WoW?
[02:36] <Basic> no comment :-P ... yes
[02:37] <Basic> all my guild osx people are poking fun at me and how this line-x thing doesn't "work" under osx
[02:39] <Basic> will work through some of the ulimit things I saw today. will do the selftest and get with ulimit on open files at 4096 and see what happens.
[02:40] <abentley> Well, I'm the TreeTransform guru and I'm hardly stumped.
[02:41] <Basic> oh? What's John's guru-ness? :-)
[02:41] <abentley> John's guruness?  The patience diff code  and checkouts are the first things to come to mind.
[02:42] <abentley> But I am the principal creator of TreeTransform.
[02:42] <abentley> Anyhow, the first thing to do is disable TreeTransform.finalize().  It's masking the real error.
[02:44] <abentley> Just edit transform.py and make finalize return early.
[02:44] <abentley> Then rerun the operation that usually fails.
[02:47] <Basic> I'll give it a try, thank you for the advice, rid here, I'll work on it on the commute home.
[02:58] <keir> lifeless, ping!
[03:02] <Peng> Speaking of patiencediff, <Peng> Hmm, easy way to run patiencediff on a directory? python -m bzrlib.patiencediff only accepts files.
[03:02] <Peng> :(
[03:04] <igc> keir: lifeless may not be around (much) today - public holiday
[03:04] <keir> oh, i thought that was yesterday
[03:04] <keir> aah timezones
[03:15] <abentley> keir: I am serious when I say if you implement your graph index in C, we will not merge it.  We write in python first, and optimize in pyrex if justified.  Look how long it's taking your patience diff to get merged.
[03:16] <keir> i did not write patience diff
[03:17] <keir> and i'm writing it in python
[03:17] <keir> so please, chill!
[03:19] <abentley> You said you were very tempted to write it in C.  I wanted to be clear about this, because I didn't want you to write it in C and then be upset when it didn't get merged.
[03:20] <keir> there has to be a python version because of all the funky transports
[03:20] <keir> but i will write a C version for local disk, just not immediately.
[03:21] <abentley> For compiled code, I think we have a preference for Pyrex over C.
[03:22] <keir> well, the issue is that i can see this bit of code useful outside of bzr
[03:22] <abentley> And it may well be that only a small section is performance critical.
[03:22] <keir> in which case pyrex is out
[03:22] <keir> but whatever, python first
[03:24] <abentley> Cool.  Thank you for working on this.
[03:24] <keir> i'm writing this up for the ML as we speak
[03:24] <keir> i have a few questions before i send though
[03:25] <keir> which you can prob answer
[03:25] <keir> in the pack-based repo, what is different graphs are stored?
[03:25] <keir> and what is the size / type of each bit of data?
[03:26] <keir> rather, average number of edges out from a node, in from a node
[03:26] <keir> i'm not looking for exact numbers, just trying to get a feel
[03:26] <abentley> Well, there are four things we store: revisions, inventories, signatures, and file texts.
[03:26] <keir> i've been working on the graphindex without *really* understanding the rest of bzr, because i enjoy these types of fast compact data structure problems
[03:27] <abentley> revisions and inventories have the same graph layout and revision-ids.
[03:27] <keir> ok
[03:28] <abentley> file-texts are subset of the revision graph.
[03:28] <abentley> You have one for each file, and it only gets a new revision when the file is modified.
[03:29] <abentley> So given two nodes in a file graph, they have the same relationship as their corresponding nodes in the revision graph.
[03:29] <keir> hmm, ok
[03:29] <abentley> Take the NEWS file for an example.
[03:30] <abentley> If NEWS is modified in revision A and revision B, there will be entries in its graph for those revisions.
[03:30] <keir> i see
[03:30] <abentley> If B is descended from A in the file graph, B will descend from A in the revision graph.
[03:31] <keir> ok, i understand. so file texts are the revision graph if you eliminate all other files from the revision graph
[03:31] <keir> or rather, the subset of the revision graph in which that file was modified
[03:32] <abentley> I don't follow you.  Many files may be altered in the same revision.
[03:32] <keir> so in the rev graph, the storage is revID->???, and edges are more revID's
[03:32] <abentley> There is a graph for NEWS, another for builtins.py, another for README, etc.
[03:33] <abentley> Yes, revision ids are the same in all of our graphs.
[03:34] <keir> abentley, as in, if you imagine a graph is k->v and k->k*, for the rev graph, what are the values?
[03:34] <abentley> For the revision graph A -> B -> C, let's say NEWS is only modified in A and C.  This means that the NEWS graph will be A -> C.
[03:34] <keir> yes, i understand
[03:35] <keir> so the rev graph has no actual values in each node? just the references to other revids?
[03:35] <abentley> What are the values associated with nodes in the revision graph?
[03:35] <keir> yes
[03:35] <abentley> They are the commit metadata.
[03:36] <keir> basically, I think my current graphindex is so cool we may want to store more than just an index in it :)
[03:36] <abentley> Commit message, timestamp, commiter, etc.
[03:36] <abentley> The cat-revision command should show you exactly what is stored for a revision.
[03:37] <abentley> e.g. bzr cat-revision -r -1
[03:37] <keir> ah, neat
[03:38] <keir> woah, xml
[03:38] <keir> do you actually store XML internally?
[03:38] <keir> or lzw'd xml or something?
[03:38] <abentley> Yes, we do.
[03:38] <abentley> Don't hate us for it :-)
[03:39] <abentley> I should draw your attention to the properties list; this is arbitrary data.
[03:39] <keir> i'm usually with jwz on the quote 'some people, when faced with a problem, think "I know, lets use xml.". Now you have two problems.'
[03:39] <keir> but whatever, this is fine
[03:39] <abentley> See the header of xml_serializer.py for a similar sentiment.
[03:40] <keir> ok, then my next question is, how is the rev graph stored on disk?
[03:40] <abentley> It is stored in kndx files.
[03:42] <abentley> The specifics of the knit index format are outside my area of expertise.  But you're working on replacing it, so it can't be altogether perfect.
[03:42] <abentley> The implementation is KnitIndex in knit.py
[03:43] <keir> ok
[03:43] <abentley> The particular kndx is  revisions.kndx
[03:46] <abentley> So revisions.kndx serves dual purposes: describe how to extract the revision XML for a given revision, and describe the revision graph.
[03:47] <keir> ok
[03:49] <abentley> You can call repository.get_revision_vf() to get an object to play with.  Or you can call revision.get_revision_graph to get a dict-style graph.
[03:50] <abentley> Another thing about the revision graph is that it applies to all projects in a shared repository.
[03:50] <abentley> Whereas the per-file graphs are almost always for a single project.
[03:52] <keir> even though they are stored in the same file
[03:53] <abentley> The per-file graphs and revision graph are not stored in the same file.
[03:54] <abentley> (I'm not sure what "they" referred to)
[04:04] <abentley> keir: When you come back, just say "abentley: ping"
[04:04] <keir> (on phone 3 mins)
[04:06] <keir> ok, back
[04:07] <keir> as in, there is one big file-text file, even when the per-file graphs are for a single project
[04:07] <keir> in a shared repo
[04:08] <keir> abently: ping!
[04:13] <abentley> keir: oops.  Needs "ey" at the end.  Anyhow, I'm back.
[04:13] <keir> ah, heh
[04:13] <keir> now i see that knitindex and graphindex are parallel :)
[04:14] <abentley> There is not one big file-text file.  There is one knit & knit index per file.
[04:15] <keir> aaah, ok
[04:15] <abentley> They are identified by their file-id, which is used elsewhere in bzr to uniquely identify files.
[04:16] <abentley> And because the file-id is semirandom, two projects containing a file named README will have different ids and knits for it.
[04:16] <keir> even with the same contents?
[04:17] <abentley> Yes.  Bazaar doesn't care about content the same way git and hg do.
[04:17] <keir> is this a weakness or a strength?
[04:17] <abentley> We consider it a strength, because it allows us to import from other systems losslessly.
[04:21] <keir> and why would it be considered a weakness?
[04:23] <abentley> To an extent, other systems will identify some tree-shapes as "the same" even when they were produced separately, and so bzr would consider them different.  This approach also provides obvious ways to validate a tree.
[04:29] <keir> ok
[04:32] <abentley> Anyhow, our current systems stores revisions, inventories, signatures and file texts separately.  Packs centralizes them.
[04:34] <abentley> We still need different indexes, because our code expects them to be separate.
[04:35] <keir> ok
[04:35] <keir> why?
[04:35] <keir> more accurately, what does the code need that can't be put in one file?
[04:36] <abentley> In theory, a single file could store all four indices.
[04:37] <abentley> It's the in-memory representation that needs to treat them differently.
[04:38] <keir> the way i'm writing my stuff, the in memory representation can be (almost) the same as the on disk format
[04:40] <abentley> In the current pack format, there's an adaptor that converts the index into an index for just a particular file.
[04:40] <abentley> GraphIndexPrefixAdapter
[04:41] <abentley> I think one reason for splitting them up is that they have different access patterns.
[04:41] <keir> that's reasonable
[04:41] <keir> and relevant to me; are they mostly repeated traversals?
[04:41] <keir> right now, my stuff optimizes for that
[04:41] <keir> aka follow links repeatedly
[04:42] <abentley> Following the same set of links?
[04:42] <keir> because in the 'dumb' implementation, that's exactly what's slow
[04:42] <keir> as in, you are on file ref A, then you want to go to its successor
[04:42] <keir> then that nodes' successor, and so on and so forth
[04:42] <keir> and yes, same 'set' of links
[04:43] <keir> in my version, there is only one graph represented (aka not multiple adjacency lists)
[04:43] <keir> because then you can't order things nicely for repeated traversals
[04:43] <abentley> Well, there are two kinds of access patterns: tree-wide and history-deep.
[04:44] <abentley> "bzr checkout" would be tree-wide.  We want one copy of the text, for each text in the tree.
[04:44] <keir> which is repeated DFS type of thing?
[04:45] <abentley> So we build each text by finding the all the deltas until we hit a fulltext.  Then we apply the deltas to the fulltext to get the version we need.
[04:45] <abentley> I'm not sure what DFS is.
[04:45] <keir> depth first search
[04:45] <keir> in my current code, that should be _very_ fast
[04:46] <keir> one question: how do you decide which link to follow if there are multiple links?
[04:46] <abentley> Yes.  Really, it's depth-only search, because only the lefthand ancestry must be traversed.
[04:46] <keir> ok, i don't quite understand that
[04:47] <abentley> Okay, do you know that Bazaar treats node parents differently depending on their position?
[04:48] <keir> no! good to know
[04:48] <abentley> You get multiple parents in merge situations, right?
[04:48] <keir> i apologize for not knowing all this stuff
[04:48] <keir> yes
[04:48] <keir> OH! i think i get it
[04:49] <abentley> So the lefthand parent is always the revision you had before you did the merge.
[04:49] <abentley> That's how Bazaar does it, anyhow.
[04:49] <keir> you creat a new edge from before the 'branch' which contains the merge diff
[04:49] <keir> or am i talking crazy
[04:49] <abentley> No, we don't do it like that.
[04:50] <keir> say there's a file that branches, then on each branch there's a few revs, then it merges
[04:50] <abentley> Okay, let's draw a quick graph: A -> B, A-> C, A->D, [C, D]  -> E
[04:51] <keir> when you go backwards, how do you know which edges to traverse?
[04:52] <keir> ok, i drew the graph
[04:52] <abentley> Oops.
[04:52] <abentley> I goofed on it.
[04:52] <keir> :)
[04:52] <abentley> Ignore A->B
[04:53] <keir> and B itself?
[04:53] <abentley> Yes.
[04:53] <abentley> So the story is I commited revision A.  You branched me, and committed revision D.  At the same time, I committed revision C.
[04:54] <abentley> Then I merged your branch, and committed the result as E.
[04:54] <abentley> Because I did the merge, C is the lefthand parent of E.
[04:54] <abentley> If you had done the merge, D would be the lefthand parent.
[04:54] <keir> i see
[04:55] <keir> and if there was another branch, F, taking the same path A->F->E
[04:55] <keir> and you _also_ merged F, then it would be the same
[04:55] <abentley> Yes.
[04:55] <keir> even though there are edges c->e and f->e
[04:55] <keir> ok
[04:55] <keir> that's really neat
[04:55] <abentley> Yes, heads are ignored in the parent list.
[04:55] <abentley> Sorry.
[04:56] <abentley> Only heads are *listed* in the parents list.
[04:56] <keir> and that's to prevent repeated merging
[04:56] <abentley> So [C, F, E]  becomes [C, E] 
[04:56] <abentley> Oh, repeated merging still happens.
[04:57] <keir> ok
[04:57] <abentley> Anyhow, deltas also follow the lefthand ancestry.
[04:58] <keir> so, to be clear, the left hand edge c->e implicitly contains the deltas a-> and a->f
[04:58] <abentley> So for my graph, there is no E -> D delta, only an E -> C delta.
[04:58] <keir> yes, exactly
[04:58] <keir> the link exists but there is no delta change
[04:58] <abentley> Right.  We could record it, but don't.
[04:59] <keir> so in a pack, why not store all this consecutively?
[04:59] <keir> for a single node, you store refs to your children, and some arbitrary edge-value
[05:00] <abentley> So if A is a fulltext, we get A, apply A -> C, apply C->E.
[05:00] <keir> it sounds like the new packs store the deltas in a big list, and then those are indexed in the GraphIndex
[05:00] <keir> abentley, yes, i understand now. i was confused about the merging.
[05:00] <abentley> That's the only possible recipe to produce E.
[05:01] <abentley> So that's why *for building texts*, it's a depth-only search.
[05:01] <keir> great
[05:01] <keir> and that search would happen in the fulltext graph, which is a subset of the revisions graph
[05:01] <abentley> I would expect that packs do store entries topologically-sorted, but not necessarily ajacent.
[05:02] <keir> i imagine in almost all cases, the number of edges is < 5
[05:02] <keir> in which case toposort would do pretty well
[05:03] <abentley> Yes.  about 75% have only 1 parent.  I would guess less than 2% have more than 2 parents.
[05:03] <keir> ok. is there many cases where you want to traverse the graph _without_ also grabbing the related diffs?
[05:04] <abentley> We would do that when running "log".
[05:04] <abentley> i.e. log FILE
[05:05] <abentley> That's pretty rare.
[05:05] <keir> ok
[05:05] <abentley> We are still just talking about files here.
[05:05] <keir> yes
[05:05] <abentley> For revisions, we use the graph quite differently.
[05:05] <keir> i am starving, i'm going to go grab something to eat; will you be around in 15 mins?
[05:05] <abentley> Sure.
[05:05] <abentley> Ping me.
[05:05] <keir> great! ttyi15
[05:28] <keir> abentley, ping!
[05:29] <abentley> Hey.
[05:29] <keir> look at that, xchat can spell better than me
[05:29] <keir> so: revision graph usage
[05:29] <abentley> Right.
[05:30] <abentley> We use the revision graph for log all the time.  I think breadth-first is the right term for the "log" access pattern.
[05:30] <abentley> Because we show all parents of each revision.
[05:31] <abentley> Sorted topologically.
[05:31] <keir> ok
[05:31] <keir> actually i think that's a DFS
[05:31] <keir> rather than a left-child-traversal
[05:31] <keir> err, hmm. nevermind, you are right
[05:31] <abentley> Okay.  I never studied graph theory.
[05:32] <abentley> We also use the revision graph for merge base selection.
[05:32] <abentley> This is done by merge, pull, push, update.
[05:33] <abentley> This is again breadth-first traversal, but we stop when we hit an LCA.
[05:33] <keir> LCA?
[05:33] <abentley> Least Common Ancestor.
[05:33] <keir> ah, ok
[05:34] <abentley> So for this access pattern, incremental reads are a big win.
[05:34] <keir> yes
[05:34] <abentley> Ideally, the data is topologically sorted with newest data first
[05:34] <keir> yes
[05:35] <keir> this is what i have now
[05:35] <keir> however, topological sort is ambiguous
[05:35] <keir> there are many topological sorts for a given graph
[05:35] <keir> my current plan is to take the left-link when doing the topo sort
[05:36] <abentley> We do already have a topo-sort routine.
[05:36] <abentley> (tsort.py)
[05:37] <abentley> But yes, there may be gains in favouring left-parent ancestry.
[05:39] <keir> ok
[05:39] <keir> so, the tradeoff if you store all the data (diffs, etc) in with these graphs
[05:39] <keir> is that in the merge case, you only want ancestry
[05:40] <keir> it looks my design still makes a bunch of sense.
[05:40] <keir> i'm going to send it out to the list
[05:40] <abentley> Okay.
[05:41] <abentley> I was only describing part of the merge process.
[05:41] <abentley> The other parts have the 'checkout' access pattern.
[05:42] <abentley> Making checkout as fast as it can be is definitely a goal.
[05:42] <keir> oh :)
[05:43] <abentley> keir: The first part of merge is you find a base.  Then you get three inventories and compare them.
[05:43] <keir> three?
[05:43] <abentley> Then you get selected file texts from two of the inventories.
[05:44] <abentley> Yes, three: My tree, the tree I am merging, and the merge base tree.
[05:45] <keir> ok
[05:45] <abentley> Oops, actually, you get selected file texts from *three* of the inventories.
[05:47] <keir> and then you do a three-way merge?
[05:47] <abentley> Right.
[05:48] <abentley> Though if only one side changed the file, we take that side's version without doing the equivalent of diff3.
[05:49] <keir> ok
[05:51] <lifeless> abentley: note that I've already replaced knit indices with index.py; keir is working on a better implementation of index.py
[05:51] <abentley> lifeless: I know, but good point.
[05:52] <lifeless> keir: It's nice that the generalised interface I crafted for bzr may be useful elsewhere, thats a good property. But the important thing to note is that that we are doing the work for bzr, so bzr's needs and process are paramount.
[05:52] <lifeless> keir: nothing stops you later rewriting it to your hearts content :)
[05:52] <keir> lifeless, i'm about 5 mins from sending out my design
[05:52] <lifeless> re topo sort grouping
[05:53] <lifeless> I would wager breadth first grouping is most suitable
[05:53] <abentley> keir: So in your version of the file graph, you will have different subgraphs, one for each file.
[05:53] <lifeless> but if its parameterisable that would allow benchmarking
[05:53] <keir> lifeless, well, it's easy to test different strategies; we just re-sort and benchmark
[05:53] <keir> ok. back to the email. i'll be back in a sec
[05:53] <lifeless> abentley: also are you aware that in packs the compression tree and the ancestry graph are decoupled ?
[05:54] <abentley> lifeless: I wasn't sure whether that was exposed.
[05:54] <lifeless> abentley: so we can without changing the index format have arbitrary compression parents
[05:54] <lifeless> abentley: its not somthing that knit.py is yet able to take advantage of, no. But it is exposed for folk that want to work directly on e.g. text_all_index for operations
[05:55] <lifeless> s/format/logic/ two lines up
[05:55] <lifeless> ok, I'm off again, drive by commenting done :)
[05:56] <keir> hmm
[05:56] <abentley> keir: Because the checkout pattern requires fast access to a lot of recent versions of a lot of texts, you would probably want the sorting to pay attention to where snapshots happen.
[05:56] <abentley> So that it can switch files when it hits a snapshot.
[05:58] <keir> so the fulltexts link back to previous revisions even though a checkout would stop at the fulltext?
[05:59] <abentley> Yes, AIUI.
[05:59] <abentley> In fact, the revision knit is all fulltexts.
[05:59] <abentley> Yet we can use its graph.
[06:00] <keir> well, the revisions only store metadata not actual file content do they?
[06:01] <abentley> The XML you saw earlier is stored in the revisions knit the same way file texts are stored in file knits.
[06:02] <keir> ok
[06:11] <keir> sent!
[07:59] <keir> we require python 2.4 right?
[08:05] <AfC> keir: impressive post about Graphs to the mailing list the other day.
[08:12] <keir> AfC thanks!
[08:12] <keir> that was just a couple hours ago
[08:24] <Peng> keir: Some bits (maybe just tests) require 2.4.
[08:24] <Peng> keir: I assume it's mostly 2.3.
[08:32] <keir> ok
[09:40] <dato> Peng: afraid not.
[09:40] <dato> keir: 2.4 is a hard requirements.
[10:02] <Peng> dato: Oh, really? Okay.
[10:22] <igc> night all
[10:23] <dato> Peng: yeah :)
[01:46] <dato> running `bzr upgrade` in a lightweight checkout makes a backup of the light .bzr, not of the real .bzr. would that be considered a bug?
[01:50] <lifeless> no, because its upgrading the lightweight checkout
[01:51] <dato> but it changes the "remote" branch as well
[01:51] <lifeless> ok thats a bu
[01:51] <lifeless> bug
[01:51] <lifeless> it should only alter the lightweight checkout
[01:52] <dato> okay, I'll double-check it does that, and file one
[01:53] <abentley> lifeless: It would be nice to have a bit more API for handing branch references.
[01:53] <lifeless> yea
[01:53] <abentley> "bzr switch" has some pretty gross aspects.
[03:29] <joe99> Is the Cygwin client broken?
[03:40] <ubotu> New bug: #137976 in bzr "`bzr upgrade` in a lightweight checkout upgrades the parent branch" [Undecided,New]  https://launchpad.net/bugs/137976
[03:41] <mwhudson> er
[03:41] <mwhudson> oh right
[04:09] <pygi> hello folks
[04:10] <lifeless> joe99: not AFAIK
[04:43] <alfborge> I've been using bzr for my module of a project for some time now, and finally the project as a whole has adopted bzr.  Is there a way I can merge my existing bzr repos with the project repos so that I keep my history?
[04:43] <alfborge> I've got an existing Products/Timeline/.bzr which is the one I've been using.
[04:44] <alfborge> Now there's a .bzr in Products...
[04:45] <radix> alfborge: yeah, there is, it might be in a plugin
[04:47] <radix> hrmph
[04:48] <radix> I can't find it :\
[04:49] <alfborge> Me neither.
[04:49] <radix> I think jam wrote it, but I guess he's not around.
[04:54] <alfborge> join                 Combine a subtree into its containing tree.'
[04:54] <alfborge> Might be it.
[04:57] <alfborge> I don't really understand the helptext to join.  Could someone take a look at it and help me out?
[04:58] <alfborge> Seems I need to create a branch and then join into this branch.  But it's not really clear to me exactly how to go about it, and where I should run the commands from (the target tree or the source tree) and so on.
[04:59] <radix> alfborge: I don't think join is the one I used
[05:00] <radix> alfborge: to be clear, you have two branches and you want to merge them into one, such that all the files in one are in a directory of the other?
[05:01] <alfborge> I'm not really very familiar with bzr.  I did "bzr init" on the part of a code tree that I've been working on.
[05:01] <fullermd> Is there overlap between what you've got in bzr and what's in upstream bzr now?
[05:02] <alfborge> no overlap.
[05:02] <fullermd> You can just use merge, then.
[05:02] <alfborge> Now the whole tree has been done "bzr init" on by some other people.  So, I can no longer push my stuff upstream since my .bzr in that tree has been removed.
[05:02] <fullermd> Do a big 'mv' in your branch so that the files are in $BRANCH/sub/dir where they should be in the main branch, then you can just 'merge' them.
[05:02] <radix> huh, ok
[05:02] <radix> maybe that's why that command I used isn't there any more
[05:02] <radix> because it's obsoleted by 'merge' handling the case?
[05:03] <fullermd> Well, I think John had some plugin once that basically did that, just automated.
[05:03] <fullermd> Can't remember what it was called.
[05:03] <radix> yeah, I used it
[05:03] <radix> I can't either :)
[05:03] <fullermd> I think it's mostly set aside because the nested trees stuff will take over handling that case.
[05:04] <alfborge> I'm confused by the word branch.
[05:04] <fullermd> (of which 'join' is to be a part, but it's currently hidden because it's not really done)
[05:04] <alfborge> I'm used to CVS, so I've learned to avoid branches like the plague.
[05:04] <fullermd> And good training, in that environment   ;)
[05:05] <alfborge> So, if I get things correctly, I've now got the root/Products/Timeline (upstream bzr) and I've got my Timeline (which is it's own root).
[05:06] <alfborge> Should I remove the Timeline-dir from the upstream and replace it by my Timeline-dir and then do a merge, or did I misunderstand completely?
[05:06] <fullermd> Mmm.  What's in the upstream Timeline dir, and what's in yours?
[05:06] <alfborge> Only difference is that mine has a .bzr dir in it.
[05:07] <alfborge> Oh, and I have a few local changes that wasn't commited when they made the repos.
[05:07] <fullermd> Ah.  So you lied earlier   :)
[05:07] <fullermd> There is overlap.
[05:07] <fullermd> That makes things harder.
[05:08] <fullermd> Your problem now is that you have two different branches that have the same files in them, but they're different files (See?  That's clear, right?)
[05:08] <alfborge> Yeah, pretty much.
[05:09] <fullermd> So, conceptually, what probably needs to happen is for your changes to get replayed against the upstream files.
[05:09] <alfborge> fullermd: Let's keep this simple.  I can roll back my repos to match the version in the upstream.
[05:09] <fullermd> You _could_ delete the upstream files, and stick yours in their place.  That's one solution.  You lose the upstream history though, and set up trouble if anybody else is working on them before merging your changes.
[05:09] <alfborge> I'm the only one working on that part of the project.
[05:09] <alfborge> And the upstream repos was created about an hour ago.
[05:10] <fullermd> Hm.  Has anything changed in the upstream repo since then?
[05:10] <alfborge> nope
[05:10] <fullermd> Was that a conversion from an existing VCS, or just a fresh 'init ; add ; commit'?
[05:10] <alfborge> I believe they did an init; add; commit
[05:10] <fullermd> That has promise.  If we can arrange to throw away that new upstream repo and recreate it, we can probably slip by nicely.
[05:11] <alfborge> upstream is at revno 7, I'm at revno 60
[05:11] <fullermd> Mmm.  Harder.
[05:11] <fullermd> Well, you can rm the files in the upstream and replace them with yours.  Not much history to lose there, if any.
[05:11] <fullermd> Or you can manually apply your changes to those upstream files and commit it, then discard (or archive) your branch.
[05:12] <fullermd> Hard to say which would be "better", really.
[05:12] <alfborge> I don't see how I lied earlier.  If I uncommit my last commit, my files are identical to the upstream files.
[05:13] <fullermd> Oh, it's just one commit?
[05:13] <alfborge> So I can uncommit that, make a patch file and revert my repos to the state it was when the upstream made a snapshot of it.
[05:13] <fullermd> Heck, I'd just redo that in the upstream then.  You can tar up your old branch somewhere for back-reference if you need it sometime, and otherwise just ignore it and move n in the upstream.
[05:14] <alfborge> Yeah, throwing away my history is a solution.  But it's sad. :)
[05:14] <fullermd> Well, the other option is the "rm those files upstream ; merge".  That keeps your history there.
[05:14] <fullermd> If nobody else is editing those files now, it won't cause conflicts.
[05:15] <alfborge> fullermd: Ah, there we go.  That's what I'll do then. :)
[05:15] <fullermd> The only real downside is the kidna "uncleanliness" of having those files from the first 7 revs be different files than in the future.  Probably only of interest to archaeologists.
[05:17] <fullermd> So, in your branch, make a commit with all the files moved into a throwaway temporary dir (myfiles.tmp/ or something), just to have them conveniently out of the root for the merge.  (you could skip that if it's just a few files, but it can make it easier if there's a lot)
[05:17] <fullermd> Then rm away the files in the upstream, merge your branch in, and bzr mv them into place.
[05:23] <alfborge> Is there a difference between bzr copy and bzr co?
[05:23] <Odd_Bloke> alfborge: Yeah, 'co' is short for 'checkout'.
[05:24] <alfborge> Which is not a branch?
[05:24] <alfborge> Or rather, it doesn't create a branch?
[05:25] <Odd_Bloke> alfborge: It creates a branch which is bound to the checkout location.
[05:26] <Odd_Bloke> alfborge: That is to say, some operations will affect the checkout location (commit, for example), unless specifically directed not to.
[05:26] <alfborge> Good to know.  I think I'll stick with branch. :)
[05:26] <alfborge> And push my changes when I'm ready.
[05:27] <Odd_Bloke> alfborge: The only difference between 'checkout' and 'branch' is that the former creates a bound branch and the latter an unbound one.
[05:27] <Odd_Bloke> You can switch between the two using 'bzr bind' and 'bzr unbind'.
[05:36] <mwhudson> fullermd: do you have better ideas?
[05:37] <fullermd> Not referring to a checkout as a branch.
[05:41] <mwhudson> it is, though
[05:41] <fullermd> Not really relevant   :p
[06:16] <Radtoo> Hi. I just installed bzr and used tailor to migrate my svn repository (contains many blobs) to bazaar. But that inflated the repository from 360MB to 1.5GB. Is that normal or did I do something wrong?
[06:17] <NfNitLoop> Hmm, I'm not familiar with tailor.
[06:18] <keir> Radtoo, have you tried bzr-svn?
[06:19] <Radtoo> keir: that doesn't even compile here, so no :/
[06:19] <Radtoo> tailor appears to pull from svn version by version and then commit it into bazaar... but I can't be entirely sure :D
[06:19] <NfNitLoop> if bzr-svn works, it sounds like the ideal solution.  If you "import" (branch) with it, you can easily keep it in sync with the svn repo and even commit back to it.
[06:19] <keir> we're working on pack repositories, which i believe will actually make bzr repos smaller than svn's
[06:19] <keir> but they're not ready just yet
[06:19] <Radtoo> I see, atm this is normal then?
[06:21] <NfNitLoop> *shrug*
[06:23] <fullermd> Probably, especially with big files getting changed a lot.
[06:23] <jelmer> NfNitLoop: can you try the 0.4 branch again, I've fixed some http bugs.
[06:24] <NfNitLoop> jelmer: *heart*
[06:24] <NfNitLoop> trying now. :)
[06:26] <NfNitLoop> If I get disconnected, it means my box is too busy thrashing to answer TCP packets.   (Whee, memory leaks!)
[06:26] <jelmer> (-:
[06:26] <jelmer> a --limit option to svn-import may be a good idea
[06:26] <NfNitLoop> Yeah... I was looking for one.
[06:27] <NfNitLoop> are you saying it's a good idea for me to use?
[06:27] <NfNitLoop> or for someone to implement?  :)
[06:27] <jelmer> rather for someone to implement, it's not there yet
[06:28] <jelmer> ideally, I should just fix the memory leaks...
[06:28] <jelmer> but tracking them down is hard and there's at least one in the python subversion bindings
[06:28] <NfNitLoop> I need to get a job that doesn't suck my will to live (and program) so that when I get home I'll actually want to code on some of this.  :)
[06:29] <Radtoo> I see. Well I'll love to try the packed variant once it's ready :D
[06:31] <Radtoo> BTW, I have tried Git and it also doesn't get that part quite right ;)
[06:31] <NfNitLoop> jelmer: I still get:  bzr: ERROR: Invalid revision-id {None} in SvnRepository(...)
[06:32] <NfNitLoop> this branch?  http://people.samba.org/bzr/jelmer/bzr-svn/0.4/
[06:34] <NfNitLoop> It seems weird to me that using svn-import, I still have to do svn+https://     (wouldn't svn+ be implied?)
[06:34] <keir> Radtoo, what do you mean? the packed repos don't work well for git?
[06:35] <NfNitLoop> keir: I think he means that git repos are much larger than svn ones.
[06:35] <dato> Radtoo: I guess the repository contains a fair amount of binary data?
[06:35] <keir> whaa?
[06:36] <keir> i hear that git repos were really tiny
[06:36] <NfNitLoop> keir: maybe I guessed wrong, then. :)
[06:36] <keir> that the entire gcc history was a 300mb git repo, but 17 gigs in svn
[06:36] <dato> maybe he didn't repack
[06:37] <Radtoo> keir: yep, as compared to svn.
[06:38] <Radtoo> dato: yes. many (said that before tho :P)
[06:38] <dato> Radtoo: ah, I missed it, sorry
[06:39] <Radtoo> keir: I did run git-gc and stuff and nothing made it grow to 810MB, still a lot larger than the 360MB from SVN.
[06:40] <Radtoo> I suppose such repositories are quite the challenge to handle :D
[06:42] <keir> actually, they are not so hard, but the SCM must recognize and support binary deltas
[06:42] <keir> old svn used to store entire copies of binary files with each rev
[06:42] <keir> not sure if they fixed that now
[06:42] <NfNitLoop> I'm pretty sure nowadays svn treats all files as binary and does binary diffs.
[06:42] <Radtoo> I think they deltify almost everything against everything or something like that
[06:43] <Radtoo> Otherwise I doubt that huge mess of a repository I have would be so small
[06:43] <dato> NfNitLoop: yeah, I belive that's it
[06:44] <NfNitLoop> which is one of the reasons we chose it for our current project.  (Lots of binary files that needed revision control.)
[06:45] <keir> what is the 'signature index'?
[06:45] <Radtoo> NfNitLoop: So SVN worked best for you out of all you tried?
[06:46] <dato> keir: revisions can be gpg signed, so I guess it's that?
[06:46] <keir> ok
[06:47] <NfNitLoop> Radtoo: well, svn was really the only thing that met our needs.   Free, works well with binary files, supports file locking, has a friendly GUI for teh n00bs.
[06:48] <keir> svn does have the benefit of 5+ years of robustness hacking
[06:48] <Radtoo> NfNitLoop: I don't need the friendly part, but otherwise I guess I'll have that particular repos stay on svn.
[06:49] <NfNitLoop> For personal projects I've migrated to bzr.
[06:49] <Radtoo> True. I do expect bzr to be better in 5+ years tho. ;)
[06:49] <keir> yes, i am quite confident the choice will be easy in 5 years :)
[06:50] <keir> though i suspect bzr will never have lockign
[06:50] <NfNitLoop> keir: heh.  I wouldn't expect it to.   Hard to do that w/ a distributed development model.
[06:51] <keir> if bzr got really solid binary support, i could see someone building a locking plugin which would talk to some central server for lock status
[06:51] <keir> the reality is, you NEED locking for working with binary files (photoshop files, etc)
[06:51] <fullermd> And then I can see large hoards of people storming that someone's house in the night, and parading their intestines up main street   ;p
[06:51] <NfNitLoop> not really.
[06:51] <NfNitLoop> Good communication > locks.  :)
[06:51] <Radtoo> I personally don't, tbh :D
[06:52] <NfNitLoop> though, again... hard to do that in a distributed environment.
[06:53] <fullermd> Well, that's a fundamental limit case.  If your environment is too distributed to manage the communication necessary for project management, then your environment is too distributed.
[06:54] <Radtoo> What if it's not "a project"? ;)
[06:54] <fullermd> Hard-coding the communication into the VCS tool doesn't help that.
[06:54] <NfNitLoop> yeah, I'm thinking of some OSS tool that has a repo published via HTTP.
[06:54] <NfNitLoop> Two new people just download it and implement changes that happen to touch the same file...  then there's no way to merge them back together.
[06:55] <NfNitLoop> I guess there, your "communication" comes in at the upstream maintainer.  :)
[06:55] <Radtoo> Ye if the content isn't known to anyone but a proprietary tool... then either that tool can, or no one?
[06:55] <NfNitLoop> Radtoo: eh?
[06:55] <fullermd> Locking wouldn't help that, though.  Even if magic occured and somehow it were all distributed locked...   all that would happen is that only one person could make their change.
[06:56] <fullermd> Which is equivalent to the case that you only accept one of those person's changes (except that you get to choose the better one, rather than just the first   ;)
[06:56] <NfNitLoop> fullermd: right.  I'm not the one proposing locking. :)
[06:57] <NfNitLoop> I'm just saying that very distributed projects don't always have good communication, since anyone can take your code and start modifying it.
[06:57] <NfNitLoop> it doesn't make the project bad.
[06:57] <NfNitLoop> or "too distributed".
[06:58] <fullermd> That's not quite what I meant...
[06:58] <fullermd> My assertion is that if your project (for whatever real or imagined reason) needs tight coordination, you shouldn't be spreading it around like that.  By doing that spreading, you're asserting that you don't need lockstep among developers.
[06:59] <fullermd> So, yes, a project _can_ be "too distributed", if the development environment they've set up isn't consistent with the constraints on the development.
[07:00] <NfNitLoop> ah, yes. agreed.
[07:01] <fullermd> (and of course, a project can be too centralized, if the development environment is twisted into lockstep when no real constraint requires that)
[07:01] <NfNitLoop> which is what my current project feels like. :p
[07:01] <NfNitLoop> (work project, that is.)
[07:02] <NfNitLoop> the binary files require this very centralised development.    But we've got quite a bit of source code in there too.
[07:02] <NfNitLoop> But I guess I should just be glad I got them away from VSS.  :p
[07:03] <jelmer> re
[07:03] <fullermd> I'm sorry, my mental gag filter blocked out that last line you sent...
[07:03] <NfNitLoop> Yes.  That's right.  Until I arrived, they were using The VCS That Shall Not Be Named.
[07:03] <jelmer> NfNitLoop: are you specifying any additional arguments to svn-import?
[07:04] <NfNitLoop> and it took me *months* to convert them.
[07:04] <NfNitLoop> jelmer: nope.   Though I am running it inside a shared repo.  Should I not?
[07:04] <jelmer> NfNitLoop: did you rm -rf the target repository after updating today?
[07:04] <fullermd> There once was a time when I held fast to the belief that using _any_ VCS, no matter how crap, was preferable to using none at all.  Said system (through second-hand experience, thankfully) showed me the error of my ways.
[07:04] <NfNitLoop> the old, broken attempt at an import?  yes.  :)
[07:05] <jelmer> NfNitLoop: what version of bzr are you using?
[07:05] <NfNitLoop> fullermd: Yeah.  I love a VCS where users can *permanently* delete history from it.  :p
[07:05] <NfNitLoop> Bazaar (bzr) 0.90.0
[07:05] <dato> a
[07:06] <dato> oops
[07:07] <jelmer> NfNitLoop: trying to reproduce..
[07:07] <jelmer> abentley: ayt?
[07:09] <jelmer> NfNitLoop: this seems fixed in bzr.dev somehow, I'm currently trying to figure out why
[07:09] <NfNitLoop> jelmer: This is against the repo that I sent you.
[07:10] <NfNitLoop> jelmer: Hmmm.   but you reproduced it with 0.90?
[07:11] <jelmer> NfNitLoop: I reproduced it with 0.90 against the repo mentioned in bug 137710 (which is the same issue you're seeing) but can import fine using bzr.dev
[07:11] <NfNitLoop> jelmer: interesting.
[07:17] <NfNitLoop> jelmer: well, how close is 0.91 to coming out?  Maybe you should just make it a requirement for bzr-svn if it's coming out soon.  :)
[07:42] <jelmer> re
[07:42] <jelmer> NfNitLoop: I don't think 0.91 is all that close yet. I'm trying to figure out if there is some way to work around the issue in bzr-svn itself.
[07:43] <NfNitLoop> jelmer: aah, that would be cool.  :)
[07:44] <dato> I thought the RC was next tuesday.
[07:48] <asabil> hi all
[08:19] <keir> wth? zlib only does 1mb/s on my laptop?
[08:21] <keir> ahah
[08:23] <keir> right, it's decompression speed that matters
[08:23] <keir> on my laptop, which is a fairly moderate 1.7ghz centrino, i can decompress at 75mb/s
[09:05] <asabil> is there any good bzr web interface out there ?
[09:12] <sabdfl> asabil: i think the best one is loggerhead
[09:13] <asabil> so there is nothing better ? and something maintained ?
[09:13] <asabil> :/
[09:14] <sabdfl> asabil: https://code.launchpad.net/loggerhead
[09:14] <sabdfl> check it out, there's activity there as we speak :-)
[09:14] <asabil> oh great
[09:15] <asabil> the weblink in the list of 3rd party tools links to quite old stuff I think\
[09:15] <asabil> thanks a lot :)
[09:16] <sabdfl> sure
[09:16] <sabdfl> nice way to see what's going on in a project, that
[09:17] <asabil> which branch you suggest I check ?
[09:17] <asabil> the devel or production ?
[09:19] <keir> wow, your two names are impossible to disambiguate at a glance
[09:20] <keir> i thought mark shuttleworth was talking to himself
[09:20] <asabil> lol
[09:20] <dato> there should be an irssi plugin that would use alternate colors for adjacent nicks with the same length
[09:23] <Basic_py> When you do a bzr checkout --light-weight, does this mean you cannot go back to this local repository and do a bzr pull to get updates from a remote repository?
[09:24] <dato> Basic_py: you can run `bzr pull` inside a lightweight checkout (in fact, you can run all operations there). it will update the "parent" branch, though.
[09:33] <keir> phew, that was a long email
[10:04] <mdke> hiya, I'm quite new to bzr and am feeling my way around still. I'm merging from a branch and there is a conflict, is there a good doc about how to resolve conflicts?
[10:10] <beuno> mdke, the bzr docs has a good one, are you using 0.18?
[10:10] <mdke> beuno: 0.90.0
[10:11] <beuno> mdke, then in the bzrdir/doc/en/user-guide
[10:11] <beuno> there's a file called conflicts.txt
[10:11] <beuno> I'm sure it's on the bazaar webpage somewhere, but I can never find it
[10:12] <dato> http://doc.bazaar-vcs.org/latest/en/user-guide/index.html
[10:12] <mdke> I'm reading http://doc.bazaar-vcs.org/latest/en/user-guide/tutorial.html at the moment, has something stuff
[10:12] <beuno> that's it  :D
[10:12] <beuno> http://doc.bazaar-vcs.org/bzr.0.90/en/user-guide/conflicts.html
[10:12] <beuno> to be precise
[10:13] <mdke> I was hoping bzr would do the merging for me, it looks like it doesn't from that page
[10:13] <radix> mdke: it does do most merging for you
[10:13] <beuno> it just needs a bit of help when to lines of code conflict
[10:13] <radix> mdke: you have to resolve any *conflicts* that happen during merge
[10:13] <beuno> dato, thanks btw  :D
[10:13] <dato> mdke: if there are changes to the very same lines, bzr can't know how to solve such conflict
[10:14] <dato> short of reading your mind
[10:14] <dato> beuno: you're welcome
[10:14] <mdke> radix: I mean, I was hoping it would resolve conflicts for me, given a bit of input from me
[10:14] <dato> ah, and that reminds me
[10:14] <radix> not sure what that means
[10:14] <dato> mdke: what kind of input?
[10:15] <mdke> in this case it's a very simple file
[10:15] <mdke> I want to keep both my changes and the changes in the branch
[10:17] <dato> mdke: bzr tried to do that, but failed because your branch and the other branch *change the very same lines in different ways*
[10:18] <mdke> ok, bzr knows best I guess. I am surprised because the file is so simple and I thought that the changes I made were just to add lines rather than change some. I may be misremembering though
[10:20] <mdke> yeah, i did just add lines to the end of the file, annoying
[10:21] <beuno> mdke, that's weird then, it shouldn't complain
[10:21] <beuno> it's actually saying there are conflicts?
[10:21] <beuno> "bzr conflicts"  complains?
[10:21] <mdke> yeah
[10:23] <beuno> mdke, did bzr generate a conflictedfile.BASE?
[10:23] <mdke> yeah
[10:24] <beuno> mdke, the conflicted text should be listed there
[10:25] <mdke> yes
[10:25] <beuno> does it make sense?
[10:25] <beuno> :D
[10:25] <mdke> well, no :) not to worry, I've resolved it manually
[10:26] <beuno> heh, ok ok, bzr isn't that random normally
[10:28] <mdke> :)
[10:47] <entoll> Hello
[10:48] <entoll> Is that normal that under Windows, bzr selftest fails on a few dozen tests?
[11:53] <lifeless> keir: 1mb/s ? what were you using GzipFile ?
[11:57] <lifeless> keir: also, I got your reply, but you didn't cc the list - do you mind if I forward it back to the list ?
[11:59] <keir> lifeless, i just noticed that and sent it to the list
[11:59] <keir> lifeless, let me know if it shows up for you
[12:00] <keir> lifeless, any thoughts?
[12:01] <keir> lifeless, also it was 10mb/s compression; i dropped a 0
[12:01] <lifeless> heh
[12:01] <lifeless> so
[12:01] <lifeless> were you testing using python?
[12:01] <keir> yes, with zlib
[12:01] <lifeless> The GzipFile interface ?
[12:01] <keir> import zlib; zlib.compress(big_100mb_string_here)
[12:01] <keir> with a time set/check around it
[12:02] <keir> not sure what GzipFile is
[12:02] <lifeless> ok
[12:03] <lifeless> have you seen the timeit module?
[12:03] <lifeless> http://bazaar-vcs.org/RobertCollins has some examples showing its use
[12:03] <lifeless> basically
[12:04] <lifeless> python -m timeit -s 'one_hundred_million_bytes= "A"*1024*1024*100' -s 'import zlib' 'zlib.compress(one_hundred_million_bytes)'