[00:00] <lifeless> jam-laptop: ok, reviewed.
[00:00] <lifeless> basically I think what you have done is the wrong approach, though the code is reasonable.
[00:01] <lifeless> jam-laptop: and I've suggested an alternative approach
[00:05] <jam-laptop> lifeless: I think that might end up returning data out of order
[00:06] <jam-laptop> coalesce will re-order to get better packing
[00:06] <lifeless> jam-laptop: so some options.
[00:07] <lifeless> we could coaelsece knowing that we'll submit 200 at a time
[00:07] <jam-laptop> I suppose you could just not collapse them?
[00:07] <lifeless> so we have a window of 200 and we pack linearly out-of order up to 200, then a new window etc
[00:08] <lifeless> another option is to do what I suggest but use that to accumulate the result rather than to return it
[00:08] <lifeless> that is, coalesce optimally, then do all the reads, then start yielding in-order.
[00:08] <lifeless> (also not that we have an out-of-order-is-ok option for readv)
[00:08] <lifeless> s/not/note/
[00:09] <abentley> Can't we just join the gaps between ranges until we have a suitable number of ranges?
[00:09] <jam-laptop> abentley: lifeless is saying that it ends up downloading too much
[00:09] <lifeless> abentley: that results in reading vast amounts of data
[00:09] <jam-laptop> that was my approach
[00:09] <lifeless> its exactly the bug that I filed
[00:10] <lifeless> now it turned out that the case I filed was trivially solved, but when it gets more complex we will still trigger bad behaviour
[00:10] <lifeless> the cost of a new round trip, for nearly anyone on the planet is < 2 seconds
[00:10] <abentley> Right.
[00:11] <lifeless> a 512kbps line - thats 1Mbit, or 128K
[00:11] <jam-laptop> I *do* think that multiple GETs is the right approach
[00:11] <jam-laptop> this was mostly expedient
[00:11] <jam-laptop> (Default fudge is 128 bytes, by the way)
[00:11] <lifeless> so any expansion over 128K is a net loss in this scenario.
[00:11] <abentley> Well, I'm going over to the dark side to see if I can't get the case-insensitivity problems fixed.
[00:11] <lifeless> faster links expand the amount we can tolerate, but also tend to reduce latency, shrinking it.
[00:12] <lifeless> abentley: may the force be with you
[00:12] <abentley> tx.  laters.
[00:12] <jam-laptop> abentley: good luck, later
[00:25] <Peng> Hmm, anything happen since Thursday?
[00:32] <lifeless> code
[00:32] <Peng> Wow, pulling bzr.dev has been going for almost 5 minutes now.
[00:32] <fullermd> Well, it took me a minute and a half, and I pulled yesterday, so...
[00:33] <Peng> What did you do, update to Python 3k?
[00:34] <Peng> Rewrite it in Perl? :D
[00:34] <Peng> Just finished. 7 minutes.
[00:34] <Peng> It only downloaded r3011-3038.
[00:35] <lifeless> you're probably experiencing the http readv coalescing bug
[00:35] <lifeless> that we fixed:)
[00:37] <Peng> Were 'bzr cat' and bzr+ssh fixed for packs?
[00:38] <lifeless> yes
[00:38] <Peng> Yay.
[00:38] <Peng> 'Course, I'd have to use bzr.dev..
[00:38] <lifeless> though mixed packs and knits over bzr*:// may do bad things
[00:39] <Peng> lifeless: Mixed?
[00:39] <lifeless> its being investigated right now
[00:39] <Peng> Like having a pack repo locally and pushing to a knit repo?
[00:39] <lifeless> bzr pull bzr+ssh://someknitrepo local-pack
[00:39] <lifeless> pushing will be fine
[00:39] <Peng> Okay, good.
[00:39] <lifeless> pull may insert incorrectly formatted data.
[00:39] <lifeless> look at bugs tagged packs in launchpad
[00:39] <Peng> Um. How bad is that?
[00:40] <Peng> Is that like rm-the-repo bad or reconcile-while-whine-some-more bad?
[00:40] <lifeless> Peng: *if* it doesn't handle it correctly, you'll get raw knit hunks that are annotated in a pack repo; or vice verca.
[00:41] <lifeless> if the error checking is there (I think it is), it will just error.
[00:41] <lifeless> Peng: and this *only* affects pull.
[00:41] <Peng> Oh.
[00:41] <Peng> Does it affect clone?
[00:41] <lifeless> cloning is pulling
[00:41] <Peng> Or merge?
[00:41] <Peng> Okay.
[00:41] <lifeless> merging does a pull
[00:42] <lifeless> basically, if you have a pack repo today, don't pull/merge/whatever from a knit repository over bzr+ssh using bzr.dev until https://bugs.edge.launchpad.net/bzr/+bug/165304 has some comments on it
[00:42] <ubotu> Launchpad bug 165304 in bzr "smart server data streams not used across repository	representations" [Undecided,New]
[00:44] <Peng> will*
[00:45] <Peng> lifeless: Like I said, is it rm-the-repo bad or reconcile-will-whine-some-more bad?
[00:45] <lifeless> rm the repo
[00:45] <fullermd> IWBNI(tm) the SS data were totally divorced from the repo formats.  That way the client wouldn't even have to be able to read or know anything about the server format (or vice versa), as long as it could store the requisite data.
[00:45] <fullermd> I'd also like a pony.
[00:45] <lifeless> fullermd: it is
[00:46] <lifeless> fullermd: the question is if the client code DTRT
[00:46] <fullermd> I'm pretty sure it isn't, as broadly as I mean it.
[00:47] <Peng> Hmm.
[00:47] <Peng> This affects 0.92 too, right?
[00:47] <lifeless> Peng: 0.92 is safe because bzr+ssh and packs will just die
[00:47] <Peng> Haha. That's right.
[00:48] <Peng> I think I'll stick with using sftp just to be safe.
[00:48] <fullermd> My bzr 1.0 client Should(tm) be perfectly able to read over bzr*:// a branch9/pack5 branch, as long as my local data format can store all the same info as the remote.
[00:48] <Peng> Even though I won't pull much.
[00:48] <Peng> Well, I don't expect to pull over bzr+ssh at all, but just to be safe...
[00:48] <lifeless> fullermd: yes, for the current smart server verbs, thats exactly what can be done
[00:49] <lifeless> fullermd: we haven't made all operations smart server verbs, so it won't actually /work/ but that's being picky :)
[00:49] <Peng> Hey, wait a minute. Isn't this a data loss bug? Or are they okay in bzr.dev?
[00:49] <lifeless> Peng: bzr.dev is fine
[00:49] <lifeless> Peng: sorry, I don't get what you mean quite;
[00:50] <fullermd> Well, in much the same way that we're totally faster than git; it's just incompletely implemented   ;p
[00:50] <Peng> The homepage claims you've never had a data loss bug or something.
[00:50] <lifeless> Peng: in a release
[00:50] <Peng> Heh.
[00:50] <fullermd> Riding dev head on anything is wild and wooly.
[00:50] <Peng> How many data loss bugs have snuck into bzr.dev? :X
[00:51] <lifeless> right now, there is this *potential* one. And its not necessarily a bug: we're checking.
[00:51] <fullermd> (says the guy running bzr head under his window manager head+ on his OS head...)
[00:51] <lifeless> its 'There might be a problem, please dont do X until we have checked'
[00:51] <lifeless> not 'There is a problem, dont do X'
[00:53] <Peng> Both mean "don't do X".
[00:53] <lifeless> right.
[00:53] <lifeless> and in a few hours we'll know if this is an actual bug or not
[00:54] <lifeless> I suspect its not, it will just error. But safety first.
[01:00] <lifeless> hi jelmer
[01:00] <Peng> Man, my cat sat down on my lap literally 30 seconds before I was planning to get up.
[01:01] <fullermd> That's pretty bad.  Most cats have much better timing than that; 30 seconds is long enough that you might think it a coincidence.
[01:02] <Peng> Heh.
[01:02] <Peng> I've been away for a few days too.
[01:02] <Peng> Anyway, bye.
[01:02] <lifeless> ciao
[01:03] <Peng> To help decide which VCS to use, are you guys dog or cat people? :)
[01:03]  * Peng wanders over to #mercurial to ask, and then leaves.
[01:03] <fullermd> Probably merits a FAQ entry...
[01:04] <lifeless> cat person
[01:05] <lifeless> is what I am
[01:33] <poolie> looking at bug 164443
[01:33] <ubotu> Launchpad bug 164443 in bzr "can't branch bzr.dev from codehost over bzr+ssh" [Critical,Confirmed] https://launchpad.net/bugs/164443
[01:34] <lifeless> poolie: I'm waiting for pqm to quiesce to reconcile it for you
[01:34] <poolie> the bzr.dev branch
[01:34] <poolie> you can kill my 'post-review cleanups' branch if that would be easier
[01:36] <lifeless> hmm,  I might do that.
[01:52] <lifeless> reconcile started
[01:58] <poolie> ok
[01:59] <poolie> do i need to resubmit that, or will you reenqueue it when it's done?
[01:59] <lifeless> you will need to resubmit
[01:59] <lifeless> it's in the replay set already
[02:00] <jelmer> the FAQ mentions tailor, svn2bzr and bzr-svn for migrating from svn (in that order)
[02:01] <jelmer> I think tailor shouldn't be listed first because it's going to give the worst results of the three
[02:01] <lifeless> agreed
[02:01] <lifeless> surely bzr-svn is the best
[02:01] <fullermd> That's so that by the time the user gets to bzr-svn, they really appreciate it   ;)
[02:02] <poolie> what is a replay set?
[02:07] <lifeless> poolie: the hash from the gpg signature is noted
[02:08] <lifeless> poolie: if we get the same again it is rejected
[02:08] <lifeless> stops replay attacks
[02:15] <lifeless> Peng: bzr+ssh is safe
[02:15] <lifeless> Peng: it will just error
[02:17] <abentley> Phew!  Made it back with my soul intact.  Turns out the last version of bzr I had on Windows was 0.8
[02:17] <lifeless> rofl
[02:18] <lifeless> poolie: we're safe to change the default
[02:18] <abentley> bialix isn't kidding when he says it's messy.  I fixed four test cases for transform while I was there.
[02:18] <lifeless> https://bugs.edge.launchpad.net/bzr/+bug/165304
[02:18] <ubotu> Launchpad bug 165304 in bzr "smart server data streams not used across repository	representations" [Undecided,New]
[02:18] <lifeless> abentley: cool
[02:19] <abentley> lifeless: I know that WorkingTree4 has a much stricter locking policy, but I wonder if we should have the same policy for is_executable on *nix and win32.
[02:19] <lifeless> abentley: I'm not sure what you mean
[02:19] <lifeless> could you phrase it as a change
[02:19] <abentley> *nix doesn't need a lock and win32 needs a lock.
[02:20] <lifeless> I said in a review to bialix that I thought requiring a lock on unix was fine
[02:20] <abentley> We could wrap is_executable in a lock decorator, or we could make it require a lock even on *nix.
[02:20] <lifeless> he had some massive cruft in his 'make things be known failures' patch
[02:21] <abentley> Okay, maybe that's a saner way forward.  Otherwise, we'll just introduce new lock failures without noticing.
[02:22] <lifeless> I think most things should require a lock; because it will help track adhoc api abuse
[02:22] <lifeless> that can cause errors and performance bugs
[02:24] <abentley> I miss the days when locks were almost implicit.
[02:27] <lifeless> it was easy
[02:27] <lifeless> but it performed terribly
[02:30] <ubotu> New bug: #172483 in bzr "progress bars on http push behave badly" [Undecided,New] https://launchpad.net/bugs/172483
[02:32] <Peng> lifeless: Great.
[02:33] <abentley> lifeless: One thing I noted on win32 is that our config directory is still *\Bazaar\2.0\
[02:33] <abentley> I don't know if we *can* fix it, but it seems like it would be good.
[02:34] <lifeless> well, we'll reach 2.0 eventually :)
[02:36] <dewd> btw, I'm updating to trunk latest to give pack a new try with bzr+ssh push. :P hopefully it's going to work. hehe. exciting.
[02:46] <lifeless> dewd: okies
[02:46] <lifeless> dewd: it should packs->packs
[02:48] <dewd> i thought it was the default format already. looks like dirstate-tags is the one still
[02:52] <dewd> something wrong? I created a new packs repo on the remote server, and when branching it on the local machine with bzr+ssh, the local format says "dirstate-tags".
[02:55] <abentley> dewd: No, that's a bug.
[02:56] <lifeless> I'm just looking it up in fact
[02:56] <lifeless> bug 164628
[02:56] <ubotu> Launchpad bug 164628 in bzr "branching from hpss doesn't preserve repository format" [High,Confirmed] https://launchpad.net/bugs/164628
[02:57] <abentley> You can work around it by creating a branch in pack fomat, then pulling into it.  Or by not using the smart server for the branch command.
[02:58] <dewd> ok. thx.
[03:10] <spiv> Hello everyone.
[03:10] <spiv> I'm back from honeymooning.
[03:10] <lifeless> poolie: check passed; pushed to a packs branch, then pushed that to a new one gave the same size pack, now running check in the resulting branch
[03:10] <spiv> lifeless: good afternoon
[03:10] <lifeless> yo spiv
[03:10] <poolie> hello spiv
[03:10] <lifeless> you working or slumming?
[03:10] <spiv> poolie: hello.
[03:10] <spiv> lifeless: working a half day, starting now.
[03:10] <poolie> i should really go and see "Keating!"
[03:11] <spiv> poolie: I nearly saw that, but the night I had tickets for got cancelled due to a performer having a cold :(
[03:11] <poolie> lifeless, is pqm running? the web ui is stuck on the output from my killed job
[03:11] <poolie> it's still running
[03:11] <poolie> he's not very kind to spivs though :)
[03:11] <lifeless> poolie: yes, pqm is running, dunno why the UI shows that
[03:11] <lifeless> poolie: oh, I know
[03:11] <spiv> Heh.
[03:12] <poolie> lifeless, you're right, using a reconciled branch in the same way works fine
[03:12] <lifeless> poolie: its got jobs,s o the status file looks like the end of the run I canned
[03:12] <poolie> ?
[03:17] <poolie> spiv, quick call?
[03:18] <spiv> poolie: jml just got in first
[03:18] <spiv> poolie: I'll ping you when we're done?
[03:18] <poolie> just call me
[03:18] <poolie> lifeless, you were right it just needed reconcile
[03:19] <spiv> poolie: will do
[03:20] <Peng> Wow, almost 1 MB pack from just the last 6 days.
[03:21] <Peng> (... of changes to bzr.dev)
[03:22] <dewd> wow. packs rocks
[03:22] <abentley> lifeless: if you've got a branch where packs are the default, I'd be interested in running bzrtools against it.
[03:22] <lifeless> abentley: I posted a patch to the list
[03:23] <lifeless> just change the format name to pack-0.92 and it will work
[03:23] <lifeless> dewd: glad you like them
[03:24] <poolie> spiv, the streaming hpss changes don't seem to be mentioned in NEWS; can you please add something appropriate?
[03:24] <spiv> poolie: ack
[03:24] <thumper> hi spiv
[03:24] <abentley> spiv: welcome back.
[03:24] <thumper> spiv: welcome back
[03:24] <thumper> abentley: snap
[03:25] <abentley> heh.
[03:26] <dewd> lifeless: packs seems much more network friendly for me now in my sample tests. congrats. :-)
[03:26] <lifeless> dewd: we have quite a few future enhancements planned
[03:26] <lifeless> that will help even more
[03:27] <spiv> abentley, thumper: thanks!
[03:27] <dewd> it's a winner in my book. nice decision to go for it sooner. much more pleasant
[03:29] <poolie> great
[03:33] <lifeless> poolie: ok, it passed, migrating to escuderi
[03:40] <Peng> ISTM that if one is going to use packs, one should use bzr.dev.
[03:48] <lifeless> Peng: seems fair to me.
[03:48] <lifeless> in 0.92 they were explicitly tagged experimental
[03:48] <Peng> True.
[03:49] <Peng> I didn't expect that "experimental" would mean "half-broken", though.
[03:51] <Peng> Honestly, I thought that they were marked experimental because they weren't fully optimized and because you're planning to change the format a couple times in the future. I didn't know that there was more work to be done for them to work fully.
[04:01] <Peng> Hmm. "bzr diff -c X.Y" seems to be a lot slower than "bzr diff -c Z".
[04:10] <Peng> Woah, you made bzr.dev packs already?
[04:16] <lifeless> Peng: we'll issue more versions of packs as we change it.
[04:17] <Peng> Mm-hmm.
[04:17] <Peng> That's whast I meant.
[04:17] <Peng> what*
[04:17] <lifeless> ok.
[04:18] <lifeless> Peng: diff -c x..y will have to extract two full inventories vs 1, so if thats slow (many files) then yes.
[04:19] <Peng> lifeless: Not two dots, just one. The crazy revnos from branching and merging.
[04:19] <lifeless> oh, hmm..
[04:20] <Peng> It's not like I did real scientific testing. But when reading through the log of new things in bzr.dev and diffing some, it felt slower.
[04:21] <jamesh> Peng: do the two revisions change similar numbers of files?
[04:21] <Peng> jamesh: Didn't pay attention.
[04:21] <jamesh> i.e. could it be that the time taken is proportional to the size of the change rather than whether it is on mainline
[04:22] <Peng> Since I'm using packs, would you recommend using bzr.dev? I mean, I'm afraid of data loss and stuff, so I don't even always use the release candidates.
[04:22] <jamesh> the other check would be to see whether "bzr diff -c revid:whatever" differs in speed to the dotted lookup
[04:23] <lifeless> abentley: please resubmit your patch
[04:23] <abentley> Okay.
[04:23] <lifeless> pqm is missing some lock correctness stuff itself
[04:23] <poolie_> Peng, if you're concerned about data loss i'd generally recommend using releases
[04:23] <lifeless> right now bzr.dev looks solid to me
[04:24] <lifeless> Peng: we called it experimental cause it was; it was supported in 0.92, so if you had encountered corruption or something we'd have happily spent time fixing your data
[04:25] <lifeless> Peng: and we needed more feedback on it; I wouldn't call it half-broken. It was literally 4 days work to fix all the glitches we encountered making packs be able to be default
[04:25] <Peng> jamesh: Argh, not sure.
[04:25] <lifeless> packs took way more than 8 days to make :)
[04:25] <lifeless> poolie_: you too, please resubmit.
[04:25] <lifeless> I'll monitor the first one to go through
[04:25] <poolie_> again?! :)
[04:26] <lifeless> yup
[04:26] <Peng> jamesh: Um, yes. Using the revid changes 3009.2.29 from taking 8.5 seconds to 0.5.
[04:26] <lifeless> tree.lock_write() FTW
[04:26] <poolie_> done
[04:26] <jamesh> Peng: I guess you've found the reason then
[04:27] <jamesh> the dotted revnos are the sort of thing that could probably be cached in the branch though
[04:27] <jamesh> so it is something that could be fixed
[04:27] <Peng> lifeless: <kinda quiet voice>Well if it only took 4 days, you could've delayed the release.</>
[04:28] <Peng> (</> is valid SGML, right?)
[04:28] <lifeless> heh
[04:28] <lifeless> ' needs &amp; though
[04:28] <jamesh> Peng: depends on your sgml definition
[04:29] <jamesh> or whatever they call it
[04:29] <Peng> :D
[04:29] <lifeless> well, the release hasn't happened yet has it? And at 0.92 time that was a month ago, there was UDS and allhands, and time based releases
[04:29] <i386> lifeless: ping
[04:29] <lifeless> we didn't know at 0.92's timeframe how long the piece of string was, or what issues would arise
[04:29] <lifeless> i386: yo
[04:29] <lifeless> i386: you in SF?
[04:29] <i386> yo
[04:29] <i386> Yeah I am
[04:30] <Peng> lifeless: Well, maybe not half-broken, but 'bzr cat' and pushing over bzr+ssh didn't work, and there seems to have been a *lot* of work in bzr.dev to fix things.
[04:30] <jamesh> the SGML declaration is what I was thinking of
[04:31] <Peng> Will bzr.dev being packs make pulling a lot faster?
[04:31] <lifeless> Peng: try it ;), using bzr.dev because 0.92 has the http range combining bug
[04:33] <Peng> I was just about to try it, then I discovered I have two extra copies of bzr.dev from November 2, at r2954. I wonder why?
[04:33] <Peng> Both are dirstate-tags.
[04:33] <Peng> One is dirstate.
[04:33]  * Peng shrugs.
[04:34] <Peng> Using packs over bzr+ssh, is there any reason to upgrade the server to bzr.dev?
[04:34] <Peng> Or is 0.92 okay?
[04:35] <lifeless> you'll need bzr.dev, there were a few broken methods
[04:35] <Peng> Ooh.
[04:35] <Peng> That sounds unappealing.
[04:35] <Peng> Especially since I'll never remember to keep it up-to-date.
[04:35] <lifeless> just the one update
[04:35] <lifeless>  then at our next release you can switch back to releases
[04:36] <Peng> Should I run check and reconcile now? Or will it still report millions of non-error errors and take gigs of RAM?
[04:37] <lifeless> check and reconcile in bzr.dev are a-ok
[04:37] <Peng> Last time I tried a couple weeks ago it used 1.6 GB of RAM.
[04:37] <lifeless> they will check but not reconcile packs.
[04:37] <lifeless> they will check and reconcile knits
[04:37] <Peng> Blaah.
[04:37] <Peng> Never mind, then.
[04:37] <lifeless> pack reconcile is on my laptop and my top priority tomorrow
[04:38] <Peng> :)
[04:39] <lifeless> it got sidelined with this 'get packs ready for default'
[04:39] <Peng> Wow.
[04:40] <Peng> Err. Not wow at that.
[04:40] <Peng> With 0.92 and http://bazaar-vcs.org/bzr/bzr.dev/ being dirstate, pulling from 3010-3038 took some 7 minutes. With bzr.dev and the remote being packs, 3010-3040 took 1m15s. Nice.
[04:42] <Peng> Wait. The 1m15s test was also without a working tree.
[04:42] <Peng> But that couldn't take more than about 5 seconds.
[04:42] <lifeless> hmm, 1 minute is still slow
[04:43] <lifeless> I bet you its in graph traversal
[04:43] <lifeless> what speed adsl do you have?
[04:43] <lifeless> 1M is 16 seconds on 512
[04:43] <Peng> 1.5 down.
[04:48] <lifeless> so, 6 seconds + figuring out what to pull
[04:48] <lifeless> 4 round trips there minimum, should be able to do this in say 12 seconds total
[04:48] <lifeless> e.g we still suck
[04:49] <Peng> Yikes.
[04:49]  * jml takes his inner pedant out for a coffee
[04:49] <Peng> lifeless: Are you going to abandon the knit version of your repository branch now?
[04:51] <lifeless> Peng: I'll probably make it a packs version, so I can hack on the next experimental format
[04:52] <Peng> repository will be the new format, repository.knits will be the current format. Okies.
[04:54] <lifeless> Peng: well, I'll rename it too, for clarity :)
[04:55] <Peng> Right.
[05:00] <lifeless> ok, pqm is happy now
[05:00] <lifeless> so I'm off
[05:03] <Peng> Pushing packs over sftp, is bzr.dev any faster than 0.92?
[05:03] <Peng> (Creating a new branch.)
[05:04] <Peng> Wait, never mind. I am using bzr.dev.
[05:05] <Peng> (I was asking because if it was faster, I might want to kill a push I'm running and start over with bzr.dev. But since I am using bzr.dev, doesn't matter. Still might be interesting to know though.)
[05:07] <lifeless> should be the same
[05:07] <lifeless> if its a shared repo it should be really fast
[05:08] <lifeless> if its not really fast and is a shared repo, please file a bug, tagged packs.
[05:22] <jml> why does addCleanup enforce uniqueness of cleanup callables?
[05:23] <poolie_> jml, no good reason
[05:23] <jml> ok
[05:24] <poolie_> if you're motivated to do it, send a separate patch removing that check
[05:24] <jml> poolie_: the motivation is low.
[05:31] <ubotu> New bug: #172499 in bzr "Bazaar knit corruption "Not a gzipped file" " [Undecided,New] https://launchpad.net/bugs/172499
[05:36] <jamesh> ^^ sounds like the web server helpfully adding Content-Encoding or Transfer-Encoding headers
[05:48] <Peng> lifeless: I dunno. It took 25 minutes to push into an empty shared repo, but it was a large branch (bzr.dev), so that might not be bad.
[05:48] <lifeless> Peng: oh, new content has to upload; It's 50MB
[05:48] <lifeless> Peng: whats your upload rate
[05:50] <Peng> lifeless: I'm not sure. I thought it was 256k.
[05:50] <lifeless> so, thats 32 seconds/MB
[05:50] <lifeless> or 2MB/minutes - so 25 minutes
[05:50] <Peng> Okay.
[05:50] <lifeless> Peng: sounds like we got wire rate total time
[05:50] <Peng> And that's with sftp's overhead.
[05:51] <lifeless> not to shabby
[05:51] <lifeless> when we halve the size of storage it will get faster again
[05:51] <Peng> Halve? How?
[05:52] <lifeless> new storage level diff algorithms
[05:52] <Peng> New delta algorithm?
[05:53] <lifeless> yah
[05:53] <lifeless> viao
[05:53] <lifeless> ciao
[05:54] <Peng> Which one?
[05:56] <poolie_> lifeless, can you please chmod the .knits branch?
[06:21] <ubotu> New bug: #172506 in bzr "allow users to force an empty commit message" [Low,Confirmed] https://launchpad.net/bugs/172506
[06:34] <lifeless> wb
[06:34] <lifeless> poolie_: sure; will do shortly
[06:49] <lifeless> poolie_: done
[06:49] <poolie_> dude go home!
[06:50] <lifeless> I am :)
[06:50] <lifeless> just got dinner
[07:14] <spiv> lifeless: the memory consumption of reconcile is much better now; thanks!
[07:14]  * spiv stops work for the day
[07:18] <Peng> Definitely.
[07:18] <Peng> Wait, I only ran check.
[09:05] <Peng> 0.92's new "bzr help formats" is nice. :)
[09:20] <Odd_Bloke> Have I missed an email regarding a new release schedule (as bugs are now being targeted for alpha releases I've never heard of :p)?
[09:21] <Peng> Huh, Ctrl+Cing pushing from packs to dirstate tracebacked.
[09:21] <lifeless> Peng: file a bug please
[10:08] <mwhudson> when did 'bzr viz' become so much more awesome?
[10:08] <mwhudson> six months ago if i fired it up on launchpad it produced a window about 5000 pixels wide
[10:30] <Peng> Uh-oh. Using a copy of bzr not installed in the system means the plugins aren't available.
[10:42] <Olberd> Is there some way to find out what revision I had before a bzr pull
[10:42] <Olberd> A log or something
[10:43] <Peng> No.
[11:11] <Odd_Bloke> What can cause 'bzr shelve' to fail to remove changes from the working tree?
[13:01] <ubotu> New bug: #172567 in bzr "Per-file log is *very* slow with packs" [Undecided,New] https://launchpad.net/bugs/172567
[14:38] <Peng> lifeless: Seems my upload speed is 512k (1/3 of the downstream? huh). That would mean bzr over sftp is half the wire speed.
[14:47] <Odd_Bloke> Peng: I assume you're using ADSL, in which the A stands for asymmetric (that is to say, they use more lines for down than up as most people use more down than up).
[14:55] <Peng> Odd_Bloke: I know that. I thought it was lower than 1/3, though.
[14:56] <vila> jam-laptop: ping
[14:57] <jam> vila: <dodge>
[14:57] <vila> lol
[14:57] <jam> what's up?
[14:57] <vila> looking at doing multiple get for readv, will be simpler if we don't have to respect the provided order for the offets
[14:58] <jam> vila: you can look at the sftp readv code
[14:58] <jam> it does out-of-order fetching
[14:58] <jam> but only buffers what it cannot return
[14:58] <jam> all we really need is to change how we parse the values
[14:58] <vila> hooo
[14:58] <jam> I've done all that work
[14:58] <jam> I just haven't had time to rewrite the http response parser
[14:59] <vila> I think we can avoid that rewriting for now, but I have to look at sftp first
[15:00] <jam> well, I think all the actual parsing is fine
[15:00] <jam> it is just written to parse a complete response
[15:00] <jam> rather than parsing as you go
[15:05] <vila> hmmm, that will buffer quite a lot of data... well price to pay to respect the order while minimizing network use, I suppose you did the analysis at the time
[15:05] <vila> Do you think it's still valid with packs handling megs of data ?
[15:15] <jam> vila: in my testing the amount of out-of-order data was all very small
[15:16] <vila> so reading as you go bufferd only small amounts of data, ok, let's say that ;-)
[15:16] <jam> vila: also if adjust_for_latency = True
[15:16] <jam> we can return things out of order
[15:17] <jam> But I think that may be taken care of in the base readv() implementation
[15:17] <jam> (it sorts them so the individual implementations don't know it is out of order)
[15:18] <jam> vila: And I believe most pack operations use that
[15:18] <vila> soooo, I don't have to worry
[15:18] <vila> phone
[15:46] <radix> Is there a way to get "bzr shelve" to be "tighter" in what it considers a single hunk?
[15:47] <radix> And by "tighter hunks", I mean what "bzr diff --diff-options=-U0" might give.
[16:02] <elmo> GGGGGGGGGGGGGGGGGGAR
[16:02] <elmo> bzr add precious; bzr rm precious
[16:02] <elmo> oh look, 'precious' is no more
[16:02]  * elmo stabs himself repeatedly
[16:03] <fullermd> Looooost!  Precious is lost!
[16:04]  * fullermd hugs his rm --keep alias.
[16:05] <luks> it doesn't actually remove anything that isn't in the branch history
[16:06] <elmo> luks: except in the example I just posted
[16:07] <elmo> I just did this IRL and lost like 70 files as a result
[16:07] <luks> what version of bzr?
[16:07] <luks> mine says "bzr: ERROR: Can't safely remove modified or unknown files:"
[16:07] <elmo> luks: 0.92
[16:10] <elmo> ah, it's a bit more involved
[16:10] <elmo> bzr init; bzr add precious; bzr ignore precious; bzr rm precious
[16:12] <luks> ouch
[16:12] <luks> that sounds like a pretty nasty bug
[16:13] <luks> it probably doesn't check the modified status for ignored files
[16:15] <warren> Will bzr upstream ever reconsider the decision to not track all permission bits in bzr?
[16:16] <warren> (not just execute)
[16:16] <warren> bzr is *really* close to being suitable to track system config files, for example.
[16:16] <elmo> warren: AFAIK, it's not that they don't want to do that, they just haven't decided quite how to
[16:16] <warren> elmo, oh
[16:16] <warren> elmo, yeah, I guess a solid plan would be necessary.
[16:17] <warren> there are other permission bits than the chmod XXXX
[16:17] <warren> there are also filesystem acl's and xattrs now
[16:17] <elmo> warren: lifeless mailed me just the other day, asking for details of how we use bzr for managing /etc and how we'd like permission management to work
[16:19] <jam> elmo: with the 'bzr ignore' involved I do see the file being deleted
[16:19] <elmo> jam: yeah, I filed it as https://bugs.launchpad.net/bzr/+bug/172598
[16:19] <ubotu> Launchpad bug 172598 in bzr "bzr rm <file> loses data under some circumstances" [Undecided,New]
[16:19] <jam> otherwise I see the "cannot safely remove modified files"
[16:20] <warren> elmo, ok, looking forward to a future of true permissions management =)
[16:23] <fullermd> I'd think the most likely near-term solution would be plugins using a per-file properties system (as might come about for line-ending and such support)
[16:25] <ubotu> New bug: #172598 in bzr "bzr rm <file> loses data under some circumstances" [Undecided,New] https://launchpad.net/bugs/172598
[16:46] <ddaa> What is the shellyscript way to find that a bzr tree has uncommitted changes?
[16:47] <LeoNerd> Something around bzr di   maybe?
[16:48] <ddaa> looks like it would work...
[16:48] <Peng> bzr st?
[16:48] <Peng> bzr modified
[16:48] <Peng> ?
[16:48] <ddaa> I was looked at bzr st, but it does not have non-zero exit status if uncommitted changes are found
[16:48]  * Peng shrugs.
[16:48] <ddaa> bzr di has
[16:49] <Peng> (modified would only show files that had changed, not other things like added files, of course.)
[16:49] <Peng> Something with python -c "import bzrlib..." ! :D
[16:49] <ddaa> "if ! bzr diff > /dev/null" it is
[16:50] <jam> ddaa: bzr status -q ?
[16:50] <jam> no, it isn't that quite
[16:50] <ddaa> jam: has zero exit status in all cases
[16:50] <jam> quiet
[16:51] <jam> diff seems to be the way to go
[16:51] <jam> I thought we had something different
[16:51] <ddaa> yup
[16:51] <ddaa> a bit wasteful, but not biggy
[16:51] <ddaa> jam: maybe we can have "bzr diff --no-diffs"? :)
[16:52] <fullermd> Maybe something with version-info --check-clean?  Think that triggers on unknowns, though...
[16:54] <ddaa> if I upgrade to packs, will "commit after a big crazy merge" stop being slow?
[16:54] <ddaa> It used to be fast, but he regressed quite badly in recent releases.
[16:54] <mwhudson> hmmmmmm
[16:54] <mwhudson> does pull with packs have a progress bar yet?
[16:54] <ddaa> s/he/it/
[16:55] <ddaa> mwhudson: ask linus, you do not need progress bars when you're fast ;)
[16:55] <ddaa> about "commit after big merge", it's a real issue for me
[16:56] <Peng> ddaa: Committing merges with packs can be/is slow. There's been recent work to make it much faster, but it didn't solve what made it slower than with knits in the first place.
[16:56] <Peng> ddaa: You could try it.
[16:56] <ddaa> Peng: not really
[16:56] <ddaa> upgrading the launchpad repo to packs requires something like 20GB RAM
[16:57] <ddaa> waiting for that to be fixed
[16:57] <mwhudson> ddaa: eh, no
[16:57] <mwhudson> ddaa: i pulled launchpad/devel into a packs repo and it was perfectly smooth
[16:57] <ddaa> am I confuzzled?
[16:58] <mwhudson> it's reconcile that takes/took megagigabytes of ram
[16:58] <ddaa> wasn't there this discussion about "bzr upgrade" eating crazy memory for the launchpad repo?
[16:58] <ddaa> ah right
[16:58] <ddaa> but upgrade won't work if it's not reconciled, right?
[16:58]  * mwhudson is confused too now
[16:59]  * ddaa holsters his confuzzlerizer
[17:00] <Peng> Some repos need to be reconciled, some don't.
[17:00] <ddaa> mh
[17:01] <ddaa> maybe some launchpad repos have this problem and others don't
[17:01] <ddaa> so maybe it's worth for me to let bzr check run on mine
[17:18] <keir> is there any way to do something like bzr push --uncomitted? this is not a good description; what i want to do is push uncommitted changes in my working tree to the branch i'm bound to
[17:19] <keir> use case: i suddenly have to leave while working on my laptop. i'm 80% done a feature, but it doesn't even compile at the moment. i want to push my working tree to the branch i'm bound to, without actually committing.
[17:20] <keir> this happens to me a fair amount
[17:20] <keir> what i do now, is have many commits of 'junk state'
[17:20] <keir> but i'm trying to move to a more sanitized mainline
[17:20] <mwhudson> well, there's merge --uncommitted . other-location isn't there?
[17:21] <mwhudson> ah, merge --uncommitted -d other-location .
[17:21] <keir> oh cool
[17:21] <keir> but this won't work with a bound branch; so i'd have to unbind first?
[17:22] <mwhudson> though i don't know if that'll update a working tree over sftp or whatever
[17:22] <mwhudson> hm not sure, i don't use bound branches
[17:22] <mwhudson> keir: but hang on a moment
[17:22] <mwhudson> keir: there's another answer here: don't commit to mainline
[17:22] <keir> i didn't either until recently; they are pretty handy for my laptop/desktop combo!
[17:22] <keir> mwhudson, yes, i understand that option; i still don't like it because e.g. bisect becomes much less useful
[17:23] <fullermd> A third option would be to uncommit away the dirty rev if it bothers you that much.
[17:23] <keir> but then it'd be uncomitted on my desktop when i get back, but still committed on my laptop
[17:23] <fullermd> Well, you have to 'update' on the laptop anyway, so what's the difference?
[17:23] <keir> unless i made a script commit, unbind, uncommit, ssh into desktop, uncommit
[17:24] <keir> i suppose what i'm looking for is bzr synctree
[17:24] <fullermd> No, you just update.
[17:25] <keir> if i 'update' on my laptop after uncommitting on the desktop, does that uncommit the changes on the laptop?
[17:25] <fullermd> Pretty sure I've done it more'n once.
[17:26] <fullermd> update's pretty much like pull after all; move the head to the new rev, merge any outstanding WT changes.
[17:26] <fullermd> Just because the new head isn't a descendant of the old one doesn't change anything material there.
[17:26] <Odd_Bloke> keir: You might have to 'pull --overwrite', but I don't use bound branches much so I don't know.
[17:27] <keir> interesting
[17:28] <keir> i recently turned one of the machine learning researchers here at U of Toronto onto bzr; he generally likes it
[17:28] <keir> but had some problems: http://article.gmane.org/gmane.comp.version-control.bazaar-ng.general/33525
[17:28] <jam> ddaa, Peng: The commit after merge has been fixed for both knits and packs
[17:28] <jam> The problem was searching the global revision graph
[17:28] <jam> versus the per-file ones
[17:28] <jam> when finding common ancestor, etc.
[17:28] <keir> i didn't have a good answer to his issue, so i asked if he could post to the list, but no one has responded.
[17:28] <jam> Both are *valid*, just the global graph has more nodes to search
[17:28] <Peng> jam: Oh, cool.
[17:28] <jam> There is more work to be done (to make searching the graph faster)
[17:29] <ddaa> jam: thanks, I'll wait for my lpdev update, then :)
[17:29] <jam> ddaa: yeah, I don't recommend you converting your personal one to packs until the upstream is packs.
[17:30] <jam> Oh, and the reconcile memory consumption has also been fixed
[17:30] <jam> I'm not sure what it would be for LP now
[17:30] <ddaa> cool
[17:30] <mwhudson> is b.control_files._lock.peek()['user'] really the easiest way to find out who has locked a branch?
[17:30] <jam> but for bzr.dev it dropped quite a bit
[17:30] <ddaa> that means we'll probably switch devpad to packs shortly after the next bzr release.
[17:31] <ddaa> I appreciate how keeping memory use in control is difficult in Python
[17:31] <keir> uncommit after a commit from a bound branch is not putting the changes back into the working copy
[17:31] <ddaa> it's much easier when you can use custom allocators.
[17:31] <ddaa> that's probably the only thing I think really sucks with Python
[17:33] <jam> ddaa: well, having an algorithm that reads all inventories and buffers them in memory
[17:33] <jam> is just a poor algorithm
[17:33] <jam> It stored a slightly more efficient form
[17:33] <jam> but still had a record for every file in every version
[17:34] <ddaa> Sure, but you'd have found out earlier if you had bounded cache mechanism with a custom allocator, so you could have tested it with different memory limits.
[17:34] <ddaa> It's a recurring problem in bzr.
[17:35] <ddaa> "If we do not cache, it's slow, if we cache it eats tons of memory."
[17:35] <Peng> LRUCache?
[17:35] <fullermd> Well, Ubuntu used to send out those free CD's.  Can't we send out free DIMM's to bzr users?   ;)
[17:36] <ddaa> (I realize this is a blanket statement and that they are many bad reasons for things to be slow or eat tons of memory, I am just trying to make a point about memory management in general)
[17:36] <Peng> fullermd: Sign me up!
[17:36] <ddaa> if we do that, people will complain that those DIMMs do not have free VHDL.
[17:37] <ddaa> So we balanced the PR implications, and decided not to ;)
[17:37] <fullermd> "Lightsabres aren't open source!"
[17:41] <ubotu> New bug: #172612 in bzr "new commit is overly verbose" [Undecided,New] https://launchpad.net/bugs/172612
[18:43] <Peng> Wow, bzr 0.12 is only 1 year old. You've been busy.
[18:44] <lifeless> vila: packs make an effort to request in-order for .pacll
[18:44] <lifeless> vila: and same for indices - but with indices they set the flag saying out-of-order-ok
[18:45] <lifeless> elmo: and are you going to mail me back :)
[18:45] <lifeless> ddaa: yes packs commit merge is faster
[18:46] <lifeless> ddaa: reconcile is fast now, in .dev
[18:47] <Peng> You were just adding dotted revnos and bzr+http.
[18:48] <Peng> (I wonder if bzr+http worked then. Grumble grumble.)
[19:03] <lifeless> hi jam, how is the missing-compression-parent patch coming ?
[19:04] <lifeless> ddaa: actually, we knew it was bad on memory in 0.92
[19:04] <lifeless> ddaa: I don't think the actual developer realised how bad bad memory issues become in the sort of scaling environments we encounter
[19:08] <lifeless> tap tap tap is this thing on?
[19:14] <keir> lifeless, i think your tapping knocked ddaa's connection out
[19:15] <lifeless> :)
[19:26] <gander> Hi. Does anyone here know anything about these bzr irc logs? <http://bzr.arbash-meinel.com/irc_log/bzr/2007/>
[19:28] <james_w> gander: that would be jam
[19:29] <gander> Why would he do that? It's a spammers delight
[19:30] <james_w> why?
[19:30] <jelmer> gander: why's that? hostmask accidently matches email address?
[19:31] <gander> Because anyone can trawl for email addresses of it. Cant they?
[19:33] <james_w> yes, but the percentage of working email addresses you would get would be tiny.
[19:33] <fullermd> It'd be pretty poor trawling, I'd think.  The density of addresses is miniscule.
[19:33] <lifeless> most users email and hostmask don't match the reverse lookup of the machine they IRC from.
[19:33] <lifeless> that said, santising them should not be hard. jam: ^
[19:34] <fullermd> Hey, the more nonexistent hosts spammers try, the less time they'll spend rapping on my servers   :p
[19:34] <gander> Looks like I need to mask mine then
[19:35] <lifeless> you'll likely find other channels you are in are also logged
[19:36] <Peng> If you participate in FOSS development, your email address is screwed anyway. Bug trackers, VCS author information, mailing lists ...
[19:36] <gander> Pah! :(
[19:36] <lifeless> Peng: yes, but gander may be just using bzr :)
[19:36] <Peng> True.
[19:36] <Peng> s/True/Right/
[19:36] <lifeless> luks: ping
[19:36] <luks> pong
[19:36] <lifeless> hi
[19:37] <lifeless> nice performance regression
[19:37] <lifeless> well, not nice, but nice catch noticing it
[19:37] <luks> the file log?
[19:37] <lifeless> yes
[19:37] <lifeless> I've replied with an analysis of the callgrind
[19:37] <lifeless> we can probably trivially drop it to 7 minutes by removing the duplicate call to get_revision_graph
[19:38] <luks> 7 seconds, I hope :)
[19:38] <lifeless> the other two will take a little more work; if you're interested in hacking on this code ...
[19:39] <luks> well, I have generally very mixed feeling about packs, so I'm probably not motivated enough :)
[19:40] <luks> maybe it's just me, but it feels a lot slower on many local operations, which is what I'm primarily interested in
[19:40] <lifeless> luks: file bugs, lots of bugs.
[19:40] <luks> it's all caused by parsing the big indexes
[19:40] <lifeless> luks: I know for absolute fact that its faster and many local operations; but we're likely not looking at the same data set, nor the same operations.
[19:41] <lifeless> s/and/on/
[19:41] <luks> yep, simple things like diff or st are faster
[19:41] <luks> historical diff and st, I mean
[19:41] <luks> but for example just genering a bundle for 'send' is slower
[19:42] <lifeless> luks: this is why is was experimental in 0.92; and I was expecting a little more time to find performance regressions (where 'all history' become more expensive, but 'partial history' became possible.) and fix then
[19:42] <lifeless> s/then/them/
[19:42] <lifeless> bug 165309 will help with the index performance
[19:42] <ubotu> Launchpad bug 165309 in bzr "pack index has no topological locality of reference" [Medium,Confirmed] https://launchpad.net/bugs/165309
[19:42] <lifeless> but changing all-history API usage to partial-history API usage is the key thing needed to improve performance
[19:43] <lifeless> knits are incapable of partial-history operations
[19:43] <lifeless> but they were tuned to do all-history very fast
[19:43] <lifeless> it turned out that this was not a good enough solution
[19:43] <luks> knits with better indexes would be, IMO
[19:44] <luks> the current text index for packs probably needs some kind of subindex to make it fast
[19:45] <lifeless> luks: the current text index for packs is /way/ faster than knit indices at extracting say 1% of keys
[19:46] <lifeless> anyhow, bug 165309 will deliver a subindex of sorts
[19:46] <ubotu> Launchpad bug 165309 in bzr "pack index has no topological locality of reference" [Medium,Confirmed] https://launchpad.net/bugs/165309
[19:46] <Peng> I think uncommit is slower with packs.
[19:46] <luks> lifeless, which is why diff -c X and st -c X are fast
[19:46] <lifeless> luks: anyhow, here's what I'd *love*. File a bug anytime a pack operation is slower than a knit operation.
[19:47] <luks> ok
[19:47] <lifeless> they will nearly all involve commands accessing too much history
[19:47] <fullermd> That's interesting.  I wouldn't expect uncommit to need to do any repository actions at all.
[19:47] <lifeless> Peng: please file a bug.
[19:47] <lifeless> I'll mail the list right now.
[19:47] <fullermd> I guess to find the parent, maybe.
[19:48] <lifeless> luks: can I quote you in my mail? I'd like to give people context about what I'm asking
[19:48] <luks> sure :)
[19:49] <Peng> Shrug. Uncommit takes a couple seconds with packs, and I don't remember it being non-instant before.
[19:49] <Peng> I'll confirm it in a few minutes.
[19:49] <Peng> But now to brush my teeth!
[19:52] <buxy> Hi, I'd like to convert an existing CVS repository to bzr
[19:53] <buxy> I tried using tailor but it fails: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=439124
[19:53] <ubotu> Debian bug 439124 in tailor "tailor: Fails to convert debian-admin CVS repository to bzr" [Important,Open]
[19:53] <Peng> lifeless: Hey, what's hte status of pack reconcile?
[19:53] <Peng> the*
[19:53]  * Peng goes back to tooth-brushing.
[19:53] <fullermd> buxy: Have you tried using the cvsps-import plugin?  IME it's a lot more reliable than tailor, as well as _much_ faster.
[19:54] <buxy> fullermd: no, do you have a pointer?
[19:54] <ksmith99> Hi folks, I have a share-repository I have been checking  out of but the dns name got changed... what do I do to use the new name? thxs
[19:54] <buxy> of found at http://bazaar-vcs.org/CVSPSImport
[19:55] <buxy> I'll look into it, thanks
[19:57] <marianom> ksmith99: I know you can use --remember with "pull" to make it change the default location
[19:58] <marianom> do you guys know when bzr-gtk 0.92.x will be in the repository http://bazaar-vcs.org/releases/debs ?
[19:58] <ksmith99> marianom: thanks, looks like a good start
[19:59] <marianom> ksmith99: this is not as clean (and maybe it's a bad advice) but you can change the parent branch address accessing .bzr/branch/parent
[20:00] <lifeless> Peng: I'm hacking on it as we speak
[20:01] <lifeless> marianom: we're not currently building bzr-gtk; next monday I should get time to do repository stuff
[20:02] <marianom> thanks lifeless
[20:06] <ksmith99> marianom: thanks, I might try that if remember doesn't work
[20:10] <Peng> Is there a command to make a repository default to having working trees or not?
[20:11] <fullermd> The bzr install hacks your system to add those capabilities to the 'rm' and 'touch' commands   ;)
[20:12]  * Peng wanders off.
[20:12] <lucasv1> hi
[20:13] <lifeless> hi
[20:13] <lifeless> Peng: you can rm .bzr/repository/no-working-trees; I don't think we have a UI for it yet
[20:13] <lucasv1> I have moved a bzr repository to a different server, now it give me the following error: http://pastebin.org/9773
[20:14] <lifeless> Peng: there is a python API to change it
[20:14] <lifeless> luks: use a modern bzr
[20:14] <lifeless> meh
[20:14] <lifeless> lucasv1: use a modern bzr
[20:14] <lifeless> 0.11 is very old
[20:14] <lifeless> we're at 0.92 today
[20:15] <buxy> Similar question, I have a git repository I'd like to convert to bzr, I found bzr-git but there's no doc on how to use it
[20:15] <lifeless> bzr-git is incomplete
[20:16] <lifeless> jam: was most recently working on it
[20:17] <lucasv1> lifeless: it's debian stable  I think
[20:17] <lucasv1> that's what my hoster gives me
[20:17] <lifeless> lucasv1: you've been using a more recent bzr
[20:17] <lucasv1> possible, I have an ubuntu workstation
[20:17] <lifeless> lucasv1: the one you have switched to cannot handle kind changes properly
[20:17] <lucasv1> ah ok
[20:18] <lifeless> so, you need to get them to upgrade; newer bzr's are in debian backports
[20:19] <luks> lifeless, damn, now you got me interested in packs again :)
[20:19] <lifeless> luks: my email ?
[20:19] <luks> I'm looking on the code right now, but is there some limit when a pack is considered "final" and won't be repacked?
[20:19] <fullermd> aleph-null?   ;)
[20:19] <luks> yes + the irc discussion
[20:19] <luks> _max_pack_count seems kind of random
[20:19] <lifeless> luks: no; we could consider that though. Some things to consider - each item is moved log10(total commits) times.
[20:20] <lifeless> luks: its arbitrary yes, but not random :)
[20:20] <luks> well, for 100000 revisions you get only 1 pack, for 99 revision you get 18 packs
[20:20] <luks> not very good distribution, IMO
[20:20] <lifeless> at 100000 revisions, the things committed in the first commit will have been copied 5 times.
[20:21] <lifeless> luks: and you'll need to do 900000 commits to get them copied again
[20:21] <luks> well, let's say I have 999 revisions, so I have 27 packs
[20:22] <lifeless> right
[20:22] <luks> now I do a commit and all of them get repacked
[20:22] <lifeless> at that point you're paying quite a high latency price
[20:22] <luks> which means I have to reupload the whole repository
[20:22] <lifeless> so we correct that by dropping it to 1
[20:22] <lifeless> luks: with the smart server, once streaming push is implemented, you won't upload the repository again; instead the server itself will do the repack.
[20:22] <luks> I think lucene-like repacking would work better
[20:23] <luks> well, but dumb servers are a big advantage of bzr :)
[20:23] <lifeless> do you have a link for that ?
[20:23] <luks> umm, no I don't think so
[20:23] <lifeless> if you can find one I'd be delighted to read up on it.
[20:23] <luks> bit the point is that you have ~ logaritmic distribution of index sizes
[20:24] <luks> and you combine smaller indexes into bigger ones
[20:24] <lifeless> luks: so far this is exactly what we do
[20:24] <luks> but almost never touch the "top" indexes
[20:24] <luks> well, 27 -> 1 is not exactly that :)
[20:24] <lifeless> luks: thats precisely that. You're sampling only the times that we do touch the top indices.
[20:24] <luks> lucene would take for example the lower 5 ones and pack those
[20:24] <lifeless> luks: again, thats exactly what we do.
[20:25] <luks> nope
[20:25] <lifeless> say you have 899 revisions, with 26 packs
[20:25] <luks> it always maintains a certain number of indexes
[20:25] <luks> it will never drop to 1
[20:25] <luks> unless you manually merge them
[20:26] <lifeless> well, we could do that easily; just change the distribution logic
[20:26] <lifeless> you could do that with a plugin in fact :). The code that takes a distribution already chooses to only move around the smallest amount of revisions possible
[20:27] <lifeless> there are two problems that it has though, that seem to me to need serious consideration
[20:27] <lifeless> one is scaling
[20:27] <lifeless> if you have a really big repository its reasonable to pay more latency
[20:28] <lifeless> so a constant number of indices is really bad for small repositories (too much latency), or really bad for big repositories (too much repacking)
[20:29] <luks> the number is not constant, it depends on the whole index size in lucene
[20:29] <lifeless> ok
[20:29] <lifeless> the other thing is that if you are going to touch two indices its cheapest to just combine them together rather than in some other form
[20:29] <lifeless> this isn't quite represented correctly in the autopack.
[20:30] <lifeless> its something I realised after doing the initial code and before coming back to it.
[20:30]  * lifeless files a bug
[20:32] <jelmer> lifeless: is there some way to tell the code how often to repack?
[20:33] <jelmer> lifeless: I'm thinking of mass-imports that could potentially be speeded up if the number of repacks could be reduced
[20:34] <lifeless> jelmer: well, you'll trade index latency off against repacking.
[20:35] <lifeless> jelmer: if mass imports will be improved, improve the autopack logic.
[20:35] <lifeless> jelmer: (but I do understand the desire to control this through code; I'm entirely open to a disable_autopack() api call.
[20:36] <lifeless> luks: so I've filed a bug about that. if we're going to read N packs, only ever bother writing 1 new pack.
[20:36] <lifeless> I don't know if lucene does this.
[20:37] <lifeless> but basically its silly to aim for a constant number. Because - to keep it constant, you have to shuffle data *between* your indices[packs]
[20:37] <lifeless> and as each pack is write-once.
[20:38] <lifeless> this means reading N packs and writing M packs, where N=M+1 once you hit your constant number
[20:38] <luks> well, the point is that you shuffle only the small ones or in small incremental steps
[20:38] <lifeless> (each new commit adds one pack, then you put it into your constant-sized collection)
[20:38] <lifeless> luks: but thats what we do? We only shuffle the smallest ones
[20:39] <lifeless> luks: I feel like we're talking past each other here.
[20:40] <luks> yes, but you drop the number of packs radically
[20:40] <luks> that's the only thing I don't like -- I don't want to reupload most of my repository if I really don't have to
[20:40] <lifeless> its at power of 10 intervals
[20:40] <ubotu> New bug: #172644 in bzr "autopacking should just combine all packs it touches" [Undecided,New] https://launchpad.net/bugs/172644
[20:41] <lifeless> to reupload most of your repo occurs extremely rarely.
[20:41] <lifeless> and mathematically I can't see that happens any more often than lucene would do it
[20:42] <lifeless> for instance, bzr is at 15K commits now after 3ish years.
[20:42] <luks> in the terms on total used bandwidth probably not
[20:42] <lifeless> our next full-upload is going to be at 100K commits
[20:42] <luks> but in the case of lucene, it would be done in small steps
[20:42] <luks> here it's a "all or nothing" situation
[20:43] <lifeless> luks: then you have described lucence to be badly; because logarthimic expansion of size will behave identically
[20:43] <lifeless> at some point lucence will decide to touch a top index
[20:43] <lifeless> and make a bigger
[20:43] <luks> yes, and it would be more often than in case of bzr
[20:43] <lifeless> that bigger one will have 10 times what the previous top had in it (what whatever logN they are using)
[20:44] <luks> but the distribution of pack sizes would be different
[20:44] <luks> so you will be never in situation you have to reupload/redownload the *whole* repository
[20:44] <lifeless> I think you are wrong about the impact and frequency; rather than debating - point me at code/docs.
[20:44] <Peng> With bzr.dev working on a fresh copy of bzr.dev, "bzr uncommit --dry-run" uses 0.41 user seconds in dirstate, 0.86 in dirstate-tags, and 3.84 in packs.
[20:44] <lifeless> Peng: please file a bug as per my email
[20:45] <luks> can't find anything except API docs, which doesn't explaing it at all
[20:45] <Peng> I haven't seen your email, but I will.
[20:45] <lifeless> Peng: thanks!
[20:45] <lifeless> Peng: my email contains a request for a callgrind file
[20:45] <lifeless> Peng: just of the packs version
[20:46] <Peng> Just of the packs version? Okay.
[20:46] <lifeless> yah
[20:46] <Peng> Will the time spent waiting for me to hit enter to exit it hurt callgrind?
[20:46] <lifeless> rarely need a knit version, its usually obvious where packs are slow
[20:46] <lifeless> Peng: not at all
[20:46] <lifeless> Peng: but you can probably enter in advance
[20:47] <lifeless> luks: so for clarity; I'm open to improving autopacks user experience.
[20:47] <lifeless> luks: It will need serious sum of operations and frequency of size impact analysis.
[20:47] <Peng> I'll file a bug in a second.
[20:48] <lifeless> luks: I did this for the current code, and I'm very happy to help with the analysis of a proposed replacement algorithm.
[20:49] <jam> Peng: you can also use "bzr uncommit --force" to avoid the prompt.
[20:49] <lifeless> luks: for instance, here's a replacement algorithm: pack the smallest packs we want to get rid of into a new pack bigger than any current pack.
[20:49] <luks> hmm
[20:49] <lifeless> luks: this would guarantee that we always leave at least the current top in place; and only do this when the sum of the smaller packs is bigger than the current top, so autopack would be about choosing a 'top' to use
[20:50] <Peng> jam: Oh, good. That even works with --dry-run.
[20:50] <lifeless> (so if we are leaving (say) the 10K packs alone, we might choose a 'top' to spin around of a 1K pack. Then all the < 1K packs become a new ~10K pack.
[20:50] <lifeless> this of course has other problems. We currently have the situation that most data most operations want is in the smallest packs.
[20:51] <lifeless> because we strictly shuffle small->next size up.
[20:51] <jam> lifeless: don't forget that pull just creates 1 new pack of all streamed data, right?
[20:51] <lifeless> and that lets us optimise the order in which we attempt to retrieve index entries as well as providing high locality of data
[20:51] <lifeless> jam: right
[20:52] <jam> do we have 'push' doing the same thing?
[20:52] <lifeless> jam: it still tends to preserve this property.
[20:52] <luks> oh, even pull over http?
[20:52] <lifeless> luks: yes, pull over http -> one new pack.
[20:52] <lifeless> jam: yes push does the same thing
[20:52] <luks> so it autopacks everything into a single pack on the client side?
[20:52] <lifeless> luks: single *new* pack.
[20:52] <jam> luks: basically it just figures out what it will need, starts streaming the data to the pack, and then finishes once it got everything.
[20:52] <Peng> What's the pack bzr+ssh thing that corrupts the repo?
[20:52] <jam> (into a new pack)
[20:52] <lifeless> luks: then after the transaction it checks to see if it needs to autopack the whole repo.
[20:52] <lifeless> Peng: there isn't one.
[20:53] <lifeless> Peng: it was a concern that was all.
[20:53] <Peng> lifeless: Okay, right. Just an error.
[20:54] <luks> er, I actually didn't realize you can "pick" only some revisions from pack even over http
[20:54] <lifeless> luks: so here are the parameters: We want adjacent texts to combine into the same pack as we combine packs, so that they can be efficiently delta compressed. We want to provide a natural cap on latency. We always have to replace an entire pack - we can't delete from a pack, nor add to one. We never want a user to have to worry about the internals.
[20:55] <lifeless> luks: over http we: perform bisection in 64K chunks on the .*ix indices. Then we plan a readv for each pack, and readv the data we want precisely out of the pack file.
[20:56] <lifeless> luks: we do one readv per pack for the revisions we need. Then one readv per pack for the inventories we need. Another for the text deltas. And finally one for digital signatures.
[20:56] <lifeless> luks: each readv is done in ascending order to make range combining as efficient as possible.
[20:56] <luks> yep, that's makes my concerns about redownloading completely invalid
[20:57] <lifeless> we're very strong about downloading; your concerns about re*uploading* are valid.
[20:57] <luks> I thought people will have to download whole packs if I repack something on the server
[20:57] <lifeless> but a reupload does not screw readers over :)
[20:57] <lifeless> luks: ah! any suggestion about where you got that impression?
[20:58] <luks> from the fact that packs are write-once and then only for reading
[20:58] <luks> so I though bzr will treat it as one big blob
[20:58] <lifeless> luks: ah; you didn't join the 'better at partial operations' thing to that
[20:58] <lifeless> luks: patches for doco appreciated :)
[20:58] <luks> lifeless, I did, but only for indexes, not for packs theirselfs :)
[20:59] <lifeless> brb
[21:10] <ubotu> New bug: #172649 in bzr "uncommit is slow with packs" [Undecided,New] https://launchpad.net/bugs/172649
[21:11] <Peng> :D
[21:16] <lifeless> thanks pen
[21:17] <lifeless> *Peng*
[21:17] <Peng> Once someone called me Feng a bunch of times.
[21:17] <Peng> I should change my nick.
[21:17] <lifeless> heh
[21:17] <lifeless> to Shui?
[21:18] <Peng> Heh.
[21:18] <Peng> That might work.
[21:18] <Peng> Aww, already taken.
[21:18] <lifeless> jam: ping
[21:18] <Peng> Actually, registered 2 years 33 weeks ago and last seen 2 years 33 weeks ago. The opers might delete it if I ask.
[21:19] <jam> lifeless: pong
[21:19] <lifeless> jam: how is the missing-compression-parent patch coming
[21:19] <jam> ok, I'm still at a bit of a loss for how to force corruption into the repository
[21:19] <jam> that, and I got side-tracked reporting a performance impact for "bzr status" after a merge
[21:19] <lifeless> there's a BrokenRepository test base class somewhere
[21:19] <jam> get_ancestry(revision_id) is a lot slower now
[21:19] <lifeless> what I do with such things is note them and move on
[21:20] <lifeless> yah, thats a typical size_history vs partial_history tradeoff
[21:20] <jam> well, you had asked for a callgrind, and I have trouble not looking at it myself
[21:20] <lifeless> jam: :)
[21:20] <lifeless> I was just saying what I do :)
[21:20] <jam> all the time is spent in _lookup_keys_bylocation
[21:20] <lifeless> yup
[21:20] <lifeless> the next index layer will remove bisection
[21:21] <lifeless> so I'm not that interested in tuning this
[21:21] <lifeless> s/.*/am not interested in/tuning this
[21:22] <jam> is there any way to upload a file as part of the bug submission?
[21:22] <lifeless> if you use the web ui
[21:22] <jam> yeah, but I wanted to "just send an email"
[21:22] <jam> so I don't have to click 10 times to set the status and the tags
[21:22] <jam> but then I don't get a bug number for a long time
[21:23] <jam> to upload the callgrind too
[21:23] <lifeless> https://bugs.launchpad.net/malone/+bug/30225
[21:23] <ubotu> Launchpad bug 30225 in malone "Attach files via email" [High,Confirmed]
[21:23] <jam> to
[21:30] <ubotu> New bug: #172657 in bzr "bzr status after merge slower with packs" [Undecided,Triaged] https://launchpad.net/bugs/172657
[21:44] <lifeless> jam: ping; can I ask for an insta-review?
[21:44] <lifeless> jam: my trivial progress bar fix - I'd like to flush that
[21:44] <jam> I'll give it a shot
[21:45] <lifeless> its 3 lines or so :)
[21:45] <jam> I don't see it in BB or in my emali
[21:45] <jam> email
[21:46] <jam> lifeless: you can paste-bin if you want
[21:46] <lifeless> grah
[21:46] <lifeless> http://rafb.net/p/ArhA5197.html
[21:48] <jam> lifeless: you are changing that function into a generator
[21:48] <jam> which with python2.4
[21:48] <jam> will fail
[21:48] <jam> try/finally doesn't work with a yield there
[21:48] <jam> bb:reject
[21:48] <jam> sorry
[21:48] <lifeless> oh foo
[21:48] <jam> python2.5 only
[21:48] <lifeless> fuckity
[21:48] <jam> Since people have troubles even getting 2.4, I don't think we want to depend on 2.5 just yet :)
[21:49] <jam> I know we discussed it a long time ago, but it really seems like at least RHEL is taking *forever* to get past 2.3
[21:49] <Peng> 2.6 is close to alpha, right? What about it?
[21:49] <lifeless> try: except ?
[21:49] <lifeless> I don't even have 2.4 to test on now
[21:49] <jam> lifeless: try/except works (I believe), but I'm not sure what you would catch.
[21:49] <lifeless> Exception: and raise
[21:50] <jam> and then a plain "pb.finished()" as well?
[21:50] <jam> (outside the exception)
[21:50] <jam> we can go for it
[21:50] <lifeless> try:except Exception: finished();raise() else: finished()
[21:50] <jam> lifeless: I don't really like it, but BB:approve anyway
[21:50] <jam> not much else we can do, unless we want to allocate the pb in a non-generator
[21:51] <lifeless>         try:
[21:51] <lifeless>             for result in self._do_copy_nodes_graph(nodes, index_map, writer,
[21:51] <lifeless>                 write_index, output_lines, pb):
[21:51] <lifeless>                 yield result
[21:51] <lifeless>         except Exception:
[21:51] <lifeless>             pb.finished()
[21:51] <lifeless>             raise
[21:51] <lifeless>         else:
[21:51] <lifeless>             pb.finished()
[21:51] <lifeless> aka what try:finally: stands for.
[21:51] <lifeless> bastards.
[21:51] <lifeless> ^ ok with that ?
[21:51] <jam> lifeless: except it isn't guaranteed to finish
[21:51] <jam> because someone in 2.4 may not consume the whole generator
[21:51] <lifeless> right
[21:51] <jam> 2.5 actually guarantees it will finish
[21:52] <lifeless> thats ok with me
[21:52] <lifeless> we control all the consumers
[21:52] <jam> anyway, BB:approve, though the "correct" fix is to move pb = ui.ui_factory.nested_progress_bar() up a level
[21:52] <lifeless> it gets fugly fast if we do that
[21:52] <lifeless> there are multiple callers of this
[21:53] <lifeless> and that sort of cruft is what we want pb's to avoid
[21:53] <jam> lifeless: go ahead
[21:53] <jam> lifeless: maybe with a comment so people understand the bit of cruftiness
[21:58] <lifeless> done
[22:19] <warren> bzr diff has -r, but is there an equivalent for a tag instead of revision number?
[22:19] <fullermd> -rtag:FOO
[22:20] <fullermd> (see 'bzr help revisionspec' for the full list of possibilities)
[22:21] <lifeless> bbs
[22:22] <warren> fullermd, thx =)
[22:53] <poolie> hello
[22:59] <lifeless> poolie: please save me from elevator music
[22:59] <poolie> jam, lifeless, igc, spiv : ping
[23:00] <jam-laptop> conference time, right poolie ?
[23:00] <poolie> yep
[23:00] <poolie> i'm in
[23:11] <lifeless> jam-laptop: to setup a repo where a fetch will find a missing text parent, here is an easy way
[23:11] <lifeless> jam-laptop: repo a: do two commits, A and B that add and alter a file foo, with id foo-id
[23:11] <lifeless> repo b: do one commit, A, that does not add the file foo
[23:12] <lifeless> then fetching into b from a will only copy the deltas for B, and the text parent for foo-id will be absent.
[23:12] <lifeless> long term this test has some flaws, but it will ensure that the sanity checking we need today is in place; and defers the other work until we have code that will need it, which is a good thing.
[23:13] <jam-laptop> lifeless: why wouldn't fetching into 'repo b' copy both A and B
[23:13] <jam-laptop> and thus fetch the basis
[23:13] <lifeless> jam-laptop: because B already has an 'A'
[23:13] <jam-laptop> or are you giving them the same revision id
[23:13] <jam-laptop> ok
[23:14] <lifeless> alternatively
[23:14] <lifeless> commit A and B
[23:14] <lifeless> then
[23:14] <lifeless> repo_b.add_inventory(repo_a.get_inventory(A))
[23:14] <lifeless> repo_b.add_revision(repo_a.get_revision(A))
[23:14] <lifeless> repo_b.fetch(repo_a, 'B')
[23:14] <lifeless> *boom*
[23:14] <jam-laptop> adding the inv and revision without adding the texts....
[23:14] <jam-laptop> ok
[23:15] <jam-laptop> thankns
[23:17] <lifeless> np :)
[23:23] <lifeless> poolie: yes pqm is sending mails AFAIK
[23:25] <poolie> lifeless, yes, it just seemed to be delayed somewhere
[23:25] <lifeless> abentley: diff is snappy now; thanks
[23:25] <lifeless> abentley: in particular no-op diff is 1/2 sec
[23:32] <lifeless> poolie: I want time before going on leave to fix bzr-email to be fast
[23:32] <lifeless> poolie: its by far the slowest part of commit
[23:35] <lifeless> abentley: ping; are you on packs yet? :)
[23:43] <jam-laptop> well, I've proven that Knit1 repositories *don't* fail under those conditions
[23:43] <jam-laptop> because they always copy all missing revisions in the ancestry
[23:43] <jam-laptop> we probably knew that already
[23:44] <lifeless> good;
[23:44] <lifeless> knit->pack will today do the same thing; a test that it stays like that would be good
[23:44] <lifeless> pack->pack will explode
[23:44] <lifeless> well
[23:44] <lifeless> fail to explode
[23:44] <jam-laptop> right
[23:44] <jam-laptop> :)
[23:44] <lifeless> where's the ka-boom.
[23:45] <lifeless> there's meant to be an earth shattering KA-BOOM
[23:45] <poolie> lifeless, well, performance of plugins is less important than performance of the core
[23:47] <lifeless> poolie: depends on the plugin though; if you were evaluating a VCS, would you setup email on commits ?
[23:48] <poolie> probably not in my initial evaluation
[23:48] <poolie> (tbh)
[23:48] <lifeless> poolie: I think the core has to support high performing plugins; this is likely a case where core changes are needed
[23:49] <poolie> it's important, just imo not as important as the stability and performance of the core tool
[23:49] <lifeless> 10:48 < lifeless> poolie: I think the core has to support high performing plugins; this is likely a case where core changes are needed
[23:49] <lifeless> because deployments will eventually expand out to need the whole ecosystem for a given project to be fast
[23:49] <lifeless> anyhow, after friday