/srv/irclogs.ubuntu.com/2008/02/11/#bzr.txt

lifeless	abentley: ok, bit of a missive, but its on its way	00:19
=== kiko-afk is now known as kiko-zzz
abentley	lifeless: At first blush, it looks like violent agreement.	00:29
abentley	I've just skimmed so far.	00:29
=== mw\|out is now known as mw\|out\|out
lifeless	abentley: I had the sense from you that other things than bytes would be on your 'storage' object, but if not then it was just mutual confusion I think	00:35
abentley	Well, graph data also, I guess.	00:36
lifeless	thats something I explicitly rejecting at this point	00:36
lifeless	I don't know if I'm right too	00:37
abentley	Well, the build graph seems like it must be there.	00:37
lifeless	abentley: thats under the wraps	00:37
lifeless	rephrasing, you don't need to expose the fact compression has happened at all outside the byte store	00:38
abentley	Okay, I guess we're in disagreement, then.	00:38
lifeless	thats good; it means there's more to explore :)	00:39
abentley	There are several things we can't implement without that.	00:39
lifeless	I'm going to go for a walk and get lunch and think about those things, if you could list them now :)	00:39
abentley	One of them is Goffedo's inventory hack.	00:39
abentley	Another is fetching via the stream thing that Andrew and you added.	00:40
lifeless	it doesn't depend on exposing deltas in the api	00:40
lifeless	it depends on exposing 'the set of lines from all these texts'	00:40
abentley	Another is iter_changes based on a semantic inventory delta.	00:41
abentley	That's all I can think of at the moment.	00:41
lifeless	I can't parse that last one.	00:41
abentley	You know your journalled inventory stuff?	00:41
lifeless	oh right, well that doesn't do deltas within the byte store, its a form of fulltext always	00:42
lifeless	its a layer up	00:42
abentley	Is that the right layer?	00:42
lifeless	but the stream-fetching one I will cogitate on, I think I didn't /quite/ have my ducks in a row	00:42
abentley	If it's a layer up, does that mean we can't get a comprehensive inventory directly from the multi-versionedfile?	00:43
abentley	lifeless: Oh, sorry plenty more.	00:44
lifeless	I don't think inventory should be in a versioned file abstraction at all eventually. We will want an index for the inventory tree, and then read the lot.	00:44
abentley	Anything that tries to accelerate test comparisons using knit deltas.	00:44
lifeless	'test comparisons' ?	00:44
abentley	s/test/text	00:45
abentley	Currently, annotate and send use that.	00:45
lifeless	how does that work?	00:45
abentley	We derive the SequenceMatcher.get_matching_blocks output from the knit delta and a pair of fulltexts.	00:46
abentley	The fulltexts are used to fix up the eol bogosities.	00:47
lifeless	sounds like what you want is 'byte_store.get_matching_blocks(from_key, to_key)' ?	00:48
lifeless	lets not detail it now	00:48
lifeless	lets just keep thinking of issues	00:48
abentley	lifeless: At least some of the time, you want the fulltexts as well as the matching_blocks.	00:49
abentley	I think the new "knit merge" also uses that information.	00:49
abentley	ie the matching_blocks.	00:50
lifeless	abentley: to come back - sure; we can handle that variation I think	00:51
abentley	Yeah.	00:52
lifeless	-> to think. (inventory hack, journalled_inv, stream_fetch, get_matching_blocks)	00:52
abentley	So I think the use cases for raw compression artifacts are 1. transmission between repos, 2. use of partial data, 3. acceleration of comparisons.	00:52
poolie	lifeless, good point about acting as an ssh agent	00:54
=== BasicMac is now known as BasicOSX
lifeless	poolie: thanks ;)	01:14
poolie	so	01:14
poolie	really this is a broader reason not to use builtin ssh, more than anything else	01:14
lifeless	Yes	01:14
lifeless	if you want ssh credential caching, use an ssh agent, kthxbye. IMNSHO	01:14
poolie	lifeless, quick call?	01:15
lifeless	sure	01:16
=== jamesh__ is now known as jamesh
cr3_	is there any plans to have a dapper .deb on the ppa?	03:02
=== cr3_ is now known as cr3
Solarion	is there a way to import the revision history of an RCS directory?	03:47
lifeless	sure	04:14
lifeless	cvs init a repo	04:14
lifeless	copy the RCS files into a subdir there	04:14
lifeless	use cvsps-import	04:15
abentley	lifeless: The other reason I wanted access to raw records was repository stacking.	04:19
lifeless	abentley: that would interact badly with different delta formats cross-repo	04:20
abentley	lifeless: that depends how stupid we are.	04:21
lifeless	lol	04:21
abentley	I had the impression that we were going to explore the possibility of multiple delta formats per repository anyhow.	04:22
lifeless	I think that for stacking we basically do iter_file_texts on the repos we're stacked against, for every component we can't create ourselves	04:22
lifeless	annotation is the other thing not really covered	04:23
abentley	Producing unnecessary fulltexts takes CPU time.	04:24
abentley	So if we can treat the foreign repo raw entries the same as local ones, that can aid performance.	04:25
abentley	John discovered that we were wasting time inserting lines in the middle of lists building fulltexts from knits.	04:27
abentley	That's why MPDiff can generate a fulltext from the top down without generating intermediate fulltexts.	04:27
=== aadis_ is now known as aadis^
lifeless	I think this is an undesirable abstraction violation in the general case.; the cost of getting data from a remote repo in the first place dwarves local cpu cost	04:28
=== aadis^ is now known as aadis
lifeless	secondly, the number of stacked components is going to be small - I don't imagine many getting more than 3-4 steps	04:29
abentley	Which is the abstraction being violated? delta compression?	04:30
lifeless	and that will mean we 3-4 points where we enforce a full text basis	04:30
abentley	I think that we have enough use cases for compression deltas that it's reasonable to question whether that abstraction is a helpful one.	04:33
lifeless	I'm refining it now, the great thing about strawmen is they get comments	04:33
lifeless	I really want something that can be consistent across repositories sensibly, and I don't think something exposing deltas itself can; not the way we handle deltas today	04:34
abentley	lifeless: I think just tagging deltas with their format goes a long way.	04:37
igc	lifeless: re hg-import, the guts of install_revision in there does diff from the same in repository.py ...	04:47
igc	is that the per-file graph issue ?	04:48
lifeless	you're a little opaque in that sentence	04:48
igc	in repository.py, that bit of code has a "FIXME: TODO:" from yourself and abentley fwiw	04:48
igc	sorry, I'll try again ...	04:49
lifeless	I haven't looked at hg-import in great detail; but as I know of no reason to have different to bzr-hg, I'm suspicious from the get-go	04:49
* igc looks up line numbers		04:49
=== asac_ is now known as asac
igc	lifeless: in repository.py, the for loop beginning with a comment on line 1973 is the bit of interest	04:54
igc	in the hg-import plugin, that routine is largely repeated but that inner loop is different - perhaps a copy of some older code at a quick guess	04:55
lifeless	meh	04:55
lifeless	so why does hg-import exist?	04:56
lifeless	what does it do differently to 'bzr pull' with bzr-hg installed ?	04:56
igc	http://rafb.net/p/drJQkk16.html is the code btw	04:56
igc	hg-import exists because bzr-hg doesn't work ...	04:57
igc	and Lukas found it easier to write it that fix bzr-hg	04:57
igc	s/that/than/	04:57
lifeless	its definately buggy	04:57
abentley	Lukáš also thinks that bzr-hg does its topo sorting wrong.	04:58
lifeless	If I was spending time on this I would be updating bzr-hg, because its more generally useful, and can have the bzr repository conformance tests run against it	04:58
lifeless	which if taken to completion would give -real- confidence in it	04:58
igc	lifeless: my actual focus right now is the git->bzr converter. I'd like to see hg-import merged into bzr-hg and whatever issues bzr-hg has fixed. I'm ok to do that once other stuff is off my plate	05:02
igc	I'm raising this now simple because ...	05:03
igc	users are using hg-import in the absence of a working bzr-hg so if it's broken ...	05:03
lifeless	meh, I hope its got a different namespace	05:03
lifeless	if it doesn't I'll be seriously miffed	05:03
igc	then we ought to be sure key people like you know about it	05:03
lifeless	its an incompatible converter	05:04
igc	the namespace prefixes as hg: (bzr-hg) vs hg- (bzr-hgimport)	05:06
lifeless	good	05:07
igc	s/as/are/	05:07
jamesh	lifeless: by the way, you can replace the code in http://www.advogato.org/person/robertc/diary/78.html with "python -mtrace -t program.py"	05:29
johnny	mwhudson, what is the current state of the loggerhead_dev branch?	05:31
abentley	igc: You mean hg-import doesn't use a colon in its prefix?	05:37
jml	is there an API in Bazaar to make all non-existing directories above a directory?	05:38
spiv	jml: there's a "_create_prefix" function in bzrlib.builtins	05:40
spiv	jml: not exactly a public API, but you could crib from it I guess.	05:40
jml	spiv: thanks	05:40
mwhudson	johnny: it works better than anything else	05:42
mwhudson	johnny: i have a few plans for further improvements but they're a ways off realistically	05:42
johnny	just wondering why there wasn't a newer release on the loggerhead page	05:42
mwhudson	ah, yes, i should make a release	05:42
mwhudson	i'm lazy when it comes to releases :)	05:43
johnny	i noticed that you were still messing with it recently via the launchpad page	05:43
johnny	also, the demo of loggerhead seems not to work	05:43
johnny	it kinda hangs	05:43
johnny	same for bzr-webserve too strangely enough	05:43
johnny	proxy error i think	05:43
lifeless	jamesh: thanks. (Groan at wheel invention)	05:46
lifeless	jml: I would use _create_prefix directly.	05:47
lifeless	jml: not public just means 'wont get deprecated'	05:47
lifeless	:)	05:48
jml	lifeless: yeah, that's what I'm doing	05:48
lifeless	jamesh: when was the trace module added ?	05:50
jamesh	lifeless: not sure. It has been around for a while though	05:50
jamesh	lifeless: http://svn.python.org/view/python/trunk/Lib/trace.py <- it has been in the standard library since 2003, and was in Tools/scripts before that	05:51
lifeless	rotfl	05:51
lifeless	thanks	05:51
bob2	-mtrace has worked since 2.4	05:51
spiv	Yeah, the "-m" was new in 2.4 IIRC.	05:54
igc	abentley: hg-import's bzr_revision_id(node) does this: return 'hg-' + mercurial.hg.hex(node)	05:58
jamesh	the module was in 2.3 too, but you couldn't use the "python -m" syntax, yes	05:58
abentley	igc: Well, I can't say I'm surprised.	05:58
lifeless	igc: (thats not namespaced in our terms, so they could conceptually collide more easily)	05:59
abentley	But that is an entirely legal revision-id for bzr itself to generate.	05:59
lifeless	I'm glad he didn't use hg: though, because colliding with bzr-hg would have been hilarious	06:00
abentley	lifeless: It used to. I asked him to change it.	06:00
lifeless	abentley: thank you!	06:00
abentley	np	06:00
igc	abentley, lifeless: so what is the convention here?	06:01
igc	bzr-git is using git-experiemental-r:	06:01
igc	as the prefix	06:01
lifeless	igc: if you are generating random ids, use bzr to create them	06:01
spiv	I think ideally the prefix ought to be (or at least include) the name of the plugin, so you know who to blame ;)	06:02
igc	as an experiment, I changed that to git-r: in one of my test runs and it saved a fair amount of space	06:02
lifeless	igc: if they are deterministic, namespace them with CONVERTER:, and change CONVERTER whenever the algorithm changes	06:02
abentley	Revision ids that include a ':' will never be generated by bazaar-- the ':' is a namespace separator.	06:02
abentley	Revision ids ending with ':' are reserved.	06:03
igc	thanks	06:03
jamesh	igc: so doing the same import twice produces the same results? (same file IDs, revision testaments, etc?)	06:18
igc	jamesh, yes, that's the point to determinstic ids as I understand it	06:18
jamesh	what do you use as file IDs?	06:19
jamesh	(out of interest)	06:19
igc	the different converters all do something slightly different it seems	06:19
jamesh	yep. It depends on what sort of file identity rules the source VCS has	06:20
igc	jamesh: the git does this: file_id.replace('_', '__').replace(' ', '_s')	06:20
igc	which looks a little suspect to me	06:21
jamesh	where file_id is what?	06:21
igc	(path.encode('utf-8')	06:22
lifeless	igc: thats ok, brittle, but ok.	06:22
igc	ah good	06:22
igc	I was concerned about any existing sequences of _s	06:23
igc	that couldn't be mapped back the other way uniquely IIUIC	06:23
lifeless	git is a rename-free system	06:23
jamesh	igc: "_s" would get encoded to "__s"	06:23
lifeless	so paths are fine but project specific	06:23
lifeless	things like svn and hg that support some form of rename are much more complex	06:24
lifeless	you need to find the tail of the per-file graph	06:24
lifeless	and assign a unique id (using the path of the name at that point is reasonable)	06:24
igc	ah - ok	06:24
spiv	igc: that escaping scheme is unambiguous, although it'd be easy for the unescaping to be buggy...	06:24
lifeless	spiv: what unescaping :)	06:25
igc	:-)	06:25
spiv	Ah, good point. Problem solved, then ;)	06:25
jamesh	spiv: right. Doing it in two passes (like the escaping is done) would be buggy.	06:25
lifeless	other fugly thing is paths are long	06:25
jamesh	that was a problem for bzr-svn in the past, right?	06:25
lifeless	I'd probably use the revision-id at time of file creation + a serial within the tree for number of files added in that revision numbering via alpha-sort, or something like that	06:26
jamesh	git does have some idea of tracing a file's history over renames, so simply using the path as a file ID will give a different view of history	06:27
lifeless	jamesh: pickaxe you mean? Thats always derived	06:28
lifeless	(its history mining. lolz. hahaha)	06:28
igc	lifeless: so IIUIC, a repo converted from other tool may well have a different (usually bigger?) size than a vanilla bzr repo and might also perform differently	06:29
igc	I wonder how different on a large repo like the OOo one	06:29
igc	s/other/another/	06:30
jamesh	lifeless: I was thinking of "git-log -M"	06:30
jamesh	I don't know if that's the same thing as pickaxe	06:31
lifeless	igc: if you use something like tailor, no. Because its non-deterministic and equivalent to serially doing bzr commits	06:31
lifeless	jamesh: looks like - note the 'detect renames'.	06:31
lifeless	igc: if you use something designed primarily as a foreign repository interface, then yes, because we're thunking across to the native metadata.	06:31
igc	makes sense to me	06:32
=== aadis_ is now known as aadis
lifeless	abentley: updated proposal sent	06:41
abentley	Cool. Good night.	06:42
=== aadis__ is now known as aadis^^
=== aadis__ is now known as aadis
=== doko_ is now known as doko
=== aadis_ is now known as aadis
=== AnMaster_ is now known as AnMaster
ubotu	New bug: #190832 in bzr-svn "PROPFIND exception during check out of Subversion branch behind https" [Undecided,New] https://launchpad.net/bugs/190832	09:24
ubotu	New bug: #190843 in bzr-svn "Attempting a lightweight checkout raises KeyError exception" [Undecided,New] https://launchpad.net/bugs/190843	09:24
appcine	What am I doing wrong here?	10:02
appcine	My previous workflow was (using svn): client: commit, server: update -- restart web server. done.	10:02
appcine	My new workflow: client: bzr commit, bzr push .. server: bzr merge, bzr commit -- restart web server. done.	10:03
appcine	If i do not run the bzr commit on server, it complains the next time i merge about having uncommited changes	10:03
appcine	Am I doing something wrong, or is this my new life? :)	10:06
luks	appcine: you can do exactly the same as with svn	10:07
luks	that is, server: update	10:07
luks	that is, if you push directly to the published branch on the server	10:08
luks	othewise you want pull instead of merge and commit	10:08
garyvdm	Or - client: bzr commit , server: bzr pull	10:08
garyvdm	luks - you beat me :-)	10:09
luks	:)	10:09
appcine	hmm	10:11
appcine	so I get this: bzr: ERROR: These branches have diverged. Use the merge command to reconcile them.	10:12
luks	because you did commit	10:12
appcine	then I merge, but can't because I have uncommitted changes.	10:12
luks	so the branches don't match anymore	10:12
appcine	So I commit, then merge again	10:12
appcine	Hehe. I guess this wasn't made to be used like this :)	10:13
luks	nope	10:13
luks	merge is for merging branches :)	10:13
garyvdm	or server: bzr pull --overwrite	10:13
luks	but be sure you have no local changes with that	10:13
appcine	Perfect.	10:14
appcine	:)	10:14
appcine	I'm not changing anything on server .. It's my way of updating the server source	10:14
luks	right, so you want pull	10:14
garyvdm	--overwrite will only be necessary this first time. From now on, just bzr pull	10:15
fullermd	Or push into it and update.	10:16
appcine	garyvdm: Yeah, testing it now. Neat! :)	10:16
appcine	fullermd: I couldn't get bzr+ssh working on the server .. besides, I want three repositories.. one backup, one working copy and one live version on server	10:17
appcine	working copy = development copy	10:17
fullermd	Well, but if the live running version is always just a duplicate of one of the others, is there really a need for it to have a separate copy?	10:18
fullermd	(which isn't intended as a rhetorical "Why, of course there isn't", but it's not obvious that there is)	10:18
appcine	fullermd: Well, I want all my code in one place. I'm running several projects.	10:19
appcine	fullermd: If I'm not on my computer and the server for project #1 screws up, I can still access the code	10:20
appcine	fullermd: And I can burn it to dvd from just one location	10:20
appcine	Gives me some kind of freedom. I always know where all my code is. If that server gets borked, I re-push from my personal computers or the servers where I've distributed the code.	10:21
appcine	And it's the perfect test-server .. if I need to test something on a machine that's accessible from outside of our office lan, I can just launch a project from the intermediary server :)	10:22
appcine	So. What I've done now is translate my svn work flow into bzr. Removed the need for a cumbersome process of adding a new project on the svn server, I am more agile. Given myself the ability to commit locally (something I rarely do though). bzr ignore is soo much easier than anything that svn has :) I haven't tested branching yet, but I hear that's a lot easier as well.	10:29
appcine	What else should I be considering? :D	10:29
=== RichardL_ is now known as rml
johnny	hmm fun.. trying to get loggerhead running under lighttpd	11:16
=== aadis_ is now known as aadis
weigon__	johnny: works as planned ?	11:54
johnny	not yet	11:54
johnny	sadly	11:54
igc	jelmer: ping	11:56
jelmer	igc: pong	11:56
johnny	getting cherrypy.msg: : Page handler: "The path '/loggerhead' was not found."	11:57
igc	jelmer: how well does svn-import scale?	11:57
johnny	i could be wrong on how to set it up on the lighttpd side	11:57
igc	I'm about to try the OOo repo ...	11:58
jelmer	igc: As well as the rest of bzr-svn	11:58
jelmer	igc: Ah..	11:58
=== weigon__ is now known as weigon
igc	it's 76K files, 506K revisions!	11:58
jelmer	igc: I'm not sure :-)	11:58
weigon	johnny: let's walk through it in #lighttpd	11:58
igc	the svn dump file is ~ 85G	11:58
jelmer	whoa	11:58
igc	so I'm wondering whether ...	11:58
jelmer	igc: A few things that may help are:	11:59
igc	to load the dump file or ...	11:59
igc	run directly on the dump file if I can	11:59
jelmer	igc: Load the dumpfile into a Svn repository first (otherwise bzr-svn will ahve to do it for you)	11:59
igc	ok	11:59
jelmer	igc: Use python-subversion with the memory leak patch (should already be in Ubuntu Hardy)	11:59
igc	is the one in gutsy good enough?	12:00
igc	I had a quick go at building subverison 1.5 but the toolchain dependencies seem long	12:01
jelmer	igc: No, the gutsy one doesn't have it yet	12:01
rolly	Any place I can see loggerhead in action?	12:01
igc	I got as far as autoconf, swig and a few others	12:01
jelmer	igc: You should be able to rebuild the hardy one on gutsy easily	12:02
igc	as in just the python-subverison bit or all of subversion?	12:02
igc	rolly: launchpad uses loggerhead	12:03
igc	so go to any lp branch and click on 'browse code'	12:03
rolly	ah thanks :p	12:03
jelmer	igc: All of subversion	12:04
jelmer	igc: How much experience do you have with building Debian packages?	12:04
igc	jelmer: I'll give it another go. Very little experience building debian packages but ...	12:05
igc	there's no time like now to learn :-)	12:05
fullermd	All the patches and such will be in 1.5.0, right?	12:05
jelmer	igc: when you download the .orig.tar.gz and the .diff and apply the diff, it should be a matter of running 'debuild' in the resulting directory	12:06
jelmer	fullermd: yep	12:06
fullermd	Oh, good. All these problems should be in the past in only 18 months or so then ;)	12:06
dato	igc: to apply the diff, just download the .dsc as well, and do `dpkg-source -x $foo.dsc`	12:06
igc	jelmer: so I should build subversion trunk before bothering to load the dump file right?	12:07
igc	is subverison 1.4 compatible with 1.5 w.r.t repos or does it want a dump-load cycle?	12:07
jelmer	fullermd: Well, the 1.5 release is only 3 months away. Always has been.	12:08
jelmer	When I started on bzr-svn it was 3 months away and it still is now :-)	12:08
jelmer	igc: 1.4 should work fine as well for loading the repository	12:08
igc	jelmer: it's a shame we don't have a ppa for subversion 1.5 given we require it	12:10
jelmer	igc: 1.5 isn't required per se	12:10
jelmer	Ubuntu Feisty, Gutsy and Hardy have 1.4 with the required patches backported	12:10
jelmer	However, only Hardy has a memory leak fix for python-subversion which you will really want given the size of your repository	12:11
igc	ah	12:11
fullermd	Ah, just think of it as a good excuse to buy more RAM. A lot more RAM.	12:11
* igc wonders whether he should upgrade to hardy tonight		12:12
jelmer	fullermd: right, somewhere in the range of, uhm, 250 to 500 Gb...	12:13
jelmer	:-)	12:13
fullermd	Well, see? After the import's done, he can even install Vista!	12:13
igc	:-)	12:15
igc	jelmer: do you have any rough metrics w.r.t. import speed, e.g. revisions per minute?	12:21
jelmer	it should be delta-dependent	12:21
igc	bzr-git takes around 18 secs per revision btw	12:21
jelmer	wow	12:22
igc	it appears to be limited on the bzr import side, not the git read side	12:22
igc	that's for the OOo repo -with 76K files ...	12:22
igc	it's much faster on smaller code bases	12:22
jelmer	the same is probably true for bzr-svn	12:23
igc	the trouble is, at 3 revs per minute, OOo will take 3 months to import	12:23
jelmer	whoops	12:23
jelmer	Samba has ~3000 files and ~25000 revisions and takes only a couple of hours to import	12:24
igc	cool	12:24
igc	it will be interesting to see how my import compares	12:25
fullermd	Well, those above stats are about 13dB over on files and revs. That adds up a tad...	12:25
igc	yes, 76K is 25X higher than 3K; 506K revs is 20X more than 25K	12:26
igc	so 400 times "a couple of hours" ought to cover it :-)	12:27
igc	assuming everything scales linearly, of course	12:27
fullermd	So the real question is, which will finish first; converting the history, or compiling the program ;)	12:27
igc	sounds like a close race :-)	12:28
jelmer	igc: is this all in one branch?	12:29
igc	jlemer: I believe the branch count is 500	12:32
igc	see http://wiki.services.openoffice.org/mwiki/index.php?title=SCM_Migration#Clean_up	12:32
igc	the git repo has a 2.4G pack file	12:32
igc	which I thought was large until I bunzip'ed the 85G svn dump file :-)	12:33
igc	jelmer: ^^^	12:34
=== mrevell is now known as mrevell-lunch
jelmer	igc: In that case, you should be able to run several processes in parallel, one for each branch	12:35
igc	jelmer: I don't think so? I meant branches as in 'branches of the one code base', not branches as in separate modules	12:36
jelmer	ah	12:37
igc	I think getting the revisions in is the bottleneck	12:37
igc	jelmer: it looks like they've been bitten by having too many modules and so the want to explicitly go to a monolithic repo	12:37
igc	that's repo as in 'bzr branch'	12:38
appcine	ok, so I've pushed my code using sftp to the server. On the server, should I run "bzr update ." in the directory containing the .bzr directory?	12:40
=== kiko-zzz is now known as kiko
jelmer	igc: Ahh	12:43
jelmer	igc: I wonder how much time the initial step of bzr-svn is going to take	12:43
jelmer	igc: (analysing the repository history)	12:44
igc	jelmer: when I tried with bzr-git, I used your branch btw which ...	12:45
igc	was based on ddaa's which was based on ...	12:45
igc	jam's, etc.	12:45
jelmer	true distributed development :-)	12:45
igc	it created a git-cache directory which was 100G today :-)	12:45
jelmer	how did the bzr-git run go?	12:45
igc	it got to 12K revisions converted when I killed it earlier today	12:46
igc	it had been running since Friday night	12:46
jelmer	ah	12:46
igc	4K revisions/day is around the 18 secs per rev I mentioned	12:46
jelmer	I'm convinced bzr-svn should be able to do better	12:46
igc	it looks like the git-cache stuff was copied from bzr-svn	12:47
igc	so I was wondering how big ...	12:47
appcine	If I push my code to a server, I get a .bzr directory on the server. How do I make that .. code? :)	12:47
igc	the similar thing gets with bzr-svn	12:47
dato	appcine: `bzr checkout .` in the server	12:47
jelmer	igc: The framework was, but it caches different things	12:48
igc	it's true that bzr-git is more experiental of course	12:48
appcine	dato: Perfect	12:48
igc	ah - good	12:48
appcine	dato: And the next time I push? still checkout?	12:48
jelmer	igc: The Samba svn cache is only 69 Mb	12:48
dato	appcine: update	12:48
appcine	dato: sweet	12:49
igc	jlemer: that's sounds much better	12:49
igc	s/jlemer/jelmer/ - damn	12:49
igc	that's twice now	12:49
jelmer	:-)	12:49
awilkins	How does bzr-svn get the revision data? By asking for the file from each revision?	13:09
jelmer	awilkins: It retrieves the delta for the revision	13:10
awilkins	e.g. is it doing the equivalent of svn log ; foreach(changedfile in revisionLog) { svn cat file@revision } ; bzr commit ?	13:10
jelmer	no	13:10
awilkins	That's good :-)	13:10
jelmer	it's equivalent to "svn update -r$(R-1):$(R)	13:10
jelmer	for each revision	13:11
awilkins	Where's the bottleneck?	13:12
jelmer	in bzr writing the revisions and in bzr-svn processing of the revisions	13:13
jelmer	awilkins: is speed being an issue?	13:16
awilkins	It's slow enough to put me off using it more, if that's enough to fret about :-)	13:17
awilkins	But you can say the same about SVK, which theoretically should be a lot faster (since it uses SVN at the back)	13:17
awilkins	Want a bzr-svn test log for revision 926?	13:18
awilkins	(win32)	13:18
jelmer	awilkins: Don't use the 0.4 branch if you want performance, use 0.4.7	13:18
awilkins	Are you saying that r877 performs better than r 926?	13:20
jelmer	yes, there is refactoring going on in the 0.4 branch, that has degraded performance temporarly	13:21
awilkins	I think my assesment was probably on 0.47, but I tell you what, I'll wind back and see how it cope with our big, nasty repository	13:22
awilkins	Lots of binary Visio files, etc	13:23
awilkins	And some multi-megabyte access databases :-)	13:23
awilkins	To be honest, the performance on our SVN server has sucked hugely since they virtualised it ; I have this theory that they have the storage on a SAN somewhere and SVN doesn't like it.	13:26
jelmer	what is the size of the repository (num revisions, num files)?	13:26
awilkins	Hang on, I'll get some stats for you.	13:27
=== mrevell-lunch is now known as mrevell
awilkins	13k revisions, 39,344 files comprising 699 MB at HEAD, and 1.5GB of revision data in the repository	13:33
jelmer	I think bzr itself would be the main bottleneck there	13:35
jelmer	given the size of the tree	13:35
awilkins	It's actually chugging along quite nicely ATM	13:37
awilkins	Doing 1 or 2 revisions per second (highly unscientific measurement)	13:38
=== cprov is now known as cprov-afk
awilkins	It seems a lot faster than git-svn was (although that's also highly subjective)	13:39
awilkins	I believe the svn cache for this repo runs to about 58MB	13:39
awilkins	Ah, it must be getting to some meatier revisions now :-)	13:40
awilkins	If I put this into a repo-tree and branch all the branches in this SVN repo do they share packs?	13:42
jelmer	awilkins: yes	13:47
awilkins	Do you need to set up the branching scheme for this to happen?	13:47
=== cprov-afk is now known as cprov
jelmer	possibly, if it's a repository that doesn't use the usual svn conventions	13:48
awilkins	Oh, it doesn't :-)	13:48
appcine_	Can you do selective branching in bzr? Like, the temlate authors can branch "temlates" and editors branch "templates/static" without any extra setup? :)	13:48
awilkins	Not simply anyway	13:48
awilkins	appcine_: You'd just both take a branch and merge them	13:49
appcine_	awilkins: Aye, I was just curious if I could remove the "overhead" of making them browse my source tree to the specific part where they may update stuff	13:49
awilkins	appcine_: Which OS... if it's a *nix, they can just have a link to the lower folder :-)	13:51
awilkins	Hell, even on win32, they can have a shortcuty	13:51
appcine_	awilkins: OS X, and yeah .. i could create a link :)	13:52
Leonidas	is there a way to merge a treeless branch into another one? I get an error because there are no working trees	14:00
abentley	Leonidas: No, because after you merge, you need to commit.	14:03
Leonidas	hmm, indeed.	14:03
awilkins	jelmer: It dropped dead before it finished :-(	14:07
jelmer	awilkins: How?	14:08
=== mw\|out\|out is now known as mw
awilkins	bzr: ERROR: bzrlib.errors.KnitCorrupt: Knit <bzrlib.knit._PackAccess object at 0x0176A330> corrupt: While reading {svn-v	14:08
awilkins	3-trunk0:97052673-6ba5-7c4e-b85a-d09b8cc4c1f0:trunk:779} got MemoryError()	14:08
jelmer	awilkins: Ah, it ran out of memory	14:09
jelmer	awilkins: You should be able to resume it	14:09
jelmer	awilkins: perhaps you're not using a version of python-subversion with the memory leak fixes	14:09
awilkins	That the cd/branch ; bzr init ; bzr pull <url> ?	14:09
awilkins	THe page that I got them from claims to have rolled that fix into them	14:10
jelmer	yeah, you should be able to just run bzr pull again now	14:10
jelmer	are there any big files in the repository?	14:10
awilkins	Yes	14:10
jelmer	how big?	14:10
awilkins	Up to 20-30MB I think	14:10
Leonidas	abentley: it would be cool if it could create lightweight chechouts on the fly and commit afterwards provided there are no conflicts. This is what I do at the moment.	14:11
jelmer	mwhudson: is there any chance loggerhead is going to support being used inside of apache?	14:11
jelmer	awilkins: hmm, that shouldn't be a problem	14:12
awilkins	I'm trying a resume now	14:14
awilkins	It is just cd branch ; bzr init ; bzr pull <url> isn't it?	14:14
jelmer	this time you should only have to run the bzr pull bit	14:14
awilkins	It says "not a branch" AFAIk if you do that	14:15
jelmer	if you're running init again it wouldn't be resuming anything	14:15
awilkins	I didn't run init to start with	14:16
awilkins	I started it with a bzr branch	14:16
jelmer	oh, ok.	14:16
jelmer	In that case it won't be resuming	14:16
awilkins	Bum	14:16
jelmer	unless you're inside a shared repository	14:16
* awilkins issues expletives		14:16
awilkins	Pack-0.92 not compatible with bzr-svn?	14:18
jelmer	awilkins: No, you need rich-root-pack	14:18
* awilkins suggests that should be in the error message		14:19
jelmer	yeah, there's already an open bug about that	14:20
abentley	Leonidas: Autocommits are dangerous. Just because there are no text conflicts doesn't mean the merge was successful. We encourage people to have a test suite and run it.	14:20
Leonidas	abentley: I see your point. How about an option like --i-am-absolutely-sure-that-this-will-merge-properly-and-take-all-the-responsibility?	14:22
awilkins	Heavens, my powershell script is running slowly	14:23
* fullermd sighs.		14:23
abentley	Well, I'm not going to write such a thing.	14:23
fullermd	I really with irssi would stop chopping wrapped lines :(	14:23
fullermd	And sometimes, I even wish...	14:23
jelmer	Leonidas: perhaps a plugin with a command with that behaviour	14:24
jelmer	?	14:24
Leonidas	jelmer: Would be fine, indeed.	14:26
* Leonidas takes a look on how bzr plugins look like		14:26
awilkins	Ouch, python is eating 550MB now	14:28
awilkins	jelmer: 1.2GB now :-{ 1.3 .... oh, finally, the GC kicks in, still 955 MB though.....	14:36
awilkins	You can just branch something into a repository tree to convert it from standalone to repo-tree, yes?	14:37
jelmer	yes	14:38
awilkins	jelmer: For what it's worth, the UI for "bzr branch svn+http://" is much more reassuring than that for bzr pull ; the former tells you how many SVN revisions it's got through, the latter just sits at "Pull phase 0/2" for a looong time.	14:39
awilkins	jelmer: Do you think it might go faster if it supressed repacking as it went, or slower?	15:20
=== kiko is now known as kiko-fud
jelmer	awilkins: Not sure	15:25
awilkins	I guess it's not easy to work it out without profiling - do you know any good Python profiler?	15:26
awilkins	THe ultimate goal would be to get the speed network-limited on a typical desktop machine :-)	15:27
awilkins	Although I think it might be disk I/O limited here, it's running between 80-100% CPU utiisation.	15:28
* awilkins finds the profilers in the std python library ans is humbled		15:30
jelmer	there is lsprof support in bzr I think	15:33
awilkins	There's even a pre-prepared output in the wiki :-)	15:36
awilkins	Why am I not surprised to find XML processing eating a lot of time .....	15:37
awilkins	Looks like the most could be gained from improving find_longest_match though (which is probably really hairy-scary)	15:40
abentley	awilkins: The thing is that repacking does reduce seek time, so it really is a tough call.	15:50
=== kiko-fud is now known as kiko
awilkins	Oh yes, I would guess at it, but it's not an improvement unless you measure it.	16:11
awilkins	Does the API provide for supressing packs temporarily?	16:11
abentley	awilkins: I'm not sure whether you mean pack creation or repacking, but both are controlled in the API.	16:13
awilkins	It's repacking, I've been watching the folder while bzr-svn pulls - packs vanish, old packs get bigger	16:14
abentley	awilkins: What command are you executing?	16:14
awilkins	pzr pull	16:16
awilkins	Might not be true that old packs are getting bigger	16:16
abentley	Packs should only get bigger when they're being created, before they're renamed into place.	16:17
awilkins	I think it's just my bad interpretation	16:18
awilkins	Old packs are disappearing and being replaced with bigger ones in the same ordinal place in the list	16:18
awilkins	I'm just watching explorer sorted by mod time	16:18
abentley	awilkins: The code that copies revisions from an svn repo to a bazaar one does one revision at a time. I believe it could do more than one at a time, though it probably wouldn't make sense to do them all at once.	16:22
jelmer	abentley: How could it do more than one? That would just mean keeping more data in memory and waiting with writing it out to disk.	16:33
abentley	As long as you don't close the write group, the data is still written to disk, but the pack isn't finished and renamed into place.	16:36
awilkins	There _are_ a lot of dinky little 2k packs here.	16:37
awilkins	I'm guessing it's ending up with 1 pack-per-revision, until it repacks	16:39
awilkins	Well, it's now pulled nigh on 700MB from an SVN repo of 1.5GB, the rate at which it's increasing has slowed tremendously.	16:51
awilkins	The trunk accounts for 9000 out of 13000 revisions, but I can't tell where it's got to in terms of those 9000	16:52
=== awilkins is now known as awilkins_train
ubotu	New bug: #191001 in bzr "checkout doesn´t work" [Undecided,New] https://launchpad.net/bugs/191001	17:16
=== mrevell is now known as mrevell-dinner
sistpoty\|work	hi, how can I remove a stale lock? (it says s.th. like "Unable to obtain lock file:///srv/revu.repo/.bzr/repository/lock")	18:28
dato	sistpoty\|work: bzr break-lock	18:28
sistpoty\|work	ah, thanks	18:28
mwhudson	jelmer: um, it does?	19:12
jelmer	mwhudson: what did I say exactly?	19:12
mwhudson	<jelmer> mwhudson: is there any chance loggerhead is going to support being used inside of apache?	19:25
=== asak_ is now known as asak
jelmer	mwhudson: Ah	19:45
jelmer	mwhudson: That should be "easily" be supported	19:46
mwhudson	jelmer: what about it is not 'easy'?	19:46
mwhudson	jelmer: you set up mod_proxy/mod_rewrite and set server.webpath in the conf file	19:46
mwhudson	i mean, documentation is lacking, but other than that?	19:47
jelmer	You have to run an extra daemon	19:47
mwhudson	so you'd rather a cgi like setup?	19:48
jelmer	yeah	19:48
jelmer	bitlbee is using hgweb atm and we were considering migrating, but it's just too much trouble atm	19:49
jelmer	that, and the dependencies (but I think that's been brought up before)	19:49
mwhudson	loggerhead currently caches way too much at branch object creation time for that to really work	19:49
mwhudson	though i guess for small projects it could work	19:50
mwhudson	abentley and i were talking about making loggerhead (or something a bit like it) into a more of a library for generating html describing a branch	19:53
mwhudson	and decoupling it more from the publishing side	19:53
jelmer	BitlBee is probably too big for that	19:53
jelmer	we're currently looking into alternatives for what we're using atm (hgweb and trac with trac-bzr)	19:53
jelmer	the size of our revision history tends to bring trac down occasionally	19:54
mwhudson	oomph :)	19:54
mwhudson	how many revisions?	19:55
jelmer	ABOUT 1.1K, SO NOT TOO MANY	19:56
jelmer	sorry for shouting	19:56
jelmer	so not too many	19:57
jelmer	mwhudson: I think that would be a good idea actually, splitting out a library that can generate HTML representations of Bazaar data	19:58
mwhudson	ok, in my testing i've been using launchpad (5k files, 20k revisions) as a "large project"	19:58
jelmer	ah, ok	19:59
jelmer	it's probably tracs fault then, it already feels really slow for BitlBee for simple operations (and runs as a separate daemon so it can do caching)	19:59
mwhudson	jelmer: yeah	20:10
mwhudson	loggerhead.bitlbee: built revision graph cache: 0.021812915802001953 secs	20:10
mwhudson	certainly, loggerhead seems pretty quick on bitlbee	20:12
jelmer	hgweb is pretty quick too, but it's unmaintained and has regressed recently	20:13
jelmer	building the revision cache is the most inefficinet step?	20:13
mwhudson	it depends	20:14
mwhudson	for launchpad, the pain point is extracting inventories	20:14
mwhudson	also, computing the files changed in a revision can be slow	20:15
mwhudson	(but you can cache that)	20:15
jelmer	that depends on the size of the tree I guess?	20:16
mwhudson	yeah	20:17
=== Gwaihir_ is now known as Gwaihir
lifeless	moin	21:10
=== mrevell-dinner is now known as mrevell
hsn_	any big projects migrated to BZR after 1.0 rel?	21:24
johnny	mwhudson, is there a reason you don't use wsgi in loggerhead?	21:25
johnny	i've been trying to get loggerhead to work with lighttpd, but the simple proxy method wont' work with 1.4.x	21:26
mwhudson	johnny: i don't know, is there any reason why i would use wsgi in loggerehad?	21:26
mwhudson	johnny: but i should point out that this side of loggerhead is very much Not My Fault :)	21:26
johnny	?	21:26
weigon	johnny: WSGI should be a feature of turbogears, it is one for all	21:27
johnny	atm it seems like your script has to be modified?	21:27
johnny	maybe i'm wrong	21:27
mwhudson	johnny: loggerhead runs happily enough behind a proxy	21:27
mwhudson	you need to set server.webpath in the config	21:28
johnny	i did	21:28
johnny	maybe i set it wrong	21:28
weigon	mwhudson: can you tell loggerhead to strip the a path-segment from the URL ?	21:28
johnny	i bet that's possible within turbogears	21:28
jelmer	mod_proxy can IIRC	21:28
johnny	i just don't know how yet :)	21:28
weigon	mwhudson: so if the URL is /foobar/baz/loggerhead/... that you string the first part and loggerhead only sees its part	21:28
mwhudson	johnny: i guess "won't work" isn't a good bug report :)	21:28
mwhudson	weigon: i hear a rumour that this is possible yes	21:29
johnny	hmm.. now that i'm more awake,i'll go look it up	21:29
weigon	johnny: you need that strip-prefix feature and lighttpd+mod_proxy will happily work for you	21:30
=== zmanuel is now known as z-man
johnny	mwhudson, do you happen to know off the top of your head on how to strip it?	21:44
mwhudson	johnny: no	21:46
johnny	hmm.. back to my cvsps import, what is the proper procedure to get the head branch of a module out of the repository and use that as the base for another shared repository?	21:49
johnny	just branch it directly?	21:49
bob2	a	21:53
lifeless	b	21:54
bob2	oops	21:54
=== lnxtech is now known as brokentux
abentley	Yeah, that "a" revision was a bit of a goof :-)	22:22
reggie	anyone seen bzr-svn give a xxx not a branch error?	22:22
jelmer	reggie: yes	22:23
reggie	I have a svn fsfs repo that appears to convert ok to about 25%	22:23
reggie	and then I get a not a branch error which I don't really understand. I think the folder it shows is a branch	22:24
jelmer	you're running "bzr svn-import" ?	22:24
reggie	yes	22:24
jelmer	that would be bug 183361	22:25
ubotu	Launchpad bug 183361 in bzr-svn "bzr-svn on a branches not working" [Medium,Triaged] https://launchpad.net/bugs/183361	22:25
reggie	so branches don't work at all?	22:25
reggie	we have someone here that got it to work	22:26
reggie	perhaps it's intermittent	22:26
jelmer	it works, but there's a bug if something strange happened in the history of a branch	22:27
jelmer	I haven't quite worked out what causes it to break	22:28
foom	but it works if you set a custom branching scheme correct for your project, i seem to recall	22:29
reggie	I assumed that auto would determine I'm using trunk (which I am)	22:30
jelmer	foom: it will never fail halfway through a svn-import though	22:30
jelmer	reggie: yes, it will. You're just hitting a bug in bzr-svn caused by some oddness in your repository	22:30
reggie	and fighting my own ignorance of bzr. I've just started using it	22:31
reggie	what does --standalone do and is it the default? seems like it is trying to convert all branches but I didn't give --standalone	22:31
jelmer	standalone determines whether it should use a bzr shared repository or not	22:32
jelmer	it will by default	22:32
reggie	so, use the svn repo as the parent?	22:32
reggie	which I don't want	22:32
jelmer	no	22:33
jelmer	a bzr shared repository	22:33
reggie	oh ok	22:33
reggie	I understand	22:33
reggie	sorry	22:33
jelmer	reggie: no worries	22:35
reggie	so I'm pretty much left with --prefix or just doing a bzr branch on the branches I care about?	22:35
jelmer	reggie: Any chance you can add a comment to that bug about the issue you're hitting?	22:35
jelmer	in particular, the "svn log" for the revision that's problematic could be useful	22:36
reggie	sure, let me figure out how (and I"m not sure what i would say other than I hit it too)	22:36
reggie	hmm. don't think it's a revision.	22:36
jelmer	It's the changes in a parituclar revision that are problematic	22:36
jelmer	you can figure out what revision is problematic by running "bzr -Dtransport svn-import ..." and looking at the last few lines in .bzr.log before it crashes	22:37
reggie	be happy to help just have no idea how to determine what revisoin that is based on what bzr is saying	22:37
reggie	ahh thanks	22:37
jelmer	the bit that would be useful then would be the "svn log -v" output for that particular revision (commit message/author, etc shouldn't matter)	22:37
jelmer	reggie: or, if this repository is public, just mention the repository URL	22:38
reggie	hmm. that reminds me we do have a public repo. maybe I'l try to convert that one	22:38
reggie	jelmer, .bzr.log shows a svn update and a svn revprop-list -r on 689 and then the crash	22:46
reggie	so is it 689 or 690 that caused it?	22:46
jelmer	689	22:46
jelmer	the output of "svn log -v -r688:690 <url>" would be useful	22:46
=== kiko is now known as kiko-afk
reggie	jelmer, comment attached	22:53
reggie	now I'll try our public repo	22:53
igc	morning	22:54
jelmer	reggie: Thanks!	22:54
reggie	np	22:54
reggie	jelmer, got a sec?	23:14
reggie	I did a bzr svn-import --prefix=trunk on my url and it ran to completion but I don't see any files other than a .bzr folder	23:14
jelmer	reggie: Run "bzr checkout" inside that directory	23:15
reggie	oh. bzr log shows some info	23:15
reggie	hmm. I made a shared repo inside a shared repo. I did mkdir tmp; cd tmp; bzr init-repo .; bzr svn-import <url> trunk	23:16
reggie	and now I have tmp/trunk/trunk/.bzr	23:16
reggie	jelmer, ok seems to be working. how are svn tags handled? as native bzr tags?	23:19
jelmer	no, they're converted into branches at the moment	23:21
jelmer	there's an open bug about it	23:21
* Peng wonders why bzr decided to think the submit branch is ".".		23:22
Peng	At least I happened to notice that before sending an empty patch. :\	23:23
=== jamesh__ is now known as jamesh
reggie	jelmer, so if I import a few of my branches and then someone fixes the tag bug with bzr-svn, can I then somehow get my svn tag info into my braches (even though I've been using the branches)?	23:43
reggie	can I merge two branches into a tag?	23:43
jelmer	reggie: Yes, once that bug is fixed you will see the svn tags as bzr tags	23:44
jelmer	I'm not sure what you mean by merging two branches into a tag	23:44
reggie	for example with svn I have branches labeled 5.0, 5.1, 5.2 ( for each version) and I would have the same for bzr	23:45
reggie	but there are also tags in those like 5.1.1 and 5.1.2 and 5.1.3. These should not be branches since I never go back and commit code to them	23:46
reggie	can I convert the 5.1 branch, start using bzr to commit code to it, and then later add the tag info once that bug is fixed?	23:46
reggie	maybe it' sjust easier for me to recreate the tags. just take a couple of hours	23:47
reggie	just do bzr tag -r for each tagged revision in svn	23:48
jelmer	I think that's probably the easiest solution	23:52
reggie	yup.	23:53
reggie	thanks for your patience. I really appreciate it	23:53

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!