/srv/irclogs.ubuntu.com/2008/02/11/#bzr.txt

lifelessabentley: ok, bit of a missive, but its on its way00:19
=== kiko-afk is now known as kiko-zzz
abentleylifeless: At first blush, it looks like violent agreement.00:29
abentleyI've just skimmed so far.00:29
=== mw|out is now known as mw|out|out
lifelessabentley: I had the sense from you that other things than bytes would be on your 'storage' object, but if not then it was just mutual confusion I think00:35
abentleyWell, graph data also, I guess.00:36
lifelessthats something I explicitly rejecting at this point00:36
lifelessI don't know if I'm right too00:37
abentleyWell, the build graph seems like it must be there.00:37
lifelessabentley: thats under the wraps00:37
lifelessrephrasing, you don't need to expose the fact compression has happened at all outside the byte store00:38
abentleyOkay, I guess we're in disagreement, then.00:38
lifelessthats good; it means there's more to explore :)00:39
abentleyThere are several things we can't implement without that.00:39
lifelessI'm going to go for a walk and get lunch and think about those things, if you could list them now :)00:39
abentleyOne of them is Goffedo's inventory hack.00:39
abentleyAnother is fetching via the stream thing that Andrew and you added.00:40
lifelessit doesn't depend on exposing deltas in the api00:40
lifelessit depends on exposing 'the set of lines from all these texts'00:40
abentleyAnother is iter_changes based on a semantic inventory delta.00:41
abentleyThat's all I can think of at the moment.00:41
lifelessI can't parse that last one.00:41
abentleyYou know your journalled inventory stuff?00:41
lifelessoh right, well that doesn't do deltas within the byte store, its a form of fulltext always00:42
lifelessits a layer up00:42
abentleyIs that the right layer?00:42
lifelessbut the stream-fetching one I will cogitate on, I think I didn't /quite/ have my ducks in a row00:42
abentleyIf it's a layer up, does that mean we can't get a comprehensive inventory directly from the multi-versionedfile?00:43
abentleylifeless: Oh, sorry plenty more.00:44
lifelessI don't think inventory should be in a versioned file abstraction at all eventually. We will want an index for the inventory tree, and then read the lot.00:44
abentleyAnything that tries to accelerate test comparisons using knit deltas.00:44
lifeless'test comparisons' ?00:44
abentleys/test/text00:45
abentleyCurrently, annotate and send use that.00:45
lifelesshow does that work?00:45
abentleyWe derive the SequenceMatcher.get_matching_blocks output from the knit delta and a pair of fulltexts.00:46
abentleyThe fulltexts are used to fix up the eol bogosities.00:47
lifelesssounds like what you want is 'byte_store.get_matching_blocks(from_key, to_key)' ?00:48
lifelesslets not detail it now00:48
lifelesslets just keep thinking of issues00:48
abentleylifeless: At least some of the time, you want the fulltexts as well as the matching_blocks.00:49
abentleyI think the new "knit merge" also uses that information.00:49
abentleyie the matching_blocks.00:50
lifelessabentley: to come back - sure; we can handle that variation I think00:51
abentleyYeah.00:52
lifeless-> to think. (inventory hack, journalled_inv, stream_fetch, get_matching_blocks)00:52
abentleySo I think the use cases for raw compression artifacts are 1. transmission between repos, 2. use of partial data, 3. acceleration of comparisons.00:52
poolielifeless, good point about acting as an ssh agent00:54
=== BasicMac is now known as BasicOSX
lifelesspoolie: thanks ;)01:14
poolieso01:14
pooliereally this is a broader reason not to use builtin ssh, more than anything else01:14
lifelessYes01:14
lifelessif you want ssh credential caching, use an ssh agent, kthxbye. IMNSHO01:14
poolielifeless, quick call?01:15
lifelesssure01:16
=== jamesh__ is now known as jamesh
cr3_is there any plans to have a dapper .deb on the ppa?03:02
=== cr3_ is now known as cr3
Solarionis there a way to import the revision history of an RCS directory?03:47
lifelesssure04:14
lifelesscvs init a repo04:14
lifelesscopy the RCS files into a subdir there04:14
lifelessuse cvsps-import04:15
abentleylifeless: The other reason I wanted access to raw records was repository stacking.04:19
lifelessabentley: that would interact badly with different delta formats cross-repo04:20
abentleylifeless: that depends how stupid we are.04:21
lifelesslol04:21
abentleyI had the impression that we were going to explore the possibility of multiple delta formats per repository anyhow.04:22
lifelessI think that for stacking we basically do iter_file_texts on the repos we're stacked *against*, for every component we can't create ourselves04:22
lifelessannotation is the other thing not really covered04:23
abentleyProducing unnecessary fulltexts takes CPU time.04:24
abentleySo if we can treat the foreign repo raw entries the same as local ones, that can aid performance.04:25
abentleyJohn discovered that we were wasting time inserting lines in the middle of lists building fulltexts from knits.04:27
abentleyThat's why MPDiff can generate a fulltext from the top down without generating intermediate fulltexts.04:27
=== aadis_ is now known as aadis^
lifelessI think this is an undesirable abstraction violation in the general case.; the cost of getting data from a remote repo in the first place dwarves local cpu cost04:28
=== aadis^ is now known as aadis
lifelesssecondly, the number of stacked components is going to be small - I don't imagine many getting more than 3-4 steps04:29
abentleyWhich is the abstraction being violated?  delta compression?04:30
lifelessand that will mean we 3-4 points where we enforce a full text basis04:30
abentleyI think that we have enough use cases for compression deltas that it's reasonable to question whether that abstraction is a helpful one.04:33
lifelessI'm refining it now, the great thing about strawmen is they get comments04:33
lifelessI really want something that can be consistent across repositories sensibly, and I don't think something exposing deltas itself can; not the way we handle deltas today04:34
abentleylifeless: I think just tagging deltas with their format goes a long way.04:37
igclifeless: re hg-import, the guts of install_revision in there does diff from the same in repository.py ...04:47
igcis that the per-file graph issue ?04:48
lifelessyou're a little opaque in that sentence04:48
igcin repository.py, that bit of code has a "FIXME: TODO:" from yourself and abentley fwiw04:48
igcsorry, I'll try again ...04:49
lifelessI haven't looked at hg-import in great detail; but as I know of no reason to have different to bzr-hg, I'm suspicious from the get-go04:49
* igc looks up line numbers04:49
=== asac_ is now known as asac
igclifeless: in repository.py, the for loop beginning with a comment on line 1973 is the bit of interest04:54
igcin the hg-import plugin, that routine is largely repeated but that inner loop is different - perhaps a copy of some older code at a quick guess04:55
lifelessmeh04:55
lifelessso why does hg-import exist?04:56
lifelesswhat does it do differently to 'bzr pull' with bzr-hg installed ?04:56
igchttp://rafb.net/p/drJQkk16.html is the code btw04:56
igchg-import exists because bzr-hg doesn't work ...04:57
igcand Lukas found it easier to write it that fix bzr-hg04:57
igcs/that/than/04:57
lifelessits definately buggy04:57
abentleyLukáš also thinks that bzr-hg does its topo sorting wrong.04:58
lifelessIf I was spending time on this I would be updating bzr-hg, because its more generally useful, and can have the bzr repository conformance tests run against it04:58
lifelesswhich if taken to completion would give -real- confidence in it04:58
igclifeless: my actual focus right now is the git->bzr converter. I'd like to see hg-import merged into bzr-hg and whatever issues bzr-hg has fixed. I'm ok to do that once other stuff is off my plate05:02
igcI'm raising this now simple because ...05:03
igcusers are using hg-import in the absence of a working bzr-hg so if it's broken ...05:03
lifelessmeh, I hope its got a different namespace05:03
lifelessif it doesn't I'll be seriously miffed05:03
igcthen we ought to be sure key people like you know about it05:03
lifelessits an incompatible converter05:04
igcthe namespace prefixes as hg: (bzr-hg) vs hg- (bzr-hgimport)05:06
lifelessgood05:07
igcs/as/are/05:07
jameshlifeless: by the way, you can replace the code in http://www.advogato.org/person/robertc/diary/78.html with "python -mtrace -t program.py"05:29
johnnymwhudson, what is the current state of the loggerhead_dev branch?05:31
abentleyigc: You mean hg-import doesn't use a colon in its prefix?05:37
jmlis there an API in Bazaar to make all non-existing directories above a directory?05:38
spivjml: there's a "_create_prefix" function in bzrlib.builtins05:40
spivjml: not exactly a public API, but you could crib from it I guess.05:40
jmlspiv: thanks05:40
mwhudsonjohnny: it works better than anything else05:42
mwhudsonjohnny: i have a few plans for further improvements but they're a ways off realistically05:42
johnnyjust wondering why there wasn't a newer release on the loggerhead page05:42
mwhudsonah, yes, i should make a release05:42
mwhudsoni'm lazy when it comes to releases :)05:43
johnnyi noticed that you were still messing with it recently via the launchpad page05:43
johnnyalso, the demo of loggerhead seems not to work05:43
johnnyit kinda hangs05:43
johnnysame for bzr-webserve too strangely enough05:43
johnnyproxy error i think05:43
lifelessjamesh: thanks. (Groan at wheel invention)05:46
lifelessjml: I would use _create_prefix directly.05:47
lifelessjml: not public just means 'wont get deprecated'05:47
lifeless:)05:48
jmllifeless: yeah, that's what I'm doing05:48
lifelessjamesh: when was the trace module added ?05:50
jameshlifeless: not sure.  It has been around for a while though05:50
jameshlifeless: http://svn.python.org/view/python/trunk/Lib/trace.py <- it has been in the standard library since 2003, and was in Tools/scripts before that05:51
lifelessrotfl05:51
lifelessthanks05:51
bob2-mtrace has worked since 2.405:51
spivYeah, the "-m" was new in 2.4 IIRC.05:54
igcabentley: hg-import's bzr_revision_id(node) does this: return 'hg-' + mercurial.hg.hex(node)05:58
jameshthe module was in 2.3 too, but you couldn't use the "python -m" syntax, yes05:58
abentleyigc: Well, I can't say I'm surprised.05:58
lifelessigc: (thats not namespaced in our terms, so they could conceptually collide more easily)05:59
abentleyBut that is an entirely legal revision-id for bzr itself to generate.05:59
lifelessI'm glad he didn't use hg: though, because colliding with bzr-hg would have been hilarious06:00
abentleylifeless: It used to.  I asked him to change it.06:00
lifelessabentley: thank you!06:00
abentleynp06:00
igcabentley, lifeless: so what is the convention here?06:01
igcbzr-git is using git-experiemental-r:06:01
igcas the prefix06:01
lifelessigc: if you are generating random ids, use bzr to create them06:01
spivI think ideally the prefix ought to be (or at least include) the name of the plugin, so you know who to blame ;)06:02
igcas an experiment, I changed that to git-r: in one of my test runs and it saved a fair amount of space06:02
lifelessigc: if they are deterministic, namespace them with CONVERTER:, and change CONVERTER whenever the algorithm changes06:02
abentleyRevision ids that include a ':' will never be generated by bazaar-- the ':' is a namespace separator.06:02
abentleyRevision ids ending with ':' are reserved.06:03
igcthanks06:03
jameshigc: so doing the same import twice produces the same results? (same file IDs, revision testaments, etc?)06:18
igcjamesh, yes, that's the point to determinstic ids as I understand it06:18
jameshwhat do you use as file IDs?06:19
jamesh(out of interest)06:19
igcthe different converters all do something slightly different it seems06:19
jameshyep.  It depends on what sort of file identity rules the source VCS has06:20
igcjamesh: the git does this: file_id.replace('_', '__').replace(' ', '_s')06:20
igcwhich looks a little suspect to me06:21
jameshwhere file_id is what?06:21
igc(path.encode('utf-8')06:22
lifelessigc: thats ok, brittle, but ok.06:22
igcah good06:22
igcI was concerned about any existing sequences of _s06:23
igcthat couldn't be mapped back the other way uniquely IIUIC06:23
lifelessgit is a rename-free system06:23
jameshigc: "_s" would get encoded to "__s"06:23
lifelessso paths are fine but project specific06:23
lifelessthings like svn and hg that support some form of rename are much more complex06:24
lifelessyou need to find the tail of the per-file graph06:24
lifelessand assign a unique id (using the path of the name at that point is reasonable)06:24
igcah - ok06:24
spivigc: that escaping scheme is unambiguous, although it'd be easy for the unescaping to be buggy...06:24
lifelessspiv: what unescaping :)06:25
igc:-)06:25
spivAh, good point.  Problem solved, then ;)06:25
jameshspiv: right.  Doing it in two passes (like the escaping is done) would be buggy.06:25
lifelessother fugly thing is paths are long06:25
jameshthat was a problem for bzr-svn in the past, right?06:25
lifelessI'd probably use the revision-id at time of file creation + a serial within the tree for number of files added in that revision numbering via alpha-sort, or something like that06:26
jameshgit does have some idea of tracing a file's history over renames, so simply using the path as a file ID will give a different view of history06:27
lifelessjamesh: pickaxe you mean? Thats always derived06:28
lifeless(its history mining. lolz. hahaha)06:28
igclifeless: so IIUIC, a repo converted from other tool may well have a different (usually bigger?) size than a vanilla bzr repo and might also perform differently06:29
igcI wonder how different on a large repo like the OOo one06:29
igcs/other/another/06:30
jameshlifeless: I was thinking of "git-log -M"06:30
jameshI don't know if that's the same thing as pickaxe06:31
lifelessigc: if you use something like tailor, no. Because its non-deterministic and equivalent to serially doing bzr commits06:31
lifelessjamesh: looks like - note the 'detect renames'.06:31
lifelessigc: if you use something designed primarily as a foreign repository interface, then yes, because we're thunking across to the native metadata.06:31
igcmakes sense to me06:32
=== aadis_ is now known as aadis
lifelessabentley: updated proposal sent06:41
abentleyCool.  Good night.06:42
=== aadis__ is now known as aadis^^
=== aadis__ is now known as aadis
=== doko_ is now known as doko
=== aadis_ is now known as aadis
=== AnMaster_ is now known as AnMaster
ubotuNew bug: #190832 in bzr-svn "PROPFIND exception during check out of Subversion branch behind https" [Undecided,New] https://launchpad.net/bugs/19083209:24
ubotuNew bug: #190843 in bzr-svn "Attempting a lightweight checkout raises KeyError exception" [Undecided,New] https://launchpad.net/bugs/19084309:24
appcineWhat am I doing wrong here?10:02
appcineMy previous workflow was (using svn): client: commit, server: update -- restart web server. done.10:02
appcineMy new workflow: client: bzr commit, bzr push .. server: bzr merge, bzr commit -- restart web server. done.10:03
appcineIf i do not run the bzr commit on server, it complains the next time i merge about having uncommited changes10:03
appcineAm I doing something wrong, or is this my new life? :)10:06
luksappcine: you can do exactly the same as with svn10:07
luksthat is, server: update10:07
luksthat is, if you push directly to the published branch on the server10:08
luksothewise you want pull instead of merge and commit10:08
garyvdmOr - client: bzr commit , server: bzr pull10:08
garyvdmluks - you beat me :-)10:09
luks:)10:09
appcinehmm10:11
appcineso I get this: bzr: ERROR: These branches have diverged. Use the merge command to reconcile them.10:12
luksbecause you did commit10:12
appcinethen I merge, but can't because I have uncommitted changes.10:12
luksso the branches don't match anymore10:12
appcineSo I commit, then merge again10:12
appcineHehe. I guess this wasn't made to be used like this :)10:13
luksnope10:13
luksmerge is for merging branches :)10:13
garyvdmor server: bzr pull --overwrite10:13
luksbut be sure you have no local changes with that10:13
appcinePerfect.10:14
appcine:)10:14
appcineI'm not changing anything on server .. It's my way of updating the server source10:14
luksright, so you want pull10:14
garyvdm--overwrite will only be necessary this first time. From now on, just bzr pull10:15
fullermdOr push into it and update.10:16
appcinegaryvdm: Yeah, testing it now. Neat! :)10:16
appcinefullermd: I couldn't get bzr+ssh working on the server .. besides, I want three repositories.. one backup, one working copy and one live version on server10:17
appcineworking copy = development copy10:17
fullermdWell, but if the live running version is always just a duplicate of one of the others, is there really a need for it to have a separate copy?10:18
fullermd(which isn't intended as a rhetorical "Why, of course there isn't", but it's not obvious that there is)10:18
appcinefullermd: Well, I want all my code in one place. I'm running several projects.10:19
appcinefullermd: If I'm not on my computer and the server for project #1 screws up, I can still access the code10:20
appcinefullermd: And I can burn it to dvd from just one location10:20
appcineGives me some kind of freedom. I always know where all my code is. If that server gets borked, I re-push from my personal computers or the servers where I've distributed the code.10:21
appcineAnd it's the perfect test-server .. if I need to test something on a machine that's accessible from outside of our office lan, I can just launch a project from the intermediary server :)10:22
appcineSo. What I've done now is translate my svn work flow into bzr. Removed the need for a cumbersome process of adding a new project on the svn server, I am more agile. Given myself the ability to commit locally (something I rarely do though). bzr ignore is soo much easier than anything that svn has :) I haven't tested branching yet, but I hear that's a lot easier as well.10:29
appcineWhat else should I be considering? :D10:29
=== RichardL_ is now known as rml
johnnyhmm fun.. trying to get loggerhead running under lighttpd11:16
=== aadis_ is now known as aadis
weigon__johnny: works as planned ?11:54
johnnynot yet11:54
johnnysadly11:54
igcjelmer: ping11:56
jelmerigc: pong11:56
johnnygetting  cherrypy.msg: : Page handler: "The path '/loggerhead' was not found."11:57
igcjelmer: how well does svn-import scale?11:57
johnnyi could be wrong on how to set it up on the lighttpd side11:57
igcI'm about to try the OOo repo ...11:58
jelmerigc: As well as the rest of bzr-svn11:58
jelmerigc: Ah..11:58
=== weigon__ is now known as weigon
igcit's 76K files, 506K revisions!11:58
jelmerigc: I'm not sure :-)11:58
weigonjohnny: let's walk through it in #lighttpd11:58
igcthe svn dump file is ~ 85G11:58
jelmerwhoa11:58
igcso I'm wondering whether ...11:58
jelmerigc: A few things that may help are:11:59
igcto load the dump file or ...11:59
igcrun directly on the dump file if I can11:59
jelmerigc: Load the dumpfile into a Svn repository first (otherwise bzr-svn will ahve to do it for you)11:59
igcok11:59
jelmerigc: Use python-subversion with the memory leak patch (should already be in Ubuntu Hardy)11:59
igcis the one in gutsy good enough?12:00
igcI had a quick go at building subverison 1.5 but the toolchain dependencies seem long12:01
jelmerigc: No, the gutsy one doesn't have it yet12:01
rollyAny place I can see loggerhead in action?12:01
igcI got as far as autoconf, swig and a few others12:01
jelmerigc: You should be able to rebuild the hardy one on gutsy easily12:02
igcas in just the python-subverison bit or all of subversion?12:02
igcrolly: launchpad uses loggerhead12:03
igcso go to any lp branch and click on 'browse code'12:03
rollyah thanks :p12:03
jelmerigc: All of subversion12:04
jelmerigc: How much experience do you have with building Debian packages?12:04
igcjelmer: I'll give it another go. Very little experience building debian packages but ...12:05
igcthere's no time like now to learn :-)12:05
fullermdAll the patches and such will be in 1.5.0, right?12:05
jelmerigc: when you download the .orig.tar.gz and the .diff and apply the diff, it should be a matter of running 'debuild' in the resulting directory12:06
jelmerfullermd: yep12:06
fullermdOh, good.  All these problems should be in the past in only 18 months or so then   ;)12:06
datoigc: to apply the diff, just download the .dsc as well, and do `dpkg-source -x $foo.dsc`12:06
igcjelmer: so I should build subversion trunk before bothering to load the dump file right?12:07
igcis subverison 1.4 compatible with 1.5 w.r.t repos or does it want a dump-load cycle?12:07
jelmerfullermd: Well, the 1.5 release is only 3 months away. Always has been.12:08
jelmerWhen I started on bzr-svn it was 3 months away and it still is now :-)12:08
jelmerigc: 1.4 should work fine as well for loading the repository12:08
igcjelmer: it's a shame we don't have a ppa for subversion 1.5 given we require it12:10
jelmerigc: 1.5 isn't required per se12:10
jelmerUbuntu Feisty, Gutsy and Hardy have 1.4 with the required patches backported12:10
jelmerHowever, only Hardy has a memory leak fix for python-subversion which you will really want given the size of your repository12:11
igcah12:11
fullermdAh, just think of it as a good excuse to buy more RAM.  A lot more RAM.12:11
* igc wonders whether he should upgrade to hardy tonight12:12
jelmerfullermd: right, somewhere in the range of, uhm, 250 to 500 Gb...12:13
jelmer:-)12:13
fullermdWell, see?  After the import's done, he can even install Vista!12:13
igc:-)12:15
igcjelmer: do you have any rough metrics w.r.t. import speed, e.g. revisions per minute?12:21
jelmerit should be delta-dependent12:21
igcbzr-git takes around 18 secs per revision btw12:21
jelmerwow12:22
igcit appears to be limited on the bzr import side, not the git read side12:22
igcthat's for the OOo repo -with 76K files ...12:22
igcit's much faster on smaller code bases12:22
jelmerthe same is probably true for bzr-svn12:23
igcthe trouble is, at 3 revs per minute, OOo will take 3 months to import12:23
jelmerwhoops12:23
jelmerSamba has ~3000 files and ~25000 revisions and takes only a couple of hours to import12:24
igccool12:24
igcit will be interesting to see how my import compares12:25
fullermdWell, those above stats are about 13dB over on files and revs.  That adds up a tad...12:25
igcyes, 76K is 25X higher than 3K; 506K revs is 20X more than 25K12:26
igcso 400 times "a couple of hours" ought to cover it :-)12:27
igcassuming everything scales linearly, of course12:27
fullermdSo the real question is, which will finish first; converting the history, or compiling the program   ;)12:27
igcsounds like a close race :-)12:28
jelmerigc: is this all in one branch?12:29
igcjlemer: I believe the branch count is 50012:32
igcsee http://wiki.services.openoffice.org/mwiki/index.php?title=SCM_Migration#Clean_up12:32
igcthe git repo has a 2.4G pack file12:32
igcwhich I thought was large until I bunzip'ed the 85G svn dump file :-)12:33
igcjelmer: ^^^12:34
=== mrevell is now known as mrevell-lunch
jelmerigc: In that case, you should be able to run several processes in parallel, one for each branch12:35
igcjelmer: I don't think so? I meant branches as in 'branches of the one code base', not branches as in separate modules12:36
jelmerah12:37
igcI think getting the revisions in is the bottleneck12:37
igcjelmer: it looks like they've been bitten by having too many modules and so the want to explicitly go to a monolithic repo12:37
igcthat's repo as in 'bzr branch'12:38
appcineok, so I've pushed my code using sftp to the server. On the server, should I run "bzr update ." in the directory containing the .bzr directory?12:40
=== kiko-zzz is now known as kiko
jelmerigc: Ahh12:43
jelmerigc: I wonder how much time the initial step of bzr-svn is going to take12:43
jelmerigc: (analysing the repository history)12:44
igcjelmer: when I tried with bzr-git, I used your branch btw which ...12:45
igcwas based on ddaa's which was based on ...12:45
igcjam's, etc.12:45
jelmertrue distributed development :-)12:45
igcit created a git-cache directory which was 100G today :-)12:45
jelmerhow did the bzr-git run go?12:45
igcit got to 12K revisions converted when I killed it earlier today12:46
igcit had been running since Friday night12:46
jelmerah12:46
igc4K revisions/day is around the 18 secs per rev I mentioned12:46
jelmerI'm convinced bzr-svn should be able to do better12:46
igcit looks like the git-cache stuff was copied from bzr-svn12:47
igcso I was wondering how big ...12:47
appcineIf I push my code to a server, I get a .bzr directory on the server. How do I make that .. code? :)12:47
igcthe similar thing gets with bzr-svn12:47
datoappcine: `bzr checkout .` in the server12:47
jelmerigc: The framework was, but it caches different things12:48
igcit's true that bzr-git is more experiental of course12:48
appcinedato: Perfect12:48
igcah - good12:48
appcinedato: And the next time I push? still checkout?12:48
jelmerigc: The Samba svn cache is only 69 Mb12:48
datoappcine: update12:48
appcinedato: sweet12:49
igcjlemer: that's sounds much better12:49
igcs/jlemer/jelmer/ - damn12:49
igcthat's twice now12:49
jelmer:-)12:49
awilkinsHow does bzr-svn get the revision data? By asking for the file from each revision?13:09
jelmerawilkins: It retrieves the delta for the revision13:10
awilkinse.g. is it doing the equivalent of svn log ; foreach(changedfile in revisionLog) { svn cat file@revision } ; bzr commit ?13:10
jelmerno13:10
awilkinsThat's good :-)13:10
jelmerit's equivalent to "svn update -r$(R-1):$(R)13:10
jelmerfor each revision13:11
awilkinsWhere's the bottleneck?13:12
jelmerin bzr writing the revisions and in bzr-svn processing of the revisions13:13
jelmerawilkins: is speed being an issue?13:16
awilkinsIt's slow enough to put me off using it more, if that's enough to fret about :-)13:17
awilkinsBut you can say the same about SVK, which theoretically should be a lot faster (since it uses SVN at the back)13:17
awilkinsWant a bzr-svn test log for revision 926?13:18
awilkins(win32)13:18
jelmerawilkins: Don't use the 0.4 branch if you want performance, use 0.4.713:18
awilkinsAre you saying that r877 performs better than r 926?13:20
jelmeryes, there is refactoring going on in the 0.4 branch, that has degraded performance temporarly13:21
awilkinsI think my assesment was probably on 0.47, but I tell you what, I'll wind back and see how it cope with our big, nasty repository13:22
awilkinsLots of binary Visio files, etc13:23
awilkinsAnd some multi-megabyte access databases :-)13:23
awilkinsTo be honest, the performance on our SVN server has sucked hugely since they virtualised it ; I have this theory that they have the storage on a SAN somewhere and SVN doesn't like it.13:26
jelmerwhat is the size of the repository (num revisions, num files)?13:26
awilkinsHang on, I'll get some stats for you.13:27
=== mrevell-lunch is now known as mrevell
awilkins13k revisions, 39,344 files comprising 699 MB at HEAD, and 1.5GB of revision data in the repository13:33
jelmerI think bzr itself would be the main bottleneck there13:35
jelmergiven the size of the tree13:35
awilkinsIt's actually chugging along quite nicely ATM13:37
awilkinsDoing 1 or 2 revisions per second (highly unscientific measurement)13:38
=== cprov is now known as cprov-afk
awilkinsIt seems a lot faster than git-svn was (although that's also highly subjective)13:39
awilkinsI believe the svn cache for this repo runs to about 58MB13:39
awilkinsAh, it must be getting to some meatier revisions now :-)13:40
awilkinsIf I put this into a repo-tree and branch all the branches in this SVN repo do they share packs?13:42
jelmerawilkins: yes13:47
awilkinsDo you need to set up the branching scheme for this to happen?13:47
=== cprov-afk is now known as cprov
jelmerpossibly, if it's a repository that doesn't use the usual svn conventions13:48
awilkinsOh, it doesn't :-)13:48
appcine_Can you do selective branching in bzr? Like, the temlate authors can branch "temlates" and editors branch "templates/static" without any extra setup? :)13:48
awilkinsNot simply anyway13:48
awilkinsappcine_: You'd just both take a branch and merge them13:49
appcine_awilkins: Aye, I was just curious if I could remove the "overhead" of making them browse my source tree to the specific part where they may update stuff13:49
awilkinsappcine_: Which OS... if it's a *nix, they can just have a link to the lower folder :-)13:51
awilkinsHell, even on win32, they can have a shortcuty13:51
appcine_awilkins: OS X, and yeah .. i could create a link :)13:52
Leonidasis there a way to merge a treeless branch into another one? I get an error because there are no working trees14:00
abentleyLeonidas: No, because after you merge, you need to commit.14:03
Leonidashmm, indeed.14:03
awilkinsjelmer: It dropped dead before it finished :-(14:07
jelmerawilkins: How?14:08
=== mw|out|out is now known as mw
awilkinsbzr: ERROR: bzrlib.errors.KnitCorrupt: Knit <bzrlib.knit._PackAccess object at 0x0176A330> corrupt: While reading {svn-v14:08
awilkins3-trunk0:97052673-6ba5-7c4e-b85a-d09b8cc4c1f0:trunk:779} got MemoryError()14:08
jelmerawilkins: Ah, it ran out of memory14:09
jelmerawilkins: You should be able to resume it14:09
jelmerawilkins: perhaps you're not using a version of python-subversion with the memory leak fixes14:09
awilkinsThat the cd/branch ; bzr init ; bzr pull <url> ?14:09
awilkinsTHe page that I got them from claims to have rolled that fix into them14:10
jelmeryeah, you should be able to just run bzr pull again now14:10
jelmerare there any big files in the repository?14:10
awilkinsYes14:10
jelmerhow big?14:10
awilkinsUp to 20-30MB I think14:10
Leonidasabentley: it would be cool if it could create lightweight chechouts on the fly and commit afterwards provided there are no conflicts. This is what I do at the moment.14:11
jelmermwhudson: is there any chance loggerhead is going to support being used inside of apache?14:11
jelmerawilkins: hmm, that shouldn't be a problem14:12
awilkinsI'm trying a resume now14:14
awilkinsIt is just cd branch ; bzr init ; bzr pull <url> isn't it?14:14
jelmerthis time you should only have to run the bzr pull bit14:14
awilkinsIt says "not a branch" AFAIk if you do that14:15
jelmerif you're running init again it wouldn't be resuming anything14:15
awilkinsI didn't run init to start with14:16
awilkinsI started it with a bzr branch14:16
jelmeroh, ok.14:16
jelmerIn that case it won't be resuming14:16
awilkinsBum14:16
jelmerunless you're inside a shared repository14:16
* awilkins issues expletives14:16
awilkinsPack-0.92 not compatible with bzr-svn?14:18
jelmerawilkins: No, you need rich-root-pack14:18
* awilkins suggests that should be in the error message14:19
jelmeryeah, there's already an open bug about that14:20
abentleyLeonidas: Autocommits are dangerous.  Just because there are no text conflicts doesn't mean the merge was successful.  We encourage people to have a test suite and run it.14:20
Leonidasabentley: I see your point. How about an option like --i-am-absolutely-sure-that-this-will-merge-properly-and-take-all-the-responsibility?14:22
awilkinsHeavens, my powershell script is running slowly14:23
* fullermd sighs.14:23
abentleyWell, I'm not going to write such a thing.14:23
fullermdI really with irssi would stop chopping wrapped lines   :(14:23
fullermdAnd sometimes, I even wish...14:23
jelmerLeonidas: perhaps a plugin with a command with that behaviour14:24
jelmer?14:24
Leonidasjelmer: Would be fine, indeed.14:26
* Leonidas takes a look on how bzr plugins look like14:26
awilkinsOuch, python is eating 550MB now14:28
awilkinsjelmer: 1.2GB now :-{ 1.3 .... oh, finally, the GC kicks in, still 955 MB though.....14:36
awilkinsYou can just branch something into a repository tree to convert it from standalone to repo-tree, yes?14:37
jelmeryes14:38
awilkinsjelmer: For what it's worth, the UI for "bzr branch svn+http://" is much more reassuring than that for bzr pull ; the former tells you how many SVN revisions it's got through, the latter just sits at "Pull phase 0/2" for a looong time.14:39
awilkinsjelmer: Do you think it might go faster if it supressed repacking as it went, or slower?15:20
=== kiko is now known as kiko-fud
jelmerawilkins: Not sure15:25
awilkinsI guess it's not easy to work it out without profiling - do you know any good Python profiler?15:26
awilkinsTHe ultimate goal would be to get the speed network-limited on a typical desktop machine :-)15:27
awilkinsAlthough I think it might be disk I/O limited here, it's running between 80-100% CPU utiisation.15:28
* awilkins finds the profilers in the std python library ans is humbled15:30
jelmerthere is lsprof support in bzr I think15:33
awilkinsThere's even a pre-prepared output in the wiki :-)15:36
awilkinsWhy am I not surprised to find XML processing eating a lot of time .....15:37
awilkinsLooks like the most could be gained from improving find_longest_match though (which is probably really hairy-scary)15:40
abentleyawilkins: The thing is that repacking does reduce seek time, so it really is a tough call.15:50
=== kiko-fud is now known as kiko
awilkinsOh yes, I would guess at it, but it's not an improvement unless you measure it.16:11
awilkinsDoes the API provide for supressing packs temporarily?16:11
abentleyawilkins: I'm not sure whether you mean pack creation or repacking, but both are controlled in the API.16:13
awilkinsIt's repacking, I've been watching the folder while bzr-svn pulls - packs vanish, old packs get bigger16:14
abentleyawilkins: What command are you executing?16:14
awilkinspzr pull16:16
awilkinsMight not be true that old packs are getting bigger16:16
abentleyPacks should only get bigger when they're being created, before they're renamed into place.16:17
awilkinsI think it's just my bad interpretation16:18
awilkinsOld packs are disappearing and being replaced with bigger ones in the same ordinal place in the list16:18
awilkinsI'm just watching explorer sorted by mod time16:18
abentleyawilkins: The code that copies revisions from an svn repo to a bazaar one does one revision at a time.  I believe it could do more than one at a time, though it probably wouldn't make sense to do them all at once.16:22
jelmerabentley: How could it do more than one? That would just mean keeping more data in memory and waiting with writing it out to disk.16:33
abentleyAs long as you don't close the write group, the data is still written to disk, but the pack isn't finished and renamed into place.16:36
awilkinsThere _are_ a lot of dinky little 2k packs here.16:37
awilkinsI'm guessing it's ending up with 1 pack-per-revision, until it repacks16:39
awilkinsWell, it's now pulled nigh on 700MB from an SVN repo of 1.5GB, the rate at which it's increasing has slowed tremendously.16:51
awilkinsThe trunk accounts for 9000 out of 13000 revisions, but I can't tell where it's got to in terms of those 900016:52
=== awilkins is now known as awilkins_train
ubotuNew bug: #191001 in bzr "checkout doesn´t work" [Undecided,New] https://launchpad.net/bugs/19100117:16
=== mrevell is now known as mrevell-dinner
sistpoty|workhi, how can I remove a stale lock? (it says s.th. like "Unable to obtain lock file:///srv/revu.repo/.bzr/repository/lock")18:28
datosistpoty|work: bzr break-lock18:28
sistpoty|workah, thanks18:28
mwhudsonjelmer: um, it does?19:12
jelmermwhudson: what did I say exactly?19:12
mwhudson<jelmer> mwhudson: is there any chance loggerhead is going to support being used inside of apache?19:25
=== asak_ is now known as asak
jelmermwhudson: Ah19:45
jelmermwhudson: That should be "easily" be supported19:46
mwhudsonjelmer: what about it is not 'easy'?19:46
mwhudsonjelmer: you set up mod_proxy/mod_rewrite and set server.webpath in the conf file19:46
mwhudsoni mean, documentation is lacking, but other than that?19:47
jelmerYou have to run an extra daemon19:47
mwhudsonso you'd rather a cgi like setup?19:48
jelmeryeah19:48
jelmerbitlbee is using hgweb atm and we were considering migrating, but it's just too much trouble atm19:49
jelmerthat, and the dependencies (but I think that's been brought up before)19:49
mwhudsonloggerhead currently caches way too much at branch object creation time for that to really work19:49
mwhudsonthough i guess for small projects it could work19:50
mwhudsonabentley and i were talking about making loggerhead (or something a bit like it) into a more of a library for generating html describing a branch19:53
mwhudsonand decoupling it more from the publishing side19:53
jelmerBitlBee is probably too big for that19:53
jelmerwe're currently looking into alternatives for what we're using atm (hgweb and trac with trac-bzr)19:53
jelmerthe size of our revision history tends to bring trac down occasionally19:54
mwhudsonoomph :)19:54
mwhudsonhow many revisions?19:55
jelmerABOUT 1.1K, SO NOT TOO MANY19:56
jelmersorry for shouting19:56
jelmerso not too many19:57
jelmermwhudson: I think that would be a good idea actually, splitting out a library that can generate HTML representations of Bazaar data19:58
mwhudsonok, in my testing i've been using launchpad (5k files, 20k revisions) as a "large project"19:58
jelmerah, ok19:59
jelmerit's probably tracs fault then, it already feels really slow for BitlBee for simple operations (and runs as a separate daemon so it can do caching)19:59
mwhudsonjelmer: yeah20:10
mwhudsonloggerhead.bitlbee: built revision graph cache: 0.021812915802001953 secs20:10
mwhudsoncertainly, loggerhead seems pretty quick on bitlbee20:12
jelmerhgweb is pretty quick too, but it's unmaintained and has regressed recently20:13
jelmerbuilding the revision cache is the most inefficinet step?20:13
mwhudsonit depends20:14
mwhudsonfor launchpad, the pain point is extracting inventories20:14
mwhudsonalso, computing the files changed in a revision can be slow20:15
mwhudson(but you can cache that)20:15
jelmerthat depends on the size of the tree I guess?20:16
mwhudsonyeah20:17
=== Gwaihir_ is now known as Gwaihir
lifelessmoin21:10
=== mrevell-dinner is now known as mrevell
hsn_any big projects migrated to BZR after 1.0 rel?21:24
johnnymwhudson, is there a reason you don't use wsgi in loggerhead?21:25
johnnyi've been trying to get loggerhead to work with lighttpd, but the simple proxy method wont' work with 1.4.x21:26
mwhudsonjohnny: i don't know, is there any reason why i *would* use wsgi in loggerehad?21:26
mwhudsonjohnny: but i should point out that this side of loggerhead is very much Not My Fault :)21:26
johnny?21:26
weigonjohnny: WSGI should be a feature of turbogears, it is one for all21:27
johnnyatm it seems like your script has to be modified?21:27
johnnymaybe i'm wrong21:27
mwhudsonjohnny: loggerhead runs happily enough behind a proxy21:27
mwhudsonyou need to set server.webpath in the config21:28
johnnyi did21:28
johnnymaybe i set it wrong21:28
weigonmwhudson: can you tell loggerhead to strip the a path-segment from the URL ?21:28
johnnyi bet that's possible within turbogears21:28
jelmermod_proxy can IIRC21:28
johnnyi just don't know how yet :)21:28
weigonmwhudson: so if the URL is /foobar/baz/loggerhead/... that you string the first part and loggerhead only sees its part21:28
mwhudsonjohnny: i guess "won't work" isn't a good bug report :)21:28
mwhudsonweigon: i hear a rumour that this is possible yes21:29
johnnyhmm.. now that i'm more awake,i'll go look it up21:29
weigonjohnny: you need that strip-prefix feature and lighttpd+mod_proxy will happily work for you21:30
=== zmanuel is now known as z-man
johnnymwhudson, do you happen to know off the top of your head on how to strip it?21:44
mwhudsonjohnny: no21:46
johnnyhmm.. back to my cvsps import, what is the proper procedure to get the head branch of a module out of the repository and use that as the base for another shared repository?21:49
johnnyjust branch it directly?21:49
bob2a21:53
lifelessb21:54
bob2oops21:54
=== lnxtech is now known as brokentux
abentleyYeah, that "a" revision was a bit of a goof :-)22:22
reggieanyone seen bzr-svn give a xxx not a branch error?22:22
jelmerreggie: yes22:23
reggieI have a svn fsfs repo that appears to convert ok to about 25%22:23
reggieand then I get a not a branch error which I don't really understand. I think the folder it shows is a branch22:24
jelmeryou're running "bzr svn-import" ?22:24
reggieyes22:24
jelmerthat would be bug 18336122:25
ubotuLaunchpad bug 183361 in bzr-svn "bzr-svn on a branches not working" [Medium,Triaged] https://launchpad.net/bugs/18336122:25
reggieso branches don't work at all?22:25
reggiewe have someone here that got it to work22:26
reggieperhaps it's intermittent22:26
jelmerit works, but there's a bug if something strange happened in the history of a branch22:27
jelmerI haven't quite worked out what causes it to break22:28
foombut it works if you set a custom branching scheme correct for your project, i seem to recall22:29
reggieI assumed that auto would determine I'm using trunk (which I am)22:30
jelmerfoom: it will never fail halfway through a svn-import though22:30
jelmerreggie: yes, it will. You're just hitting a bug in bzr-svn caused by some oddness in your repository22:30
reggieand fighting my own ignorance of bzr.  I've just started using it22:31
reggiewhat does --standalone do and is it the default?  seems like it is trying to convert all branches but I didn't give --standalone22:31
jelmerstandalone determines whether it should use a bzr shared repository or not22:32
jelmerit will by default22:32
reggieso, use the svn repo as the parent?22:32
reggiewhich I don't want22:32
jelmerno22:33
jelmera bzr shared repository22:33
reggieoh ok22:33
reggieI understand22:33
reggiesorry22:33
jelmerreggie: no worries22:35
reggieso I'm pretty much left with --prefix or just doing a bzr branch on the branches I care about?22:35
jelmerreggie: Any chance you can add a comment to that bug about the issue you're hitting?22:35
jelmerin particular, the "svn log" for the revision that's problematic could be useful22:36
reggiesure, let me figure out how (and I"m not sure what i would say other than I hit it too)22:36
reggiehmm.  don't think it's a revision.22:36
jelmerIt's the changes in a parituclar revision that are problematic22:36
jelmeryou can figure out what revision is problematic by running "bzr -Dtransport svn-import ..." and looking at the last few lines in .bzr.log before it crashes22:37
reggiebe happy to help just have no idea how to determine what revisoin that is based on what bzr is saying22:37
reggieahh thanks22:37
jelmerthe bit that would be useful then would be the "svn log -v" output for that particular revision (commit message/author, etc shouldn't matter)22:37
jelmerreggie: or, if this repository is public, just mention the repository URL22:38
reggiehmm.  that reminds me we do have a public repo.  maybe I'l try to convert that one22:38
reggiejelmer, .bzr.log shows a svn update and a svn revprop-list -r on 689 and then the crash22:46
reggieso is it 689 or 690 that caused it?22:46
jelmer68922:46
jelmerthe output of "svn log -v -r688:690 <url>" would be useful22:46
=== kiko is now known as kiko-afk
reggiejelmer, comment attached22:53
reggienow I'll try our public repo22:53
igcmorning22:54
jelmerreggie: Thanks!22:54
reggienp22:54
reggiejelmer, got a sec?23:14
reggieI did a bzr svn-import --prefix=trunk on my url and it ran to completion but I don't see any files other than a .bzr folder23:14
jelmerreggie: Run "bzr checkout" inside that directory23:15
reggieoh.  bzr log shows some info23:15
reggiehmm.  I made a shared repo inside a shared repo.  I did  mkdir tmp; cd tmp; bzr init-repo .; bzr svn-import <url> trunk23:16
reggieand now I have tmp/trunk/trunk/.bzr23:16
reggiejelmer, ok seems to be working.  how are svn tags handled?  as native bzr tags?23:19
jelmerno, they're converted into branches at the moment23:21
jelmerthere's an open bug about it23:21
* Peng wonders why bzr decided to think the submit branch is ".".23:22
PengAt least I happened to notice that before sending an empty patch. :\23:23
=== jamesh__ is now known as jamesh
reggiejelmer, so if I import a few of my branches and then someone fixes the tag bug with bzr-svn, can I then somehow get my svn tag info into my braches (even though I've been using the branches)?23:43
reggiecan I merge two branches into a tag?23:43
jelmerreggie: Yes, once that bug is fixed you will see the svn tags as bzr tags23:44
jelmerI'm not sure what you mean by merging two branches into a tag23:44
reggiefor example with svn I have branches labeled 5.0, 5.1, 5.2 ( for each version) and I would have the same for bzr23:45
reggiebut there are also tags in those like 5.1.1 and 5.1.2 and 5.1.3.  These should not be branches since I never go back and commit code to them23:46
reggiecan I convert the 5.1 branch, start using bzr to commit code to it, and then later add the tag info once that bug is fixed?23:46
reggiemaybe it' sjust easier for me to recreate the tags.  just take  a couple of hours23:47
reggiejust do bzr tag -r for each tagged revision in svn23:48
jelmerI think that's probably the easiest solution23:52
reggieyup.23:53
reggiethanks for your patience.  I really appreciate it23:53

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!