[00:03] <beuno> mwhudson, interesting result
[00:03] <mwhudson> beuno: ?
[00:03] <beuno> quick and dirty cleanup of genshi, per Toshio's recommendations
[00:04] <beuno> cuts the time it takes to generate the template in half
[00:04] <beuno> let me compare that against zpt...
[00:05] <mwhudson> ah
[00:06] <beuno> damn, zpt is still faster
[00:06] <beuno> ~5sec against ~3sec
[00:06] <beuno> to render NEWS annotation
[00:09] <mwhudson> beuno: seem to be getting somewhere with simpletal fwiw
[00:09] <beuno> mwhudson, ah, cool. It'll be interesting to see how that works out
[00:09] <mwhudson> yes :)
[00:10] <beuno> I'm trying to set aside the other options, while I wait
[00:26] <mwhudson> beuno: tests pass :)
[00:27] <beuno> mwhudson, oh, very cool!  branch/patch?
[00:27] <lifeless> beuno: what changes did you make to the genshi templateE?
[00:27] <mwhudson> my zpt-templating branch, in a few minutes
[00:28] <beuno> lifeless, replaced the include tag with the actual content, and avoided match:
[00:28] <lifeless> awesome
[00:29] <beuno> yeah, amazingly, that cuts render time in half
[00:29] <beuno> still uses more ram than zpt though, aside from the extra seconds
[00:29] <lifeless> bugs ftw huh
[00:29] <mwhudson> oops, pushed to the wrong location
[00:30] <beuno> lifeless, btw, I found a bug in bzr-search, which I haven't been able to fix very nicely yet, so when you're up to it, and want to help me a bit, I'll be happy to work it out. It also may merit a test.  When you re-index a branch with no changes, md5 collision tracebacks
[00:31] <lifeless> sure
[00:31] <lifeless> two issues there
[00:31] <lifeless> three in fact
[00:31] <mwhudson> man simpletal's logging is noisy
[00:31] <lifeless> a) you are indexing already indexed revisions
[00:32] <lifeless> b) md5 collisions should result in a content-match check followed by ignore(matches) or raise(fails)
[00:32] <mwhudson> oh and unicode issues, fun
[00:33] <lifeless> c) we need the component combining facility shortly after we get growing-index-component-counts
[00:33] <lifeless> a) is the fix needed for this scenario
[00:34] <beuno> right, my initial though was to compare revisions, but it seems like too expensive of an operation for each time you index
[00:34] <beuno> I'm wondering if we can safely assume we need to index from the last revision indexed on
[00:34] <lifeless> so there is no 'last' as such
[00:35] <lifeless> but we can take the branch revision tip
[00:35] <lifeless> and do a breadth first search from that out
[00:35] <lifeless> and when we find a revision that has been indexed, call search.stop_searching_any([revid])
[00:36] <beuno> that sounds pretty good
[00:36] <lifeless> (bzrlib.repository has this code, in search_out_something_or_other)
[00:36] <beuno> I like that   :)
[00:36] <lifeless> it will need a new method on the Index, to test for the presence of revisions
[00:36] <lifeless> something like adding a parameter to indexed_revisions
[00:36] <beuno> mwhudson, revision 244 is the simpletal's?
[00:36] <lifeless> so indexed_revisions(revisions) -> intersection of revisions and the index
[00:37] <mwhudson> beuno: 224
[00:37] <beuno> lifeless, I already have a patch to seperate getting the branches revisions, as it seemed like it would be useful for other operations
[00:37] <beuno> maybe I can tweak it to accept an optional parameter to stop at
[00:37] <beuno> mwhudson, right, 224, slipped, I'll take a look now
[00:38] <lifeless> I don't think that will work ;)
[00:38] <mwhudson> beuno: at least one more fix coming up
[00:40] <mwhudson> beuno: simpletal seems a bit slower, probably around the same speed as kid
[00:40] <mwhudson> uh
[00:40] <mwhudson> 'your fixed genshi'
[00:40] <mwhudson> is what i meant to say
[00:41] <beuno> mwhudson, I'll compare them now
[00:42] <mwhudson> beuno: get rev 225
[00:42] <beuno> lifeless, so you'd rather I leave it as-is and duplicate code?
[00:43] <lifeless> beuno: imagine we have 60K revisions
[00:43] <mwhudson> beuno: memory usage seems the same or slightly better with simpletal though
[00:43] <lifeless> beuno: anything that requires getting set(revisions) for *either* the branch or the index will perform badly
[00:43] <lifeless> beuno: so we have to spider, breadth first search
[00:46] <beuno> mwhudson, simpletal really does complain a lot...  but that may be a good thing eventually
[00:47] <mwhudson> beuno: it does seem a little slower than zope.pagetemplate, but the memory usage is very nice
[00:47] <mwhudson> and yes, it's whiny :)
[00:47] <mwhudson> which is probably good, it's pointing out mistakes in the templates
[00:49] <Peng> I wonder how speedy hg's custom template system is?
[00:50] <lifeless> probably very fast
[00:53] <lifeless> beuno: does my explanation make sense?
[00:54] <lifeless> beuno: also the index is a set, its not got an inherent ordering so there isn't 'most recently indexed' as a concept to work from
[00:56] <beuno> lifeless, not to me, no. But I probably didn't explain well what I meant. Aside from this bug, I moved the logic of getting all the revision ids in a branch to a seperate function, which returns a list. So basically just changed the code for the patch I sent out earlier. On top of that, I think I may be able to add a "stop at this rev" optional argument, so we can re-use that piece of code. Doesn't really have anything to do with search indexes, as far as I
[00:58] <beuno> mwhudson, seems a bit slower, yes. 6-8 seconds slower to generate my 36k diff. But it does use less memory, and, well, it's nice to depend on than zope
[00:58] <mwhudson> there are also bugs remaining
[00:58] <lifeless> beuno: so that piece of code (get all revision ids) is _only_ useful in an entirely empty index
[00:58] <mwhudson> but yes, seems worth pursuing
[00:58] <lifeless> beuno: and its actually not needed then, because the search approach I describe will generate the same results
[00:59] <lifeless> beuno: also note that there is no _single_ revision to stop at - its a graph, you need the surface of the intersection between the two sets (indexed, all) to be able to provide a stop-list, but that means knowing them all a-priori so is thus more expensive
[01:00] <lifeless> beuno: grab a whiteboard, draw a branch with merges - say 5 mainline revs, 10 in total.
[01:01] <lifeless> draw a circle around the bottom two mainline revs, should cover 3 or 4 total revs
[01:01] <beuno> mwhudson, btw, what do you think about removing full text indexing by default in trunk?  It's driving me crazy  :)
[01:01] <mwhudson> beuno: +1
[01:01] <lifeless> we need to identify what is outside the circle without looking inside the circle
[01:02] <beuno> lifeless, I see your point. I'll give it some thought and get back to you. Meanwhile, indexing an indexed-unchanged branch == boom
[01:03] <beuno> mwhudson, I think just commenting that like out would suffice for now, and you've got the push powers  :)
[01:03] <lifeless> beuno: right. tell you what, I'll whip it up tonight, its very straight forward (I've written it before). I'm mainly trying to broaden your mind to think about these problems in a clearer way
[01:03] <lifeless> beuno: because these problems are the heart of the performance-of-bzr issues for many things
[01:04] <mwhudson> beuno: we need to be careful what happens when people type something in the search box then
[01:04] <mwhudson> though i dunno, the naive approach only took 20s on bzr.dev
[01:05] <lifeless> mwhudson: 20s its 19 seconds too long for a web app
[01:05] <mwhudson> lifeless: yes, indeed
[01:05] <lifeless> mwhudson: also, I suggest you guys don't limit yourself to bzr.dev for a test branch.
[01:05] <lifeless> may I humbly present python
[01:06] <beuno> lifeless, perfect. I'll look into that piece of code as soon as you push it. I understand the problem, and I have a few ideas on how to get around it, but I'd have to put them into practice. I really apreciate the explanation  :)
[01:06] <lifeless> its 60K revisions, but only a few hundred MB
[01:06] <mwhudson> lifeless: the ~vcs-imports branch?
[01:06] <lifeless> es
[01:06] <mwhudson> or is there a better one?
[01:06] <lifeless> no
[01:06] <mwhudson> k
[01:06] <lifeless> bzr-svn'd
[01:06] <mwhudson> i use launchpad sometimes :)
[01:06] <beuno> mwhudson, what took 20s?  It takes a good 25 minutes to index commits here
[01:06] <mwhudson> beuno: if you run without caches
[01:07] <mwhudson> the search is very naive
[01:07] <beuno> ah, right. Well, at this rate, we should be able to benefit from lifeless's efforts real soon and cut that down to under 1 sec  :)
[01:08] <mwhudson> beuno: https://code.edge.launchpad.net/~mwhudson/loggerhead/zpt-templating is up to rev 226 now
[01:10] <lifeless> I have to commit this soon
[01:10] <lifeless>  63 files changed, 4891 insertions(+), 6962 deletions(-)
[01:10] <jml> hello #bzr
[01:10] <RAOF> Hello jml :)
[01:10] <jml> We were doing some tests with RepositoryTestProviderAdapter. It seems to have disappeared from bzr.dev.
[01:11] <mwhudson> beuno: i notice the javascript on the revision page is broken :/
[01:11] <lifeless> it is deleted
[01:11] <lifeless> you should now use
[01:11] <lifeless> TestScenarioApplier
[01:11] <lifeless> to apply the scenarios
[01:11] <lifeless> and peek in bzrlib.tests.repository_implementations.test_suite to see how the scenarios are created
[01:11] <mwhudson> beuno: and the css on the revision page seems a bit odd, though maybe that's just the new ff
[01:12] <mwhudson> beuno: the inventory listing also looks crappy, the 'last changed' column is too narrow and the dates are wrapping
[01:13] <mwhudson> beuno: other than that, i reckon this branch is probably good to land, what do you think?
[01:14] <beuno> mwhudson, looks better, yes. I doing some last performance tests to convince myself we're not loosing too much, and that it's worth the tradeoff for better dependencies
[01:14] <beuno> still, 40 seconds to render a file seems too much to me
[01:15] <mwhudson> where does 40 seconds to render a file come from?
[01:15] <beuno> so, this could be a good step forwards, as opposed to 1.35m and gigwats of ram
[01:15] <beuno> mwhudson, my 36k dif
[01:15] <mwhudson> oh right
[01:16] <beuno> it takes bzr ~13 seconds to generate it, and the rest is rendering
[01:16]  * pygi wonders if anyone of the bzr folks is going to Akademy?
[01:17] <lifeless> when is it?
[01:17] <mwhudson> beuno: yes, still too slow
[01:17] <lifeless> certainly Jonathan Riddell will be, but hes more a user :P
[01:17] <mwhudson> beuno: but much better
[01:17] <pygi> October 8 - October 13 lifeless
[01:17] <pygi> pygi, October 15 that is :p
[01:17] <pygi> ergh
[01:18] <pygi> my counting xD
[01:18] <pygi> August*
[01:18] <mwhudson> beuno: can you work on the things i mentioned above?
[01:18] <beuno> mwhudson, yup, on it. I'll work towards sending a similar patch to zpt yesterday
[01:18] <mwhudson> awesome
[01:18] <beuno> javascript seems fine for me
[01:19] <beuno> ah, no it doesn't
[01:19] <beuno> in revision view
[01:22] <mwhudson> it's probably worth getting trunk loggerhead and zpt-templating running side by side and comparing the look of each page
[01:22] <mwhudson> i can help with this of course
[01:23] <mwhudson> but not right now as i'm going to go and get lunch
[01:23] <lifeless> jam: ping
[01:23] <beuno> mwhudson, yes, I'll make sure I do that before I send the patch in. I'm going to go home, it's been to many hours at the office. Oh, and also eat something  :)
[01:24] <pygi> lifeless, so I guess no one :p
[01:24] <mwhudson> beuno: ok, we will meet again in a few hours then :)
[01:24] <lifeless> pygi: I would raise it on the list
[01:24] <pygi> lifeless, was trying to see if we could collect like 3 more people, and start a bzr group (share a room)
[01:24] <pygi> then we could hack a bit yay :D
[01:25] <lifeless> that sounds like an excellent idea
[01:25] <lifeless> there are more people that read the list than hang out in the IRC channel :P
[01:25] <pygi> lifeless, perhaps but you know, rooms are limited (there are like 6 more beds or so :-/)
[01:26] <lifeless> pygi: so mail _now_ ? :P
[01:28] <beuno> lifeless, FWIW, I sent you the patch for bzr-search, just because, as it is now, that's cleaner  :)
[01:28] <beuno> you can discard it silently, just felt better to send before deleting
[01:28] <pygi> lifeless, done :p
[01:28]  * beuno -> home
[01:31] <lifeless> beuno: thanks
[01:32] <poolie> hello
[01:33] <pygi> hi hi poolie
[01:33] <jml> lifeless: would it be a good idea for TestScenarioApplier to have a method that takes a suite of tests and returns the multiplied suite?
[01:34] <lifeless> isn't multiply_* already what you want?
[01:34] <jml> lifeless: not quite. adapt_tests is close.
[01:35] <jml> lifeless: but it uses an out parameter rather than returning.
[01:35] <lifeless> jml: (No, not really, we want to decouple: the act of application, the act of accessing and deciding, and the act of making scenarios)
[01:35] <lifeless> jml: also, EOVERFLOW
[01:35] <lifeless> I have a 10K patch in my head. space is reduced
[01:35] <lifeless> seriously:  63 files changed, 4935 insertions(+), 7037 deletions(-)
[01:35] <lifeless> don't expect arbitrary topics to work until this lands
[01:35] <jml> lifeless: ok. I'll disagree with you later in more detail :)
[01:41] <lifeless> that sounds fine
[01:43] <poolie> jml, i can talk to you about this
[01:45] <jml> poolie: if you'd like. I've figured out how to do what I want, I just don't yet understand why the API is as it is.
[01:49] <poolie> i think the api needs to be cleaned up
[01:49] <poolie> and the various users made consistent
[01:49] <poolie> i posted about it on friday iirc, you can reply to that if you have any other thoughts
[02:11] <lifeless> wheee
[02:11] <lifeless>  63 files changed, 4891 insertions(+), 7387 deletions(-)
[02:12] <bob2> vf?
[02:12] <lifeless> yah
[02:13] <lifeless> migrating the very low level tests across now, so deleting all the old classes
[02:13] <lifeless> FEELS GOOD
[02:50] <poolie> jml, is there any meaningful difference between a TestSuite and just a container full of tests?
[02:51] <jml> poolie: it's a subject of scholarly debate.
[02:51] <poolie> heh
[02:51] <jml> poolie: the big difference is you can run() a TestSuite.
[02:52] <jml> poolie: mostly I don't get why adapt_tests takes a suite as a parameter, rather than returning one.
[02:54] <jml> poolie: actually, another important difference between suites and generic containers is that TestSuite().addTest(some_suite) works.
[02:54] <poolie> hm
[02:54] <poolie> but list.append(a_list) works too...
[02:55] <jml> poolie: yeah.
[02:56] <jml> poolie: if TestSuite.run() were implemented with iter_tests, then what I said about addTest wouldn't matter.
[03:00] <lifeless> poolie: test suites can add capabilities
[03:01] <lifeless> because they can do things before and after tests
[03:01] <lifeless> like IsolatedTestSuite for instance
[03:02] <poolie> so if they can be subclassed then we should presumably not be creating them ourselves
[03:02] <poolie> as we don't really know which one to create
[03:02] <poolie> i'm not sure whose decision it _is_ as to which one is used
[03:02] <lifeless> its the loaders
[03:02] <poolie> and it may not matter, because the tests can be copied across, or the suites composed
[03:02] <lifeless> and load_tests is designed to support this
[03:03] <lifeless> so TSA should not create a suite
[03:03] <lifeless> blink, I've been sucked in
[03:03]  * lifeless goes back to code
[03:11] <beuno> mwhudson, back (and thanks for disabling the search cache in trunk)
[03:12] <mwhudson> beuno: hi
[03:12] <mwhudson> i'm working on changing the modified file links on the changelog view to match what trunk does
[03:14] <beuno> mwhudson, ok, cool. I'll fix javascript and see what else is off
[03:17] <mwhudson> grr
[03:22] <jam> lifeless: I'm around for a short time, what did you need?
[03:22] <lifeless> jam: I mailed you
[03:23] <lifeless> jam: basically, I have all tests passing but am not setting modes on knit files
[03:23] <mwhudson> beuno: ah, the javascript problems are '>'s getting turned into '&gt's in the output
[03:23] <lifeless> I want to know whether you think I need to (fixthetestsANDputmodesupportintothenewcode)
[03:23] <jam> lifeless: I've often thought about giving up as well, and just telling users to set their umask for their bzr connection
[03:24] <lifeless> jam: I haven't broken those tests or anything, they just don't seem to hit this code
[03:24] <lifeless> the patch is quite large now:
[03:24] <mwhudson> beuno: maybe it's a simpletal buglet, but it's easy enough to work around (put the javascript in a js file...(
[03:24] <lifeless>  63 files changed, 4937 insertions(+), 7958 deletions(-)
[03:24] <jam> I'm pretty sure we had them at one point, I don't know at what level
[03:24] <lifeless> we had them at a couple of levels
[03:24] <jam> and certainly they could have been removed without realizing
[03:24] <lifeless> they just aren't triggering for this for some reason.
[03:24] <jam> I know poolie did some work on moving around where "_dir_mode" was being accessed
[03:24] <beuno> mwhudson, from what I saw, it removes anything within an HTML comment  <!-- -->
[03:25] <beuno> at least for the revision view
[03:25] <poolie> hello jam
[03:25] <mwhudson> beuno: yeah, that too
[03:25] <jam> anyway, it is fairly moot for packs, and anyone on knits should be using bzr < 1.3 or so
[03:25] <mwhudson> if you take out the <!-- --> you get the problem i mentioned
[03:25] <lifeless> jam: so, you're cool with this? we can always add it in later..
[03:26] <poolie> lifeless: is it hard to set the modes?
[03:26] <jam> lifeless: *I'm* okay with it, but I've approved some of your recent changes and *other* people stumble over them
[03:26] <poolie> with my recent landing it should be easy
[03:26] <jam> like get_revision_graph
[03:26] <beuno> mwhudson, ah, yes, you're one step ahead of me. So, move all javascript to .js file, where it should actually be.
[03:26] <jam> i don't really know whether people are really depending on it or not.
[03:26] <jam> It seems broken for sftp which would be one of the primary use cases
[03:26] <mwhudson> revision 229 heading to lp...
[03:27] <lifeless> poolie: I haven't merged bzr.dev for a bit; I still have to deal with the conflicts there
[03:27] <lifeless> poolie: I wouldn't say its hard, just not done
[03:27] <lifeless> poolie: and as I count I have 8 days to land this and stacked
[03:27] <lifeless> ><
[03:27] <poolie> sure
[03:28] <lifeless> I pity the reviewer
[03:30] <lifeless> (say it like mr T)
[03:31] <beuno> mwhudson, does the code look small to you to?
[03:31] <mwhudson> beuno: yes
[03:31] <beuno> good
[03:31]  * beuno pops the CSS open
[03:31] <mwhudson> i probably bungled the css somewhere
[03:35] <beuno> all fonts seem different
[03:35] <beuno> the revision info block is bigger
[03:38] <mwhudson> beuno: yeah, it's a bit odd
[03:39] <poolie> spiv, you should tweak and merge http://bundlebuggy.aaronbentley.com/request/%3C20080602011607.GA27779@steerpike.home.puzzling.org%3E
[03:44] <mwhudson> oh man, how do i check that these six incomprehensible urls are the same on the two pages
[03:54] <mwhudson> beuno: how are you enjoying the css?
[03:55] <beuno> mwhudson, I'm crying
[03:55] <mwhudson> oh dear
[03:57] <beuno> mwhudson, I've went ahead and started to convert the new theme to HTML, even though we're changing it a bit later on
[03:57] <beuno> just so we can use it ASAP
[03:57] <beuno> that's how bad it is  :)
[03:57] <mwhudson> heh
[03:57] <mwhudson> i would _hope_ we can fix the differences in zpt-templating with less effort than that
[03:58] <mwhudson> afk for a few
[03:58] <beuno> alrighty, I'm back to making sense out of the CSS
[04:01] <lifeless> FINALLY
[04:01] <mwhudson> beuno: would you like to skype about this?
[04:01] <lifeless> old knit code deleted, full test run going
[04:01] <mwhudson> lifeless: hurrah
[04:02] <beuno> mwhudson, sure, opening it up now
[04:07] <mwhudson> beuno: read and weep http://bazaar.launchpad.net/~mwhudson/loggerhead/zpt-templating/revision/232
[04:23] <bud3030> Just had a pc to break couple of wires in the box was little hot
[04:25] <flint_dude> not able to fix it this late
[04:39] <beuno> mwhudson, fixed the diff view, trying to figure out the revision info
[04:39] <mwhudson> beuno: great
[04:44] <poolie> mwhudson: could you sometime send a patch to add a pydoctor makefile target for bzr?
[04:45] <mwhudson> poolie: sure
[04:45] <poolie> is "pydoctor --make-html bzrlib --add-package=bzrlib" about right?
[04:46] <mwhudson> --make-html doesn't take an argument
[04:46] <poolie> oh right
[04:46] <mwhudson> it's --make-html --html-output-dir bzrlib or something
[04:46] <mwhudson> these days i mostly use pydoctor --server --add-package bzrlib
[04:52] <beuno> mwhudson, emailed you the patch to fix all the CSS oddness I could find
[04:54] <mwhudson> beuno: looks good to me
[04:55] <beuno> :)
[04:55] <mwhudson> i mean, the css looks pretty nasty, but that's nothing new
[04:55] <beuno> my head will work for about an hour more. Anything else I missed to get this into trunk?
[04:56] <mwhudson> nope, don't think so
[04:57] <mwhudson> i'm going to do a good chunk of diff reading and then merge it
[04:58] <beuno> good, I'll work on the nicer-urls branch, and stick around if you run into anything you need me to fix
[04:58] <mwhudson> ok
[04:59] <poolie> mwhudson: there doesn't seem to be a --server option
[04:59] <poolie> at least in the version in hardy
[04:59] <beuno> mwhudson, and just sent you an email with what we'll be working with
[04:59] <mwhudson> poolie: well, i only made the "release" that that package is based on under mild protest
[05:00] <mwhudson> poolie: bzr get lp:pydoctor
[05:00] <poolie> ok
[05:00] <mwhudson> i'm probably not that far off making a release that i'm actually happy with, come to think of it...
[05:06] <lifeless> 7K passing
[05:06] <lifeless> keep thy fingers crossewd
[05:07] <poolie> mwhudson: is there a way to make it just print the warnings?
[05:07] <mwhudson> poolie: maybe, the whole 'what gets printed' stuff is a bit over engineered in pydoctor
[05:07] <mwhudson> poolie: which warnings do you mean?
[05:08] <poolie> rest syntax errors etc
[05:09] <mwhudson> if you add a -v do the command line you'll get more information about them
[05:09] <mwhudson> add two and you'll get even more
[05:09] <mwhudson> (and a lot of other stuff i expect)
[05:09] <poolie> yeah it's all the other stuff about every module being imported that is undesirable
[05:10] <mwhudson> try --verbose-about epydoc2stan
[05:10] <poolie> ah
[05:10] <poolie> there is also the issue that our docs are mostly in rest not epydoc
[05:10] <mwhudson> --docformat restructuredtext
[05:10] <poolie> oh great
[05:11] <poolie> :)
[05:11] <mwhudson> you can make a config file with lots of this stuff in
[05:11] <mwhudson> this is the one used for the online docs: http://python.net/~mwh/mybzrlib.cfg
[05:12] <mwhudson> warning: restructuredtext parsing is about 5 times slower than epytext parsing...
[05:14] <poolie> mm, so i see!
[05:18] <lifeless> yes yes yes yes yes yes yes yes yes yes yes
[05:18] <lifeless> Ran 11009 tests in 1359.977s
[05:18] <lifeless> OK (known_failures=10)
[05:18] <lifeless> 681 tests skipped
[05:19] <poolie> way to go
[05:19] <poolie> btw is spiv around today? maybe just quiet?
[05:20] <lifeless> he's around
[05:20] <lifeless> time for bzr send
[05:20] <lifeless> hahhaha 800K bundle
[05:20] <spiv> I'm around.
[05:23] <poolie> hello spivvo
[05:24] <lifeless> I'm gonig to take a short break (lunch, and do something to take my mind out of the details)
[05:24] <lifeless> then I'll merge bzr.dev
[05:32] <mwhudson> yay for bugs in loggerhead trunk
[05:33] <poolie> down to 8 pending reviews
[05:33] <poolie> i'm going to take a break too
[05:43] <beuno> mwhudson, I python2.5 slipped into your branch too, in start-loggerheat.py
[05:43] <beuno> it seems I committed my testing values at some point, sorry  (commenting out the daemon was the other)
[05:44] <mwhudson> ah, right
[05:44] <beuno> so, we can go with plain python
[05:44] <beuno> or stick to 2.5/2.4 in both
[05:44] <mwhudson> let's not make this change in this branch
[05:45] <beuno> fair enough
[05:49] <beuno> I'm still resolving the remaining 4 conflicts out of 22 from merging my url-cleanup branch to trunk
[05:56] <mwhudson> i'm still fighting url details a bit
[05:56] <beuno> what do you mean?
[05:58] <mwhudson> beuno: click one of the 'changes' links in the table on http://bazaar.launchpad.net/~bzr/bzr/trunk/files
[05:59] <beuno> mwhudson, right, I reported that as bug #238477
[05:59] <mwhudson> oh ok :)
[06:00] <mwhudson> well, that's fixed in zpt-templating now
[06:00] <mwhudson> but checking all sorts of stuff like that
[06:06] <beuno> mwhudson, that was just a problem with the URL?
[06:06] <mwhudson> yeah
[06:08] <mwhudson> beuno: i merged the branch and am pushing to trunk
[06:09] <beuno> mwhudson, great!
[06:10] <beuno> I'll merge my clean-url branch with that, and send a bundle for it. It seems pretty decent now
[06:11] <mwhudson> sounds good
[06:16] <lifeless> beuno: bzr-search can now index bzr.dev
[06:17] <beuno> lifeless, time?
[06:17] <lifeless> same, 20s
[06:17] <lifeless> but it can report on the results now :P
[06:18] <beuno> hahah
[06:18] <lifeless> 0.7 seconds to query
[06:18] <lifeless> in theory it will scale better
[06:18] <beuno> well, that's more than usable today
[06:18] <lifeless> lunch is nearly over, I'm going to test indexing python now
[06:19] <beuno> actually, in loggerhead's terms, it's amazingly fast
[06:19] <lifeless> 60K commits
[06:19] <lifeless> python-packs$ time bzr index
[06:19] <lifeless> real    1m9.116s
[06:19] <lifeless> user    0m51.147s
[06:19] <lifeless> sys     0m6.468s
[06:19] <lifeless> give me a search query
[06:19] <beuno> bzr search the
[06:19] <lifeless>  time bzr search Robert Collins
[06:19] <lifeless> bzr: ERROR: No matches were found for the search [u'Robert', u'Collins'].
[06:19] <lifeless> real    0m2.536s
[06:19] <lifeless> user    0m0.368s
[06:19] <lifeless> sys     0m0.084s
[06:19] <beuno> :)
[06:20] <lifeless> so, the turns up a lot
[06:20] <lifeless> my screen is paging and paging :P
[06:20] <lifeless> I've stopped it
[06:20] <beuno> hahaha, well, it works!
[06:20] <lifeless> time to first line is more interesting
[06:20] <lifeless> time bzr search the | head -n1
[06:20] <lifeless> Revision id 'svn-v3-trunk1:6015fed2-1504-0410-9fe1-9d1591cc4771:python%2Ftrunk:43495'. Summary: 'Bug #1445068: getpass.getpass() can now be given an explicit stream'
[06:20] <lifeless> bzr: broken pipe
[06:21] <lifeless> real    0m6.195s
[06:21] <lifeless> user    0m5.840s
[06:21] <lifeless> sys     0m0.140s
[06:21] <lifeless> I can tell you why that is as well
[06:21] <lifeless> the 'the' posting list is going to be freaking long
[06:21] <lifeless> so the shortest-list it can give results from it reads fully at the moment
[06:21] <lifeless> this will get worse when documents are indexed
[06:21] <lifeless> it probably wants to iterate there
[06:22] <lifeless> and use concurrent iterators across all keys or some such
[06:22] <lifeless> this needs bzrlib core changes, which I posted about in the context of pack performance today
[06:22] <lifeless> still:
[06:23] <lifeless>  time bzr search "Lange"
[06:23] <lifeless> Revision id 'svn-v3-trunk1:6015fed2-1504-0410-9fe1-9d1591cc4771:python%2Ftrunk:36422'. Summary: '[Patch #945642] Fix non-blocking SSL sockets, which blocked on reads/writes in Python 2.3.'
[06:23] <lifeless> Revision id 'svn-v3-trunk1:6015fed2-1504-0410-9fe1-9d1591cc4771:python%2Ftrunk:25536'. Summary: 'add SSL class submitted by Tino Lange'
[06:23] <lifeless> real    0m0.538s
[06:23] <lifeless> user    0m0.332s
[06:23] <lifeless> sys     0m0.108s
[06:23] <lifeless> this is pretty fun
[06:23] <lifeless> I might blog about it tonight
[06:24] <beuno> I'm going to try and make it work with Loggerhead tomorrow
[06:24] <lifeless> awesome
[06:24] <lifeless> k, thats my hour, time to merge bzr.dev :P
[06:24] <beuno> have fun
[06:32] <lifeless> indices are fairly large btw:
[06:32] <lifeless> robertc@lifeless-64:~/source/baz/bzr.dev$ du -sh .bzr/bzr-search/
[06:32] <lifeless> 55M     .bzr/bzr-search/
[06:32] <lifeless> robertc@lifeless-64:~/source/baz/bzr.dev$ du -sh ../.bzr/repository/
[06:32] <lifeless> 99M     ../.bzr/repository/
[06:34] <beuno> I'd check to see how much loggerhead's are, but I need my cpu for the next hour
[06:34] <lifeless> :P
[06:34] <beuno> 55M doesn't seem that bad for 16k revision text indexed, especially if it's that fast
[06:34] <lifeless> thats better
[06:34] <lifeless> robertc@lifeless-64:~/source/baz/bzr.dev$ du -sh .bzr/bzr-search/ --apparent
[06:34] <lifeless> 6.1M    .bzr/bzr-search/
[06:34] <lifeless> robertc@lifeless-64:~/source/baz/bzr.dev$ du -sh ../.bzr/repository/ --apparent
[06:34] <lifeless> 99M     ../.bzr/repository/
[06:34] <lifeless> it was all allocation overhead
[06:35] <lifeless> when I shove it into packs it will be that big
[06:36] <beuno> :)
[06:37] <lifeless> poolie: is API CHANGES replacement for API BREAKS ?
[06:39] <gour> what is this 'bzr search' stuff? something new in 1.6?
[06:41] <lifeless> gour: its a plugin I wrote on the weekend
[06:44] <gour> lifeless: it looks cool. something like 'darcs changes' or ?
[06:44] <lifeless> I don't know what darcs changes is
[06:44] <lifeless> is darcs changes something like bzr search?
[06:44] <Peng> lifeless: FWIW, your mailing list message went through.
[06:44] <lifeless> Peng: thanks
[06:45] <gour> lifeless: "Gives a changelog-style summary of the repository history" with ability to 'grep' on tags, patches...
[06:46] <lifeless> gour: no, not really then. its a full text index, (though it currently only indexes the commit messages)
[06:47]  * gour is inspecting bzr-search on LP
[06:48] <lifeless> it won't eat your data :P you can install it and play :)
[06:49] <gour> i just wanted to ask if it eats data ;)
[06:50] <lifeless> man I hate conflicts. bzr.DWIM
[06:50] <beuno> mwhudson_, any reason for "kw['start_revid'] = start_revid" in controllers/revision_ui.py?
[06:51] <beuno> I think it has something to do with what you fixed
[06:51] <beuno> but it broke my clean-urls branch
[06:51] <mwhudson_> uh, i forget
[06:51] <mwhudson_> it's probably not essential
[06:51] <beuno> I'll triple check
[06:51] <beuno> just wanted to know if I was missing something obvious
[06:53] <beuno> running start-loggerhead.py twice makes you loose control of it, and you have to kill -9 it
[06:53] <beuno> we should probably do something more sensible there
[06:54] <gour> lifeless: what support do you plan for TERM? regex?
[06:54] <lifeless> words
[06:54] <lifeless> possibly phrases
[06:54] <gour> wildcards?
[06:54] <lifeless> I have some literature searches to do (though I have some probably naive ideas on that)
[06:54] <beuno> mwhudson_, everything seems to work fine. I'm going to send my patch with that removed. Just FYI
[06:55] <mwhudson_> beuno: cool
[06:55] <lifeless> uhm, wild cards, like stemming (and regexs-to-match-one-term) can be done if someone wants to
[06:55] <gour> lifeless: boolean searches?
[06:55] <mwhudson_> what does annotate say about that line?
[06:55] <lifeless> gour: yah, it is a simple boolean at the moment, every term is AND
[06:55] <lifeless> to do wild cards, its basically search the term list for matching terms with the wildcards, then do a boolean on that
[06:56] <gour> lifeless: it just needs some good parser and then it can search everything ;)
[06:56] <lifeless> gour: do you mean 'it needs to get terms from file texts and so on' ?
[06:56] <lifeless> gour: because if thats what you mean, its definitely planned, see DESIGN.
[06:57] <lifeless> there are some complications the more you think about it - given we're dealing with versioned data, what hits are most useful and so on
[06:57] <gour> lifeless: no, i meant something like mairix's syntax which is, afaik, used in e.g. claws-mail
[06:58] <gour> mairix is quite fast
[06:58] <lifeless> oh
[06:58] <mwhudson_> beuno: that kw stuff looks like super old code
[06:59] <gour> although, afaik, no utf-8 support
[06:59] <beuno> mwhudson_, 167.2.12
[06:59] <lifeless> so the syntax you can support is driven by your back end
[06:59] <beuno> 2007-08-23
[06:59] <lifeless> I don't have big plans for this, I wrote it because it seemed like an interesting thing to do
[07:00] <mwhudson_> beuno: that just changed the indentation
[07:00] <lifeless> I'll happily hand it over to someone, or review patches that improve it
[07:00] <gour> lifeless: it could be quite useful
[07:00] <lifeless> but i'm not sure I'm going to spend hours on it
[07:00] <beuno> mwhudson_, well, than yes, it's *super* old  :)
[07:00] <lifeless> gour: sure, it is already :P
[07:00]  * gour thinks that using mairix could be interesting
[07:01] <lifeless> doesn't look like a good fit TBH
[07:01] <lifeless> we have very specific constraints on same management of data to work with FTP etc servers
[07:01] <gour> 'marix-like' i wanted to say...mairix is for email
[07:02] <lifeless> oh well, feel free to dive in :P or file detailed feature requests.
[07:02] <gour> sure, let me explore present solution 1st
[07:02] <lifeless> ('be mairix like' means nothing to me. "Support '-foo' to exclude documents matching foo" does)
[07:04] <lifeless> ok merge done, running tests to see what broke
[07:05] <gour> lifeless: right, let's say, using syntax similar to the one in mairix, explained under "Match words" in man page
[07:05] <lifeless> gour: I don't have mairix, I don't plan to install it. I'm not trying to be stubborn or anything but referencing something I don't have and don't use *doesn't help*
[07:06] <beuno> mwhudson_, I have a few more quirks to work out on my branch since the merge with trunk. I'm going to defer sending til tomorrow, when I'm more awake
[07:06] <mwhudson_> beuno: sounds good, i'm not awake enough to read it today anyway :)
[07:06] <lifeless> (because everyone can point at their favourite and say 'be like that'). I could spend ages just reading them all. But there are more users than authors for bzr-search (already :P)
[07:07] <lifeless> so I think it makes sense for the larger group, which has the requests, to do the effort of creating specific suggestions
[07:07] <lifeless> in particular, things like 'what is it about mairix's search language you like'
[07:11] <gour> the biggest win is that it's very fast
[07:12] <lifeless> indeed, that is a useful thing :)
[07:12] <gour> here is some simple example of the syntax http://rafb.net/p/4kdE6G75.html i'm just saying as idea, not that it has to be like that at all...i.e. support for boolean (and, or, not) and some wildcarding would be enough
[07:13] <lifeless> I think I prefer the google style language
[07:13] <lifeless> where everything is AND
[07:14] <lifeless> to do not you do -TERM
[07:14] <lifeless> (giving AND NOT)
[07:14] <beuno> I'm off to bed
[07:14] <lifeless> anyhow, it sounds like you are really saying "I want more than exact word searches"
[07:14] <beuno> mwhudson_, great marathon today. Congrats on getting trunk in better shape  :)
[07:15] <gour> how do you handle rebuilding of the index?
[07:15] <lifeless> which is fair enough - but like I said, I'm doing this to scratch my own itches at the moment, which is mainly about the challenge of the thing
[07:15] <lifeless> what rebuild
[07:15] <lifeless> :P
[07:16] <lifeless> more seriously, the disk format is logically done now, but the individual elements of an 'index component' are in separate disk files rather than a single pack file
[07:16] <gour> what about OR? or it is done by running two queries one after another?
[07:16] <beuno> aaaaand another reason why commit messages are immutable: it wouls break bzr-search  :p
[07:16] <beuno> s/wouls/would
[07:16] <lifeless> incremental index operations generate additional components
[07:17] <lifeless> and I'll use a bog standard exponential backoff to combine components to amortise time-to-insert and time-to-query
[07:17] <gour> that's good
[07:17] <lifeless> it has no OR support today, but yes, to do or I would expect serial queries inside the engine with a union
[07:18] <gour> btw, what follows after bzr-1.9? 1.10?
[07:20] <lifeless> yah
[07:20] <lifeless> its just a serial
[07:22] <gour> will 1.6 release consider guadec conference and gnome needs?
[07:24] <lifeless> I sure hope so
[07:24] <gour> yesterday i was reading about imendio plans to push gtk-3.0 out...too bad, they're on git already
[07:29] <lifeless> poolie: ping
[07:32] <poolie> lifeless: pong
[07:32] <lifeless> want a quick call before EOD?
[07:34] <poolie> sure
[08:56] <lifeless> poolie: if you are still around the updated patch is in the moderator queue
[09:16] <lifeless> holy cow
[09:16] <lifeless> I strongly suspect this is as long a slope as bzr itself...
[09:19] <mwhudson> ?
[09:20] <lifeless> bzr.dev$ time bzr search VersionedFiles
[09:20] <lifeless> File id 'versionedfile.py-20060222045106-5039c71ee3b65490', revision 'robertc@robertcollins.net-20070921013648-i9w180g6ea73w9mf'. Summary: 'No summaries yet.'
[09:20] <lifeless> File id 'fetch.py-20050818234941-26fea6105696365d', revision 'andrew.bennetts@canonical.com-20070821235535-37okm0uaprwku9cu'. Summary: 'No summaries yet.'
[09:20] <lifeless> ...
[09:20] <lifeless> real    0m0.501s
[09:20] <lifeless> user    0m0.420s
[09:20] <lifeless> sys     0m0.076s
[09:22] <mwhudson> nice
[09:22] <lifeless> mwhudson: that is, I just implemented a crude index-file-texts patch for bzr-search
[09:23] <mwhudson> that must bump up the indexing time quite a lot?
[09:23] <lifeless> $ du -sh .bzr/bzr-search/ --apparent
[09:23] <lifeless> 44M     .bzr/bzr-search/
[09:23] <lifeless> real    5m53.305s
[09:23] <lifeless> user    4m2.547s
[09:23] <lifeless> sys     0m14.733s
[09:23] <lifeless> and 545MB of memory at peak
[09:24] <mwhudson> huh, not bad
[09:24] <lifeless> thanks
[09:25] <lifeless> without --apparent, 316M of disk in the index :P. Really must put them into packs.
[09:25] <lifeless> probably can improve things by doing a two-phase task
[09:26] <lifeless> 40K terms, or so
[09:27] <lifeless> make that 73K
[09:30] <lifeless> revno 21 if you feel tempted to, say, index launchpad ;P
[09:30] <lifeless> "boom"
[09:31] <mwhudson> i think i'd like to be able to use my machine for a little while longer :)
[09:34] <lifeless> you could index, say, twisted. for fun.
[09:53] <mtaylor> I don't know if this is a bzr or launchpad thing - but I think bzr since the output looks like it's just bzr log reworked...
[09:53] <mtaylor> if you look at https://code.launchpad.net/~mysql/mysql/mysql-5.1-telco-6.3
[09:53] <lifeless> go on
[09:53] <mtaylor> all you see are a bunch of merge changesets, without any idea of what went on beneath them...
[09:54] <pygi> lifeless, your plan to conquer the world by sending a mail to m-l did not work :p
[09:54] <lifeless> pygi: I know, so I sent another
[09:54] <lifeless> mtaylor: thats an lp thing, because bzr log does know and shows the indented merge
[09:55] <mtaylor> hm. oh you're right... it's missing that doesn't show them, right?
[09:55] <lifeless> yah
[09:55] <mtaylor> k. thanks.
[09:55]  * mtaylor goes to bug launchpad
[09:55] <mwhudson> it's known
[09:55]  * mtaylor aborts mission to annoy launchpad
[09:55]  * mwhudson clicks browse code and wonders if that was a good idea
[09:56] <mtaylor> any time anybody wants a good will-this-work-quickly? testcase, doing stuff with the mysql trees seems to be a good place to start :)
[09:56] <lifeless> mtaylor: yah
[09:56] <mwhudson> how big is the tree?
[09:57] <lifeless> mtaylor: how much ram do you have ?
[09:57] <mtaylor> big tree
[09:57] <mtaylor> I have 3G
[09:57] <lifeless> you might like to try bzr-search out then
[09:57] <lifeless> I'd be interested to know how badly it fars
[09:57] <mtaylor> ooh. yes, I might.
[09:58]  * mtaylor goes to try it
[09:59] <mtaylor> anything specific I should try to give you good feedback ?
[09:59] <lifeless> well
[09:59] <lifeless> rev 21 adds full text indexing
[09:59] <mtaylor> so "bzr index" should make the indexes, right?
[09:59] <mwhudson> mtaylor: roughly how many files?
[09:59] <lifeless> so try with rev 20, (time bzr index), and them rm -rf .bzr/bzr-search, and try with rev 21
[09:59] <mtaylor> ok
[09:59] <lifeless> bzr search term [term ...] performs a search
[10:00] <lifeless> currently case sensitive
[10:00] <mtaylor> can you do bzr search "term with spaces" ?
[10:00] <lifeless> not yet
[10:00] <lifeless> I need to do some reading/thinking
[10:00] <lifeless> I imagine there are canned answers for that in the field
[10:01] <mtaylor> probably so
[10:01] <mtaylor> running index using r20 now
[10:02] <mtaylor> without fulltext - real	0m59.664s
[10:02] <mtaylor> mwhudson: 9600
[10:02] <mtaylor> mwhudson: but around 60k revisions
[10:03] <mwhudson> mtaylor: so pretty big but not huge
[10:03] <mtaylor> correct
[10:03] <mwhudson> ~3 times the size of launchpad
[10:03] <lifeless> I should note
[10:03] <mwhudson> nothing insane like OOo :)
[10:03] <lifeless> this is not in any way optimised today
[10:03] <mtaylor> mwhudson: a fresh branch normally takes around 25 minutes
[10:03] <mtaylor> lifeless: that's fine :)
[10:03] <mwhudson> mtaylor: in a repo?
[10:03] <lifeless> its not completely-stupidly-naive
[10:03] <mtaylor> mwhudson: well, the first time
[10:04] <lifeless> does mysql have fulltext-and-phrases search indices?
[10:05] <mtaylor> well, fulltext indexes
[10:05] <mwhudson_> grr
[10:05] <mtaylor> I'm not sure it does phrases
[10:05] <mwhudson_> mtaylor: in a repo? <- last thing i said, dunno if it got through
[10:05] <mtaylor> mwhudson_: yeah - but the first time - subsequent branches, of course, are not bad
[10:05] <lifeless> so my first idea for phrases is an adjaceny graph
[10:06] <mwhudson_> ok
[10:06] <lifeless> start by getting candidates for simple AND
[10:06] <mtaylor> mwhudson_: I think the biggest problem with that is the non-updating status bar bug
[10:06] <lifeless> then look for term, term in an adjacency graph for that
[10:06] <mtaylor> mm
[10:07] <mtaylor> that sounds reasonable
[10:07] <lifeless> tats about 40 seconds worth of thought in the weekend
[10:07] <mtaylor> hehe
[10:08] <mtaylor> well, if you want to look at other work in the field, definitely check out sphinx search
[10:08] <lifeless> yeah
[10:08] <lifeless> I was pointed at that
[10:08] <lifeless> seems interesting, but only the search api seems to be python accessible
[10:08] <mtaylor> hm. I wonder how hard it would be to embed...
[10:09] <mtaylor> dang it
[10:09] <lifeless> :)
[10:09] <mtaylor> now you've got me thinking about YET ONE MORE project
[10:09] <lifeless> thing for me is I want this to run over the bzr vfs
[10:09] <lifeless> which means being safe on a small set of IO primitives
[10:09] <lifeless> and having their back end talk back to python
[10:11] <lifeless> it all seemed rather uninteresting, so I wrote an engine instead :P
[10:16] <lifeless> mtaylor: so, whats the memory use up to ?
[10:28] <lifeless> so mwhudson when are you getting a internet connection ?
[10:28] <lifeless> mtaylor: has the fulltext run started thrashing yet ?
[10:31] <radix> lifeless: have you written than indexer yet
[10:31] <radix> is that what mtaylor is testing
[10:31] <radix> s/than/that/
[10:33] <lifeless> radix: https://launchpad.net/bzr-search/
[11:36] <awilkins> How's the XMLRPC server thnig going?
[12:09] <lifeless> awilkins: verterok might know
[12:26] <awilkins> I just reverted two revisions and commited them as one but I still have one hanging around as a pending merge, how can I get rid of it?
[12:29] <mwhudson> revert has a --forget-merges option or something, but i don't really undestand the question
[12:29] <awilkins> mwhudson: That was the right answer though :-)
[12:30] <awilkins> I uncommitted two revisions, then committed the changes from those two revisions as one revision ; it retained the merge marker for one of those revisions which caused problems
[12:31] <awilkins> Gah, still got problems though
[12:31] <fullermd> Er.  That really shouldn't happen...
[12:32] <awilkins> It seems stuck now... it won't push to a branch it really should be pushing to without complaint "have diverged"
[12:32] <awilkins> I don;'t hink I did anything bad....
[12:33] <awilkins> \All I did was hand-merge  some changes from the bzr-gtk plugin windows install and commit them
[12:41] <mtaylor> lifeless: it wound up eating all my RAMs
[12:42] <lifeless> mtaylor: yah
[12:42] <lifeless> mtaylor: I suspected it might :P
[12:42] <mtaylor> hehe
[12:42] <lifeless> let me put a quick 'batch by 5000' in there
[12:42] <mtaylor> well, you were correct!
[12:47] <spiv> awilkins: perhaps "bzr missing" will help
[13:04] <lifeless> mtaylor: I'm just acceptance testing this patch
[13:04] <lifeless> 225MB peak
[13:04] <lifeless> thats much more tolerable
[13:04] <mtaylor> yes. much more!
[13:05] <lifeless> when it finishes I'll see if the performance blows chunks or not (it doesn't have the grouping-logic done yet
[13:05] <lifeless> also, the heuristic I'm using - 5K commit - is not that suitable for very wide trees
[13:05] <lifeless> possibly I should instead use 200 or something ;P
[13:06] <lifeless> 273 mb resident now
[13:08] <lifeless> ok, 4m47 to index
[13:08] <lifeless> now to search
[13:08] <lifeless> yeah, performance has been hit, largely disk cache I think, but all the same -
[13:09] <lifeless> 133505 files
[13:09] <lifeless> I have a plan for that - incremental development FTW
[13:10] <mtaylor> hehe.
[13:10] <mtaylor> I just told someone yesterday "oh yeah, you need this patch I'm working on right now to do that..."
[13:11] <lifeless> ok, pull, rev 22
[13:11] <mtaylor> k
[13:11] <mtaylor> running
[13:12] <lifeless> please keep an eye on top
[13:12] <mtaylor> yup
[13:12] <lifeless> I'm very interested in memory
[13:12] <mtaylor> up to 500m resident
[13:13] <mtaylor> seems to be holding steady at 506m
[13:13] <mtaylor> nope
[13:13] <mtaylor> 602
[13:14] <lifeless> if you want to tweak this, in index.py look for 5000
[13:14] <mtaylor> 650
[13:14] <mtaylor> ok
[13:14] <lifeless> but see how it goes first
[13:15] <lifeless> large trees do imply large developer machines :P
[13:15] <mtaylor> you'd think :)
[13:15] <mtaylor> 725
[13:16] <matkor> Hmm how can I get info about given revision (342.29.1) who did it ? when etc ?
[13:16] <lifeless> matkor: bzr log -r 342.29.1
[13:16] <lifeless> mtaylor: once it starts the progress bar 0...5000 again it will be roughly stable
[13:18] <mtaylor> 1000
[13:18] <mtaylor> it's going to start another status bar?
[13:18] <lifeless> you have 3 :P
[13:18] <lifeless> it does a progress bar for each group of 5K
[13:18] <lifeless> I haven't really done a UI for this, beyond proof-of-capability
[13:19] <mtaylor> so, what's it doing before that status bar is displayed?
[13:20] <lifeless> well
[13:20] <lifeless> at the start it finds what revisions to index
[13:20] <lifeless> then it topo sorts them
[13:21] <lifeless> groups by <size>
[13:21] <lifeless> then for each group it makes a new component index
[13:21] <lifeless> reads the inventories (this shows the progress bar) for those commits
[13:21] <lifeless> reads the lines introduced (roughly) by those commits
[13:22] <lifeless> and then reads the commit messages
[13:22] <lifeless> writes that component out
[13:22] <lifeless> and loop
[13:22] <mtaylor> I seem to be stable at around a gig so far
[13:23] <mtaylor> and it's just making kcryptd work now
[13:23] <mtaylor> which, I suppose, one would expect with an encrypted root partition
[13:23] <lifeless> :P
[13:23] <lifeless> so its flushing an index probably
[13:23] <lifeless> my entire drive is enrypted
[13:23] <lifeless> now - the posting lists for each term are a separate index
[13:24] <lifeless> this creates hundreds of thousands of files for me with the 5K group
[13:24] <lifeless> I hope you have btrees enabled for your file system :)
[13:28] <mtaylor> hehe.
[13:28] <mtaylor> nope
[13:28] <mtaylor> plain-ol ext3
[13:28] <lifeless> ext3 has btrees
[13:28] <lifeless> uhm dirindex or something the option is called
[13:28] <mtaylor> but not unless I've enabled them, right?
[13:28] <lifeless> how old isyour install, what distro
[13:29] <lifeless> (as in, what distro install CD created your file system)
[13:29] <mtaylor> ubuntu. hardy - reinstalled from scratch  a couple of months ago to get the encrypted root
[13:29] <lifeless> you'll be fine :)
[13:29] <mtaylor> so, hardy alt cd
[13:29] <mtaylor> ok
[13:37] <lifeless> stable ?
[13:38] <lifeless> (I mean, is it stable at 1G ?)
[13:40] <jelmer> Peng, do you know if mercurial has something like "bzr send" ?
[13:41] <spiv> tune2fs can tell you what options are set on your ext2/3 partition, e.g. "tune2fs -l /dev/sda3 | grep features"
[13:42] <lifeless> mtaylor: (I mean, is it stable at 1G ?)
[13:43] <mtaylor> lifeless: yes. it seems to be stable
[13:43] <mtaylor> ooh!
[13:43] <mtaylor> and I've got status bar now
[13:43] <lifeless> right, its doing another group
[13:43] <mtaylor> well, I had a status bar for a short time
[13:43] <lifeless> its almost certainly done that quite a few times
[13:43] <mtaylor> the bit with the status bar seems to be the quick bit. :)
[13:43] <lifeless> :P
[13:50] <lifeless> how many mb is the .bzr/repository ?
[13:50] <mtaylor> lifeless: 635M
[13:50] <toyto1> hello lifeless, this is possible 'bzr commit -m "My Message" > filename.txt'?
[13:50] <lifeless> so, I would geuss at 1 hour or so to index
[13:52] <lifeless> toyto1: what would you like that to do ?
[13:53] <toyto1> lifeless: let's say viewing the files that you have commited, like if you'll notice the 'bzr commit' it will open the default editor right?
[13:54] <lifeless> yes
[13:54] <toyto1> instead of viewing that editor i want to make it a file instead of a log file or let's say i use 'bzr commit -m "my message" > file.txt so that file.txt can be used as my backup for files that had been in the trunk
[13:55] <toyto1> or is there a way for bzr commands to list the files inside a trunk?
[13:55] <toyto1> let's say 'bzr filelist' will list the files or it can be 'bzr filelist > file.txt'
[13:57] <lifeless> bzr ls
[13:57] <lifeless> bzr ls URL
[13:57] <lifeless> bzr ls -r X URL
[13:59] <toyto1> thank you lifeless, may i kow what's the -r means?
[13:59] <lifeless> toyto1: its used to specify revisions
[13:59] <lifeless> toyto1: I recommend you read our user guide
[14:00] <toyto1> lifeless: ah thanks, yeah i was reading the user guide but I didn't notice reading it :( never had that idea yet
[14:09] <lifeless> mtaylor: can you do find .bzr/bzr-search/indices -name "*.rix"
[14:09] <Pieter> does bzr search work with large repositories yet?
[14:10] <mtaylor> Pieter: that's what we're testing :)
[14:10] <Pieter> ah
[14:10] <mtaylor> lifeless: I get 6 results
[14:10] <Pieter> is it in the branch now? or do you need changes in bzr.dev?
[14:11] <mtaylor> Pieter: branch. I just pulled it off launchpad
[14:11] <Pieter> let's try it then :)
[14:11] <mtaylor> if you have a large repo - it's not quick to index :)
[14:11] <Pieter> I'll try on emacs
[14:12] <mtaylor> Pieter: lp:bzr-search
[14:12] <mtaylor> Pieter: emacs is in bzr now?
[14:12] <Pieter> mtaylor: yeah, I already have it
[14:12] <toyto1> lifeless: I have found a good user guide it's 'bzr help commands' :D
[14:12] <Pieter> mtaylor: no, they're switching
[14:12] <mtaylor> sweet!
[14:12] <Pieter> mtaylor: but someone imported their cvs in bzr and made it public
[14:13] <lifeless> mtaylor: its completed 30K revisions then
[14:13] <mtaylor> well, then I shoujld be done in another hour or so then
[14:13] <lifeless> Pieter: its /functional/ on larger repos now
[14:13] <lifeless> Pieter: how much ram do you have, and how many files are in the emacs repo ?
[14:14] <Pieter> lifeless: 2GB RAM
[14:14] <Pieter> let's see how many files
[14:14] <Pieter> 3000 files, 89000 commits
[14:16] <lifeless> K
[14:16] <lifeless> you should be fine
[14:16] <Pieter> it's still cloning though
[14:18] <lifeless> you don't have a copy alrady ?
[14:19] <Pieter> yeah, but I didn't want to use my pristine clone :)
[14:19] <lifeless> it writes to .bzr/bzr-search only
[14:20] <lifeless> check BUGS out for current caveats
[14:20]  * mgedmin hugs everyone
[14:20] <mgedmin> bzr is reading my mind again
[14:20] <mgedmin> and working just the way I want it to work
[14:21] <lifeless> cool
[14:21]  * mgedmin discovered that "bzr push" and "bzr merge" remember the last location separately
[14:21] <Pieter> lifeless: what does it currently index
[14:21] <Pieter> ?
[14:24] <lifeless> commit messages and file contents
[14:26] <lifeless> its pretty basic stuff
[14:26] <lifeless> lots of obvious room for improvement
[14:26] <lifeless> the biggest actual caveat today is that it creates hundreds of thousands of files
[14:27] <lifeless> because I haven't written a transport adapter to allow the index layer to read an index from within a .pack file
[14:27] <lifeless> which I will get around to pretty soon
[14:28] <lifeless> the main impact of that is that you have to have dirindex on on ext3 to have it work well :)
[14:28] <Pieter> :)
[14:28] <lifeless> it also badly needs some ui love
[14:28] <Pieter> won't do wonders on my hfs then :P
[14:28] <lifeless> like overall progress during indexing
[14:28] <lifeless> uhm, should be ok I imagine. Well - even though its a known defect feel free to report a bug if it blows horrible chunks
[14:29] <lifeless> if you want a teaser without the file text content searching, revert to revision 20, which only indexes commit messages
[14:29] <lifeless> (it has to do less, so obviously is a lot faster)
[14:31] <Pieter> I tried that some time ago, but then it crashed somewher in bisect in bzr
[14:31] <lifeless> 'bzr revert -r XXX' ?
[14:31] <Pieter> I meant the index
[14:31] <lifeless> oh right
[14:32] <lifeless> revno 20:
[14:32] <lifeless> message:
[14:32] <lifeless>   New disk format, one step closer to being in packs, and fixes the issue with overflowing bzrlib's index bisection capabilities.
[14:32] <Pieter> :)
[14:34] <lifeless> (I knew going it would fail on big environments till I built up additional capabilities, but I wanted early-results so took an incremental approach)
[14:34] <lifeless> all the mundane framework stuff like basic ui, format marker, open/close/create etc
[14:35] <lifeless> thats all in place, so actual changes now can deliver real improvements (like the full text stuff, which was about 40 minutes to do)
[14:35] <lifeless> s/full text/file text/
[14:40] <lifeless> anyhow, feedback appreciated
[14:41] <lifeless> if it chews ram (I would expect about 600MB stable for your tree) let me know - file a bug
[14:41] <lifeless> performance too
[14:45] <lifeless> mtaylor: welcome back
[14:45] <lifeless> mtaylor: how many .rix now ?
[14:46] <lifeless> Pieter: I'm just pushing a patch to drop the batch size to 2500 revisions, you may prefer that - less memory pressure
[14:46] <lifeless> it will degrade search performance until I write the batch-combining logic
[14:47] <lifeless> (or someone sends me a patch :))
[14:48] <mtaylor_> lifeless: my machine is taking a very long time to answer that question
[14:49] <mtaylor_> oh, and bzr is up to 1.2G
[14:49] <lifeless> mtaylor_: dentry cache :)
[14:49] <lifeless> mtaylor_: you will have _many_ files :)
[14:49] <mtaylor_> hehe
[14:50] <lifeless> actually
[14:50] <lifeless> I was being nub
[14:50] <lifeless> cat .bzr/bzr-search/names
[14:50] <lifeless> thats the root node
[14:51] <lifeless> each line is a block of batch_size (e.g. 5K) revisions
[14:51] <lifeless> (actually its arbitrary size revisions, but your version only writes in one group size)
[14:55] <Pieter> lifeless: memory pressure is fine so far -- no more than 350MB
[14:55] <lifeless> Pieter: ok cool; its proportional to (term_count * document_references) per batch
[14:55] <lifeless> well, I say proportional, thats the key driver
[14:56] <Pieter> I wonder how big the index will be
[14:56] <lifeless> so lots of small changes will use less memory than lots of big changes
[14:56] <Pieter> the repo itself is 300MB
[14:56] <lifeless> use du --apparent
[14:56] <lifeless> the next stage will combine all the little files
[14:56] <lifeless> by writing them end to end in a big single file
[14:56] <lifeless> (next stage of development)
[14:57] <lifeless> I would expect, 200MB or something
[14:57] <lifeless> heres the one for bzr
[14:58] <lifeless> robertc@lifeless-64:~/source/baz/bzr.dev$ find .bzr/bzr-search/ | wc -l
[14:58] <lifeless> 133505
[14:58] <lifeless> robertc@lifeless-64:~/source/baz/bzr.dev$ du -sh .bzr/bzr-search/
[14:58] <lifeless> 549M    .bzr/bzr-search/
[14:58] <lifeless> robertc@lifeless-64:~/source/baz/bzr.dev$ du -sh .bzr/bzr-search/  --apparent
[14:58] <lifeless> 50M     .bzr/bzr-search/
[14:58] <lifeless> robertc@lifeless-64:~/source/baz/bzr.dev$ du -sh ../.bzr/repository/ --apparent
[14:58] <lifeless> 99M     ../.bzr/repository/
[14:58] <lifeless> (../.bzr/repository is the shared repository containing all my branches)
[14:59] <Pieter> my du doesn't have apparent
[14:59] <Pieter> it's at 400k files so far
[14:59] <lifeless> oh
[14:59] <lifeless> well
[14:59] <lifeless> the posting lists are generally very small
[15:00] <lifeless> I forget the distribution name, its the one for use of words in programming languages :)
[15:04] <lifeless> Pieter: mtaylor_: gnight
[15:04] <Pieter> see ya
[15:04] <lifeless> please do let me know (email to the list or a bug report in launchpad)
[15:04] <lifeless> how it goes
[15:04] <lifeless> oh
[15:04] <lifeless> for reference
[15:05] <lifeless> its subsecond to start outputting hits for most every search I do on bzr.dev.
[15:06] <Pieter> I think I'm going to stop it and wait until it's a single-file thing
[15:07] <Pieter> it's just doing disk trashing now
[15:08] <lifeless> do you lurk here?
[15:08] <lifeless> I can ping you when it is
[15:08] <Pieter> yes
[15:08] <lifeless> k, I will do that then if you like
[15:08] <Pieter> yeah, thanks
[15:09] <lifeless> no problems
[15:09] <lifeless> good night
[15:27] <mtaylor_> I know this is off-tope
[15:27] <mtaylor_> I know this is off-topic
[15:28] <mtaylor_> nm
[15:28] <ricardokirkner> hi, I am having an issue with bzr running as a smart server under wsgi. the error I am getting is: 'module' has no 'ElementTree' attribute.
[15:29] <ricardokirkner> this is happening when trying to open the bzr shared repository
[15:29] <ricardokirkner> the server is using bzr 1.5 while the client is using 1.3
[15:30] <ricardokirkner> any ideas? I managed to trace a possible location for that problem to the xml_serializer.py file, where there is an import elementtree which falls back to the elementtree implementation of python2.4 which does not have the ElementTree attribute, but after changing that specific line, the problem is still there, so I am still trying to figure out where the problem is
[16:15] <maanskyn> does anyone here have any experience with loggerhead?
[16:18] <trepca> hey
[16:19] <trepca> is there any support for attribute insertion as in subversion where you inserted "$Revision: $" which was then replaced with actual revision on commit ?
[16:27] <statik> jam: do we have a bzr logo like the one on bazaar-vcs.org but without the blue background?
[16:27] <jam> statik: http://bazaar-vcs.org/LogoOptions
[16:27] <jam> It comes in an SVG
[16:27] <jam> or a bunch of .png sizes
[16:27] <statik> jam: thanks!
[16:27] <jam> statik: btw, looking forward to what comes out of that :)
[16:28] <statik> me too
[16:38] <ricardokirkner> has anyone successfully integrated bzr with trac for multiple projects using http authentication for both read and write? (I am facing several difficulties here)
[16:40] <jelmer> ricardokirkner: read/write in trac is unrelated to trac-bzr - or do you mean something else?
[16:41] <ricardokirkner> sorry, yes
[16:41] <ricardokirkner> I think the trac part is irrelevant here
[16:41] <ricardokirkner> what I am trying to achieve is to have bzr use apache's authentication plugins (like for example ldap, which I already have working)
[16:41] <ricardokirkner> the thing is: authentication works ok (it doesnt depend on bzr)
[16:42] <ricardokirkner> but the integration through mod_<something> is not working correctly
[16:42] <jelmer> ahh, ok
[16:42] <ricardokirkner> first I tried mod_python, but I got some issues that according to my research were due to mod_python
[16:42]  * jelmer doesn't have much experience with bzr and apache
[16:42] <ricardokirkner> so I tried to use mod_wsgi, and I am still having issues (very strange ones)
[16:43] <ricardokirkner> regarding 'module' has no 'ElemenTree' attribute, when trying to open a repository
[16:43] <ricardokirkner> but this does only happen when I try to open the second project's repository
[16:43] <ricardokirkner> the first open works ok
[16:44] <ricardokirkner> and if I restart apache and try the other way around, the same thing happens, but with the projects reversed
[17:32] <emilis_info> is it possible to commit a file to an older revision that has already been committed?
[17:37] <mtaylor> emilis_info: no
[17:37] <emilis_info> heh :/
[17:37] <mtaylor> emilis_info: you can uncommit, and then redo the commit
[17:37] <mtaylor> but a commit is a commit
[17:38] <emilis_info> I understand that, but that would require me to uncommit a few revisions back...
[17:38] <mtaylor> yeah. then you're pretty much screwed. :)
[17:38] <emilis_info> the files in these revisions are completely independent
[17:38] <emilis_info> doh, I guess I'll just do another commit
[17:38] <emilis_info> :)
[17:39] <mtaylor> emilis_info: well... you could branch your current branch into a different place up until that revision
[17:39] <mtaylor> emilis_info: then in the new branch, you could uncommit the last rev, then recommit adding in your new file
[17:39] <mtaylor> then you could merge your other changes in maybe...
[17:39] <mtaylor> but it might get ugly
[17:39] <mtaylor> you probably should not do ^^
[17:40] <emilis_info> seems really ugly from your description :)
[17:41] <fredreichbier> anybody here who uses bzrweb from http://vmlinux.org/jocke/bzr/index.py?
[17:50] <gour> i need some advice what kind of workflow to use with bzr
[17:50] <gour> atm i used darcs for one project - study notes...
[17:51] <gour> there are few courses, each has several written assignments with some questions. we could say project with few componenents, each having few features
[17:52] <gour> i plan to create shared repo and would like to work on specific 'component' and then merge to the main 'trunk' by having clear history
[17:53] <gour> do you suggest to create branch for each 'feature' (question) and when i finish with it, then to merge back to main trunk and discard 'working' branch?
[17:53] <pickscrape> Sounds like you're talking about partial tree branches, or whatever it's called
[17:53] <gour> hmm, dunno
[17:54] <pickscrape> Currently when you branch you branch the whole tree, and your working copy contains that whole tree
[17:54] <gour> in darcs i can cherrypick patches and push them to the main trunk keeping the history clear and after finishing with the 'feature' i can easily dismiss unwanted branch.
[17:55] <pickscrape> Sounds a bit like rebase?
[17:55] <gour> i do not mind having the whole tree, but to have somewhat clean history in the 'main'
[17:55] <gour> hmm, no experience with git, but can take look at bzr's rebase
[18:01] <fullermd> That ["do you suggest..."] sounds like what I'd do...
[18:03] <gour> fullermd: thanks
[18:14] <Leonidas> gour: what do you mean by "clear history"?
[21:23] <Verterok> moin
[21:27] <james_w> hi Verterok
[21:27] <Verterok> hi james_w
[22:05] <fdv> Hi. Does anybody know whether it's possible to push a set of revisions as one to an svn repo?
[22:06] <Odd_Bloke> fdv: As a single revision, or all at the same time?
[22:06] <Odd_Bloke> Both are possible, but require different methods.
[22:06] <fdv> Odd_Bloke: as a single revision
[22:07] <fdv> then I don't necessarily have to have my code in a completely good state for small commits
[22:07] <Odd_Bloke> fdv: You should merge the branch with those revisions into a pristine checkout/branch of the SVN repo.
[22:08] <fdv> ah, so that's possible, too?
[22:08] <fdv> I don't necessesarily have to talk directly to the svn repo?
[22:08] <Odd_Bloke> fdv: Well, you'll merge into the checkout of the SVN repo and commit it having first sorted out conflicts/tested it is working properly.
[22:09] <Odd_Bloke> This creates a single mainline revision, which is either in the SVN repo (a checkout) or can be pushed there (a branch).
[22:09] <fdv> Odd_Bloke: right, that seems fine
[22:10] <fdv> so, basically, I have two trees, then, a working tree and a 'commit' tree, and I merge my changes into the commit tree everytime I wish to push to svn
[22:10] <fdv> Odd_Bloke: sounds feasible. Thanks :)
[22:10] <Odd_Bloke> fdv: Yup, that's it.  No worries. :)
[22:11] <fdv> (well, apart from that the repo will probably be about 7-8GB.. :p)
[22:12] <LarstiQ> if you don't really care about the bzr history in svn, or roundtripping, you could also coversion
[22:18] <fdv> roundtripping and coversion?
[22:18] <fdv> sorry, just not familiar with the terminology
[22:19] <LarstiQ> fdv: roundtripping would be doing a commit with bzr, send it over to svn, and pull with bzr from it again, getting the equivalent revision.
[22:19] <Odd_Bloke> fdv: Coversioning would be where you just init a bzr branch in your SVN checkout, do intermediate work in bzr and then commit normally in SVN.
[22:19] <LarstiQ> fdv: coversioning is when you use two seperate versioning control systems to version one tree, without them knowing about the other
[22:20] <LarstiQ> fdv: so no sharing of history, but sometimes that is what one wants
[22:22] <fdv> ah, right
[22:26] <fdv> thanks for great help, once again :)
[22:27] <fdv> I think I have to work with this for a while to find out how it's best to incorporate into our workflow
[22:27] <fdv> (or what kind of workflow one can accomplish)
[22:39] <james_w> statik: I was just today trying to put together a screencast about bzr.
[22:40] <james_w> I got stuck trying to find something that I could use to edit the video without crashing.
[22:40] <Odd_Bloke> statik: You're emurphy on Twitter?
[22:42] <james_w> hi Odd_Bloke
[22:44] <Odd_Bloke> james_w: o/
[22:57] <lifeless> mtaylor: hi
[22:57] <lifeless> beuno: also, hi
[22:59] <jelmer> 'morning lifeless
[23:00] <beuno> mornin' lifeless. I see you've managed to index files as well  :)
[23:01] <lifeless> yeah
[23:01] <lifeless> people kept bitching about it
[23:01] <lifeless> so I'm like fuck-it, I'll do it
[23:01] <james_w> moanin' lifeless
[23:02] <beuno> hahah
[23:02] <beuno> so that's the secret...
[23:03] <lifeless> hi james_w, jelmer
[23:39] <mwhudson> beuno: good morning
[23:40] <beuno> good morning to you mwhudson
[23:43] <mwhudson> so i looked at pylons a bit last night
[23:43] <beuno> mwhudson, I've been thinking about what would be enough to make a Loggerhead release. To get things rolling again. Maybe we can add nicer-urls, clean up setup.py, and... well, the templating change is already a big one by itself
[23:43] <mwhudson> i wasn't very impressed :(
[23:44] <beuno> really? Hard to use? Incomplete?
[23:44] <beuno> "just ugly"?
[23:45] <mwhudson> ugly, from my personal pov, and not really appropriate for loggerhead
[23:45] <mwhudson> i think i'm quite stuck in my ways in some things
[23:45] <mwhudson> i like xml templating and i like object publishing
[23:45] <mwhudson> (as opposed to routes-type things)
[23:46] <beuno> fair enough. Did that include looking at Mako as well?
[23:46] <mwhudson> also pylons seems to have and encourage quite a lot of magic
[23:46] <mwhudson> no
[23:47] <beuno> ok, well, a better look at Genshi may be worth a try instead
[23:50] <beuno> either way, if pylons is that ugly, we may as well concentrate on the core instead, now that the templating engine is saner
[23:51] <mwhudson> wrt a release, if we could get away from cherrypy, i'd say we were pretty close to a release-candidate
[23:52] <mwhudson> cherrypy3 is an option i guess, but it conflicts with cherrypy 2 :(
[23:54] <poolie> good morning
[23:54] <beuno> how about paste?
[23:54] <beuno> morning poolie
[23:55] <mwhudson> beuno: i'm still not really sure what paste is :)
[23:55] <mwhudson> hi poolie
[23:57] <mwhudson> beuno: are you going to bzr send your cleaner urls branch?
[23:58] <beuno> mwhudson, http://pythonpaste.org/
[23:58] <mwhudson> beuno: yes
[23:58] <beuno> mwhudson, yeah, I just need to cleanup the latest quirks introduced when merging from trunk
[23:58] <mwhudson> beuno: that site is very bad at explaining what paste is :)
[23:59] <poolie> beuno, hi
[23:59] <beuno> heheh, AFAIK, it can replace cherrypy, which is why I have that in my head
[23:59] <mwhudson> ah, http://pythonpaste.org/do-it-yourself-framework.html seems like the page i was looking for