[09:11] hi. at $COMPANY, we use bazaar for our internal development. now the problem appeared, that some repository needs over 500 seconds to download, while the cli says: "Inserting stream:Estimate 751/2255" with no progress. AFAIK any other repository runs well. any advice? [09:13] What's it doing during that time? Churning the disk, slamming the network, chewing the CPU? [09:14] fullermd: nothing. it takes it RAM share at the repo server (the checkout is via the ssh transport), but the disk and network is idling [09:14] no cpu usage, too [09:15] fullermd: you could think someone stopped the job (no, nobody does this) [09:16] Check both server and client side? [09:16] fullermd: both server and client side [09:17] fullermd: would you mind to prefix posts for me with my username, so I get a higlight? this would be really nice :) [09:22] fullermd: at the server side, the .bzr.log with some debug flags says "Adding the key (, 23942126, 8836427) to an LRUSizeCache failed. value 62761550 is too big to fit in a the cache with size 41943040 52428800", client says "inserting substream: texts", after that both went idling [09:24] killermoehre: What, to sort it out from all the other conversations happening here? :p [09:25] killermoehre: I'm pretty sure that's more of a diagnostic message than an error, so probably ignorable. [09:26] killermoehre: Still, I have a hard time imagining how both sides could be sitting around doing nothing. Something's waiting on something.2, and that's either because something.2 is feasting on some resource, something is broken, or things are deadlocked. [09:26] If (3), then it'll never get anywhere, and points to a ginormous bug. I doubt it. (2) is always possible, but is also ginormous bug I haven't seen. So I'm guessing (1) is the case somewhere. [09:27] fullermd: I don't see the numbers you're referring ^^ [09:28] killermoehre: Just the 3 things I mentioned in the prior line. [09:32] fullermd: well, I don't see any kind of resource usage, but maybe you can hint me something. the server is a debian7 with bzr 2.6.0dev2, the clients are very different, but features all the same problem, so I think it's a problem with the server. [09:33] and now I see with a test at the server, that a local checkout hangs, too [09:34] top for CPU, iostat for disk, netstat for network. The more other stuff is being done by the system, of course, the harder to sort out what's happening with bzr itself... [09:34] If something's memory constrained, there's always the chance it's falling into swap. That would completely tank performance, but you'd probably notice the pagein/out activity pretty easily. [09:35] fullermd: yeah, 100% cpu! success, I found something [09:35] and constant growing RAM usage [09:35] or not [09:35] but enough RAM usage [09:35] killermoehre: what is the version of the repository you are branching from? [09:35] killermoehre: maybe you can upgrade / pack it? [09:36] to be fair, this is a VM with 476MB RAM [09:36] Well, 2000-some texts isn't particularly large, so it's probably not a pure history size filling stuff up. Big files in the history? [09:36] fullermd: how can I check? I'm only the admin, I don't use bzr very often [09:36] That _is_ a bit tight RAMwise by modern standards. 'course, it's enough if it's not paging in and out of swap, but it probably can still kill your page cache etc... [09:37] killermoehre: Well, by just knowing what's in the code/whatever is being versioned. Images, binaries, ISO's, core dumps, etc? [09:37] fullermd: should be just php code [09:37] 2000 texts of program source code is tiny, but if it's multi-meg images, that's a whole 'nother matter. [09:39] After doing a branch, you should have a repo with a single pack in it. How big is that pack? [09:39] fullermd: pack? [09:40] ll .bzr/repository/packs/ [09:41] fullermd: 32MB [09:42] OK, even a 2002-era memory size shouldn't have too much trouble with _that_. [09:42] So let's look at zyga's first suggestion; format mismatch. Look at 'info' on the source repo, see what format it says it's in. [09:43] fullermd, zyga: format says 2a [09:45] Well, that should be OK, unless you're branching it in _to_ an old format repo, which is pretty unlikely. [09:45] (would probably take some conscious effort to squirrel things up) [09:45] fullermd: I don't think so I do this with a new checkout === alexlist` is now known as alexlist [09:54] killermoehre: Well, that's all my particularly good ideas. It's not a tiny repo, but it's not huge either, so I wouldn't expect massive waits on it putting stuff together. [09:54] Of course, VM does mean all kinds of constrained environment, so "all the CPU available for a long time" might not really be all that much in larger terms... [09:55] fullermd: as I said, I don't see anything in vmstat, too [09:56] zyga: do you have another idea? [09:56] killermoehre: not really [09:57] killermoehre: try bzr pack on the remote end [09:57] killermoehre: see if that helps [09:57] VM as in virtual machine; the CPU churnings in top. [09:57] killermoehre: use a shared repo locally to avoid the traffic in the first place [09:57] zyga: well, we have multiple developers and they want to checkout the repo first to have it local ^^ [09:58] For comparison, here's a local repo I have of a project full of PHP and SQL files, but with a bunch of images thrown in too. The history is about 37 megs. pack takes about 30 seconds. [09:58] fullermd: I can't see the cpu utilization if I do a remote checkout [09:59] killermoehre: doing a checkout or a branch? [09:59] killermoehre: checkouts are pretty crap IMHO [09:59] zyga: checkout [10:00] funfact: a lightwight checkout runs fast [10:00] killermoehre: checkouts are horrible, don't use them unless you really have to [10:00] killermoehre: everything about them is wrong IMHO [10:00] zyga: it's the way the devs work here [10:00] I can't change this [10:00] I'm just the admin [10:00] Checkouts are awesome. Lightweight checkouts, now, of non-local branches, are nutso. [10:01] The CPU burning may be on the server side, not the client. [10:01] fullermd: the bzr pack takes 100% cpu at the server [10:01] funny [10:01] killermoehre: it runs once and then stops [10:01] Well, yeah; pack is CPU bound, unless you have _really_ slow disks. [10:01] killermoehre: recommend your developers not to use checkouts, they probably don't need them and are just used to "svn" [10:01] And most VM hosts don't have ESDI controllers, so yours probably aren't that slow ;p [10:02] A heavy checkout ain't gonna be any slower than a branch. [10:03] fullermd: the point with checkouts is that they are against the spirit of distribution, just do a local branch, use a shared repository to get local, low-cost branching, push branches back, that's the most efficient bzr workflow [10:03] centralized checkouts are all evil, no way to CI, single point of contention, I cannot understand why people even use that [10:03] zyga: right now the workflow is "checkout → developing → merging with the local checkout → after developing is done, merging with the repo" [10:04] killermoehre: branch, hack, merge || rebase, push, have CI merge it [10:04] killermoehre: (push to new branch on the far end) [10:04] Pfui. Checkouts are an _excellent_ mechanism to coordinate work on shared branches. The only rational alternatives are just manually implementing the checkout workflow, and that's silly. [10:04] killermoehre: anyway [10:04] fullermd: I have yet to see a case where a checkout is worth anything over CI gatekeeper [10:04] fullermd: but perhasps in some specal cases that makes sense [10:04] zyga: CI? [10:04] "not being nutso" :) [10:04] fullermd: for me checkouts are slow and crappy [10:05] killermoehre: http://en.wikipedia.org/wiki/Continuous_integration [10:05] zyga: ah, ok [10:05] zyga: I don't think this is appliable right now here [10:06] killermoehre: well, you have what you have then [10:06] killermoehre: slow stuff [10:06] fullermd: the pack now hangs at "\ repacking texts:texts 147/650" [10:06] can I see which file this is? [10:07] or what bzr does there? [10:07] It's not hung, it's just digesting. [10:08] fullermd: hanging := no visable output of success or computing [10:08] killermoehre: it's not hang, it's just compressing, wait [10:08] zyga: any way to see what it's compressing? [10:08] hmm … lsof [10:09] Won't tell you anything useful. [10:09] right -__ [10:09] right -__ [10:09] right -_- [10:09] argh! [10:10] It's extracting the texts out of the existing packs, and re-compressing them into new. Maybe verbose logging can tell you text ids, but without a lot of digging that still won't tell you anything too useful. [10:11] fullermd: so this is an internal action and probably the problem can only be solved by throwing more resources at the machine? [10:12] Probably. How long did the pack take? [10:14] fullermd: if I read the ~/.bzr.log right, it took 810 seconds [10:15] well, I have multiple debug flags enabled, so this value could be a little to high for production use [10:16] 810 seconds? For a 36 meg repo? So 24 times as long as on my workstation here? [10:17] I guess it's possible the contents could be pathological somehow. But yeah, I'd guess the more likely answer is "time to retire that Pentium II". [10:17] fullermd: again, it's a VM. two cores at 3GHz [10:18] brb, coffee wants to come out [10:19] Well, pack isn't multithreaded, so 2 cores wouldn't be much different from 1. [10:20] But either it's _really_ busy with something else, or that's a 3 GHz single-issue in-order core with no cache, 'cuz I'm only on a not particularly awe-inspiring 3.7. [10:20] Or, as above, possibly an ultra-pathological set of contents. But I'd be pretty surprised at that sort of disparity. [10:26] I do love VM's, but the species does generally describe an animal that's oversubscribed to heck and back, and gets really cranky when subjected to much load. And, absent aforementioned magic, that repo really shouldn't be much load. [10:28] A lot of possible pathologies could be come up with to explain things running like molasses, but they're a pretty far reach. Running out of suds is endemic to the VM world, and accounts for it. [10:32] fullermd: well, the hosting server has 8 cores and 8 GB RAM, summed up the machines uses 10 cores and 7.5 GB RAM. [10:35] killermoehre: re-do that operation on your $laptop [10:35] killermoehre: (just before the pack) [10:35] killermoehre: if you wonder about virtualization impact [10:36] zyga: the "bzr pack" at the checkout I did? [10:38] killermoehre: not at the checkout [10:38] killermoehre: you need the original branch, if you still have it [10:38] The checkout will do fine. [10:38] (assuming it's not a light one, of course. But that's silly anyway) [10:39] fullermd, zyga: started a bzr pack at my $laptop in the checkout I did. seems to "hang", too [10:39] I will see how long it takes [10:41] still running … [10:42] I like how bzr is logging everything [10:44] and still running … six minutes now [10:45] That really sounds excessive. It's got the CPU nailed the whole time? [11:02] fullermd: yes [11:03] fullermd: sorry for the delay. sometimes other work interferes ^^ [11:03] .bzr.log probably has some info about the counts of things being dealt with. [11:03] On my ~37 meg repo here, I've got numbers like [11:03] 0.158 repacking 8809 revisions [11:03] 3.278 repacking 8809 inventories [11:03] 4.600 repacking chk: 8717 id_to_entry roots, 1426 p_id_map roots, 41320 total keys [11:03] 14.405 repacking 30226 texts [11:03] 29.317 repacking 0 signatures [11:03] (and there it's done) [11:05] fullermd: http://sprunge.us/MMMK [11:06] fullermd: I'm killermoehre and /home/killermoehre/tmp/repository/ is the directory where the checkout is [11:08] Almost 10 minutes to repack 251 texts? Crazymuffin. [11:09] The cache overflows aren't _good_, but having 3 of 'em in the whole process doesn't seem like it would make a sizable difference. [11:10] fullermd: what are those "texts"? [11:10] Versions of files, basically. [11:11] So 219 revisions and 251 texts means you're changing roughly 1.15 files per commit. [11:12] fullermd: the devs did [11:12] fullermd: I never touched the source [11:12] Of course, OTOH, they're obviously pretty big. 251 texts compressing down to 36 megs means (assuming revs etc take up no space, which to a first approximation they do) means that each text after delta compression is taking up about 143k. That's hyuge. [11:12] Compare my ~37 meg repo with 30k texts, so each is taking up about 1k. [11:14] fullermd: so what do you suggest? delete the history of this repository? [11:14] So I suspect there's some significant churn on big files, which usually either means binary or generated stuff (as human-maintained text hardly ever gets THAT big and mobile) [11:16] Well, that would hack around the issue, certainly. [11:16] I don't s'pose it's a sharable repo. [11:17] fullermd: sharable repo? [11:17] As in you can give somebody else access to it. [11:18] fullermd: no. it have to stay in-company [11:18] * fullermd nods. [11:18] I would be killed (at least fired) if I would give access to it for extern [11:18] Hey, you gotta sacrifice sometimes ;p [11:19] fullermd: with a whole checkout I can go back to any previous revision, right? [11:19] At any rate, that sorta size sounds a couple sigma off the norm, but I'm still a bit surprised it's that slow. [11:20] Even accounting for my workstation, while not all that screamingly fast, being almost certainly significantly gruntier than your laptop, a factor of 20 difference seems well outside the lines. [11:21] (how's that for tortured sentence construction?) [11:21] Yes, you can. [11:21] fullermd: (I'm german. we invented tortured sentence constructions ;) ) [11:22] fullermd: how can access previous revisions? [11:22] Depending on particulars, usually via update or revert. [11:24] update being more for things like "shuffle my whole WT over to this revision for experimentation", revert more for "slam this file back to the state of that rev for committing a backout" [11:24] (as a first approximation; endless details and special cases of course) [11:25] fullermd: well, I want to find out, which file is why causing this behavior [11:25] How big is the existing working tree? [11:25] fullermd: I can't tell the devs "you fucked this up, live with it" [11:25] (a quick mental 'du .' - 'du .bzr' will give you a working approximation) [11:27] fullermd: 252kB [11:27] Basically, there are 2 possible extremes. One is that it's a bunch of big files and/or massive changes in each commit. In that case, well, there's nothing much you can do, it's just how your project goes. [11:27] fullermd: echo $(($(du -s . | awk '{print $1}') - $(du -s .bzr | awk '{print $1}'))) [11:28] The other is that somewhere back in history, somebody accidentally committed a giant .iso or .core or the like, and then later noticed and removed it because it shouldn't have been there. In that case, you can in concept rewrite the history without it, and get rid of the deadweight. [11:29] fullermd: any easy way to find the biggest file in bzr history? [11:29] Well, that's not huge. [11:30] Not directly. You could try pulling up a 'bzr log -v' then looking through it for 'removed' lines and seeing if any of the files that have been removed look suspicious. [11:30] It sounds like the latter, really. 252k of size, if it were completely changed every one of the 219 revs, would total up to 55 megs before compression, which is only twice the size of your fully compressed repo. [11:31] So there's probably some big stuff piled up in the history. [11:34] fullermd: seems like in the first revision some sql dumps were added, which were removed in the second revision [11:34] Sounds like a likely culprit. A "bzr diff -c1 | diffstat" could confirm it. [11:34] (giving you a quick eyeball of the size of the files) [11:37] fullermd: 45 files changed, 7351 insertions(+) [11:38] fullermd: at -c2, I got "41 files changed, 1 insertion(+), 6731 deletions(-)" [11:38] And SQL dumps often have ultra long lines, so those ~7000 lines can mean a _lot_ of data. [11:39] Yep, I'd call that a safe bet for the bloatsource. [11:40] Getting rid of it would mean redoing the history without those files. That means everyone's branch/checkouts would need redoing, and anybody's outstanding work in unmerged branches would have to be rebased on top of the redone history. [11:41] Which may be anything from "trivial" to "fiendishly involved", depending on your particular workflows etc. [11:41] hrmpf [11:42] Doing it will probably involved some invocation of something in the rewrite plugin, possibly on top of a little manual setup. But that gets deep into details; I don't have a recipe offhand, so it would probably take some experimentation. [11:43] fullermd: well, experimentation would be … bad [11:43] Well, obviously you'd make a scratch branch for the experimentation. It would be _bad_, just potentially time-consuming. [11:44] fullermd: so I have to say the devs "a other dev not working here anymore screwed off. you're out of luck". i CAN LIVE WITH THT [11:44] sorry for the caps [11:44] That's why god made devs-who-left-the-company ;) [11:45] fullermd: at least it's not my or the machine fault [11:47] fullermd: thank you very much for your time and help and the insight into bzr. should you ever need help with xfce, don't hestiate to ask in #xfce :D [11:48] np :) [11:49] Though I'm afraid ctwm would be terribly disappointed in me if I left it for some newfangled upstart wm ;> === zyga__ is now known as zyga [14:03] vila: Hey, you won First Commit Of 2014. Does that come with a trophy, or just a little brass nameplate? ;) [14:04] fullermd: hehe, no nothing, pure glory ;) (And even there...) [14:06] Not even a cake? Jeez, what a buzzkill. === Kraln- is now known as Kraln