/srv/irclogs.ubuntu.com/2011/10/03/#bzr.txt

=== vila changed the topic of #bzr to: Bazaar version control <http://bazaar.canonical.com> \| try https://answers.launchpad.net/bzr for more help \| http://irclogs.ubuntu.com/ \| Patch pilot: vila
vila	hi all !	06:13
jam	morning all	06:54
mgz	morning all	07:52
Riddell	hello	08:16
mgz	hey Riddell	08:16
jam	vila: https://code.launchpad.net/~jameinel/bzr/2.5-no-hanging-teardown/+merge/77870 should handle the hanging tests issue	08:18
jam	both some fixes to the testing infrastructure to avoid future hangs, and fix the actual trigger	08:18
jelmer	vila: did you see https://code.launchpad.net/~bryan-tongminh/bzr-webdav/ctype-fix/+merge/77824 ?	09:45
vila	jelmer: yup, on my TODO list	09:51
jelmer	vila: great, just checkin' :)	10:00
jelmer	we should really get some of those inactive projects out of the "bazaar" project on Launchpad	10:00
vila	yeah, this one have been rumored to be merged into core but the maintainer is slacking ;)	10:01
jelmer	hehe	10:04
jelmer	Riddell: just noticed, the changelog entry still says a "get-tar" command was added rather than get-orig-source	11:33
jelmer	sorry for not catching that during review	11:33
Riddell	jelmer: oh aye, fixing	11:34
=== medberry is now known as med_out
rom1	hi all	12:15
rom1	I have a functional question about branches merges : is it possible to do a "merge revision" without merging the files, but selecting the OTHER code ?	12:17
rom1	Or in other words, is it possible to do a pull/push --overwrite with the creation of a revision that bundkles all the pulled/pushed revisions ?	12:18
jelmer	rom1: I'm not sure I follow entirely - is this to e.g. land a branch on trunk?	12:21
rom1	jelmer : not sure what land means... In fact, i have some developers requesting a workflow a la gitflow.	12:22
rom1	in this workflow, i can have hotfixes in the master	12:22
jelmer	rom1: Can you give a concrete example ?	12:23
jelmer	rom1: There is no one git work flow afaik :)	12:23
rom1	but to merge the develop revisions to the master, i want to forget about the hotfixes changes : either they have been merged into the develo branch, and no problem, either not, and it means that it has to be replaced by the develop branch	12:23
rom1	jelmer : i talk about this one : http://nvie.com/posts/a-successful-git-branching-model/	12:24
jelmer	rom1: I'm still not sure if I follow but I think you mean "bzr merge OTHER && bzr ci -m 'Merge other'."	12:26
poolie	o/ jelmer	12:26
poolie	(not really here, should go to bed)	12:26
jelmer	hey Martin	12:26
jelmer	Ah, right.. labor day in .au ?	12:27
jelmer	Or I guess that would be labour day?	12:27
nigelb	lol	12:27
poolie	in NSW	12:27
jelmer	ah, that explains why wgrant and wallyworld were still around. I figured they just forgot ;-)	12:28
jelmer	hey mrevell, back on the interwebz?	12:28
mrevell	ja!	12:28
mrevell	:)	12:28
rom1	jelmer : in fact, i do not want to merge the files, i want to get exactly the code from OTHER (just like the ush/pull --overwrite) but in a single revision...	12:29
nigelb	jelmer: forgetting is entirely possible :)	12:30
jelmer	rom1: so you want to discard the changes in the current branch, but have history indicate they're present?	12:30
jelmer	nigelb: you're talking to somebody who has once accidentally worked on a holiday :)	12:31
nigelb	jelmer: hahaha	12:31
nigelb	I guess that happens when you work from home :)	12:31
rom1	jelmer : well, discussing about that, i notice that it isn't very clear even for me.	12:33
rom1	:p	12:33
jelmer	rom1: is there a specific git command that does the same thing that you're looking for?	12:34
rom1	jelmer : when we create a hotfix on a production branch, it may be a dirty quick patch to quickly resolve the issue. When we release a new version containing a quick fix of the issue, i do not want to merge the dirty patch and the clean fix, but only take the clean one.	12:35
jam	rom1: bzr merge $OTHER; bzr revert -r -1:$OTHER; bzr commit -m "Merge and reset the tree state to $OTHER" ?	12:35
jam	Or just simply:	12:36
rom1	Yep ! i didn't think about revert !	12:36
jam	bzr revert -r -1:$OTHER; bzr commit -m "Set the tree state to exactly OTHER but don't mark it as merged"	12:36
jam	I don't really think you want to set it to other without merging, but you could, if you really want to throw away all of OTHER's history.	12:36
rom1	jam : i understand. I haven't validated so far this workflow. I wanted first to see if it was feasible with bzr and our release management. I understand that a "merge without merge" is somehow surprising...	12:39
rom1	sorry, in my post to jelmer, i was meaning : "When we release a new version containing a CLEAN fix of the issue[...]"	12:40
jam	rom1: the issue is just what exactly you mean by "pull --overwrite" with only a single revision	12:42
jam	doing a merge, and revert, will create a single new mainline revision	12:43
jam	however	12:43
jam	this also assumes that you don't have any state in mainline that you want to keep	12:43
jam	specifically, say that you emergency fix X, then you do a normal fix of Y, then you finally finish the real fix for X	12:43
jam	doing the revert will throw away the updates to Y.	12:43
jam	Which is why I would suggest just doing "bzr merge && bzr commit"	12:44
jam	but you know your process better than I do	12:44
jam	Assuming that the dev branch always supersedes the production branch sounds a bit risky, but if that is your process, you can stick to it.	12:44
rom1	jam : you're right, a temporary hotfix hasn't to be released in a production branch. Just branching it in a dead end branch, and keeping my proudtcion branch with merges only.	12:47
rom1	Thx jam and jelmer	12:47
systemclient	is bzr 0.18.0 usable with current repos at all?	12:49
poolie	systemclient, it won't be able to read the default format created by recent bzr releases	12:50
poolie	pre-1.0 is pretty old	12:50
systemclient	poolie: isn't pre 2.0 old already?	12:51
poolie	yeah, therefore 0.18 is really quite old	12:54
poolie	2.1 and later are still in support	12:55
jam	poolie: I know you aren't really here, but if maybe your ghost is around, I have an initial prototype up for https://bugs.launchpad.net/bzr/+bug/819604	13:08
ubot5	Ubuntu bug 819604 in Bazaar 2.1 "when an idle ssh transport is interrupted, bzrlib errors; should reconnect instead" [High,In progress]	13:08
jam	And it would be nice to get some feedback about where it is going.	13:08
mgz	okay, lunch before caches confuse me any more	13:28
jelmer	I never go caching before lunch either.	13:29
jelmer	vila: thanks!	13:53
jam	vila: I replied to your review. I tried to run the babune jobs, but it just told me the servers are unavailable, and it looks like you deleted the requests. (Perhaps I entered them wrong?)	14:28
vila	ha, whic requests did you enter on which jobs ?	14:29
jam	vila: freebsd, natty, lucid	14:29
jam	selftest-subset-*	14:29
vila	jam: I' jusr recovering from a babune crash and I was typing text in a firefox, unlikely to crash...	14:29
vila	jam: for which tests ?	14:30
jam	vila: I was running all of them, since the failing tests tend to be randomly distributed. I know which ones to suspect if you don't want a full test run.	14:30
vila	I'd prefer that, yes, especially if something is crashing	14:31
vila	jam: the freebsd slave is running a fsck, leave it some time to recover	14:32
jam	vila: So, after the changes, nothing should crash or hang :). The issue is that the test was hanging once it hits 4.0s of runtime. So you need a bit of load to slow the test suite down enough. Here they normally finish in about 2.5s	14:35
jam	(Which is why it seemed so random, a given test has to get some sort of hiccup and go over the 4.0s mark.)	14:35
vila	no test takes 4s to run on babune AFAIK	14:35
jam	vila: I'm pretty sure it did, though not consistently	14:35
vila	did you find reports to back that up ?	14:37
jam	vila: I can trigger the test suite hang by making the test take longer than 4.0s	14:40
jam	is that good enough for you?	14:40
vila	meh, of course not, I mean, this is certainly a bug but it doesn't mean it explain the ones we encountered on babune	14:41
vila	a test taking 4 seconds is already a bug and we've never seen such huge variation without a good reason	14:41
jam	vila: aka, this diff: http://paste.ubuntu.com/701706/	14:42
mgz	load vila?	14:42
jam	vila: its already at 2.5s here on my reasonably fast laptop	14:42
jam	it isn't that far from 4.0s	14:42
mgz	how careful are you to only be running one test suite on a box at once?	14:42
vila	mgz: I rely on jenkins for that (don't remember the details)	14:43
jam	vila: didn't you have to implement locks to reduce the load spike at midnight?	14:45
jam	I was pretty sure you schedule all jobs to run daily, and then you restrict it via some sort of inter-locking to something like 2 concurrent runs.	14:45
jam	also, you are running --parallel, right?	14:45
jam	if you had per-test timing (which I really had hoped you would), we might have been able to see something like that	14:45
jam	I realize it doesn't get exposed via our junit xml adapter	14:46
vila	inter-locking is on slaves	14:46
jam	though again, it may not strictly matter, since once a test hit 4.2s it would hang, and we wouldn't see it. But we could see that in the past, some test happened to spike higher than 4.0s.	14:46
vila	jam: look at the Test results, the timings are there for all tests (consolidated by prefix)	14:46
jam	vila: ah, just only for successful runs, right?	14:47
vila	yup, but it would very weird that a spike never occurred	14:47
jam	vila: http://babune.ladeuil.net:24842/job/selftest-chroot-lucid/lastSuccessfulBuild/testReport/bzrlib.tests.per_interrepository.test_fetch/TestInterRepository/	14:48
jam	has a test that takes 2s	14:48
jam	vila: note that it only started failing if you have a 4.0s spike with the ConnectionTimeout patch	14:48
jam	so it certainly could have been spiking in the past, and just didn't cause a failure/hang	14:48
jam	vila: http://babune.ladeuil.net:24842/view/FreeBSD/job/selftest-freebsd/lastStableBuild/testReport/bzrlib.tests.per_interrepository.test_fetch/TestInterRepository/ has a test that takes 5.5s	14:49
jam	test_fetch_from_stacked_smart_old(InterDifferingSerializer,RepositoryFormat2a,RepositoryFormatKnitPack6RichRoot) 5.5 secPassed	14:49
jam	test_fetch_parent_inventories_at_stacking_boundary_smart(InterDifferingSerializer+get_known_graph_ancestry,RepositoryFormatKnitPack1,RepositoryFormatKnitPack6RichRoot) took 5.4s	14:50
vila	can you paste the precise URL instead of letting me find it in ~100 line pages ?	14:50
jam	and the ones after it are taking 6.x *	14:50
jam	vila: I was trying to show you the overview	14:50
jam	http://babune.ladeuil.net:24842/view/FreeBSD/job/selftest-freebsd/lastStableBuild/testReport/bzrlib.tests.per_interrepository.test_fetch/TestInterRepository/test_fetch_parent_inventories_at_stacking_boundary_smart_InterDifferingSerializer_RepositoryFormat2a_RepositoryFormatKnitPack6RichRoot_/?	14:50
jam	there is a direct test	14:50
jam	'took 6.7s'	14:50
jam	which also explains why the freebsd was more likely to hang	14:50
vila	ha, ok	14:54
vila	but why spuriously then ?	14:54
jam	vila: you have to have 4.0s of idle time on a given connection, and then get another connection after that 4.0s. So it isn't strictly a 'test takes >4.0s'.	14:55
vila	jam: if you look at this same test in the previous builds, it always above 4.0s	14:55
jam	Say you get the last connection at 3.9s, and then spend 2.7s working with the last connection.	14:55
jam	so, vila, even if I'm slightly wrong with my analysis (though I've spent the last 3 days on it), I'm sure that this change makes behavior more friendly, and fixes the "time.sleep(4.0)" problem.	15:03
jam	it certainly should change behavior so rather than hanging, we at least get a failure/exception when appropriate.	15:03
vila	jam: we can spend days discussing, if you know exactly how to trigger the hand, you should be able to make the test fail in a simple way to demonstrate it, you don't need to change all test servers for that leaving the doubt about whether you shake the code enough to make the hang go somewhere else	15:04
jam	anyway, EOD, here, I'll see if we can start talking about it earlier tomorrow.	15:04
jam	vila: I did change the tests to prove it	15:04
vila	indeed, and focus on smart server only if that's where the issue is	15:04
jam	client.read() was hanging	15:04
vila	but you din't use this	15:04
jam	vila: the test didn't do it	15:04
jam	in fact, you had the test set a client timeout	15:04
jam	because it was hanging	15:04
jam	I was able to fix the test to just read and get a closed connection	15:05
vila	in test_test_server only, that's not used anywhere else !	15:05
jam	vila: test_server is the base implementation of SmartTCPServer_for_testing which is used in every test that calls make_smart_server()	15:05
vila	no !	15:05
vila	test_test_server not test_server	15:05
vila	your change is in the former not the later	15:06
jam	vila: TestingTCPServerMixin is the class that needed updating as it was the part that implemented the code that SmartTCPServer_for_testing uses	15:06
jam	the tests for that class are in "test_test_server"	15:06
jam	there aren't any tests in "test_server"	15:06
jam	it is the implementation of the "TestServer"	15:07
jam	vila: if you go to "bzrlib.tests.__init__" you can see that we add "bzrlib.tests.test_test_server" but not "bzrlib.test_server" to the test suite.	15:07
jam	anyway, really, I need to go pick up my son. I'm not sure why you don't believe me	15:08
vila	that's what I'm saying, I know how these files are named, I created them	15:08
mgz	okay, I think this has just made my morning worthwhile.	15:08
mgz	>>> om[3062558728]	15:09
mgz	str(3062558728 4194212B 1par 'f\xe7_chknode:\n65536\n1\n1382\n\n\x00\x00sha1:6d13c15b49497a74b59b064e0f1bb074dd05b3be\n\x01\x00sha1:ce73daef8871866fd78')	15:09
mgz	>>> om[3043770376]	15:09
mgz	str(3043770376 4194212B 1par 'f\xe7_chknode:\n65536\n1\n1382\n\n\x00\x00sha1:6d13c15b49497a74b59b064e0f1bb074dd05b3be\n\x01\x00sha1:ce73daef8871866fd78')	15:09
jam	vila: I fixed "test_server.py" to close connections on an exception, or when validate_request() returns false. I updated the tests in test_test_server to test those cases. If you just run the updated tests without the fixes, the test suite hangs.	15:10
jam	I'm not sure what else you want.	15:10
jam	mgz: do you need some help understanding those?	15:10
mgz	of the 25 LRUSizeCache objects over repository packs, two have duplicates	15:10
jam	That is a groupcompress record contain CHK nodes.	15:10
jam	If you open a repository twice, you'll get duplicates	15:11
vila	jam: I want your fix to be specific to the smart test server not invading other servers	15:11
jam	if you open a source and a target, they both might have a copy	15:11
jam	you can check the parents to see who is referencing them.	15:11
mgz	jam, as I understand that readout, there are two different GroupCompressBlock objects with the same content	15:11
jam	vila: I think the other servers are poorly behaved, because they will also cause clients to hang when the server gives up on talking to them	15:11
jam	mgz: also certainly possible	15:11
jam	vila: the tests you have today actually were hanging, but you forced the client to use a socket.timeout to avoid it	15:12
vila	jam: you can't say that without evidence, I told you last friday this is don't happen except for the smart server which uses daemons threads	15:12
jam	mgz: you can ping me tomorrow after standup and I'll poke around with you if you want.	15:12
jam	vila: I can	15:12
vila	jam: don't generalize from a single test specifically designed	15:12
jam	I have evidence	15:12
jam	vila: if you take out the socket.timeout ... the test hangs	15:12
jam	reliably	15:12
jam	the comment is that the "server doesn't get cycles" is false	15:13
jam	it is because the server "doesn't close the connection" until teardown	15:13
jam	sorry "socket.settimeout"	15:13
mgz	jam, thanks. I'll plug on for a bit longer now and see where I get.	15:13
jam	vila: http://bazaar.launchpad.net/~bzr-pqm/bzr/bzr.dev/view/head:/bzrlib/tests/test_test_server.py#L166	15:14
jam	but for now, I'm gone	15:14
jam	see you all tomorrow	15:14
jam	have a good night	15:14
mgz	bye!	15:14
abentley	jelmer: some time I'd like to get up to speed on co-located branches and their implications for pipelines.	15:15
vila	jam: the comment says "whether our reads or writes may hang" this test requires a timeout	15:15
jelmer	abentley: the colocated branch format hasn't landed yet, so it might still be a bit too early. As far as I can tell pipelines just use the regular bzr APIs, in which case I think	15:17
jelmer	pipelines will just work out of the box with colocated branches.	15:17
abentley	jelmer: pipelines can create and use bzr-colo-style branches using "reconfigure-pipeline".	15:20
jelmer	abentley: colocated branches in core are different from bzr-colo	15:21
abentley	jelmer: Once colocated branches in core are usable, I think reconfigure-pipeline should switch to them. And "add-pipe" will need to support them too, I imagine.	15:22
mgz	looks very hopeful for some duplicate elimination: <http://pastebin.ubuntu.com/701737/>	15:24
jelmer	abentley: It should be able to support both, if you're ok with having the extra code to do so.	15:24
abentley	jelmer: I'm okay with that.	15:24
=== deryck is now known as deryck[lunch]
jelmer	vila: hi, still there?	15:56
vila	jelmer: oh sorry, yes	16:26
=== deryck[lunch] is now known as deryck
AuroraBorealis	hiya mgz/wgz whatever you are today	16:41
AuroraBorealis	i dunno how hard it is to fix whatever was going wrong with the meliae dumps but if one could figure that out then maybe we could fianlly get somewhere xD	16:45
mgz	after you went to sleep I succeeded in loading the dump by getting meliae to ignore ids that are not present,	16:54
mgz	which there's a TODO over but I'm not sure of the neatest way of doing	16:54
mgz	...and was on the other box	16:54
AuroraBorealis	lol	16:55
mgz	but if you add something similar the dump will at least load	16:55
AuroraBorealis	remember what file i should look in?	16:56
mgz	I've also been looking at where memory is used this morning, so have a generally better idea of what I'm doing	16:56
mgz	sec, nearly done for the day here, will transfer down below	16:56
AuroraBorealis	ah ok	16:56
wgz	okay.	17:01
wgz	workaround: <http://paste.ubuntu.com/701810/>	17:02
AuroraBorealis	that works	17:02
wgz	can still fall over later, but gets the thing loaded	17:02
wgz	so, can you also do `om.summarize()` now? if so, we can progress.	17:04
AuroraBorealis	i shall work on that now	17:06
AuroraBorealis	finally got i t	17:23
AuroraBorealis	it	17:23
AuroraBorealis	it wasn't repacking during this, but doing the fast-import again (2 gigs of memory)	17:23
AuroraBorealis	http://paste.ubuntu.com/701828/	17:24
wgz	omg.	17:24
wgz	well, that's only finding 440MB usage at that point	17:26
wgz	but 120MB in frozenset is pretty crazy	17:26
wgz	do `om.get_all("frozenset")[1]`	17:27
wgz	`om.get_all("frozenset")[0]` even.	17:27
AuroraBorealis	even though the memory usage was 2 gb in the process the dump file was only 1 gb	17:28
AuroraBorealis	which was weird	17:28
wgz	it's possible fast-import has 1.5GB of unfindable allocations	17:28
AuroraBorealis	frozenset(37820232 2272B 31refs 1par)	17:28
AuroraBorealis	and frozenset(1340577832 736B 15refs 1par)	17:29
=== beuno is now known as beuno-lunch
AuroraBorealis	for [0] and [1]	17:29
wgz	hm, that's not big, it's just lots of teeny ones adding up to pain then.	17:30
wgz	use _.c to see what's in one.	17:30
AuroraBorealis	after the get_all()[1] call?	17:31
wgz	in the python terminal _ just refers to the last object	17:31
AuroraBorealis	oh	17:31
wgz	you can bind one to a name instead of you want	17:31
AuroraBorealis	getting "address not present" again	17:31
wgz	try some other indexes, see if they're all missing contents	17:32
wgz	might be were some of the extra mem usage is to be found	17:32
AuroraBorealis	[0] worked	17:32
AuroraBorealis	http://paste.ubuntu.com/701837/	17:33
AuroraBorealis	just look like strings o.o	17:33
wgz	heh, yeah, [0] is likely not typical	17:33
wgz	try some other numbers nearer the middle	17:33
AuroraBorealis	2-7 all return KeyError >.>	17:34
wgz	there are 562876 frozenset objects, so pick some bigger indexes	17:35
AuroraBorealis	oh	17:35
AuroraBorealis	lol	17:35
wgz	if lots of them have the same problem, it's likely there's our meliae bug to fix	17:36
AuroraBorealis	yeah	17:36
AuroraBorealis	200,000 does it, 300,000, 400,000	17:36
AuroraBorealis	all keyerror	17:36
wgz	try in the other direction, use .p rather than .c	17:37
wgz	and find out what's holding on to them	17:37
wgz	keep going up with _.p[0].p ..etc as needed	17:38
wgz	(.c is 'children' - the list of object this object references, and .p is 'parents' - the list of objects that reference this object)	17:39
AuroraBorealis	http://paste.ubuntu.com/701839/	17:39
AuroraBorealis	tried some numbers	17:39
AuroraBorealis	oh	17:39
AuroraBorealis	seems to be the same dictionary tho	17:40
wgz	yeah, dict is too generic, go up again till you find a class or something more signpost-y	17:40
wgz	it's all the same dict at least	17:40
wgz	hm... actually, I think I may know where in the code this is	17:40
AuroraBorealis	[bzrlib._known_graph_pyx.KnownGraph(170691824 72B 2refs 1par)]	17:41
AuroraBorealis	they all say that	17:41
wgz	okay, so that's the entire history in memory.	17:45
AuroraBorealis	that seems bad	17:45
wgz	well, fastest way provided it fits, probably	17:45
AuroraBorealis	fast importing the linux kernel does seem like an extreme case	17:46
wgz	AuroraBorealis: get that KnownGraph object and look at its children	17:50
wgz	and also maybe summarize it (with `om.summarize(kg)`)	17:51
AuroraBorealis	i think this is right	17:52
AuroraBorealis	http://paste.ubuntu.com/701847/	17:52
wgz	think there was one to many .c to get kg, but gives the right idea	17:55
AuroraBorealis	yeah without the .c it still shows the same thing	17:56
AuroraBorealis	actually i lied	17:57
wgz	we've lost the giant dictionary with the frozenset objects somehow in that output	17:57
AuroraBorealis	http://paste.ubuntu.com/701848/	17:57
AuroraBorealis	that?	17:57
wgz	there he is.	17:58
wgz	so, that's big, but not huge. however, we seem to have no content for any of those containers, which is apparently where the dump went wrong	17:59
AuroraBorealis	is that something wrong with meliae?	17:59
AuroraBorealis	that it didn't dump everything	17:59
wgz	yup, I'm guessing, will try and repo so it can be fixed.	18:01
AuroraBorealis	ok	18:03
AuroraBorealis	i should be around	18:03
AuroraBorealis	well i have school, and i am probably going to be at school late cause some company is coming and i want to sit in on their talk	18:03
AuroraBorealis	you can email me with stuff to do at markgrandi@gmail.com though :D	18:03
=== beuno-lunch is now known as beuno
=== med_out is now known as med
=== med is now known as medberry
=== medberry is now known as med_out
poolie	jam, hi, i see your mail	23:12
poolie	hi all	23:13
jelmer_	hi poolie	23:22

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!