=== lifeless [n=robertc@ppp245-86.static.internode.on.net] has joined #launchpad-meeting === ddaa [n=ddaa@82.109.136.116] has joined #launchpad-meeting === jelmer [n=jelmer@a62-251-123-16.adsl.xs4all.nl] has joined #launchpad-meeting === ddaa [n=ddaa@82.109.136.116] has joined #launchpad-meeting [05:10] hello there [05:10] hi [05:10] Is right now ok for you for the meeting? We can postpone it to this evening if you like. I don't mind [05:12] yes, just need to clarify the issues with SteveA first [05:13] sorry for the delay [05:13] no worries, just let me know when you're ready [05:17] let's do it === jamesh [n=james@82.109.136.116] has joined #launchpad-meeting [05:18] jelmer: last time we talked, I gave you a short summary of the issue [05:18] jelmer: I would you to explain it back to me, to make sure we are in sync [05:18] ddaa: Sure [05:19] As I understand it, you're looking into improving importd so it can recognize the special properties set on Subversion branches using bzr-svn. [05:20] Is that correct? [05:20] that's probably the shortest way of framing the issue [05:20] jelmer: my understanding is that the main feature we are after is for people working off branches we publish as vcs-imports on launchpad.net can land changes to SVN [05:21] jamesh: Ok, that's quite a bit different [05:21] jelmer: the question is whether to do it by making your plugin handle our current vcs-imports code, or whether to change vcs-imports to be compatible with your code [05:21] jamesh: I assume invalidating all the existing vcs-imports is not an option? [05:22] for example, by changing the revision id generation algorithm? [05:22] we are discussing that. [05:22] If you haven't seen it yet, there's a short summary of the SVN properties bzr-svn sets on http://samba.org/~jelmer/bzr/mapping.txt [05:23] jelmer: I'd rather we not use your svn import system [05:23] it has nice properties, in that it does not require a central authority to work [05:24] but it also has issues with referential integrities in the face of changes in the conversion code [05:24] ddaa: I'm not suggesting replacing importd by bzr-svn, but those are some of the properties importd will have to set [05:24] i.e. if you find a problem with your revision and file ID mapping and need to change it [05:24] if you want to allow people to take an importd-generated branch and push it to subversion [05:25] let me have a quick look, so do not talk _entirely_ out of my ass [05:26] jamesh: There's a version number that's part of the generated revision id that I'll increment whenever I change the mapping code; that hasn't been necessary so far. [05:27] jamesh: The revision ids are used for informational purposes by bzr-svn, they're not just random identifiers. [05:27] okay [05:28] AIUI there are two important processes in what your plugin does [05:29] the first one is the "commit to svn" process. Its inputs are: 1. a bzr branch 2. a svn url. Its output is 1. a svn repo with a commit in it that includes some magical metadata. [05:32] the second one is the "merge with svn" process. Its inputs are: 1. a bzr branch that was initially imported from svn, and may have a number of native bzr commits after the last imported revision 2. a svn url to a repository which includes a revision created by "commit to svn", and which may include some of the native commits on the bzr branch. Its output is a bzr working tree which is merged in a history sensitive way with the svn repo. [05:35] Yeah, I guess dividing it that way makes the most sense from a launchpad POV [05:37] excuse me, the guys are chatting around [05:37] I'll move to some quieter place [05:38] ok [05:38] back [05:38] what we would like to have in the end is the ability to: [05:39] 1. commit to svn from a bzr branch imported from svn, but with non-deterministic ids [05:40] to do that meaningfully, we might have to alter our import tool to provide the commit tool with some special metadata it needs to do its job [05:41] a highly desirable property here, is that it should not be a requirement for _all_ the revisions in the history of the bzr branch to have this special metadata [05:42] but that's only a secondary requirement [05:42] we can throw away all the existing svn import, as a one-off event [05:43] Ok, that makes sense. [05:43] So, in order to be able to do a commit, given a certain SVN branch and a Bazaar branch derived from that SVN branch, the bzr-svn plugin needs: [05:46] * A common revision that contains a branch path, revision number, uuid and file ids that can be mapped back to paths in the Subversion repository [05:47] At the moment, the branch path, revision number and uuid are simply "stored" in the revision id, but I guess you could store them somewhere in Bazaar revision properties [05:47] 2. import svn into a bzr branch, that uses non-deterministic ids, sets the parent_ids in revision that were comitted using the bzr-svn plugin, and preserves the file-ids for files that were introduced on the svn repo using the bzr-svn plugin. [05:48] ddaa: I doubt 2 is very well feasible [05:49] jelmer: okay, so we could start storing that information into, say, a "svn-revision" revision property, and once a launchpad import has a single revision with this property, it could be used to branch off and commit back to svn? [05:50] ddaa: Yes, but if the revision ids are non-deterministic, there are a couple of weird problems that'll show up [05:50] I'm all electronic ears. [05:51] mostly, if you have two bazaar branches created from different svn branches in the same repository [05:51] yes? [05:52] then, you merge branches/foo into trunk using bzr-svn [05:52] later on, you can still merge whatever branch was created by importd for branches/foo [05:53] I sense trouble here, but can you spell out what the problem will be? [05:54] from the user perspective, what is the failure mode you are concerned with? [05:56] because that revision (the same revision) has different revision ids [05:57] the fact that you will get the same history twice [05:57] but with different revision ids [05:57] I'm not sure how much of a problem that is for you [05:58] it's not a very big problem, it's just not very nice [05:58] does svn actually _tracks_ merge? [05:59] well, bzr-svn does [05:59] or at least, the revision ids of the merged revisions are stored, the revision itself aren't [05:59] Subversion 1.5 will actually track merges, and as it looks now, bzr-svn should be easily able to support that [06:01] jelmer: in the end, the problem you just mentioned only happens because there are two competing mappings from svn to bzr revision ids: the deterministic ones you produce, and the random ones importd produce. [06:01] ddaa: Sure, so I guess as long as you don't mix the two, there isn't much of a problem. [06:02] ddaa: I guess checkouts don't work on importd branches? [06:02] jelmer: I do not understand that question about checkouts. [06:03] ddaa: You can't have a branch bound to a vcs-import, right? [06:03] jelmer: well, checkouts fail for read-only bzr branches right now, independent of who made them [06:03] what jamesh said [06:03] ah. You mean so that commits go directly to svn? [06:04] jamesh: Yes. There'll be no need for bzr-svn to support direct commits to svn for importd-branches [06:04] only push [06:04] If I understand the system correctly [06:05] that sounds like it, except push is a bit too limited [06:06] that's pretty much -- just get the delta committed to svn with the metadata needed to recognise the merge [06:06] ok, that should be doable if importd sets the right metadata in the revision properties [06:07] I just realised something that your approach gives that importd cannot readily give [06:07] what is it? [06:08] Using your system, to merge a bzr branch to svn, you do a checkout of svn, merge your external branch there, and commit. [06:08] you have the opportunity to resolve any conflict at the merge point [06:09] And once you have comitted, you immediately get the import branch that includes your merge. [06:09] Putting importd in the loop to assign non-deterministic yet "canonical" revision ids force the workflow to become: [06:11] 1. do an atomic "merge and push" operation to the svn repository, that only accept clean merges 2. wait for importd to come around updating its branch to get the revision you just comitted [06:11] jelmer: is that correct, or do you see a way to avoid this loss in agility? [06:12] oh! yes, there would be a way to avoid it! [06:13] When doing the atomic "merge and push" thing, stick the "canonical bzr revision id" in a property of the svn revision. [06:13] ddaa: well, you'll always commit locally and push to svn - never commit directly to SVN, unless we add a 'commit-svn' command or something [06:14] right, that used to be the original idea for push-as-merged. Ideally revision id aliases would be used for that [06:14] let's stay away from revision ids aliases for now [06:15] if abentley says "it confusing the heck out me", that means it's definitely too complicated for mere mortals to understand [06:15] ddaa: I don't think that was about revision id aliases - probably about file id aliases [06:16] ddaa: anyhow, how would you know what revision id importd is going to assign? [06:16] in that case, the revision id to use by importd would be encoded explicitly in the svn revision. [06:17] though that makes it deterministic [06:18] yeah your are right [06:18] wrong track [06:19] I was on crack [06:19] Are you opposed strongly to deterministic revision ids? Or mainly to breaking the existing revision ids? [06:20] one minute [06:26] Okay... [06:26] I think your determinitic revision id approach is necessary to get something that people will want to use. [06:27] the push-then-wait-for-launchpad is, IMO, not something people will realistically want to use [06:28] however, to use those deterministic revision ids, we need additional guarantees [06:29] the way it currently works, we have a potentially open-ended number of namespace transitions that may affect all svn revisions [06:29] that's not acceptable in that it makes users of svn branches pay for svn features they do not use [06:30] for example, consider svn symlinks [06:30] so, to summarize: you don't want a bug in the mapping code force all vcs-import users to ditch their existing branch copies? [06:31] yes [06:31] fixing the bug should only bump the revision id namespace of revisions whose ancestry contains at least one revision affected by the bugfix [06:32] note that this is not limited to bugs, it also matter for new features [06:32] like nested trees and stuff like that [06:32] or file properties [06:32] right [06:33] can you deliver functionality in your plugin to provide that logic right away? [06:35] The easy way would be to just iterate over the branch and compare all the existing revisions with the ones on the SVN branch [06:36] (where the ones on the SVN branch are created using code with the bug fixed) [06:36] but I guess it makes sense to provide an upgrade mechanism, much like the one for data formats in bzr at the moment [06:36] jelmer: I think there's more to it [06:36] consider the independent import [06:37] independent import using bzr-svn, you mean? [06:37] yes [06:38] in that case, you need to determine how many namespace bumps each revision needs to undergo in order to be compatible with other imports [06:38] hmm [06:39] back in a minute [06:39] I'm starting to think that, maybe, it would be simpler (as in more robust) to actually go the git way, and use content-addressed revision ids. [06:39] jelmer: cool, I need some nicotin [06:47] ddaa: that's not really an option for bzr-svn as it requires you to fetch the full svn revision just to figure out what the revision id should be [06:48] that fixes my main problem at least [06:48] when comitting to svn, you have the full revision handy, so you can readily hash it [06:49] and importd is very good at sucking the data out of svn servers [06:49] ddaa: do you hash on the data you preserve, or all the data SVN stores? [06:49] the hash would have to be computed on what the bzr revision looks like, after being transformed by bzr-svn [06:50] so if later versions of bzr-svn preserve more data you'd get different hashes? [06:51] the alternative is to guarantee that for all revision, we can reliably count the number of required namespace version bumps, using any bzr-svn version ever released [06:51] which seems too strong a promise to make [06:51] jamesh: yes [06:52] ddaa: Keeping a hash is probably a good idea - I'm not sure whether it should be part of the revision id though [06:52] can't we just rely on the revision sha1? [06:53] unless you can come up with a good way to detect and recover from loss of referential integrity of revision id, I think we need to put the hash in the revision id. [06:53] I simply can't put the hash in the revision id for bzr-svn, it's way too expensive [06:54] jelmer: what if launchpad provides you readily imported branches, and handles re-importing them now and then? [06:55] you do need to suck in that data anyway when pulling or branching from a svn repo, dont' you? [06:55] ddaa: big loss of functionality - bzr-svn works stand-alone right now, and svn commits appear immediately in the bzr-svn plugin [06:55] ddaa: Yes, but if I just run 'bzr log' on a remote branch, that's only a couple of roundtrips now, even on a very big repository [06:56] ddaa: if you add a hash in the revision id, Branch.revision_history() will slow down big time [06:57] jelmer: can you propose a different way to reliably handle the requirement of limiting revision-id changes to those revision that are affected by changes in the conversion logic? [06:57] So far I see only two options: [06:57] ddaa: I hope so.. that would be the ideal solution, so I'm trying to think of something [06:58] 1. merge-and-push then wait for importd [06:58] 2. put a hash in the revision id [06:58] none of them are ideal, so I'm certainly open to better ideas [06:58] where (1) would be with a non-deterministic revision id, (2) with a deterministic revision id? [06:58] jelmer: yes [07:00] I think log-directly-from-svn could be handled using a special revision id namespace [07:00] because then the actual contents of the revision do not really matter [07:00] ddaa: but how are you going to detect you're in a log-directly-from-svn? [07:01] ddaa: log isn't really special, it's just a bunch of get_revision() calls [07:02] maybe have "bzr log svn://something" be interpreted as "bzr log svn+logonly://something" [07:02] that's just plain ugly :-) [07:02] yes [07:02] What I'm looking for here, is something that would make the sabdfl happy. [07:04] To slightly rephrase what he told me: "we are not paying for the svn plugin because it's cool, but because we want some feature working" [07:04] so, working trumps pretty [07:04] Well, bzr-svn works very well without importd... (-: [07:05] (1) and (2) both sortof defeat the purpose of bzr-svn, as (1) doesn't allow mixing of merges from bzr-svn and importd and (2) slows down bzr-svn too much [07:05] The reason I'm talking to you here, is that I do not think that in its current form it can work reliably enough for us. [07:05] but maybe some less anal launchpad folks will disagree with me. [07:07] where reliability means, mainly, preserving referential integrity of revision ids and avoid breaking imports that are not affect by changes in the import logic [07:09] Finally, I do not personally believe that having launchpad assign "canonical" bzr ids to imports from other vcs increases the value of launchpad in a useful way, but some people in the company seem not to share that conviction yet. [07:09] Why exactly can't it work reliably? I can see the issue with mapping changes and I am convinced that should be fixed, but I'd rather not cripple bzr-svn for it. [07:10] Because you have not told me yet how to fix that issue in another way. [07:10] I'm totally open to suggestions. [07:11] If you need time to think about it, that's fine with me. [07:11] jamesh: anything you would like to add? [07:12] Well, one of the original ideas behind bzr-svn (as it read on the wiki) was that it worked without depending on Launchpad. Another option would be to simply extract the commit code of bzr-svn and create a launchpad-svn-commit plugin without complicating the bzr-svn code. [07:12] Mh, which wiki? [07:13] launchpad - page BzrRoundTrippingSvn, though I can no longer access that [07:13] ddaa: I'm still trying to think of a way to properly detect problems. [07:14] ddaa: my feeling is that jelmer's system is the right general direction, but I agree that we need to address the future in the bzr <-> svn mapping spec [07:15] jelmer: the idea of the launchpad-svn-commit plugin is one of the option I see now. But it's far from ideal. [07:15] in that this functionality will just suck [07:16] ddaa: If we can figure out some way to deal with upgrades but still have deterministic revision ids, that'd be by far the best solution imo. [07:16] jelmer: I agree with jamesh that your design goes in the right general direction [07:17] jelmer: how can we help in the figuring out? [07:18] ddaa: I'm not sure - I'm currently trying to generalize the problem a bit. [07:21] I mean, how can we help in terms of gathering resources: like attention from core bzr devels working at canonical. [07:22] ddaa: Ah, in that sense. I think it would be a good idea to discuss this with Martin, Robert or John. [07:22] I understand that you need to sit back and think about the issues, but when I report to the company, I want to be able to give a date when I will have some more information. [07:24] ddaa: Sure, I understand. Perhaps it's a good idea to have another discussion on, say, monday? [07:24] if you expect to have some progress on that issue monday, great [07:24] btw, I hereby invite you to the launchpad-bazaar meeting [07:25] Which is held at 1000UTC on monday in that very channel [07:26] Cool, I'll be there. [07:26] With SteveA, mpool, lifeless (usually), jamesh, spiv and myself. [07:27] Ok, I'm off to dinner. See you on monday. [07:27] just for clarification, this meeting is intended to be a short discussion to assign tasks and resources [07:27] design discussion should take place outside of the meeting [07:27] ok [07:27] thank you for your time [07:27] have a nice dinner