Re: git: uh-oh

From: Michael Haggerty <mhagger(at)alum(dot)mit(dot)edu>
To: Max Bowsher <maxb(at)f2s(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: git: uh-oh
Date: 2010-08-21 06:15:30
Message-ID: 4C6F6F02.5080601@alum.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Max Bowsher wrote:
> On 20/08/10 19:07, Magnus Hagander wrote:
>> On Fri, Aug 20, 2010 at 19:56, Max Bowsher <maxb(at)f2s(dot)com> wrote:
>>> On 20/08/10 18:43, Magnus Hagander wrote:
>>>> On Fri, Aug 20, 2010 at 19:41, Max Bowsher <maxb(at)f2s(dot)com> wrote:
>>>>> On 20/08/10 18:30, Magnus Hagander wrote:
>>>>>> On Fri, Aug 20, 2010 at 19:28, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>>>> Max Bowsher <maxb(at)f2s(dot)com> writes:
>>>>>>>> The history that cvs2svn is aiming to represent here is this:
>>>>>>>> 1) At the time of creation of the REL8_4_STABLE branch, plperl_opmask.pl
>>>>>>>> did *not* exist.
>>>>>>>> 2) Later, it was added to trunk.
>>>>>>>> 3) Then, someone retroactively added the branch tag to the file, marking
>>>>>>>> it as included in the REL8_4_STABLE branch. [This corresponds to the git
>>>>>>>> changeset that Robert is questioning]
>>>>>>> Uh, no. We have never "retroactively added" anything to any branch.
>>>>>>> I don't know enough about the innards of CVS to know what its internal
>>>>>>> representation of this sort of thing is, but all that actually happened
>>>>>>> here was a "cvs add" and a "cvs commit" in REL8_4_STABLE long after the
>>>>>>> branch occurred. We would like the git history to look like that too.
>>>>>> Yeah.
>>>>>>
>>>>>> In fact, is the only thing that's wrong here the commit message?
>>>>>> Because it's probably trivial to just patch that away.. Hmm, but i
>>>>>> guess we'd like to hav ethe actual commit message and not just another
>>>>>> fixed one..
>>>>> There is no "actual commit message" - the entire changeset is
>>>>> synthesized by cvs2git to represent the addition of a branch tag to the
>>>>> file - i.e. the logical equivalent of a "cvs tag -b", which has no
>>>>> commit message.
>>>> There is a commit message on the trunk when the file was added there.
>>>> Is there any chance of being able to lift that message off trunk and
>>>> just copy it into the branch, instead of saying "this is a cherry-pick
>>>> of this commit over here"?
>>> It wouldn't be accurate to do so, because the synthetic commit is not
>>> copying the entire change, only registering the addition of (in this
>>> case) one file which happens to be part of the changeset. Please note
>>> that there is a changeset on the branch immediately following the
>>> synthetic one under discussion which contains the 'real' commit message
>>> used when committing to the branch.
>> Hmm. Good point.
>>
>> I wonder if we should try to locate these places and fix them with git
>> filter-branch or rebase -i after the fact, with history rewriting.
>>
>> There seem to be 57 of them.
>
> It sounds cumbersome.
>
> Michael Haggerty might be better placed than me to assess whether
> eliding them within cvs2git is practically achievable.

I think this would be nontrivial.

It is (relatively) easy to tweak a file's history during
FilterSymbolsPass, which is the last time during the conversion when the
file's whole history is in memory at once. But you don't want to omit
all connections between file-on-branch and parent branch; you only want
to omit the information if the branching of the particular file cannot
be included with the first commit that creates the branch.
Unfortunately, determination of commits requires *global* information
and is done *after* FilterSymbolsPass.

The elision of the file branching event could conceivably be done at the
point when it would otherwise be output to the dumpfile, but its elision
would affect how the first change to the file on the branch had to be
treated, so information would have to be kept around.

Moreover, this is a pretty specialized request that would be useless to
people who are not so disciplined about their repository as you seem to be.

It seems like you already have a way to find these events in the git
repository after conversion, so I think it would be more practical to
use git-filter-branch to remove the unwanted commits *after* the conversion.

Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2010-08-21 08:45:12 Re: Version Numbering
Previous Message Greg Sabino Mullane 2010-08-21 03:34:35 Re: Version Numbering