Edit Git commit messages including root commit

From time to time I need to clean-up a Git repository before making it accessible to others. One step is to fix unspecific commit messages and add version tags.

Caution: The following modification should not be done if the repository is already shared with others.

Most of the job can be done with the interactive rebase command, but if you want to also clean-up the root commit message you have to invest a bit more work. This is a modified/adapted solution from a thread found on stack overflow.

First fetch the SHA1 checksum of the first/root commit:

git rev-list --max-parents=0 HEAD

Checkout that commit, ignore the warning message:

git checkout <root-sha1>

Now you can use the amend feature of the git commit command to fix the (from this point of view most recent) root commit:

git commit --amend

Now you can add the remaining commits to the modified root commit. By making it interactive you can either meld multiple commits (squash) or just edit their commit messages (reword):

git rebase -i --onto HEAD HEAD master

How to extract a subdirectory from a Git repository while keeping its history

This is more or less to document a procedure I needed a few days ago: the extraction of a sub-directory from a Git repository into a separate (new) repository. The nice thing here is, that the whole history of the respective directory is kept.

First I created a check-out of the existing repository and removed the link to its origin.

git clone SOURCE.git
mv SOURCE NEW-REPO
cd NEW-REPO
git remote rm origin

The next step is the actual extraction. We will remove all directories but the desired one throughout the whole history. This also works with more than one directory, i.e. here: DIR1 and DIR2

git filter-branch --tree-filter 'ls -1 |grep -v -e DIR1 -e DIR2 | xargs -i rm -rf {}' --prune-empty -f HEAD

Now the repository should only contain the (here) two directories. But the removed files still are in the packed object references. We have to manually let them expire and remove them.

git reflog expire --expire=now --all
git gc --aggr --prune=now

Finally I did some cleaning, but this is optionally:

git repack -a -d -l
git clean -f -d

Finally we can for example create a new (bare) repository from our NEW-REPO and further use it. I recommend to just remove the extracted directory from the source repository and keep its history, just in case.

The steps are similar if you need to instead remove a directory including its history. The only difference is in the actual removal command:

git filter-branch --tree-filter 'ls -1 |grep -e DIRECTORY |xargs -i rm -rf {}' --prune-empty -f HEAD