This is more or less to document a procedure I needed a few days ago: the extraction of a sub-directory from a Git repository into a separate (new) repository. The nice thing here is, that the whole history of the respective directory is kept.
First I created a check-out of the existing repository and removed the link to its origin.
git clone SOURCE.git mv SOURCE NEW-REPO cd NEW-REPO git remote rm origin
The next step is the actual extraction. We will remove all directories but the desired one throughout the whole history. This also works with more than one directory, i.e. here: DIR1 and DIR2
git filter-branch --tree-filter 'ls -1 |grep -v -e DIR1 -e DIR2 | xargs -i rm -rf {}' --prune-empty -f HEAD
Now the repository should only contain the (here) two directories. But the removed files still are in the packed object references. We have to manually let them expire and remove them.
git reflog expire --expire=now --all git gc --aggr --prune=now
Finally I did some cleaning, but this is optionally:
git repack -a -d -l git clean -f -d
Finally we can for example create a new (bare) repository from our NEW-REPO and further use it. I recommend to just remove the extracted directory from the source repository and keep its history, just in case.
The steps are similar if you need to instead remove a directory including its history. The only difference is in the actual removal command:
git filter-branch --tree-filter 'ls -1 |grep -e DIRECTORY |xargs -i rm -rf {}' --prune-empty -f HEAD