Migrating Repositories from Subversion to Git

I have been working on migrating some older subversion repositories to git for the last few days, and thought I’d share my findings.

Tutorials

There are a number of excellent tutorials outlining the git-subversion integration available out of the box in git.  These are what I based my investigation on:

https://www.atlassian.com/git/tutorials/migrating-overview

http://git-scm.com/book/en/v1/Git-and-Other-Systems-Git-and-Subversion

Deployment

This article assumes you have a Subversion repository you want to migrate, hosted on an internal server, and a Git repository hosted on an internal Stash. 

Subversion – http://subversion.company.com/project

Git – git://stash.company.com/scm/project/project.git

High Level Process

At a high level, the process is as follows:

  1. All source code is migrated from SVN into a SVN / Git Staging Area.  Included in the migration are branches, tags, user comments, user who made the change, etc.
  2. Changes from the SVN–>Git Staging area are merged manually into the final Git repository.
  3. The above steps 1 + 2 are repeated on a frequent basis, until the SVN server is decommissioned.

Refer to the diagram below for a pictorial representation of the process advocated:

Preparation

During the process of converting the Subversion history to Git, we will need to identify the user associated with each commit in Subversion.  The Subversion representation of a user is a single string, but in Git each user has a Username and Email address.  We need a way of mapping between each unique username in Subversion to a Username and Email Address in Git.  The authors file will do that for us.

jbloggs = Joe Bloggs <jbloggs@company.com>

egoldstein = Emmanuel Goldstein <egoldstein@company.com>

You can create this file manually, or by exporting all of the known users from your Subversion repository.  To do this, you will need to checkout the repo.  I chose to put my subversion repos in /svn:

cd /svn
svn co http://subversion.company.com/project project
svn log --xml | grep author | sort -u | perl -pe 's/.*>(.*?)<.*/$1 = /' > users.txt
Edit the users.txt file using your favourite editor, adding git format username and password on each line.  Save the file.  For the purposes of this article, I am assuming it is in /tmp/users.txt.

Create Staging Git Repository

Next, we need to create the Git repo which will stage the content from SVN, before pushing it to our Stash.  Instead of using git init to create this repo, we will use git svn clone to bring in our SVN repo’s contents.  I chose to put my git repos in /git
mkdir /git
cd /git
The Git-Subversion clone operation works best when the Subversion repository follows standard Subversion layout (that is, trunk/, branches/, and tags/ directories).
git svn clone --stdlayout --authors-file=/tmp/authors.txt http://subversion.company.com/project project-staging --prefix svn/

Git-svn will by default map the SVN trunk to the git master branch.  All other SVN branches will be created as remote branches in Git with the given prefix (ie. remotes/svn/…)

Please note that this clone operation will take an EXTREMELY long time for large SVN repos

After this operation is completed, you will have the contents of the Subversion trunk in the Git master branch.  All other Subversion branches will be created as remote branches in the Git repository, prefixed with ‘svn’.  These branches will have to be made local, in a later step.

Update .gitignore base on Subversion ignore settings

Before we make any local commits to the Git repository, we should update the .gitignore file to match the Subversion ignore settings:
 
# Ignore the same files that the SVN repo is ignoring
git svn show-ignore >> .gitignore
git add .gitignore
git commit -m "Migrated ignore configuration from svn." 

Migrate Remote Git Branches to Local

 The following shell script will create local versions of the remote branches
#!/bin/bash
# Create local git branches for SVN branches to accommodate working with them, for later on being able to repeatedly merge them
# i.e. git branch branch-0.0.2 remotes/svn/branch-0.0.2
for i in `git branch -r --no-color | grep 'svn/' | grep -v 'svn/tags/' | grep -v 'svn/trunk' `
do echo $i
    git branch `echo $i | sed -e 's|svn/||'` remotes/$i
done
 

Migrate Subversion Tags to Git

The following shell script will create Git tags from Subversion

# Create git tags for SVN tags
# i.e. git tag -a -m "Converting tags" tag-0.0.2.1 remotes/svn/tags/tag-0.0.2.1
for i in `git branch -r --no-color | grep 'svn/tags/' | sed -e 's|svn/tags/||'`
do echo $i
    git tag -a -m "Converting tags" `echo $i | sed -e 's|svn/||'` remotes/svn/tags/$i
done 

Push to Stash

Add a remote to the Git staging repo that points to the remote repo in Stash

git remote add final_git_repo git://stash.company.com/scm/project.git

# Push all of our work into that destination

git push final_git_repo --all
# Push all of our tags into that destination
git push final_git_repo --tags 
 
You will now have the SVN content in you Stash repository:

Subsequent Updates to the ‘Destination’ Git Repo

When subsequent changes are made to the Subversion repository, they can be imported into Git using git svn fetch:
git svn fetch
 
git svn fetch will bring in changes to previously defined remote branches, and create new remote branches for any newly created branches in SVN
If any new remote branches are created, you should create local branches to match them:
git branch <new_branch> remotes/svn/<new_branch>

For each feature branch which had changes, go to each release branch that changes from SVN have been received upon, and do:

git checkout <feature-branch>
git svn rebase

git svn rebase will pull changes from the remote branch to the local.  Additionally, it will do a fetch.  This is redundant, as we have already fetched from the Subversion repo.  This does make git svn rebase convenient if you only want to pull changes from a single Subversion branch, however.

Automating with Jenkins

In my next post, I will go over automating the initial migration as well as subsequent updates using Jenkins.

It's only fair to share...
Share on FacebookGoogle+Tweet about this on TwitterShare on LinkedIn