From SVN to Git

From BioPerl
Jump to: navigation, search

Notes and comment on moving to git {{#comment| test whether user set the "named" parameter }} {{#function|present||{{#not|{{#strpos|{{#1}}|{{#2}}}}}}}} {{#var|See|@=|see}} {{#var|sp|@=| }} {{#if|{{#present|{{#var|See}}|2}}||{{#var|See|@=|see{{#var|sp}}}}||{{#var|See|@=|^.}}}} ({{#if|{{#present|{{#var|See}}|2}}||{{#var|See}}||see{{#var|sp}} }}thread)

Contents

General Migration Strategy

Migrating from Subversion to git isn't as straightforward as it would seem at first. One could use git-svn to assist in the conversion, bit one aspect of this (particularly with a multiple-project repo) is that tags in subversion are really just branches (one can make commits on them), whereas tags in git are essentially noting a specific commit as something special (a release, for instance). In other words, a git tag is a branch that never moves.

Migration Tools

  • svn2git (C++/Qt), aka svn-all-fast-export.
  • svn2git (Ruby)
  • git-svn-abandon

I'm currently testing Jonathan Leto's suggestion, namely svn2git (recently known as svn-all-fast-export). Appears to be the most flexible...

Progress

Conversion via --Chris Fields 13:52, 2 May 2010 (UTC)

  • svn2git compiled on Ubuntu 9.10 (was a bit of a pain w/o docs)
  • I am using rsync to sync a local repo on my local Ubuntu setup for testing; when we are ready we can freeze the subversion repo on dev, make a final conversion to git repos, and push to wherever we chose. As the tags are truly supposed to be more like git tags than subversion by legacy (the original repo was in CVS), we may just allow them to be tags for our purposes.
  • I have pushed up a few test ones to GitHub (as suggested by Jonathan Leto) prior to a full github move. We may utilize a secondary public repo as a git mirror when GitHub is unavailable.
  • Author mapping: currently all author IDs are mapped to their respective bioperl.org emails (these can become more general open-bio.org emails as well). I believe that if one sets up their GitHub account and includes this in their alternative emails, GitHub linkifies the author name to that account (at least it works for Jason and I). I won't post the authors file online (damn spambots).
  • Tags - as we've historically used tags as uneditable branches, I chose to convert Subversion tags into git tags. This doesn't really do git tags justice, but it does have the same effect.
  • Now working on a Using Git document, analogous to Using Subversion for our last migration.

Timeline

When? Sooner the better (weeks as opposed to months). Our anon. svn is down, likely permanently (http://code.open-bio.org/svnweb/index.cgi/bioperl/browse/bioperl-live).

Migration strategy

Now mainly worked out using svn2git, which is very fast. We would need to make the svn repo on dev read-only during this transition. My guess is it would take very little time. Do we want to retain the git-SVN metadata on commits? This is viewable with our current read-only mirror on github:

http://github.com/bioperl/bioperl-live/commit/7090e24f3916346b11a6bf960371f1d903d241ca

Developers

Not everyone has a github account. Recent ones who I couldn't find on github: dmessina, fangly

The current authors file used for mapping commit authors to emails used their respective bioperl.org addresses (DEVNAME -at- bioperl.org). I think, once one has signed up with github, you can add that same address to your current ones, and it should map to your github account. If we use dev.open-bio.org as our central git repo, we won't need to go through with that, but we will need a viewable version of dev available somehow (mirrored on github or otherwise). Speaking of...

Development strategy

Are we sticking with a single centralized repo (SVN-like)? Will that be github, or will github be a downstream repo to our work on dev? We could feasibly have github be an active, forkable repo that could be bidirectionally synced with dev, but I'm not sure of the logistics on this (this popped up before with svn migration and was rejected b/c it was considered too difficult to maintain).

Git makes it very easy to make branches and merge in code to trunk. With that in mind, I would highly suggest we start working on branches for almost everything and merge over to trunk. There is very little to no overhead in doing so with git.

I like this strategy (Mark Jensen pointed this out): http://nvie.com/git-model

Also, several points were raised in a related project (Parrot) considering a move to git/github from svn. One in particular was that git allows destructive commits. Jonathan Leto indicated we can set up specific branches that don't allow this, using commit hooks, so my guess is the master branch and release branches wouldn't allow rewinds.

Encouraging Outside Contributors

Do we want to adopt a policy similar to Moose?

http://search.cpan.org/dist/Moose/lib/Moose/Manual/Contributing.pod

This is easy with github and forks.

SVN Read/Write to GitHub

It was recently announced that one can access a github repo using subversion as read-only, and just yesterday experimental write to github is allowed:

http://github.com/blog/644-subversion-write-support

I can see allowing read-only svn, but write support is still experimental. Do we want to allow that?

Others?

ADD MORE HERE!

Links

Some helpful links for git newbies:

Personal tools
Namespaces
Variants
Actions
Main Links
documentation
community
development
Toolbox