Git Locally With Mercurial Remote Repository

Oct. 30th, 2011 at 10:47pm

Git and HG Working Together

So I have taken a new job and the one thing that concerned me was the fact that they use mercurial as their version control system. I am a long time svn user but recently, about 8 months ago, switched to git partly because my work was switching to git. I was in charge of converting the main project I worked on from svn to git and that process was pretty simple (considering that converting history was not a top concern). Ever since then I have been one of two go to people at my current employment for git questions. I have completely embraced the git way and the system is pretty simple but still very powerful. The best thing I like about it is that it really allows the user to define the workflow instead of forcing a particular workflow on you like other version control systems do. The only downside to git is that to extend functionality, you need to write bash scripts (though through the process, I have become better at it).

Now To Mercurial

I originally wanted to find a way to use git locally and push to a mercurial remote repository. Now there is a plugin call hg-git that allows the exact opposite very easily (use mercurial locally but pushing to a git remote repository) but it really was not designed for the other way around. I did a search on the internet and found a number of different ways to setup hg-git to use it in the way I was looking to but I always had issues with being able to pull in new changesets properly. After this I decided to just try to learn mercurial because it could hurt to learn it, or so I though.

My Git WorkFlow

Let me first go through my general workflow with git, a workflow that just works perfectly for me. I am only going to focus on how I work with the main development branch (in this case master) as it is the main point here.

Repository Workflow

I work with git in a centralized version control system way. There is one remote repository that is consider the "golden" repository and that is the main repository everyone works off of.

Branch Workflow

I use the master branch as the main line of development, this contains the bleeding edge code that has only been through the most basic developer testing and unit tests if available. This is the main branch that lives on the remote repository. I use something referred to as a topic/feature branch workflow. What this means is that ever time I have something to work on, whether it is a new feature, enhancement, bug, experiment, whatever, I create a separate branch for that. Working in this way means that I create a lot of branches and that sometimes those branches are deleted without even being merged back into master (maybe a failed experiment or a crappy implementation of something). These topic branches never make is to the central "golden" remote repository. If there is ever a case where I want to share a topic branch (maybe I am working with someone on it), I generally try to just create a fork of the golden repository and push my topic branches there.

So a general workflow would look like this:

  • I have the master branch
  • I have 2 issues assigned to me
  • I create a branch for issue1 from master
  • I work on issue1 one but reach a stopping point as I need more information from someone, this branch right now it in a broken state
  • I commit my changes so far for this branch and then create a new branch for issue2 from master
  • I start working on issue2 however an urgent bug gets assigned to me that needs to be fixed right away
  • I commit my current progress on the issues2 branch and then create a new branch for the bug from master
  • I fixed the bug and commit to the bug branch
  • After testing I merge my bug branch into the master branch, after which I delete the bug branch
  • I then go back to my issue2 branch and do a rebase from master (to pull in the new bug fix)
  • I complete the issue2 branch, rebase to squash my commits (so I don't have any commits where the code is broken when merging into master), merge issue2 branch into master and then delete the issue2 branch
  • It turns out that I really didn't need to work on issue1 after all, so I just delete that branch without merging it back into master

Now let take a look at a visual that might better show this (I know I am more of a visual person):

Now for what I would do as far as git commands:

git checkout -b issue1 master
git commit -a #commit 1
git commit -a #commit 2 - waiting for more information
git checkout -b issue2 master
git commit -a #commit 1
git commit -a #commit 2
git commit -a #commit 3 - need to work or a major bug fix
git checkout -b bug master
git commit -a #commit a - but is fixed
git checkout master
git merge bug
git push origin master
git branch -d bug
git checkout issue2
git rebase master
git commit -a #commit 4
git commit -a #commit 5 - feature done
git rebase -i master #squash commit to only the ones that are important and are not at a state with the code is broken
git checkout master
git merge issue2
git branch -d issue2
git branch -D issue1

Now while that might be a lot of commands, 8 of them are really just commits (leaving 14 commands we have to run to performs this workflow). Even though there a number of commands to run (which in my opinion is still pretty small) I don't need to really remember a lot of options/parameters, they are all pretty simple commands. Using bash scripts I have created to streamline some of my processes, the commands be reduced to:

git checkout -b issue1 master
#n commits
git checkout -b issue2 master
#n commits
git checkout -b bug master
#n commits
git rebase-merge-push bug master # this performs a git checkout master; git pull; git checkout bug; git rebase master; git checkout master; git merge bug; git branch push origin master; git branch -d bug automatically
git checkout issue2
git rebase master # to pull in to bug fix
#commits
git rebase -i master #squash commit to only the ones that are important and are not at a state with the code is broken
git rebase-merge-push issue2 master # this performs a git checkout master; git pull; git checkout issue2; git rebase master; git checkout master; git merge issue2; git branch -d issue2 automatically
git branch -D issue1

This brings the commands from 14 down to 9 (not to mention it adds in the commands I would have to run if the remote master branch was updated while I did these changes twice, so we are possibly down from 20 to 9).

This process work very well for me and is not something I want to change. Fighting with my version control system just decreases my effectiveness (just like SVN did).

Mercurial Branches/Bookmarks/Queues

Mercurial has a number of different features to work in, the most popular seem to be branches, bookmarks, and queues. I have spent several hours looking into all of those solutions and none of them seem to allow me to easily perform the workflow described above. I am not going to go into detailed about why these mercurial features don't seem to work for me, that alone would be a long blog post (something I might do in the future).

I am also not saying that the workflow describe above is impossible with mercurial. Just because I couldn't find a way to do it doesn't mean that it doesn't exist. I just spent more time than I wanted to in trying to get upto speed with mercurial that I just said, lets try to get git working in my mercurial workflow.

StackOverflow To the Rescue

So I tried to get hg-git plugin to work again and I think this time I have finally found a way to get it to work the way it was designed. A stackoverflow.com user named Lazy Badger first posted the process I had already tried but I tried it again and as expected, got the same result. I could push changes from git to mercurial but pulling changes from mercurial was not working. He then made another post of a different option he was thinking of and that was the ticket. He suggested to create 2 local repositories, one mercurial and the other git. In this setup you are using the hg-git plugin as designed because it is designed to push changes to git and pull changes from git while in a mercurial repository and this setup does exactly that, the only difference begin the git repository is local, not remote. Lets go into how I implemented this.

Git <-> HG Implementation

I decided to go one step further and I have 3 local repositories. While this might seems like extra work it really is not. It does add a few extra steps in the setup process but after it is setup, it adds very little extra work. I also have a bash script that does everything describe here in 1 command so in reality the extra repository only adds the disk space (and git is very good about compression) and the few extra lines in bash script. I will explain later the reason I have the extra repository in my setup.

Now I store all local repositories in a structure like:

/repositories/[separator_name]/[git | hg-git]

so for example I might have:

/repositories/personal/git
/repositories/personal/hg-git
/repositories/work/git
/repositories/work/hg-git

The git folders just store plain git repositories, nothing special since git is what I work with. The hg-git folder has a special structure and looks like this:

  • hg-git/hg - Stores my local hg repositories. I use this mainly to just push and pull from and to the remote hg repositories and local bare git repository.
  • hg-git/git-bare - Used as the middle man between the local hg repositories and the git working copy repositories.
  • hg-git/git-working - 95% of your work should be able to done inside these repositories.

Setup

The setup is a relatively simple process and can easily be converted into a bash script for a one command setup. First we need to create the directories that are needed to store all the local repositories. Again, I like to keep my repository directories very organized so I do:

mkdir -p /repositories/work/hg-git/hg/project_name /repositories/work/hg-git/git-bare/project_name /repositories/work/hg-git/git-working/project_name

With the directories created, let first initialize a bare git repository by running:

git init --bare /repositories/work/hg-git/git-bare/project_name

Next we need to clone the hg repository by doing:

hg clone [repository_url] /repositories/work/hg-git/hg/project_name

Now lets cd into the hg repository and create the master bookmark which will be needed when pushing to the git bare local repository:

hg bookmark -f master -r default

The last thing we need to do is setup the path for the git bare repository when pushing from the hg repository. To do this, all you have to do is run:

echo "git = /repositories/work/hg-git/git-bare/project_name" >> .hg/hgrc

This just creates a shortcut name you can use when running the hg push command. Now with the hg repository setup the way we need it, lets push our hg repository to the git bare repository:

hg push git

This should have worked without any issues and your git bare repository should be all set. The last thing we have to do is create the clone of the git bare repository and for that we do:

git clone /repositories/work/hg-git/git-bare/project_name /repositories/work/hg-git/git-working/project_name

After this, the setup portion should be completely done. This process it pretty standard and should never contain any issues which means it is a great candidate for a bash script. If compelled enough, I would be willing to release this bash script freely on github/bitbucket.

Normal Workflow

When working in the git working repository, the workflow should be normal as you are working off a git bare repository. Since there is no direct connection between the hg repository and the git working copy repository, there should be no real issues with any normal git workflow. The only things that might not work as expected would be force pushing history rewriting to the git bare repository. This is not an issue for me as I almost never rewrite history that has been pushed to the git bare repository.

Working with hg should also be pretty simple. You push and pull normal to the mercurial remote repository and the you just do hg push/pull passing git as the parameter to push and pull updates from the git bare repository. The only thing you have to remember is that when you are pulling changes from the remote mercurial repository, in order to push those changes to the git bare repository correctly, you need to run this command before doing the hg push git:

hg bookmarks -f master -r default

This will refresh the master bookmark to point to the latest commit of the default branch.

Why Three Repositories

While it is probably completely possible to skip having the bare git repository and just push/pull from hg to the git working copy, that setup does not exactly provide the best workflow for me.

The first issue is that when you first try to push to the git working copy repository, you get this:

error: refusing to update checked out branch: refs/heads/master
error: By default, updating the current branch in a non-bare repository
error: is denied, because it will make the index and work tree inconsistent
error: with what you pushed, and will require 'git reset --hard' to match
error: the work tree to HEAD.
error: 
error: You can set 'receive.denyCurrentBranch' configuration variable to
error: 'ignore' or 'warn' in the remote repository to allow pushing into
error: its current branch; however, this is not recommended unless you
error: arranged to update its work tree to match what you pushed in some
error: other way.
error: 
error: To squelch this message and still keep the default behaviour, set
error: 'receive.denyCurrentBranch' configuration variable to 'refuse'.
abort: git remote error: refs/heads/master failed to update

This is because git does not except non-bare repositories to be pushed to by default. You can easily fix this by just adding the following to your .git/config file:

[receive]
denyCurrentBranch = warn

Remember though that you also have to remember to do git reset --hard on the working copy (or at least according to the error that shows initially). This type of warning just seems like a red flag to me, who knows what will happen down the road.

The other issue, and a bigger one, is the way mercurial will pull from the working copy. Let say in my git working copy I create 3 branches, call them issue1, issue2, and issue3. I make commits to all three branches but I only want to push one of those changes to the mercurial repository. Well I can merge say branch issue2 into master and then delete branch issue2 from git however when I go into the mercurial repository and do a hg pull from my git working copy, it is going to also pull the changes of branches issue1 and issue3 (it will create a bookmarks for each branch on the git repository it is pulling from). This means that if you want to only have certain changes committed, you have to merge properly after pulling the changes from git into mercurial.

With the 3rd repository, the git bare repository, this mean that you add two extra steps while working inside the git working copy repository. Now while inside the git working copy repository you actually have to do a git pull to pull in the mercurial changes (because hg push git is pushing to the git bare repository) and when pushing changes, you have to actually perform git push [remote] [branch] (because hg pull git pulls changes from the git bare repository). This requires a small amount of extra work (if you even consider that extra work) but provides a much more streamline process because now you don't have to worry about mercurial pulling in unwanted changes because mercurial will only pull in changes you specific push to the git bare repository.

Conclusion

Now I know that mercurial is used by a number of people and it certainly has it's following, it is just that I don't quite understand the logic behind how it handles branches/bookmarks/queues. It seems overly complex without adding in and real benefits. A lot of people will say that mercurial is better because it is harder to rewrite history and that branches last forever. Unfortunately I believe that both of those are downsides to mercurial, not advantages. Obviously I am biased towards git if I am willing to try to figure out a workflow like this instead of learning mercurial but I just have not seen anything about mercurial that would every make me what to fully learn it (though it does have some advantages which is for another time).

I believe this workflow will work if you currently are working in a setup with heavy branching (1 repository per branch) or named branching (multiple named branches in 1 repository) where you only have 1 head per branch. Since I am unaware of how bookmarking really works or why one would use it, I can't can for sure that it would work in a setup where bookmarks are shared but I don't see why not.

Now I have only tested this setup for about 95% of the use cases I have encountered in git so far. I haven't tried any real complex use cases that have only come up once in a while with git but that is because I don't have a mercurial repository that has enough data to be able to simulate them.

Like I have said before, I am to saying that my workflow I am looking for is impossible to do in mercurial but after spending a number of hours trying a number of different things, it is not nearly as easy to do as I can in git. The ultimate goal of any version control system is that it has as little interference with the developer's workflow as possible and this workflow is better than trying to use plain mercurial and that is why I plan on working with mercurial in this way.

Join the Discussion
  • (required but never shared)
  • (optional)