Binary Files in Mercurial

Mercurial doesn’t handle Large Binary Files very well especially if these files are changed frequently. To sum up the problem here is an excerpt from Greg Ward (creator of the Bfiles extension)

“bfiles“ is a Mercurial extension for handling large binary files. Such files tend to be:

  • not very compressible
  • not very “diffable” (small modifications can result in unexpectedly large deltas)
  • not at all mergeable

Mercurial was not designed to handle large binary files. This shows in a number of ways:

  • Internally, Mercurial generally reads file contents entirely into memory; for doing diffs and merges, it reads multiple whole revisions into memory.
  • Mercurial’s revlog format is based on compressed deltas. This doesn’t work very well with non-diffable, non-compressible data. Large binary files with lots of history can take up quite a lot of space in the repository.
  • Mercurial’s distributed nature means that the overhead of all that history is repeated for every clone, making a bad situation worse.

At the time of this writing there are several extensions to Mercurial that try to resolve the issue. We took 3 for a test drive:

There are several others you can see here. None of them are all that great yet. I would say they’re all still in BETA.

There are signs of the authors colaborating to come up with a solution, and Kiln for example are waiting for TortoiseHG’s rewrite to finish in March 2010 release 2.0 before investing in it.

We chose Bfiles but needed to tweak it a bit to work moderately well with TortoiseHg and Mercurial Eclipse:

It's only fair to share...
Share on Facebook
Tweet about this on Twitter
Share on LinkedIn

Leave a Reply