So, sha-1 is looking increasingly insecure in applications where birthday attacks are possible. ("Birthday attacks" ... what a phrase ... I hope my non-technical readers stopped at "sha-1".)

Two things about that:


First, I wanted to mention that I've today released jetring 0.15, which adds support for arbitrary hashes in the index file, and deprecates use of sha-1, going to sha-256 by default. There is a jetring-checksum -u utility that can be used to upgrade sha-1 hashes in existing jetring index files.

If you're using jetring in an application where changesets are provided by third paries, then a birthday attack could be possible (though not easy?), and you should upgrade your index. debian-maintainer is a good example of such a jetring user.


Secondly, our beloved git uses sha-1, and this seems unlikely to change soon or without significant pain. So, what kinds of collision attacks would you need to watch out for when using git? Here is a real-world example I've been pondering. Is it accurate?

  • Alice creates a legitimate new version of a file in the linux kernel.
  • Alice uses the new 252 work to generate two variants of the file that sha-1 the same. One is suitable for public consumption, and one does something nasty. (Note that this is still gonna be very hard to accomplish for peer-reviewed source code. (Maybe the best file to patch would be one containing firmware?) Also note that the collision actually needs to occur on the data that git-hash-object(1) will hash for the file.)
  • From the two variants of the file, Alice can generate two patches, a good and a bad. The good version is sent to Linus. Note that the sha-1's of the patches will not be the same, but when applied to a git repo, both patches will generate versions of the original file that sha-1 the same, despite being different.
  • Linus accepts the patch and publishes it in his git repo, and tags a new release. His repo now contains the good variant of the file.
  • Alice sets up her own git repo, a clone of Linus's, and tweaks it to contain the bad version of the file.
  • Alice lets the world know about her git repo, and encourages people pull from it before pulling from Linus, to save him bandwidth, or for some other plausible reason. (May seem unlikly, but people actually do this in many scenarios in the git world.)
  • Bob pulls from Alice's git repo, then pulls from Linus, and then builds the kernel, from Linus's tag. Git gets the bad version of Alice's file from her repo, and its sha-1 is ok. Alice has succeeded in deploying her evil code.

Here's a different scenario..

  • Say that I have commit access to the firmware-nonfree git repository. Let's also suppose that releases of firmware-nonfree are built by a build server that clones the git repository, and builds from it.
  • I take a new firmware file, and use birthday attacks to generate two variants of it, one good and one bad, that git-hash-object will generate the same sha-1 for.
  • I commit the good one to git, ensuring that the commit appears to have been made by a contributor who is on vacation, not me. I push it to the master git repository, and wait for my co-developers to pull it, test it out, etc.
  • When a release of firmware-nonfree is immenant, I ssh in and modify the object in the master git repo, replacing it with the bad version of the firmware. Since all the developers have pulled the good version already, they are unlikely to notice this change.
  • The build server is kicked off, clones the repository, including the bad firmware, checks the gpg signature on the release tag, and deploys my evil code.
  • I ssh back in and cover my traces, changing the object back to the good version.

This seems more plausible. This sort of attack is easy to accomplish with a subversion repository, and was one of the reasons I was glad to switch to git, since its checksums and signed tags seemed to prevent this kind of mischief. So, worrying, especially if your project uses such a build server.

Update: Thanks to the commenters for helping me correct my example. (I hope!)