This feed contains some of my blog entries that link to software code that I've developed.
Keysafe securely backs up a gpg secret key or other short secret to the cloud. But not yet. Today's alpha release only supports storing the data locally, and I still need to finish tuning the argon2 hash difficulties with modern hardware. Other than that, I'm fairly happy with how it's turned out.
Keysafe is written in Haskell, and many of the data types in it keep track of the estimated CPU time needed to create, decrypt, and brute-force them. Running that through a AWS SPOT pricing cost model lets keysafe estimate how much an attacker would need to spend to crack your password.
(Above is for the password "makesad spindle stick")
If you'd like to be an early adopter, install it like this:
sudo apt-get install haskell-stack libreadline-dev libargon2-0-dev zenity stack install keysafe
~/.local/bin/keysafe --backup --store-local to back up a gpg key
I still need to tune the argon2 hash difficulty, and I need benchmark data to do so. If you have a top of the line laptop or server class machine that's less than a year old, send me a benchmark:
~/.local/bin/keysafe --benchmark | mail email@example.com -s benchmark
Bonus announcement: http://hackage.haskell.org/package/zxcvbn-c/ is my quick Haskell interface to the C version of the zxcvbn password strength estimation library.
PS: Past 50% of my goal on Patreon!
Have you ever thought about using a gpg key to encrypt something, but didn't due to worries that you'd eventually lose the secret key? Or maybe you did use a gpg key to encrypt something and lost the key. There are nice tools like paperkey to back up gpg keys, but they require things like printers, and a secure place to store the backups.
I feel that simple backup and restore of gpg keys (and encryption keys generally) is keeping some users from using gpg. If there was a nice automated solution for that, distributions could come preconfigured to generate encryption keys and use them for backups etc. I know this is a missing peice in the git-annex assistant, which makes it easy to generate a gpg key to encrypt your data, but can't help you back up the secret key.
So, I'm thinking about storing secret keys in the cloud. Which seems scary to me, since when I was a Debian Developer, my gpg key could have been used to compromise millions of systems. But this is not about developers, it's about users, and so trading off some security for some ease of use may be appropriate. Especially since the alternative is no security. I know that some folks back up their gpg keys in the cloud using DropBox.. We can do better.
I've thought up a design for this, called keysafe. The synopsis of how it works is:
The secret key is split into three shards, and each is uploaded to a server run by a different entity. Any two of the shards are sufficient to recover the original key. So any one server can go down and you can still recover the key.
A password is used to encrypt the key. For the servers to access your key, two of them need to collude together, and they then have to brute force the password. The design of keysafe makes brute forcing extra difficult by making it hard to know which shards belong to you.
Indeed the more people that use keysafe, the harder it becomes to brute-force anyone's key!
I could really use some additional reviews and feedback on the design by experts.
I've been funded for two years by the DataLad project to work on git-annex. This has been a super excellent gig; they provided funding and feedback on ways git-annex could be improved, and I had a large amount of flexability to decide what to work on in git-annex. Also plenty of spare time to work on new projects like propellor, concurrent-output, and scroll. It was an awesome way to spend the last two years of my twenty years of free software.
That funding is running out. I'd like to continue this great streak of working on the free software projects that are important to me. I'd normally dip into my savings at this point and keep on going until some other source of funding turned up. But, my savings are about to be obliterated, since I'm buying the place where I've had so much success working distraction-free.
So, I've started a Patreon page to fund my ongoing work. Please check it out and contribute if you want to.
Some details about projects I want to work on this fall:
PocketCHIP is the pocket sized Linux terminal I always used to want. Which is to say, it runs (nearly) stock Debian, X, etc, it has a physical keyboard, and the hardware and software is (nearly) non-proprietary and very hackable. Best of all, it's fun and it encourages playful learning.
It's also clunky and flawed and constructed out of cheap components. This keeps it from being something I'd actually carry around in my pocket and use regularly. The smart thing they've done though is embrace these limitations, targeting it at the hobbiest, and not trying to compete with smart phones. The PocketCHIP is its own little device in its own little niche.
Unless you're into hardware hacking and want to hook wires up to the GPIO pins, the best hardware feature is the complete keyboard, with even Escape and Control and arrow keys. You can ssh around and run vi on it, run your favorite REPL (I use ghci) to do quick programming, etc. The keyboard is small and a little strange, but you get used to it quickly; your QWERTY muscle memory is transferrable to it. I had fun installing nethack on it and handing it to my sister who had never played nethack before, to watch her learn to play.
The screen resolution is 480x272, which is pretty tiny. And, it's a cheap resistive touchscreen, with a bezil around it. This makes it very hard to use scroll bars and icons near the edge of the screen. The customized interface that ships with it avoids these problems, and so I've been using that for now. When I have time, I plan to put a fullscreen window manager on it, and write a pdmenu menu configuration for it, so everything can be driven using the keyboard.
I also have not installed Debian from scratch on it yet. This would be tricky because it uses a somewhat patched kernel (to support the display and wifi). The shipped distribution is sadly not entirely free software. There are some nonfree drivers and firmwares. And, they included a non-free gaming environment on it (a very nice one for part of the audience, that allows editing the games, but non-free nevertheless). They did do a good job of packaging up all the custom software they include on it, although they don't seem to have published source packages for everything.
(They might be infringing my GPL copyright of flash-kernel by distributing a modified version without source. I say "might" because flash-kernel is a pile of shell scripts, so you could probably extract the (probably trivial) modifications. Still.. Also, they seem to have patched network-manager in some way and I wasn't able to find the corresponding source.)
The battery life is around 5 hours. Unfortunately the "sleep" mode only turns off the backlight and maybe wifi, and leaves the rest of the system running. This and the slightly awkward form factor too big to really comfortably fit in a pocket limit the use of PocketCHIP quite a bit. Perhaps the sleeping will get sorted out, and perhaps I'll delete the GPIO breakout board from the top of mine to make it more pocket sized.
Propellor is my second big Haskell program. I recently described the motivation for it like this, in a proposal for a Linux.Conf.Au talk:
The configuration of Linux hosts has become increasingly declarative, managed by tools like puppet and ansible, or by the composition of containers. But if a server is a collection of declarative properties, how do you make sure that changes to that configuration make sense? You can test them, but eventually it's 3 AM and you have an emergency fix that needs to go live immediately.
Data types to the rescue! While data types are usually used to prevent eg, combining an Int and a Bool, they can be used at a much more abstract level, for example to prevent combining a property that needs a Debian system with a property that needs a Red Hat system.
Propellor leverages Haskell's type system to prove the consistency of the properties it will apply to a host.
The real origin story though, is that I wanted to finally start using configuration management, but the tools for it all seemed very complicated and built on shaky foundations (like piles of yaml), and it seemed it would be easier to write my own than deal with that. Meanwhile, I had Haskell burning a hole in my pocket, ready to be used in a second large project after git-annex.
Propellor has averaged around 2.5 contributions per month from users since it got started, but increasing numbers recently. That's despite having many fewer users than git-annex, which remember gets perhaps 1 patch per month.
Of course, I've "cheated" by making sure that propellor's users know Haskell, or are willing to learn some. And, propellor is very compositional; adding a new property to it is not likely to be complicated by any of the existing code. So it's easy to extend, if you're able to use it.
At this point propellor has a small community of regular contributors, and I spend some pleasant weekend afternoons reviewing and merging their work.
Much of my best work on propellor has involved keeping the behavior of the program the same while making its types better, to prevent mistakes. Propellor's core data types have evolved much more than in any program I worked on before. That's exciting!
concurrent-output is a more meaty Haskell library than the ones I've covered so far. Its interface is simple, but there's a lot of complexity under the hood. Things like optimised console updates, ANSI escape sequence parsing, and transparent paging of buffers to disk.
It developed out of needing to display multiple progress bars on the console in git-annex, and also turned out to be useful in propellor. And since it solves a general problem, other haskell programs are moving toward using it, like shake and stack.
shell-monad is a small project, done over a couple days and not needing many changes since, but I'm covering it separately because it was a bit of a milestone for me.
As I learned Haskell, I noticed that the libraries were excellent and did things to guide their users that libraries in other languages don't do. Starting with using types and EDSLs and carefully constrained interfaces, but going well beyond that, as far as applying category theory. Using these libraries push you toward good solutions.
shell-monad was a first attempt at building such a library. The shell script it generates should always be syntactically valid, and never forgets to quote a shell variable. That's only the basics. It goes further by making it impossible to typo the name of a shell variable or shell function. And it uses phantom types so that the Haskell type checker can check the types of shell variables and functions match up.
So I think shell-monad is pretty neat, and I certianly learned a lot about writing Haskell libraries making it. Including how much I still have to learn!
I have not used shell-monad much, but keep meaning to make propellor and git-annex use it for some of their shell script needs. And ponder porting etckeeper to generate its shell scripts using it.
ey dad sometimes asks when I'll finish git-annex. The answer is "I don't know" because software like that doesn't have a defined end point; it grows and changes in response to how people use it and how the wider ecosystem develops.
But other software has a well-defined end point and can be finished. Some of my smaller projects that are more or less done include electrum-mnemonic, brainfuck-monad, scroll, yesod-lucid haskell-mountpoints.
Studies of free software projects have found that the average free software project was written entirely by one developer, is not very large, and is not being updated. That's often taken to mean it's a failed or dead project. But all the projects above look that way, and are not failures, or dead.
It's good to actually finish some software once in a while!
github-backup is an attempt to take something I don't like -- github's centralization of what should be a decentralized techology -- and find a constrictive way to make it at least meet my baseline requirements for centralized systems. Namely that when they go away, I don't lose data.
So, it was written partly with my ArchiveTeam hat on.
A recent bug filed on it, Backup fails for repositories unavailable due to DMCA takedown made me happy, because it shows github-backup behaving more or less as intended, although perhaps not in the optimal way.
By the way, this is the only one of my projects that uses github for issue tracking. Intentionally ironically.
It was my second real Haskell program (after git-annex) and so also served as a good exercise in applying what I'd learned about writing Haskell up to that point.
It was written just to solve my own problem, but in a general way, that turned out to be useful in lots of other situations. So over the first half a year or so, it started attracting some early adopters who made some very helpful suggestions.
Then I did the git-annex assistant kickstarter, and started blogging about each day I worked on it. Four years of funding and seven hundred and twenty one posts later, the git-annex devblog is still going. So, I won't talk about technical details in this post, they've all been covered.
One thing I wondered when starting git-annex -- besides whether I would be able to write it in Haskell at all -- was would that prevent it from getting many patches. I count roughly 65 "thanks" messages in the changelog, so it gets perhaps one patch contributed per month. It's hard to say if that's a lot or a little.
Part of git-annex is supporting various cloud storage systems via "special remotes". Of those not written by me, only 1 was contributed in Haskell. Compare with 13 that use the plugin system that lets other programming languages be used.
The other question about using Haskell is, did it make git-annex a better program. I think it did. The strong type system did prevent plenty of bugs, although there have still been some real howlers. The code is still not taking full advantage of the power of Haskell's type system, on the other hand it uses many Haskell libraries that do leverage the type system more. I've done more small and large refactorings of git-annex than on any other program I've written, because the strong types and referential transparency makes refactoring easier and safer in Haskell.
And the code has turned out to be much more flexible, for all its static types, than the kind of code I was writing before. Examples include building the git-annex assistant, which uses the rest of git-annex as a library, and making git-annex run actions concurrently, thanks to there being no global variables to complicate things (and excellent support for concurrency and parallelism in Haskell).
So: Glad I wrote it, glad I used Haskell for it, estatic that many other people have found it useful, and astounded that I've been funded to work on it for four years.