This feed contains some of my blog entries that link to software code that I've developed.
Propellor is my second big Haskell program. I recently described the motivation for it like this, in a proposal for a Linux.Conf.Au talk:
The configuration of Linux hosts has become increasingly declarative, managed by tools like puppet and ansible, or by the composition of containers. But if a server is a collection of declarative properties, how do you make sure that changes to that configuration make sense? You can test them, but eventually it's 3 AM and you have an emergency fix that needs to go live immediately.
Data types to the rescue! While data types are usually used to prevent eg, combining an Int and a Bool, they can be used at a much more abstract level, for example to prevent combining a property that needs a Debian system with a property that needs a Red Hat system.
Propellor leverages Haskell's type system to prove the consistency of the properties it will apply to a host.
The real origin story though, is that I wanted to finally start using configuration management, but the tools for it all seemed very complicated and built on shaky foundations (like piles of yaml), and it seemed it would be easier to write my own than deal with that. Meanwhile, I had Haskell burning a hole in my pocket, ready to be used in a second large project after git-annex.
Propellor has averaged around 2.5 contributions per month from users since it got started, but increasing numbers recently. That's despite having many fewer users than git-annex, which remember gets perhaps 1 patch per month.
Of course, I've "cheated" by making sure that propellor's users know Haskell, or are willing to learn some. And, propellor is very compositional; adding a new property to it is not likely to be complicated by any of the existing code. So it's easy to extend, if you're able to use it.
At this point propellor has a small community of regular contributors, and I spend some pleasant weekend afternoons reviewing and merging their work.
Much of my best work on propellor has involved keeping the behavior of the program the same while making its types better, to prevent mistakes. Propellor's core data types have evolved much more than in any program I worked on before. That's exciting!
concurrent-output is a more meaty Haskell library than the ones I've covered so far. Its interface is simple, but there's a lot of complexity under the hood. Things like optimised console updates, ANSI escape sequence parsing, and transparent paging of buffers to disk.
It developed out of needing to display multiple progress bars on the console in git-annex, and also turned out to be useful in propellor. And since it solves a general problem, other haskell programs are moving toward using it, like shake and stack.
shell-monad is a small project, done over a couple days and not needing many changes since, but I'm covering it separately because it was a bit of a milestone for me.
As I learned Haskell, I noticed that the libraries were excellent and did things to guide their users that libraries in other languages don't do. Starting with using types and EDSLs and carefully constrained interfaces, but going well beyond that, as far as applying category theory. Using these libraries push you toward good solutions.
shell-monad was a first attempt at building such a library. The shell script it generates should always be syntactically valid, and never forgets to quote a shell variable. That's only the basics. It goes further by making it impossible to typo the name of a shell variable or shell function. And it uses phantom types so that the Haskell type checker can check the types of shell variables and functions match up.
So I think shell-monad is pretty neat, and I certianly learned a lot about writing Haskell libraries making it. Including how much I still have to learn!
I have not used shell-monad much, but keep meaning to make propellor and git-annex use it for some of their shell script needs. And ponder porting etckeeper to generate its shell scripts using it.
ey dad sometimes asks when I'll finish git-annex. The answer is "I don't know" because software like that doesn't have a defined end point; it grows and changes in response to how people use it and how the wider ecosystem develops.
But other software has a well-defined end point and can be finished. Some of my smaller projects that are more or less done include electrum-mnemonic, brainfuck-monad, scroll, yesod-lucid haskell-mountpoints.
Studies of free software projects have found that the average free software project was written entirely by one developer, is not very large, and is not being updated. That's often taken to mean it's a failed or dead project. But all the projects above look that way, and are not failures, or dead.
It's good to actually finish some software once in a while!
github-backup is an attempt to take something I don't like -- github's centralization of what should be a decentralized techology -- and find a constrictive way to make it at least meet my baseline requirements for centralized systems. Namely that when they go away, I don't lose data.
So, it was written partly with my ArchiveTeam hat on.
A recent bug filed on it, Backup fails for repositories unavailable due to DMCA takedown made me happy, because it shows github-backup behaving more or less as intended, although perhaps not in the optimal way.
By the way, this is the only one of my projects that uses github for issue tracking. Intentionally ironically.
It was my second real Haskell program (after git-annex) and so also served as a good exercise in applying what I'd learned about writing Haskell up to that point.
It was written just to solve my own problem, but in a general way, that turned out to be useful in lots of other situations. So over the first half a year or so, it started attracting some early adopters who made some very helpful suggestions.
Then I did the git-annex assistant kickstarter, and started blogging about each day I worked on it. Four years of funding and seven hundred and twenty one posts later, the git-annex devblog is still going. So, I won't talk about technical details in this post, they've all been covered.
One thing I wondered when starting git-annex -- besides whether I would be able to write it in Haskell at all -- was would that prevent it from getting many patches. I count roughly 65 "thanks" messages in the changelog, so it gets perhaps one patch contributed per month. It's hard to say if that's a lot or a little.
Part of git-annex is supporting various cloud storage systems via "special remotes". Of those not written by me, only 1 was contributed in Haskell. Compare with 13 that use the plugin system that lets other programming languages be used.
The other question about using Haskell is, did it make git-annex a better program. I think it did. The strong type system did prevent plenty of bugs, although there have still been some real howlers. The code is still not taking full advantage of the power of Haskell's type system, on the other hand it uses many Haskell libraries that do leverage the type system more. I've done more small and large refactorings of git-annex than on any other program I've written, because the strong types and referential transparency makes refactoring easier and safer in Haskell.
And the code has turned out to be much more flexible, for all its static types, than the kind of code I was writing before. Examples include building the git-annex assistant, which uses the rest of git-annex as a library, and making git-annex run actions concurrently, thanks to there being no global variables to complicate things (and excellent support for concurrency and parallelism in Haskell).
So: Glad I wrote it, glad I used Haskell for it, estatic that many other people have found it useful, and astounded that I've been funded to work on it for four years.
moreutils is a little love letter to the Unix Tools philosophy.
It was interesting to try to find new tools as basic as
chronic and others, we managed
to find several such tools.
So, it was fun to work on moreutils, but it also ran into inherent problems
with the Unix Tools philosophy. One is namespacing; there are only so many
good short names for commands, and a command like
parallel can easily
collide with something else. And since Unix Tools have a bigger surface
area than a pure function, my
parallel is not going to be quite compatible
parallel, even if they were developed with (erm) parallel
Partly due to that problem, I have gotten pickier about adding new tools to moreutils as it's gotten older, and so there's a lot of suggested additions that I will probably never get to.
And as my mention of pure functions suggests, I have kind of moved on from being a big fan of the Unix Tools philosophy. Unix tools are a decent approximation of pure functions for their time, but they are not really pure, and not typed at all, and not usefully namespaced, and this limits them.
[...] a little bit about the reason I wrote pristine-tar in the first place. There were two reasons: 1. I was once in a talk where someone mentioned that Ubuntu had/was developing something that involved regenerating orig tarballs from version control. I asked the obvious question: How could that possibly be done technically? The (slightly hung over) presenter did not have a satesfactory response, so my curiosity was piqued to find a way to do it. (I later heard that Ubuntu has been using pristine-tar..) 2. Sometimes code can be subversive. It can change people's perspective on a topic, nudging discourse in a different direction. It can even point out absurdities in the way things are done. I may or may not have accomplished the subversive part of my goals with pristine-tar. Code can also escape its original intention. Many current uses of pristine-tar fall into that category. So it seems likely that some people will want it to continue to work even if it's met the two goals above already.
For me, the best part of building pristine-tar was finding an answer to the question "How could that possibly be done technically?" It was also pretty cool to be able to use every tarball in Debian as the test suite for pristine-tar.
I'm afraid I kind of left Debian in the lurch when I stopped maintaining pristine-tar.
"Debian has probably hundreds, if not thousands of git repositories using pristine-tar. We all rely now on an unmaintained, orphaned, and buggy piece of software." -- Norbert Preining
So I was relieved when it finally got a new maintainer just recently.
Still, I don't expect I'll ever use pristine-tar again. It's the only software I've built in the past ten years that I can say that about.
While Branchable has not reached the point of providing much income, it's still running after 6 years. Ikiwiki-hosting makes it pretty easy to maintain it, and I host all of my websites there.
A couple of other people have also found ikiwiki-hosting useful, which is not only nice, but led to some big improvements to it. Mostly though, releasing the software behind the business as free software caused us to avoid shortcuts and build things well.
myrepos is kind of just an elaborated
foreach (@myrepos) loop, but
its configuration and extension in a sort of hybrid between an .ini file
and shell script is quite nice and plenty of other people have found it
I had to write myrepos when I switched from subversion to git, because git's submodules are too limited to meet my needs, and I needed a tool to check out and update many repositories, not necessarily all using the same version control system.
It was called "mr" originally, but I renamed the package because it's impossible to google for "mr". This is the only software I've ever renamed.