12

In a recent blog post, I mentioned how lucky I feel to keep finding ways to work on free software. In the past couple years, I've had a successful Kickstarter, and followed that up with a second crowdfunding campaign, and now a grant is funding my work. A lot to be thankful for.

A one-off crowdfunding campaign to fund free software development is wonderful, if you can pull it off. It can start a new project, or kick an existing one into a higher gear. But in many ways, free software development is a poor match for kickstarter-type crowdfunding. Especially when it comes to ongoing development, which it's really hard to do a crowdfunding pitch for. That's why I was excited to find Snowdrift.coop, which has a unique approach.

Imagine going to a web page for a free software project that you care about, and seeing this in a button: "1283 current patrons will donate MORE when you pledge"

That's a lot stronger incentive than some paypal donation button or flattr link! The details of how it works are explained on their intro page, or see the ever-insightful and thoughtful Mike Linksvayer's blog post about it.

When I found out about this, I immediately sent them a one-off donation. Later, I got to meet one of the developers face to face in Portland. I've also done a small amount of work on the Snowdrift platform, which is itself free software. (My haskell code will actually render that button above!)

Free software is important, and its funding should be based, not on how lucky or good we are at kickstarter pitches, but on its quality and how useful it is to everyone. Snowdrift is the most interesting thing I've seen in this space, and I really hope they succeed. If you agree, they're running their own crowdfunding campaign right now.

You have a machine someplace, probably in The Cloud, and it has Linux installed, but not to your liking. You want to do a clean reinstall, maybe switching the distribution, or getting rid of the cruft. But this requires running an installer, and it's too difficult to run d-i on remote machines.

Wouldn't it be nice if you could point a program at that machine and have it do a reinstall, on the fly, while the machine was running?

This is what I've now taught propellor to do! Here's a working configuration which will make propellor convert a system running Fedora (or probably many other Linux distros) to Debian:

testvm :: Host
testvm = host "testvm.kitenet.net"
        & os (System (Debian Unstable) "amd64")
        & OS.cleanInstallOnce (OS.Confirmed "testvm.kitenet.net")
                `onChange` propertyList "fixing up after clean install"
                        [ User.shadowConfig True
                        , OS.preserveRootSshAuthorized
                        , OS.preserveResolvConf
                        , Apt.update
                        , Grub.boots "/dev/sda"
                                `requires` Grub.installed Grub.PC
                        ]
        & Hostname.sane
        & Hostname.searchDomain
        & Apt.installed ["linux-image-amd64"]
        & Apt.installed ["ssh"]
        & User.hasSomePassword "root"

And here's a video of it in action.

It was surprisingly easy to build this. Propellor already knew how to create a chroot, so from there it basically just has to move files around until the chroot takes over from the old OS.

After the cleanInstallOnce property does its thing, propellor is running inside a freshly debootstrapped Debian system. Then we just need a few more Propertites to get from there to a bootable, usable system: Install grub and the kernel, turn on shadow passwords, preserve a few config files from the old OS, etc.

It's really astounding to me how much easier this was to build than it was to build d-i. It took years to get d-i to the point of being able to install a working system. It took me a few part days to add this capability to propellor (It's 200 lines of code), and I've probably spent a total of less than 30 days total developing propellor in its entirity.

So, what gives? Why is this so much easier? There are a lot of reasons:

Technology is so much better now. I can spin up cloud VMs for testing in seconds; I use VirtualBox to restore a system from a snapshot. So testing is much much easier. The first work on d-i was done by booting real machines, and for a while I was booting them using floppies.
Propellor doesn't have a user interface. The best part of d-i is preseeding, but that was mostly an accident; when I started developing d-i the first thing I wrote was main-menu (which is invisible 99.9% of the time) and we had to develop cdebconf, and tons of other UI. Probably 90% of d-i work involves the UI. Jettisoning the UI entirely thus speeds up development enormously. And propellor's configuration file blows d-i preseeding out of the water in expressiveness and flexability.
Propellor has a much more principled design and implementation. Separating things into Properties, which are composable and reusable gives enormous leverage. Strong type checking and a powerful programming language make it much easier to develop than d-i's mess of shell scripts calling underpowered busybox commands etc. Properties often Just Work the first time they're tested.
No separate runtime. d-i runs in its own environment, which is really a little custom linux distribution. Developing linux distributions is hard. Propellor drops into a live system and runs there. So I don't need to worry about booting up the system, getting it on the network, etc etc. This probably removes another order of magnitude of complexity from propellor as compared with d-i.

This seems like the opposite of the Second System effect to me. So perhaps d-i was the second system all along?

I don't know if I'm going to take this all the way to propellor is d-i 2.0. But in theory, all that's needed now is:

Teaching propellor how to build a bootable image, containing a live Debian system and propellor. (Yes, this would mean reimplementing debian-live, but I estimate 100 lines of code to do it in propellor; most of the Properties needed already exist.) That image would then be booted up and perform the installation.
Some kind of UI that generates the propellor config file.
Adding Properties to partition the disk.

cleanInstallOnce and associated Properties will be included in propellor's upcoming 1.1.0 release, and are available in git now.

Oh BTW, you could parameterize a few Properties by OS, and Propellor could be used to install not just Debian or Ubuntu, but whatever Linux distribution you want. Patches welcomed...

The Memory Palace: This is the way history should be taught, but rarely is. Nate DiMeo takes past events and puts you in the middle of them, in a way that makes you emphathise so much with people from the past. Each episode is a little short story, and they're often only a few minutes long. A great example is this description of when Niagra falls stopped. I have listened to the entire back archive, and want more. Only downside is it's a looong time between new episodes.
The Haskell Cast: Panel discussion with a guest, there is a lot of expertise amoung them and I'm often scrambling to keep up with the barrage of ideas. If this seems too tame, check out The Type Theory Podcast instead..
Benjamen Walker's Theory of Everything: Only caught 2 episodes so far, but they've both been great. Short, punchy, quirky, geeky. Astoundingly good production values.
Lightspeed magazine and Escape Pod blur together for me. Both feature 20-50 minute science fiction short stories, and occasionally other genre fictions. They seem to get all the award-winning short stories. I sometimes fall asleep to these which can make for strange dreams. Two strongly contrasting examples: "Observations About Eggs from the Man Sitting Next to Me on a Flight from Chicago, Illinois to Cedar Rapids, Iowa" and "Pay Phobetor"
Serial: You probably already know about this high profile TAL spinoff. If you didn't before: You're welcome. :) Nuff said.
Redecentralize: Interviews with creators of decentralized internet tools like Tahoe-LAFS, Ethereum, Media Goblin, TeleHash. I just wish it went into more depth on protocols and how they work.
Love and Radio: This American Life squared and on acid.
Debian & Stuff: My friend Asheesh and that guy I ate Thai food with once in Portland in a marvelously unfocused podcast that somehow connects everything up in the end. Only one episode so far; what are you guys waiting on? :P
Hacker Public Radio: Anyone can upload an episode, and multiple episodes are published each week, which makes this a grab bag to pick and choose from occasionally. While mostly about Linux and Free Software, the best episodes are those that veer var afield, such as the 40 minute river swim recording featured in Wildswimming in France.

Also, out of the podcasts I listed previously, I still listen to and enjoy Free As In Freedom, Off the Hook, and the Long Now Seminars.

PS: A nice podcatcher, for the technically inclined is git-annex importfeed. Featuring list of feeds in a text file, and distributed podcatching!

Inspired by "An ASM Monad", I've built a Haskell monad that produces brainfuck programs. The code for this monad is available on hackage, so cabal install brainfuck-monad.

Here's a simple program written using this monad. See if you can guess what it might do:

import Control.Monad.BrainFuck

demo :: String
demo = brainfuckConstants $ \constants -> do
        add 31
        forever constants $ do
                add 1
                output

Here's the brainfuck code that demo generates: >+>++>+++>++++>+++++>++++++>+++++++>++++++++>++++++++++++++++++++++++++++++++<<<<<<<<[>>>>>>>>+.<<<<<<<<]

If you feed that into a brainfuck interpreter (I'm using hsbrainfuck for my testing), you'll find that it loops forever and prints out each character, starting with space (32), in ASCIIbetical order.

The implementation is quite similar to the ASM monad. The main differences are that it builds a String, and that the BrainFuck monad keeps track of the current position of the data pointer (as brainfuck lacks any sane way to manipulate its instruction pointer).

newtype BrainFuck a = BrainFuck (DataPointer -> ([Char], DataPointer, a))

type DataPointer = Integer

-- Gets the current address of the data pointer.
addr :: BrainFuck DataPointer
addr = BrainFuck $ \loc -> ([], loc, loc)

Having the data pointer address available allows writing some useful utility functions like this one, which uses the next (brainfuck opcode >) and prev (brainfuck opcode <) instructions.

-- Moves the data pointer to a specific address.
setAddr :: Integer -> BrainFuck ()
setAddr n = do
        a <- addr
        if a > n
                then prev >> setAddr n
                else if a < n
                        then next >> setAddr n
                        else return ()

Of course, brainfuck is a horrible language, designed to be nearly impossible to use. Here's the code to run a loop, but it's really hard to use this to build anything useful..

-- The loop is only entered if the byte at the data pointer is not zero.
-- On entry, the loop body is run, and then it loops when
-- the byte at the data pointer is not zero.
loopUnless0 :: BrainFuck () -> BrainFuck ()
loopUnless0 a = do
        open
        a
        close

To tame brainfuck a bit, I decided to treat data addresses 0-8 as constants, which will contain the numbers 0-8. Otherwise, it's very hard to ensure that the data pointer is pointing at a nonzero number when you want to start a loop. (After all, brainfuck doesn't let you set data to some fixed value like 0 or 1!)

I wrote a little brainfuckConstants that runs a BrainFuck program with these constants set up at the beginning. It just generates the brainfuck code for a series of ASCII art fishes: >+>++>+++>++++>+++++>++++++>+++++++>++++++++>

With the fishes^Wconstants in place, it's possible to write a more useful loop. Notice how the data pointer location is saved at the beginning, and restored inside the loop body. This ensures that the provided BrainFuck action doesn't stomp on our constants.

-- Run an action in a loop, until it sets its data pointer to 0.
loop :: BrainFuck () -> BrainFuck ()
loop a = do
    here <- addr
    setAddr 1
    loopUnless0 $ do
        setAddr here
        a

I haven't bothered to make sure that the constants are really constant, but that could be done. It would just need a Control.Monad.BrainFuck.Safe module, that uses a different monad, in which incr and decr and input don't do anything when the data pointer is pointing at a constant. Or, perhaps this could be statically checked at the type level, with type level naturals. It's Haskell, we can make it safer if we want to. ;)

So, not only does this BrainFuck monad allow writing brainfuck code using crazy haskell syntax, instead of crazy brainfuck syntax, but it allows doing some higher-level programming, building up a useful(!?) library of BrainFuck combinators and using them to generate brainfuck code you'd not want to try to write by hand.

Of course, the real point is that "monad" and "brainfuck" so obviously belonged together that it would have been a crime not to write this.

Shell script is the lingua franca of Unix, it's available everywhere and often the only reasonable choice to Get Stuff Done. But it's also clumsy and it's easy to write unsafe shell scripts, that forget to quote variables, typo names of functions, etc.

Wouldn't it be nice if we could write code in some better language, that generated nicely formed shell scripts and avoided such gotchas? Today, I've built a Haskell monad that can generate shell code.

Here's a fairly involved example. This demonstrates several features, including the variadic cmd, the ability to define shell functions, to bind and use shell variables, to build pipes (with the -:- operator), and to factor out generally useful haskell functions like pipeLess and promptFor ...

santa = script $ do
    hohoho <- func $
        cmd "echo" "Ho, ho, ho!" "Merry xmas!"
    hohoho

    promptFor "What's your name?" $ \name -> pipeLess $ do
        cmd "echo" "Let's see what's in" (val name <> quote "'s") "stocking!"
        forCmd (cmd "ls" "-1" (quote "/home/" <> val name)) $ \f -> do
            cmd "echo" "a shiny new" f
            hohoho

    cmd "rm" "/table/cookies" "/table/milk"
    hohoho

pipeLess :: Script () -> Script ()
pipeLess c = c -|- cmd "less"

promptFor :: T.Text -> (Var -> Script ()) -> Script ()
promptFor prompt cont = do
    cmd "printf" (prompt <> " ")
    var <- newVar "prompt"
    readVar var
    cont var

When run, that haskell program generates this shell code. Which, while machine-generated, has nice indentation, and is generally pretty readable.

#!/bin/sh
f1 () { :
    echo 'Ho, ho, ho!' 'Merry xmas!'
}
f1
printf 'What'"'"'s your name?  '
read '_prompt1'
(
    echo 'Let'"'"'s see what'"'"'s in' "$_prompt1"''"'"'s' 'stocking!'
    for _x1 in $(ls '-1' '/home/'"$_prompt1")
    do :
        echo 'a shiny new' "$_x1"
        f1
    done
) | (
    less
)
rm '/table/cookies' '/table/milk'
f1

Santa has already uploaded shell-monad to hackage and git.

There's a lot of things that could be added to this library (if, while, redirection, etc), but I can already see using it in various parts of propellor and git-annex that need to generate shell code.

I have been hard at work on the shell-monad ever since it was born on Christmas Eve. It's now up to 820 lines of code, and has nearly comprehensive coverage of all shell features.

Time to use it for something interesting! Let's make a shell script and a haskell program that both speak a simple protocol. This kind of thing could be used by propellor when it's deploying itself to a new host. The haskell program can ssh to a remote host and run the shell program, and talk back and forth over stdio with it, using the protocol they both speak.

abstract beginnings

First, we'll write a data type for the commands in the protocol.

data Proto
    = Foo String
    | Bar
    | Baz Integer
    deriving (Show)

Now, let's go type class crazy!

class Monad t => OutputsProto t where
    output :: Proto -> t ()

instance OutputsProto IO where
    output = putStrLn . fromProto

So far, nothing interesting; this makes the IO monad an instance of the OutputsProto type class, and gives a simple implementation to output a line of the protocol.

instance OutputsProto Script where
    output = cmd "echo" . fromProto

Now it gets interesting. The Script monad is now also a member of the OutputsProto. To output a line of the protocol, it just uses echo. Yeah -- shell code is a member of a haskell type class. Awesome -- most abstract shell code evar!

Similarly, we can add another type class for monads that can input the protocol:

class Monad t => InputsProto t p where
    input :: t p

instance InputsProto IO Proto where
    input = toProto <$> readLn

instance InputsProto Script Var where
    input = do
        v <- newVar ()
        readVar v
        return v

While the IO version reads and deserializes a line back to a Proto, the shell script version of this returns a Var, which has the newly read line in it, not yet deserialized. Why the difference? Well, Haskell has data types, and shell does not ...

speaking the protocol

Now we have enough groundwork to write haskell code in the IO monad that speaks the protocol in arbitrary ways. For example:

protoExchangeIO :: Proto -> IO Proto
protoExchangeIO p = do
    output p
    input

fooIO :: IO ()
fooIO = do
    resp <- protoExchangeIO (Foo "starting up")
    -- etc

But that's trivial and uninteresting. Anyone who has read to here certianly knows how to write haskell code in the IO monad. The interesting part is making the shell program speak the protocol, including doing various things when it receives the commands.

foo :: Script ()
foo = do
    stopOnFailure True
    handler <- func (NamedLike "handler") $
        handleProto =<< input
    output (Foo "starting up")
    handler
    output Bar
    handler

handleFoo :: Var -> Script ()
handleFoo v = toStderr $ cmd "echo" "yay, I got a Foo" v

handleBar :: Script ()
handleBar = toStderr $ cmd "echo" "yay, I got a Bar"

handleBaz :: Var -> Script ()
handleBaz num = forCmd (cmd "seq" (Val (1 :: Int)) num) $
    toStderr . cmd "echo" "yay, I got a Baz"

serialization

I've left out a few serialization functions. fromProto is used in both instances of OutputsProto. The haskell program and the script will both use this to serialize Proto.

fromProto :: Proto -> String
fromProto (Foo s) = pFOO ++ " " ++ s
fromProto Bar = pBAR ++ " "
fromProto (Baz i) = pBAZ ++ " " ++ show i

pFOO, pBAR, pBAZ :: String
(pFOO, pBAR, pBAZ) = ("FOO", "BAR", "BAZ")

And here's the haskell function to convert the other direction, which was also used earlier.

toProto :: String -> Proto
toProto s = case break (== ' ') s of
    (w, ' ':rest)
        | w == pFOO -> Foo rest
        | w == pBAR && null rest -> Bar
        | w == pBAZ -> Baz (read rest)
        | otherwise -> error $ "unknown protocol command: " ++ w
    (_, _) -> error "protocol splitting error"

We also need a version of that written in the Script monad. Here it is. Compare and contrast the function below with the one above. They're really quite similar. (Sadly, not so similar to allow refactoring out a common function..)

handleProto :: Var -> Script ()
handleProto v = do
    w <- getProtoCommand v
    let rest = getProtoRest v
    caseOf w
        [ (quote (T.pack pFOO), handleFoo =<< rest)
        , (quote (T.pack pBAR), handleBar)
        , (quote (T.pack pBAZ), handleBaz =<< rest)
        , (glob "*", do
            toStderr $ cmd "echo" "unknown protocol command" w
            cmd "false"
          )
        ]

Both toProto and handleProto split the incoming line apart into the first word, and the rest of the line, then match the first word against the commands in the protocol, and dispatches to appropriate actions. So, how do we split a variable apart like that in the Shell monad? Like this...

getProtoCommand :: Var -> Script Var
getProtoCommand v = trimVar LongestMatch FromEnd v (glob " *")

getProtoRest :: Var -> Script Var
getProtoRest v = trimVar ShortestMatch FromBeginning v (glob "[! ]*[ ]")

(This could probably be improved by using a DSL to generate the globs too..)

conclusion

And finally, here's a main to generate the shell script!

main :: IO ()
main = T.writeFile "protocol.sh" $ script foo

The pretty-printed shell script that produces is not very interesting, but I'll include it at the end for completeness. More interestingly for the purposes of sshing to a host and running the command there, we can use linearScript to generate a version of the script that's all contained on a single line. Also included below.

I could easily have written the pretty-printed version of the shell script in twice the time that it took to write the haskell program that generates it and also speaks the protocol itself.

I would certianly have had to test the hand-written shell script repeatedly. Code like for _x in $(seq 1 "${_v#[!\ ]*[\ ]}") doesn't just write and debug itself. (Until now!)

But, the generated scrpt worked 100% on the first try! Well, it worked as soon as I got the Haskell program to compile...

But the best part is that the Haskell program and the shell script don't just speak the same protocol. They both rely on the same definition of Proto. So this is fairly close to the kind of type-safe protocol serialization that Fay provides, when compiling Haskell to javascript.

I'm getting the feeling that I won't be writing too many nontrivial shell scripts by hand anymore! :)

the complete haskell program

Is here, all 99 lines of it.

the pretty-printed shell program

#!/bin/sh
set -x
_handler () { :
    _v=
    read _v
    case "${_v%%\ *}" in FOO) :
        echo 'yay, I got a Foo' "${_v#[!\ ]*[\ ]}" >&2
    : ;; BAR) :
        echo 'yay, I got a Bar' >&2
    : ;; BAZ) :
        for _x in $(seq 1 "${_v#[!\ ]*[\ ]}")
        do :
            echo 'yay, I got a Baz' "$_x" >&2
        done
    : ;; *) :
        echo 'unknown protocol command' "${_v%%\ *}" >&2
        false
    : ;; esac
}
echo 'FOO starting up'
_handler
echo 'BAR '
_handler

the one-liner shell program

set -p; _handler () { :;    _v=;    read _v;    case "${_v%%\ *}" in FOO) :;        echo 'yay, I got a Foo' "${_v#[!\ ]*[\ ]}" >&2;     : ;; BAR) :;        echo 'yay, I got a Bar' >&2;    : ;; BAZ) :;        for _x in $(seq 1 "${_v#[!\ ]*[\ ]}");      do :;           echo 'yay, I got a Baz' "$_x" >&2;      done;   : ;; *) :;      echo 'unknown protocol command' "${_v%%\ *}" >&2;       false;  : ;; esac; }; echo 'FOO starting up'; _handler; echo 'BAR '; _handler

←	Dec 2014					→
S	M	T	W	T	F	S
	1 snowdrift - sustainable crowdfunding for free software development	2	3	4	5 clean OS reinstalls with propellor	6
7	8	9 podcasts that don't suck, 2014 edition	10	11	12 a brainfuck monad	13
14	15	16	17	18	19	20
21	22	23	24 generating shell scripts from haskell using a shell monad	25	26 shell monad day 3	27
28	29	30	31