shell monad day 3

I have been hard at work on the shell-monad ever since it was born on Christmas Eve. It's now up to 820 lines of code, and has nearly comprehensive coverage of all shell features.

Time to use it for something interesting! Let's make a shell script and a haskell program that both speak a simple protocol. This kind of thing could be used by propellor when it's deploying itself to a new host. The haskell program can ssh to a remote host and run the shell program, and talk back and forth over stdio with it, using the protocol they both speak.

abstract beginnings
speaking the protocol
serialization
conclusion
the complete haskell program
the pretty-printed shell program
the one-liner shell program

abstract beginnings

First, we'll write a data type for the commands in the protocol.

data Proto
    = Foo String
    | Bar
    | Baz Integer
    deriving (Show)

Now, let's go type class crazy!

class Monad t => OutputsProto t where
    output :: Proto -> t ()

instance OutputsProto IO where
    output = putStrLn . fromProto

So far, nothing interesting; this makes the IO monad an instance of the OutputsProto type class, and gives a simple implementation to output a line of the protocol.

instance OutputsProto Script where
    output = cmd "echo" . fromProto

Now it gets interesting. The Script monad is now also a member of the OutputsProto. To output a line of the protocol, it just uses echo. Yeah -- shell code is a member of a haskell type class. Awesome -- most abstract shell code evar!

Similarly, we can add another type class for monads that can input the protocol:

class Monad t => InputsProto t p where
    input :: t p

instance InputsProto IO Proto where
    input = toProto <$> readLn

instance InputsProto Script Var where
    input = do
        v <- newVar ()
        readVar v
        return v

While the IO version reads and deserializes a line back to a Proto, the shell script version of this returns a Var, which has the newly read line in it, not yet deserialized. Why the difference? Well, Haskell has data types, and shell does not ...

speaking the protocol

Now we have enough groundwork to write haskell code in the IO monad that speaks the protocol in arbitrary ways. For example:

protoExchangeIO :: Proto -> IO Proto
protoExchangeIO p = do
    output p
    input

fooIO :: IO ()
fooIO = do
    resp <- protoExchangeIO (Foo "starting up")
    -- etc

But that's trivial and uninteresting. Anyone who has read to here certianly knows how to write haskell code in the IO monad. The interesting part is making the shell program speak the protocol, including doing various things when it receives the commands.

foo :: Script ()
foo = do
    stopOnFailure True
    handler <- func (NamedLike "handler") $
        handleProto =<< input
    output (Foo "starting up")
    handler
    output Bar
    handler

handleFoo :: Var -> Script ()
handleFoo v = toStderr $ cmd "echo" "yay, I got a Foo" v

handleBar :: Script ()
handleBar = toStderr $ cmd "echo" "yay, I got a Bar"

handleBaz :: Var -> Script ()
handleBaz num = forCmd (cmd "seq" (Val (1 :: Int)) num) $
    toStderr . cmd "echo" "yay, I got a Baz"

serialization

I've left out a few serialization functions. fromProto is used in both instances of OutputsProto. The haskell program and the script will both use this to serialize Proto.

fromProto :: Proto -> String
fromProto (Foo s) = pFOO ++ " " ++ s
fromProto Bar = pBAR ++ " "
fromProto (Baz i) = pBAZ ++ " " ++ show i

pFOO, pBAR, pBAZ :: String
(pFOO, pBAR, pBAZ) = ("FOO", "BAR", "BAZ")

And here's the haskell function to convert the other direction, which was also used earlier.

toProto :: String -> Proto
toProto s = case break (== ' ') s of
    (w, ' ':rest)
        | w == pFOO -> Foo rest
        | w == pBAR && null rest -> Bar
        | w == pBAZ -> Baz (read rest)
        | otherwise -> error $ "unknown protocol command: " ++ w
    (_, _) -> error "protocol splitting error"

We also need a version of that written in the Script monad. Here it is. Compare and contrast the function below with the one above. They're really quite similar. (Sadly, not so similar to allow refactoring out a common function..)

handleProto :: Var -> Script ()
handleProto v = do
    w <- getProtoCommand v
    let rest = getProtoRest v
    caseOf w
        [ (quote (T.pack pFOO), handleFoo =<< rest)
        , (quote (T.pack pBAR), handleBar)
        , (quote (T.pack pBAZ), handleBaz =<< rest)
        , (glob "*", do
            toStderr $ cmd "echo" "unknown protocol command" w
            cmd "false"
          )
        ]

Both toProto and handleProto split the incoming line apart into the first word, and the rest of the line, then match the first word against the commands in the protocol, and dispatches to appropriate actions. So, how do we split a variable apart like that in the Shell monad? Like this...

getProtoCommand :: Var -> Script Var
getProtoCommand v = trimVar LongestMatch FromEnd v (glob " *")

getProtoRest :: Var -> Script Var
getProtoRest v = trimVar ShortestMatch FromBeginning v (glob "[! ]*[ ]")

(This could probably be improved by using a DSL to generate the globs too..)

conclusion

And finally, here's a main to generate the shell script!

main :: IO ()
main = T.writeFile "protocol.sh" $ script foo

The pretty-printed shell script that produces is not very interesting, but I'll include it at the end for completeness. More interestingly for the purposes of sshing to a host and running the command there, we can use linearScript to generate a version of the script that's all contained on a single line. Also included below.

I could easily have written the pretty-printed version of the shell script in twice the time that it took to write the haskell program that generates it and also speaks the protocol itself.

I would certianly have had to test the hand-written shell script repeatedly. Code like for _x in $(seq 1 "${_v#[!\ ]*[\ ]}") doesn't just write and debug itself. (Until now!)

But, the generated scrpt worked 100% on the first try! Well, it worked as soon as I got the Haskell program to compile...

But the best part is that the Haskell program and the shell script don't just speak the same protocol. They both rely on the same definition of Proto. So this is fairly close to the kind of type-safe protocol serialization that Fay provides, when compiling Haskell to javascript.

I'm getting the feeling that I won't be writing too many nontrivial shell scripts by hand anymore! :)

the complete haskell program

Is here, all 99 lines of it.

the pretty-printed shell program

#!/bin/sh
set -x
_handler () { :
    _v=
    read _v
    case "${_v%%\ *}" in FOO) :
        echo 'yay, I got a Foo' "${_v#[!\ ]*[\ ]}" >&2
    : ;; BAR) :
        echo 'yay, I got a Bar' >&2
    : ;; BAZ) :
        for _x in $(seq 1 "${_v#[!\ ]*[\ ]}")
        do :
            echo 'yay, I got a Baz' "$_x" >&2
        done
    : ;; *) :
        echo 'unknown protocol command' "${_v%%\ *}" >&2
        false
    : ;; esac
}
echo 'FOO starting up'
_handler
echo 'BAR '
_handler

the one-liner shell program

set -p; _handler () { :;    _v=;    read _v;    case "${_v%%\ *}" in FOO) :;        echo 'yay, I got a Foo' "${_v#[!\ ]*[\ ]}" >&2;     : ;; BAR) :;        echo 'yay, I got a Bar' >&2;    : ;; BAZ) :;        for _x in $(seq 1 "${_v#[!\ ]*[\ ]}");      do :;           echo 'yay, I got a Baz' "$_x" >&2;      done;   : ;; *) :;      echo 'unknown protocol command' "${_v%%\ *}" >&2;       false;  : ;; esac; }; echo 'FOO starting up'; _handler; echo 'BAR '; _handler