concurrent-output is a Haskell library I've developed this week, to make it easier to write console programs that do a lot of different things concurrently, and want to serialize concurrent outputs sanely.

It's increasingly easy to write concurrent programs, but all their status reporting has to feed back through the good old console, which is still obstinately serial.

Haskell illustrates problem this well with this "Linus's first kernel" equivilant interleaving the output of 2 threads:

> import System.IO
> import Control.Concurrent.Async
> putStrLn (repeat 'A') `concurrently` putStrLn (repeat 'B')
BABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABA
BABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABA
...

That's fun, but also horrible if you wanted to display some messages to the user:

> putStrLn "washed the car" `concurrently` putStrLn "walked the dog"
walwkaesdh etdh et hdeo gc
ar

To add to the problem, we often want to run separate programs concurrently, which have output of their own to display. And, just to keep things interesting, sometimes a unix program will behave differently when stdout is not connected to a terminal (eg, ls | cat).

To tame simple concurrent programs like these so they generate readable output involves a lot of plumbing. Something like, run the actions concurrently, taking care to capture the output of any commands, and then feed the output that the user should see though some sort of serializing channel to the display. Dealing with that when you just wanted a simple concurrent program risks ending up with a not-so-simple program.

So, I wanted an library with basically 2 functions:

outputConcurrent :: String -> IO ()
    
createProcessConcurrent :: CreateProcess -> IO whatever

The idea is, you make your program use outputConcurrent to display all its output, and each String you pass to that will be displayed serially, without getting mixed up with any other concurrent output.

And, you make your program use createProcessConcurrent everywhere it starts a process that might output to stdout or stderr, and it'll likewise make sure its output is displayed serially.

Oh, and createProcessConcurrent should avoid redirecting stdout and stderr away from the console, when no other concurrent output is happening. So, if programs are mostly run sequentially, they behave as they normally would at the console; any behavior changes should only occur when there is concurrency. (It might also be nice for it to allocate ttys and run programs there to avoid any behavior changes at all, although I have not tried to do that.)

And that should be pretty much the whole API, although it's ok if it needs some function called by main to set it up:

import Control.Concurrent.Async
import System.Console.Concurrent
import System.Process

main = withConcurrentOutput $
    outputConcurrent "washed the car\n"
        `concurrently`
    createProcessConcurrent (proc "ls" [])
        `concurrently`
    outputConcurrent "walked the dog\n"
$ ./demo
washed the car
walked the dog
Maildir/  bin/  doc/  html/  lib/  mail/  mnt/  src/  tmp/

I think that's a pretty good API to deal with this concurrent output problem. Anyone know of any other attempts at this I could learn from?

I implemented this over the past 3 days and 320 lines of code. It got rather hairy:

  • It has to do buffering of the output.
  • There can be any quantity of output, but program memory use should be reasonably small. Solved by buffering up to 1 mb of output in RAM, and writing excess buffer to temp files.
  • Falling off the end of the program is complicated; there can be buffered output to flush and it may have to wait for some processes to finish running etc.
  • The locking was tough to get right! I could not have managed to write it correctly without STM.

It seems to work pretty great though. I got Propellor using it, and Propellor can now run actions concurrently!