concurrent-output is a Haskell library I've developed this week, to make it easier to write console programs that do a lot of different things concurrently, and want to serialize concurrent outputs sanely.
It's increasingly easy to write concurrent programs, but all their status reporting has to feed back through the good old console, which is still obstinately serial.
Haskell illustrates problem this well with this "Linus's first kernel" equivilant interleaving the output of 2 threads:
> import System.IO > import Control.Concurrent.Async > putStrLn (repeat 'A') `concurrently` putStrLn (repeat 'B') BABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABA BABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABABA ...
That's fun, but also horrible if you wanted to display some messages to the user:
> putStrLn "washed the car" `concurrently` putStrLn "walked the dog" walwkaesdh etdh et hdeo gc ar
To add to the problem, we often want to run separate programs concurrently,
which have output of their own to display. And, just to keep things
interesting, sometimes a unix program will behave differently when stdout
is not connected to a terminal (eg,
ls | cat).
To tame simple concurrent programs like these so they generate readable output involves a lot of plumbing. Something like, run the actions concurrently, taking care to capture the output of any commands, and then feed the output that the user should see though some sort of serializing channel to the display. Dealing with that when you just wanted a simple concurrent program risks ending up with a not-so-simple program.
So, I wanted an library with basically 2 functions:
outputConcurrent :: String -> IO () createProcessConcurrent :: CreateProcess -> IO whatever
The idea is, you make your program use
outputConcurrent to display
all its output, and each String you pass to that will be displayed serially,
without getting mixed up with any other concurrent output.
And, you make your program use
createProcessConcurrent everywhere it
starts a process that might output to stdout or stderr, and it'll likewise
make sure its output is displayed serially.
createProcessConcurrent should avoid redirecting stdout and
stderr away from the console, when no other concurrent output is happening.
So, if programs are mostly run sequentially, they behave as they normally
would at the console; any behavior changes should only occur when
there is concurrency. (It might also be nice for it to allocate
ttys and run programs there to avoid any behavior changes at all,
although I have not tried to do that.)
And that should be pretty much the whole API, although it's ok if it needs some function called by main to set it up:
import Control.Concurrent.Async import System.Console.Concurrent import System.Process main = withConcurrentOutput $ outputConcurrent "washed the car\n" `concurrently` createProcessConcurrent (proc "ls" ) `concurrently` outputConcurrent "walked the dog\n"
$ ./demo washed the car walked the dog Maildir/ bin/ doc/ html/ lib/ mail/ mnt/ src/ tmp/
I think that's a pretty good API to deal with this concurrent output problem. Anyone know of any other attempts at this I could learn from?
I implemented this over the past 3 days and 320 lines of code. It got rather hairy:
- It has to do buffering of the output.
- There can be any quantity of output, but program memory use should be reasonably small. Solved by buffering up to 1 mb of output in RAM, and writing excess buffer to temp files.
- Falling off the end of the program is complicated; there can be buffered output to flush and it may have to wait for some processes to finish running etc.
- The locking was tough to get right! I could not have managed to write it correctly without STM.
It seems to work pretty great though. I got Propellor using it, and Propellor can now run actions concurrently!