Three thousand lines of code is not a huge program, but it is enough to get a pretty good feel for a language. Now that I've completed my first real Haskell program I feel that I've gotten over several of the humps in the learning curve and am starting to get a good feel for it.

Actually, I've written closer to five thousand lines, since there were several big refactorings. One was when I stopped manually threading my program state around and added a StateT monad. I did know from the beginning I would need one, but it seemed easier and a better learning exercise to let the program start out with a vesigial tail and gills before growing up into a modern Haskell program. (I suppose it's still written in baby-Haskell, really..)

Another refactoring came when I realized I needed to use a custom data type, not String, to represent keys. That was a great experience in type-based refactoring. Being able to keep typing ':make' and landing on the next bit of code that needed fixing was great, and simply adding that one type exposed several non-obvious bugs.

I found myself writing code that is much more solid and reusable than normally comes easily. And yet it's also very malleable. Actually, pulling out better data types and abstractions can get a bit addictive.

When I realized that I had a similar three-stage control flow being used for each of git-annex's subcommands, and factored that control flow out into a function that used the 3 data types below, I felt I'd gone down that rabbit hole perhaps far enough for now.

type SubCmdStart = String -> Annex (Maybe SubCmdPerform)
type SubCmdPerform = Annex (Maybe SubCmdCleanup)
type SubCmdCleanup = Annex Bool

(That will allow for some nice parallelism later though, and removed dozens of lines of code, so was worth it.)

Since git-annex is a very Real World Haskell type program, there is a lot of impure code in it. I could probably do better at factoring out more pure code. I count 117 impure functions, and only 37 pure.

Anyhow, from my perspective of a long-time perl programmer, some other random impressions..

  • ghc --make is handy, but every time it spits out a new 13 mb executable I can feel my laptop's SSD groan!
  • It was surpisingly easy to get into nasty situations with recursive dependencies between the 19 haskell modules I wrote. Sometimes solving them was really messy. I lost hours to this. More time than I've lost to the problem in all other languages combined over 15 years. It's not clear to me if it was due to the overall design of my program, or if Haskell's types tend to encourage this problem. Or if there's some simple "please let me have recursive dependencies" switch to ghc that I missed..
  • I'm used to being able to use man to get at mutiple books worth of detailed documentation for perl, and work easily offline or with limited bandwidth. With Haskell, I spend much more time searching online for documentation than I an comfortable with (although Hoogle is pretty neat). And the haddock-produced documentation is often pretty sketchy. The saving grace is that the source to any library function is a click away, and tends to be very readable.
  • I'm used to being able to use pretty much any Unix syscall by name from perl: mkdir, chmod, rename, etc. In Haskell, there is a Windows smell to the names, like createDirectoryIfMissing and setPermissions. And there are pointless distinctions like renameFile vs renameDirectory. These long names are not memorable and I have to look them up every time. Most of POSIX is available, but it's scattered amoung many disparate libraries, and I can't find an interface for sysconf(3) at all. There is a certian temptation, that I am so far resisting, to make a library for C/perl refugees that exports the sane Unix names for everything.
  • Anything involving the IO monad, or probably most monads, has a certian level of syntactic clumsiness about it. Compare:
      if ($flag{foo} && length $l = <>) {
    
    vs
      foo <- getFlag "foo"
      l <- getLine
      if (foo && not $ null l)
          then do
    
    When writing lots of impure code, that got old, and while I could use ifM, or make up some other similar thing, its syntax would also be somewhat clumsy.
  • The fixity levels for a lot of stuff seems a bit off. I too often found myself writing error $ "foo: " ++ (show bar) or return $ Just $ ... (Still a lot better than Scheme thanks to $!)
  • I've leveled up a couple times now, but this particular video game seems to have more levels going up and up, forever. Can't even see the top from here!