Couchdb came onto my radar since distributed stuff is interesting to me these days. But most of what was being written about it put me off, since it seemed to be very web-oriented, with javascript and html and stuff stored in the database, served right out of it to web browsers in an AJAXy mess.

Also, it's a database. I decided a long, long time ago not to mess with traditional databases. (They're great, they're just not great for me. Said the guy leaving after 5 years in the coal mines.)

Then I saw Damien Katz's talk about how he gave up everything to go off and create couchdb. Was very inspirational. Seemed it must be worth another look, with that story behind it.

Now I'm reading the draft O'Rielly book, like some things, as expected don't like others[1], and am not sure what to think overall (plus still have half the book to get through yet), but it has spurred some early thoughts:

... vs DVCS

Couchdb is very unlike a distributed VCS, and yet it's moved from traditional database country much closer to VCS land. It's document oriented, not normalized; the data stored in it has significant structure, but is also in a sense freeform. It doesn't necessarily preserve all history, but it does support multiple branches, merging, and conflict resolution.

Oddly, the thing I dislike most about it is possibly its biggest strength compared to a VCS, and that is that code is stored in the database alongside the data. That means that changes to the data can trigger processing, so it is mapped, reduced, views are updated, etc, on demand. This is done using code that is included in the database, and so is always available, and runs in an environment couchdb provides -- so replicating the database automatically deploys it.

Compare with a VCS, where anything that is triggered by changes to the data is tacked onto the side in hooks, has to be manually set up, and so is poorly integrated overall.

Basically, what I've been doing with ikiwiki is adding some smarts about handling a particular kind of data, on top of the VCS. But this is done via a few narrow hooks; cloning the VCS repository does not get you a wiki set up and ready to go.

There are good reasons why cloning a VCS repository does not clone the hooks associated with it. The idea of doing so seems insane; how could you trust those hooks? How could they work when cloned to another environment? And so that's Never Been Done[2]. But with couchdb's example, this is looking to me like a blind spot, that has probably stunted the range of things VCSs are used for.

If you feel, like I do, that it's great we have these amazing distributed VCSs, with so many advanced capabilities, but a shame that they're only used by software developers, then that is an exciting thought.


[1] Javascript? Mixed all in a database with data it runs on? Imperative code that's supposed to be side-effect free? (I assume the Haskell guys have already been all over that.) Code stored without real version control? Still having a hard time with this. :)

[2] I hope someone will give a counterexample of a VCS that does so in the comments?

BitKeeper

BitKeeper has triggers, pretty similar to git hooks, yet they're tracked and are part of the repository. Triggers are used where I work to enforce some software development policies.

When I first started using git I was surprised that hooks are not managed the same way as in bk.

But triggers can be quite a nuisance - I found myself adding exit statements to buggy trigger scripts (both on the local and remote repositories) just so I could pull/push stuff.

This behavior in itself may be considered as a security/policy enforcement problem: the local triggers (with uncommitted modification) take precedence over remote triggers. I think that bk has a notion of permissions regarding triggers, but it's not the way we have it setup at work, so it may just be a configuration issue, rather than a design flaw.

Comment by machine-cycle [blogspot.com]
Comment

"Javascript? Mixed all in a database with data it runs on? Imperative code that's supposed to be side-effect free? (I assume the Haskell guys have already been all over that.) Code stored without real version control? Still having a hard time with this."

JavaScript is the lingua franca of the web. And storing it in the database only seems weird because the usual understanding of the word brings to mind MySQL or other relational systems, where the data is highly structured. If you think of CouchDB more like a filesystem for the Web, it starts to make sense. Say you built a simple JavaScript app using flat files, a directory, and an Apache virtual host - you wouldn't think twice about keeping your JavaScript in the same directory as your images, or your HTML, or the SQLite database you might be writing to. It just so happens that CouchDB exposes a bunch of cool stuff you can do with the data it stores. Imagine a filesystem that let you run map/reduce jobs over it! Also, with regard to the version control stuff - check out CouchApp. We cover it in the book! It lets you keep the code locally, in a VCS if you choose, and "sync" with a CouchDB instance.

Comment by nslater [claimid.com]
Re: Comment

There is a significant difference between keeping some javascript files in a directory, VCS, or even in a table in a database, and running them in the client web browser (where as you note, it's lingua franca); and keeping javascript files in the same place and using them to control the filesystem, or the VCS, or the database, or whatever you want to call it. I probably did conflate dislike of both of these cases in my blog. I have accepted that CouchDB can be considered a type of filesystem, so I have no grounds left to dislike the first, but am still having a visceral dislike of the second.

(And ya, I know about CouchApp now, having mostly finished the book, but I'm not convinced that it's not slightly papering over a underlying hole that this sort of information that should be version controlled is being kept in a database that does not offer complete revision control.)

Comment by joey [kitenet.net]