I keep my life in a subversion repository. For the past five years, every file I've created and worked on, every email I've sent or received, and every config file I've tweaked have all been checked into revision control. Five years ago when I started doing this, using CVS, people thought I was nuts to use revision control in this way. Today it's still not a common practice, but thanks to my earlier article, CVS homedir (Linux Journal issue 101), I know I'm not alone. In this article I will describe how my new home directory setup is working, now that I've switched from CVS to subversion.
Subversion is a revision control system, and like the earlier and much
cruftier CVS, it's meant to be used for managing chunks of code, such as
free software programs that are worked on by many developers, or in-house
software projects that are collaborated on by several employees. Unlike
CVS, subversion has reasonable handling of directories and file renaming,
which is more than sufficient reason to switch to it if you're already
using CVS, and most of CVS's other misfeatures are also fixed. But
subversion still has its warts, such as an inability to store some file
permissions and its need for twice as much disk space as you'd expect
thanks to the copies of everything in those
directories. These problems can be quite annoying when you're keeping your
whole home directory in svn. So why bother?
I see three main benefits of keeping my entire home directory in svn:
- home directory replication
- distributed backups
The first of these is what originally drove me to using revision control for my whole home directory, and is still the greatest benefit of it today. I have many accounts on unix machines scattered around my house, the country, and the planet, and I have an abiding desire for every single one of these disparate accounts to work and look exactly the same. I don't care if the machine I'm logging into is in Japan or the Netherlands, or a California co-location center, or my home office; I don't care if it's a PC clone, or a Mac, or a S/390 virtual machine; if it's not set up the same as all the others, if I cannot concentrate on the important differences instead of being distracted by the unimportant differences, then I will be less productive. The final ingredient for configuration insanity is that I'm constantly tweaking my setup, and as soon as I make an improvement, I want it to be available on every one of my accounts, everywhere. Without subversion, keeping all these accounts in sync would be well-nigh impossible. With subversion it's as easy as typing "svn up" now and then.
It seems that the next big change in how we use computers might be the introduction of filesystems that store every old version of every file. With the explosion in size of cheap hard disks there seems to be no reason not to keep a complete record of your computing life, and several research projects are working on it. Meanwhile, I've been doing just that for five years, using first CVS and now svn. It's amazing to check out my home directory as it looked on Christmas day, 1999, and play around in it. It's neat to be able to look at the entire revision history of my .procmailrc, and watch as I moved mail around, dealt with the growing spam problem, and joined and left many mailing lists. It's handy to be able to run "svn diff" on my kernel config file to see how "make xconfig" changed it. I can recover files that I've deleted, or delete files because they're not relevant right now, and know I've not really lost them at all. Amazingly my subversion repository is only 4 gigabytes in size for all this historical data.
I have not lost a file since 1999. And I don't intend to, ever again. Take one crucial file, like my resume or sent-mail archive. I have a copy of that file on my desktop computer in the .svn directory. There's another copy on my home directory on my laptop, and yet another copy in the subversion repository on my server thousands of miles away. I'm told that the best backups are done without effort -- so you actually do them -- and are widely scattered among many machines and a lot of area -- so a local disaster doesn't knock them out -- and are tested on a regular basis -- to make sure the backup works. I'm doing all of these things, as a mere side effect of keeping it all in subversion. To complete the picture, I only have to be sure to take very careful backups of my subversion repository itself. The automated distributed backups via svn keep me sleeping quietly at night -- I know that no matter what I do my life will still be there, safe and secure in svn.
At this point I should fess up to my dirty little secret: Not everything is in svn after all. My full home directory with all the trimmings often runs to the dozens of gigabytes. Much of that is collections of music files and documentation, which I have not yet dared to check into svn, and which I rsync between computers. As disk sizes continue to grow, it's looking more and more likely that I will take the plunge soon and check these large file libraries into svn too. Then too I have the occasional file, such as a disk image for a virtual machine, which is too large and much bother to check into svn. And my incoming mailboxes are not kept in svn because that would lead to a merging nightmare -- instead I use offlineimap to keep them synchronized between several computers. The mail archives do get checked into svn, by a cron job. A few other missing corners include my web browser cache, which I would love to have a history of, and my temporary directory, which I'd rather not.
I have made some progress recently in moving more things into svn. I've managed to check the /etc directories of several machines into svn. While this is of questionable value as a way to replicate those machines, and some files like /etc/shadow are not included, it's useful to be able to check old versions of config files. I've also come up with a way to check crontabs into svn. This is a great improvement, since I can edit and view any machine's cron jobs from anywhere, and have all the history and backup benefits of svn. I'm sure that my use of svn will only increase as I find ways to use it in the odd little corners that remain. Yesterday I even found myself checking baby photos into svn for my family's website.
Time to get down to the details of how I organize my home directory in svn. I speak of my svn repository, but I actually have several repositories. First there's the public one, which holds most of the less private parts of my home directory, and lots of software projects. You can even browse the contents of this directory on the web at svn.kitenet.net, or check it out anonymously from svn://svn.kitenet.net/joey/. Next I have a private repository that holds things like my email archives, and I have several other small special purpose repositories. I also work on other projects which are themselves kept in svn, on other servers. A full checkout of my home directory will include parts from all of these repositories; the svn:externals feature of svn lets me knit them all together into a whole that I can check out or update with a single command.
I've always managed my home directory with an iron hand, and keeping files in revision control has only exacerbated this tendency. Let's look at the top level:
joey@dragon:~>ls Maildir/ bin/ doc/ html/ lib/ mail/ src/ tmp/
That really is everything, except for 100+ dot-files. Most people use their home directory a cluttered scratch space for files they're working on, and subversion is more suited to this kind of work than CVS, since it lets you easily rename and move files and directories. My tightly controlled home directory is partly personal preference and partly a leftover from my days as a CVS user. Keeping a home directory in subversion does encourage some neatness, since svn will complain about files that are not checked in. This encourages keeping things organized, or out of the way in a temporary directory.
Since my home directory is publically available online, I have to take care to keep private files private, and one tricky thing is private dotfiles. These need to be in my home directory, but I can't keep them in the public repository. To manage this, I keep all the private dotfiles in ~/.hide, which is stored in an entirely different, private subversion repository.
The private dotfiles in ~/.hide have to be symlinked into my home directory to be used, and for this I have a svnfix program that symlinks them into my home directory, as well as doing some permissions fixing and other symlinking, and even updating my crontab from svn. I have to remember to run this program from time to time, or put a call to it in my crontab, since there is no way to add a client-side hook in subversion, or CVS for that matter.
My ~/.hide directory is just one of several subversion repositories that are pulled into my home directory by subversion's useful svn:externals feature. My ~/src subdirectory, which holds various code projects I'm working on, is an even better example, as some of its contents come from repositories shared with others.
joey@dragon:~>ls src Words2Nums/ debconf/ filters/ packages/ sleepd/ alien/ debhelper/ flashybrid/ pdmenu/ tasksel/ apt-src/ debian-cd/ kernel/ sarge/ ticker/ base-config/ debian-edu/ misc/ secure-testing/ unreleased/ d-i/ dpkg-repack/ mooix/ skolelinux/ wmbattery/ joey@dragon:~>svn propget svn:externals src mooix svn+ssh://svn.mooix.net/home/svn/mooix/trunk debhelper svn+ssh://kitenet.net/home/svn/debhelper/trunk tasksel svn+ssh://svn.debian.org/svn/tasksel/trunk d-i svn+ssh://svn.debian.org/svn/d-i/trunk base-config svn+ssh://svn.debian.org/svn/base-config/trunk debconf svn+ssh://svn.debian.org/svn/debconf/trunk/src/debconf secure-testing svn+ssh://svn.debian.org/svn/secure-testing
After I use the svn propedit command to add external repositories, they are pulled in and become subdirectories in my home directory that behave, mostly, as if they are part of the same larger repository. This is a great feature, and it has uses beyond including directories from other repositories. On many of the machines I use, I don't need my entire home direcotry checkout, and so my home directory is more minimal.
joey@elephant:~>ls bin/ tmp/
I use this machine for occasional development. It's not a fully trusted machine, so I don't want to put private files there. So I've got a branch of my home directory that includes (using svn:externals) only the basics and is perfectly usable for everything I normally do on that machine. Using svn:externals like this to pull in optional directories keeps the part of my home directory that I have to branch (and merge) small.
When I want to check out my home directory to a new account, I run one of these commands:
% svn co svn+ssh://email@example.com/svn/joey/trunk/home-base . % svn co svn+ssh://firstname.lastname@example.org/svn/joey/trunk/home-full .
The first is the minimal version of my home directory, the other is the whole thing. The dots at the end of the command lines make svn check it out directly into my home directory.
I switched from CVS to svn over a painful couple of months in the winter of 2003. CVS had many misfeatures that made keeping a home directory in it annoying, and I'm glad I don't have to worry anymore about picking file and directory names (svn can easily rename them; cvs couldn't), that svn can handle binary files well and efficiently (unlike CVS), that svn is quite a bit faster at updating large home directories than CVS, that managing branches is so much easier with svn that I actually have some branches of my home directory, and that those annoying "CVS" directories that once cluttered up every corner of my home directory have gone away. The transition from CVS to svn would be easier today, since the conversion software has improved, but such a large conversion between revision control systems is bound to be a slow and painstaking process.
It's interesting to think that the longevity of my home directory's history is not limited to the useful lifetime of a given revision control system, or even the lifetime a given computer platform. Converting repositories of past revision control systems seems likely to be something new systems will continue to support, and if I one day switch to arch, or a distant future relative of arch and svn, I fully expect to take my history with me when I go.
Before I go, I want to thank the hundreds of readers who responded to my original article about keeping my life in CVS. Thanks for your encouragement, your ideas, and for letting me know I wasn't as crazy as I thought. And yes, as you can see, I finally have switched to subversion! Now I'm off to commit this file..
Joey Hess email@example.com
Selene Scriven uses svn for her home directory a similar way, so you might want to read her article for more details and a different perspective on some things.
There's a mailing list for people who use version control for their home directories, at https://lists.madduck.net/mailman/listinfo/vcs-home.
Also, my own setup is constantly evolving, and changes will be documented, to some extent, in my blog. Updates:
- svnhome update: rethinking dotfiles
- introducing mr: a tool to make checkous and updates easier, and allow mixing svn, git, etc.
- I actually keep most of my home directory in git now. Go figure. :-) I have not updated the above svn.kitenet.net urls, since references to git ones would not fit very well in the article, but see git.joeyh.name for the git versions of everything.
- I use etckeeper to keep /etc in git too.
- I used to use unison, but have found an efficient way to store large files in git -- and then I developed an even better way: git-annex