Keeping pristine upstream tarballs around is a pain. Wouldn't it be nice to be able to keep them in revision control? Except, it would use far too much disk...

Here's a solution. It generates an binary delta between the pristine upstream tarball and a tarball created using files checked out of the repository. The delta should be quite small, and any checkout of the repository that has the same file contents as that used to create the delta can be used to regenerate the pristine tarball.

Example (which would presumably be more fun and faster if I used git):

joey@kodama:~package/uqm-voice>svn switch svn+ssh://
D    debian
A    comm/blackur/black041.ogg
U    comm/shofixt/shofi040.ogg
U    comm/starbas/starb182.ogg
D    comm/slyland/slyla030.ogg
U    comm/chmmr/chmmr035.ogg
U    comm/supox/supox031.ogg
Updated to revision 223.
joey@kodama:~package/uqm-voice> pristine-tar extract ~/ ../uqm-voice_0.3.orig.tar.gz

The generated tarball is bit-identical to the 19 MB upstream tarball, though you have to gunzip them both to check this, since the gzip compression differs. The delta file is all of 41k large.

The file was created earlier as follows:

joey@kodama:~> pristine-tar stash lib/debian/unstable/uqm-voice_0.3.orig.tar.gz

I've uploaded pristine-tar to incoming.