Obnam 1.0 was released during several months when I had no well-connected server to use for backups. Yesterday I installed a terabyte disk in a basement with a fiber optic network connection, so my backupless time is over.
Now, granted, I have a very multi-layered approach to backups; all my data is stored in git, most of it with dozens of copies automatically maintained, and with archival data managed by git-annex. But I still like to have a "real" backup system underneath, to catch anything else. And to back up those parts of my user's data that I have not given them tools to put into git yet...
My backup server is not my basement, so I need to securely encrypt
the backups stored there. Encrypting your offsite backups is such a good
idea that I've always been surprised at the paucity of tools to do it. I
got by with duplicity
for years, but it's increasingly creaky, and the
few times I've needed to restore, it's been a terrific pain. So I'm excited
to be trying Obnam today.
So far I quite like it. The only real problem is that it can be slow, when there's a transatlantic link between the client and the server. Each file backed up requires several TCP round-trips, and the latency kills the bandwidth. Large files are still sent fast, and obnam uses little resources on either the client or server while running. And this mostly only affects the initial, full backup.
But the encryption and ease of use more than make up for this. The real
killer feature with Obnam's encryption isn't that it's industry-standard
encryption with gpg, that can be trivially enabled with a single option
(--encrypt-with=DEADBEEF
). No, the great thing about it is its key
management.
I generate a new gpg key for each system I back up. This prevents systems reading each other's backups. But that means you have to backup the backup keys.. or when a system is lost, the backup would be inaccessible.
With Obnam, I can instead just grant my personal gpg key access to
the repository: obnam add-key --keyid 2512E3C7
. Now both the machine's
key and my gpg key can access the data. Great system; can't revoke access,
but otherwise perfect. I liked this so much I stole the design and used
it in git-annex too. :)
I'm also pleased I can lock down .ssh/authorized_keys
on my backup
server, to prevent clients running arbitrary commands. Duplicity runs
ad-hoc commands over ssh, which defeated me from ever locking it down.
Obnam can be easily locked down, like this:
command="/usr/lib/openssh/sftp-server"
This could still be improved, since clients can still read the whole filesystem with sftp. I'd like to have something like git-annex's git-annex-shell, which can limit access to only a specific repository. Hmm, if Obnam had its own server-side program like this, it could stream backup data to it using a protocol that avoids the roundtrips needed by the SFTP protocol, and fix the latency issue too. Lars, I know you've been looking for a Haskell starter project ... perhaps this is it? :)
I'd like to use obnam as well, but the latency issues currently make it unusable for me, so I still use duplicity.
I've managed to lock it down quite tightly, though, and the same procedure will work for anything else that uses sftp.
First, configure sshd on your backup machine to support chrooted sftp-only access for members of a given group, by adding the following to
/etc/ssh/sshd_config
:Then create a dedicated user for each machine you want to back up, and put that user in the sftponly group.
Create an ssh key for each, put the public key in .ssh/authorized_keys in that dedicated backup account, and put the private key somewhere convenient on the machine you want to back up.
Create an sftp directory in that user's home directory, owned by root:root (because sshd will not chroot to a directory owned by the user). Create a writable user-owned subdirectory sftp/backup or sftp/duplicity that you can actually back up to.
Now, configure duplicity on the machine to back up:
Hope that helps.
i see you using short (8 hex digits) keyids in the example above. These are trivially forgeable with a couple hours of low-power consumer hardware, and new keys are easy enough to inject into any keyring if the keyring is regularly refreshed from the keyservers (as it should be if you want to get revocation information).
The short keyids are just the last digits of the full OpenPGPv4 fingerprint, which is significantly harder to forge. Can you use the full fingerprint in place of the short keyid?