propelling containers

Propellor has supported docker containers for a "long" time, and it works great. This week I've worked on adding more container support.

docker containers (revisited)

The syntax for docker containers has changed slightly. Here's how it looks now:

example :: Host
example = host "example.com"
    & Docker.docked webserverContainer

webserverContainer :: Docker.Container
webserverContainer = Docker.container "webserver" "joeyh/debian-stable"
    & os (System (Debian (Stable "wheezy")) "amd64")
    & Docker.publish "80:80"
    & Apt.serviceInstalledRunning "apache2"
    & alias "www.example.com"

That makes example.com have a web server in a docker container, as you'd expect, and when propellor is used to deploy the DNS server it'll automatically make www.example.com point to the host (or hosts!) where this container is docked.

I use docker a lot, but I have drank little of the Docker KoolAid. I'm not keen on using random blobs created by random third parties using either unreproducible methods, or the weirdly underpowered dockerfiles. (As for vast complicated collections of containers that each run one program and talk to one another etc ... I'll wait and see.)

That's why propellor runs inside the docker container and deploys whatever configuration I tell it to, in a way that's both replicatable later and lets me use the full power of Haskell.

Which turns out to be useful when moving on from docker containers to something else...

systemd-nspawn containers

Propellor now supports containers using systemd-nspawn. It looks a lot like the docker example.

example :: Host
example = host "example.com"
    & Systemd.persistentJournal
    & Systemd.nspawned webserverContainer

webserverContainer :: Systemd.Container
webserverContainer = Systemd.container "webserver" chroot
    & Apt.serviceInstalledRunning "apache2"
    & alias "www.example.com"
  where
    chroot = Chroot.debootstrapped (System (Debian Unstable) "amd64") Debootstrap.MinBase

Notice how I specified the Debian Unstable chroot that forms the basis of this container. Propellor sets up the container by running debootstrap, boots it up using systemd-nspawn, and then runs inside the container to provision it.

Unlike docker containers, systemd-nspawn containers use systemd as their init, and it all integrates rather beautifully. You can see the container listed in systemctl status, including the services running inside it, use journalctl to examine its logs, etc.

But no, systemd is the devil, and docker is too trendy...

chroots

Propellor now also supports deploying good old chroots. It looks a lot like the other containers. Rather than repeat myself a third time, and because we don't really run webservers inside chroots much, here's a slightly different example.

example :: Host
example = host "mylaptop"
    & Chroot.provisioned (buildDepChroot "git-annex")

buildDepChroot :: Apt.Package -> Chroot.Chroot
buildDepChroot pkg = Chroot.debootstrapped system Debootstrap.buildd dir
    & Apt.buildDep pkg
  where
    dir = /srv/chroot/builddep/"++pkg
   system = System (Debian Unstable) "amd64"

Again this uses debootstrap to build the chroot, and then it runs propellor inside the chroot to provision it (btw without bothering to install propellor there, thanks to the magic of bind mounts and completely linux distribution-independent packaging).

In fact, the systemd-nspawn container code reuses the chroot code, and so turns out to be really rather simple. 132 lines for the chroot support, and 167 lines for the systemd support (which goes somewhat beyond the nspawn containers shown above).

Which leads to the hardest part of all this...

debootstrap

Making a propellor property for debootstrap should be easy. And it was, for Debian systems. However, I have crazy plans that involve running propellor on non-Debian systems, to debootstrap something, and installing debootstrap on an arbitrary linux system is ... too hard.

In the end, I needed 253 lines of code to do it, which is barely one magnitude less code than the size of debootstrap itself. I won't go into the ugly details, but this could be made a lot easier if debootstrap catered more to being used outside of Debian.

closing

Docker and systemd-nspawn have different strengths and weaknesses, and there are sure to be more container systems to come. I'm pleased that Propellor can add support for a new container system in a few hundred lines of code, and that it abstracts away all the unimportant differences between these systems.

PS

Seems likely that systemd-nspawn containers can be nested to any depth. So, here's a new kind of fork bomb!

infinitelyNestedContainer :: Systemd.Container
infinitelyNestedContainer = Systemd.container "evil-systemd"
    (Chroot.debootstrapped (System (Debian Unstable) "amd64") Debootstrap.MinBase)
    & Systemd.nspawned infinitelyNestedContainer

Strongly typed purely functional container deployment can only protect us against a certian subset of all badly thought out systems. ;)

on leaving

I left Debian. I don't really have a lot to say about why, but I do want to clear one thing up right away. It's not about systemd.

As far as systemd goes, I agree with my friend John Goerzen:

I promise you – 18 years from now, it will not matter what init Debian chose in 2014. It will probably barely matter in 3 years.

read the rest

And with Jonathan Corbet:

However things turn out, if it becomes clear that there is a better solution than systemd available, we will be able to move to it.

read the rest

I have no problem with trying out a piece of Free Software, that might have abrasive authors, all kinds of technical warts, a debatable design, scope creep etc. None of that stopped me from giving Linux a try in 1995, and I'm glad I jumped in with both feet.

It's important to be unafraid to make a decision, try it out, and if it doesn't work, be unafraid to iterate, rethink, or throw a bad choice out. That's how progress happens. Free Software empowers us to do this.

Debian used to be a lot better at that than it is now. This seems to have less to do with the size of the project, and more to do with the project having aged, ossified, and become comfortable with increasing layers of complexity around how it makes decisions. To the point that I no longer feel I can understand the decision-making process at all ... or at least, that I'd rather be spending those scarce brain cycles on understanding something equally hard but more useful, like category theory.

It's been a long time since Debian was my main focus; I feel much more useful when I'm working in a small nimble project, making fast and loose decisions and iterating on them. Recent events brought it to a head, but this is not a new feeling. I've been less and less involved in Debian since 2007, when I dropped maintaining any packages I wasn't the upstream author of, and took a year of mostly ignoring the larger project.

Now I've made the shift from being a Debian developer to being an upstream author of stuff in Debian (and other distros). It seems best to make a clean break rather than hang around and risk being sucked back in.

My mailbox has been amazing over the past week by the way. I've heard from so many friends, and it's been very sad but also beautiful.

Posted
continuing to be pleasantly surprised

Free software has been my career for a long time -- nothing else since 1999 -- and it continues to be a happy surprise each time I find a way to continue that streak.

The latest is that I'm being funded for a couple of years to work part-time on git-annex. The funding comes from the DataLad project, which was recently awarded a grant by the National Science Foundation. DataLad folks (at Dartmouth College and at Magdeburg University in Germany) are working on providing easy access to scientific data (particularly neuroimaging). So git-annex will actually be used for science!

I'm being funded for around 30 hours of work each month, to do general work on the git-annex core (not on the webapp or assistant). That includes bugfixes and some improvements that are wanted for DataLad, but are all themselves generally useful. (see issue list)

This is enough to get by on, at least in my current living situation. It would be great if I could find some funding for my other work time -- but it's also wonderful to have the flexibility to spend time on whatever other interesting projects I might want to.

Posted
a programmable alarm clock using systemd

I've taught my laptop to wake up at 7:30 in the morning. When it does, it will run whatever's in my ~/bin/goodmorning script. Then, if the lid is still closed, it will go back to sleep again.

So, it's a programmable alarm clock that doesn't need the laptop to be left turned on to work.

But it doesn't have to make noise and wake me up (I rarely want to be woken up by an alarm; the sun coming in the window is a much nicer method). It can handle other tasks like downloading my email, before I wake up. When I'm at home and on dialup, this tends to take an hour in the morning, so it's nice to let it happen before I get up.

This took some time to figure out, but it's surprisingly simple. Besides ~/bin/goodmorning, which can be any program/script, I needed just two files to configure systemd to do this.

/etc/systemd/system/goodmorning.timer

[Unit]
Description=good morning

[Timer]
Unit=goodmorning.service
OnCalendar=*-*-* 7:30
WakeSystem=true
Persistent=false

[Install]
WantedBy=multi-user.target

/etc/systemd/system/goodmorning.service

[Unit]
Description=good morning
RefuseManualStart=true
RefuseManualStop=true
ConditionACPower=true

[Service]
Type=oneshot
ExecStart=/bin/systemd-inhibit --what=handle-lid-switch --why=goodmorning /bin/su joey -c "/usr/bin/timeout 45m /home/joey/bin/goodmorning"

installation

After installing those files, run (as root): systemctl enable goodmorning.timer; systemctl start goodmorning.timer

Then, you'll also need to edit /etc/systemd/logind.conf, and set LidSwitchIgnoreInhibited=no -- this overrides the default, which is not to let systemd-inhibit block sleep on lid close.

almost too easy

I don't think this would be anywhere near as easy to do without systemd, logind, etc. Especially the handling of waking the system at the right time, and the behavior around lid sleep inhibiting.

The WakeSystem=true relies on some hardware support for waking from sleep; my laptop supported it with no trouble but I don't know how broadly available that is.

Also, notice the ConditionACPower=true, which I added once I realized I don't want the job to run if I forgot to leave the laptop plugged in overnight. Technically, it will still wake up when on battery power, but then it should go right back to sleep.

Quite a lot of nice peices of systemd all working together here!

xfce workaround

If using xfce, xfce4-power-manager takes over handling of lid close from systemd, and currently prevents the system from going back to sleep if the lid is still closed when goodmorning finishes. Happily, there is an easy workaround; this configures xfce to not override the lid switch behavior:

xfconf-query -c xfce4-power-manager -n -p /xfce4-power-manager/logind-handle-lid-switch -t bool -s true

Other desktop environments may have similar issues.

why not a per-user unit?

It would perhaps be better to use the per-user systemd, not the system wide one. Then I could change the time the alarm runs without using root.

What's prevented me from doing this is that systemd-inhibit uses policykit, and policykit prevents it from being used in this situation. It's a lot easier to run it as root and use su, than it is to reconfigure policykit.

propellor is d-i 2.0

I think I've been writing the second system to replace d-i with in my spare time for a couple months, and never noticed.

I'm as suprised as you are, but consider this design:

  • Installation system consists of debian live + haskell + propellor + web browser.

  • Entire installation UI consists of a web-based (and entirely pictographic and prompt based, so does not need to be translated) selection of the installation target.

  • Installation target can be local disk, remote system via ssh (wiping out crufty hacked-up pre-installed debian), local VM, live ISO, etc.

  • Really, no other questions. Not even user name/password! The installed system will only allow login via the same method that was used to install it. So a locally installed system will accept console/X login with no password and then a forced password change. Or a system installed via ssh will only allow login using the same ssh key that was used to install it.

  • The entire installation process consists of a disk format, followed by debootstrap, followed by running propellor in the target system. This also means that the installed system includes a propellor config file which now describes the properties of the system as installed (so can be edited to tweak the installation, or reused as starting point for next installation).

  • Users who want to configure installation in any way write down properties of system using a simple propellor config file. I suppose some people still use more than one partiton or gnome or some such customization, so they'd use:

main :: IO
main = Installer.main
    & Installer.partition First "/boot" Ext3 (MiB 256)
    & Installer.partition Next "/" Ext4 (GiB 5)
    & Installer.partition Next "/home" Ext4 FreeSpace
    & Installer.grubBoots "hd0"
    & os (System (Debian Stable) "amd64")
    & Apt.stdSourcesList
    & Apt.installed ["task-gnome-desktop"]
  • The installation system is itself built using propellor. A free feature given the above design, so basically all it will take to build an installation iso is this code:
main :: IO
main = Installer.main
    & Installer.target CdImage "installer.iso"
    & os (System (Debian Stable) "amd64")
    & Apt.stdSourcesList
    & Apt.installed ["task-xfce-desktop", "ghc", "propellor"]
    & User.autoLogin "root"
    & User.loginStarts "propellor --installer"
  • Propellor has a nice display of what it's doing so there is no freaking progress bar.

Well, now I know where propellor might end up if I felt like spending a month and adding a few thousand lines of code to it.

using a debian package as the remote for a local config repo

Today I did something interesting with the Debian packaging for propellor, which seems like it could be a useful technique for other Debian packages as well.

Propellor is configured by a directory, which is maintained as a local git repository. In propellor's case, it's ~/.propellor/. This contains a lot of haskell files, in fact the entire source code of propellor! That's really unusual, but I think this can be generalized to any package whose configuration is maintained in its own git repository on the user's system. For now on, I'll refer to this as the config repo.

The config repo is set up the first time a user runs propellor. But, until now, I didn't provide an easy way to update the config repo when the propellor package was updated. Nothing would break, but the old version would be used until the user updated it themselves somehow (probably by pulling from a git repository over the network, bypassing apt's signature validation).

So, what I wanted was a way to update the config repo, merging in any changes from the new version of the Debian package, while preserving the user's local modifications. Ideally, the user could just run git merge upstream/master, where the upstream repo was included in the Debian package.

But, that can't work! The Debian package can't reasonably include the full git repository of propellor with all its history. So, any git repository included in the Debian binary package would need to be a synthetic one, that only contains probably one commit that is not connected to anything else. Which means that if the config repo was cloned from that repo in version 1.0, then when version 1.1 came around, git would see no common parent when merging 1.1 into the config repo, and the merge would fail horribly.

To solve this, let's assume that the config repo's master branch has a parent commit that can be identified, somehow, as coming from a past version of the Debian package. It doesn't matter which version, although the last one merged with will be best. (The easy way to do this is to set refs/heads/upstream/master to point to it when creating the config repo.)

Once we have that parent commit, we have three things:

  1. The current content of the config repo.
  2. The content from some old version of the Debian package.
  3. The new content of the Debian package.

Now git can be used to merge #3 onto #2, with -Xtheirs, so the result is a git commit with parents of #3 and #2, and content of #3. (This can be done using a temporary clone of the config repo to avoid touching its contents.)

Such a git commit can be merged into the config repo, without any conflicts other than those the user might have caused with their own edits.

So, propellor will tell the user when updates are available, and they can simply run git merge upstream/master to get them. The resulting history looks like this:

* Merge remote-tracking branch 'upstream/master'
|\  
| * merging upstream version
| |\  
| | * upstream version
* | user change
|/  
* upstream version

So, generalizing this, if a package has a lot of config files, and creates a git repository containing them when the user uses it (or automatically when it's installed), this method can be used to provide an easily mergable branch that tracks the files as distributed with the package.

It would perhaps not be hard to get from here to a full git-backed version of ucf. Note that the Debian binary package doesn't have to ship a git repisitory, it can just as easily ship the current version of the config files somewhere in /usr, and check them into a new empty repository as part of the generation of the upstream/master branch.

abram's 2014

pics from trip to Abram's Falls

The trail to Abram's Falls seems more trecherous as we get older, but the sights and magic of the place are unchanged in our first visit in years.

Posted
laptop death

So I was at Ocracoke island, camping with family, and I brought my laptop along as I've done probably half a dozen times before. An enormous thuderstorm came up. It rained for 8 hours and thundered for 3 of those. Some lightning cracks quite close by as we crouched in the food tent, our feet up off the increasingly wet ground "just in case". The campground flooded. Luckily we were camped in the dunes and tents mostly avoided being flooded with 2-3 inches of water. (That was just the warmup; a hurricane hit a week after we left.)

My laptop was in my tent when this started, and I got soaked to the skin just running over there and throwing it up on the thermarest to keep it out of any flooding and away from any drips. It seemed ok, so best not to try to move it to the car in that downpour.

Next time I checked, it turned out the top vent of the tent was slightly open and dripping. The laptop bag was damp. But inside it seemed ok. Rain had slackened to just heavy, so I ran it down to the car. Laptop appeared barely damp, but it was hard to tell as I had quite forgotten what "dry" was. Turned it on for 10 seconds to check the time. It was 7:30 and we still had to cook dinner in this mess. Transferred it to a dry bag.

(By the way, in some situations, discovering you have a single dry towel you didn't know you had is the best gift in the world!)

Next morning, the laptop was dead. When powered on, the fan came on full, the screen stayed black, and after a few seconds it turned itself back off.

I need this for work, so it was a crash priority to get it fixed or a replacement. Before I even got home, I had logged onto Lenovo's website to check warantee status and found 2 things:

  1. They needed some number from a sticker on the bottom of my laptop. Which was no longer there.
  2. The process required some stange login on an entirely different IBM website.

At this point, I had a premonition of how the beuracracy would go. Reading Sesse's Blehnovo, I see I was right. I didn't even try. I ordered a replacement with priority shipping.

When I got home, I pulled the laptop apart to try to debug it. I still don't know what's wrong with it. The SSD may be damaged; it seems to cause anything I put it into to fail to work.

New laptop arrived in 2 days. Since this model is now a year old, it was a few hundred dollars cheaper this time around. And now I have an extra power supply, and a replacment keyboard, and a replacement fan etc. And I've escaped the dead USB port and broken rocker switch of the old laptop too.

The only weird thing is that, while my old laptop had no problem with my Toshiba passport USB drive, this new one refuses to recognize it unless I plug it into a USB 1.0 hub. Oh well..


Update: Ocracode for this trip:

OBX1.1 P6 L7 SA3d+++b+c++ U2(rearended,laptop death) T4f2b1 R2T Bb+m++++n++
F+++u++ SC+s-g6 H+f0i3 V+++s++m0 E++r+

what does docker.io run -it debian sh run?

When you type docker.io run -it debian sh, it goes off and gets "debian" and runs it. But what is in this "debian" image? How was it built?

The docker hub does not really say. All it tells us is this is a "(Semi) Official Debian base image" and that its sources.list uses http.debian.net for geolocation.

There's a link to https://github.com/dotcloud/stackbrew/blob/master/library/debian which in turn uses a very strange git repository, owned by Debian maintainer Tianon Gravi, that contains compressed tarballs of Debian: http://github.com/tianon/docker-brew-debian "Git is not a fan of what we're doing here."

The "source", such as it is, that is used to build this image consists of:

FROM scratch
ADD rootfs.tar.xz /
CMD ["/bin/bash"]

and

mkimage.sh -t tianon/debian:wheezy -d . debootstrap --variant=minbase --components=main --include=inetutils-ping,iproute wheezy http://http.debian.net/debian

I don't know where mkimage.sh is. [Update: Probably /usr/share/docker.io/contrib/mkimage-debootstrap.sh or a modified version] And anyway, I have no reason to trust that this image is built the way it claims to be built. So, the question remains: What is in this image?

To find out, I did a debootstap --variant=minbase stable and diffed the entire docker debian image against it. The diff was 6738 lines, from which I found the following interesting differences.

added packages

The image has iputils-ping and netbase and iproute added. These are not in a minbase debootstrap, but are in a regular debootstrap. It's rather weird that the docker image is based on a minbase debootstrap, since this means they have to add back important stuff like this on an ad-hoc basis.

If the expectation is that an experienced Unix person who found it missing would say "What on earth is going on, where is 'foo'?", it must be an 'important' package. -- Debian Policy

apt hooks

DPkg::Post-Invoke { "rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true"; };
APT::Update::Post-Invoke { "rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true"; };

Dir::Cache::pkgcache "";
Dir::Cache::srcpkgcache "";

Acquire::Languages "none";

These are some strange modifications to apt's config. The intent is clearly to avoid wasting disk space, even at the expense of making apt slower (by disabling caches) and losing translations.

I am curious if apt might ever invoke the DPkg::Post-Invoke twice in an upgrade in which it runs dpkg twice. I'm also curious whether deleting /var/cache/apt/archives/lock could cause a problem.

unsafe-io

dpkg is configured to use unsafe-io.

motd

Linux viper 3.12.20-gentoo #1 SMP Sun May 18 12:36:24 MDT 2014 x86_64

Yes, that's "gentoo". Presumably this tells us something about the build host.

policy-rc.d

/usr/sbin/policy-rc.d contains "exit 101", which prevents daemons from being automatically started after they are installed. This may or may not be desirable, depending on what you're doing with docker.

It notably also prevents restarting running daemons in this container if they're upgraded for eg, a security fix. It would almost certianly be better if this script allowed restarting running daemons.

diversions

/sbin/initctl is diverted and replaced with /bin/true. This is a workaround for a bug in sysvinit; when upgraded inside a docker container it hangs while trying to run initctl.

missing devices

Some versions of the debian image are missing things in /dev. See this bug.

(I had listed some device files that I thought were missing, but I was wrong.)

some gpg thing is different

Binary files pure-debootstrap/etc/apt/trustdb.gpg and from-docker/etc/apt/trustdb.gpg differ

Oh well, that can't be important.. Or can it? I did not check.

conclusions

I would hardly consider this to be an "(Semi) Official Debian image". Some of the changes are quite dubious. The build environment is not Debian. There is no guarantee you'll get the same image I examined. Diffing thousands of lines of filesystem changes is not particularly fun or reliable way to spot accidental or malicious changes.

I'd recommend only trusting docker images you build yourself. I have some docker images published somewhere that are built with 100% straight debootstrap with no modifications (and even an armel image that can be used on an x86 system thanks to qemu). But I'm not going to link to them, because again, you should only trust docker images you built yourself. To help increase your mistrust of me, I present this IRC snippet:

<joeyh> I'll bet I could publish an image that just did a killall5 as root on startup and get plenty of people to nuke their container hosts

Here are some ideas for things Debian could do to improve this:

  • Make a package that can build docker images of Debian, in a fully reproducible fashion. Ie, same versions of debs in, same byte stream out.
  • If it makes sense for the docker image to not contain all the packages in a standard debootstrap (maybe leaving out init systems), or to contain other packages, write down the rationalle for this, and make a --variant=docker.
  • Make a package that provides appropriate tweaks for Debian in a container. This might include a policy-rc.d that allows restarting daemons on upgrade if they're already running in the container, and otherwise prevents running daemons.
  • Make a low-disk-space package that eg, prevents apt from caching debs.
  • Provide some way to verify, through gpg signatures, that docker has pulled an actual trusted image and not some https-MITMed thing. (See also #746394

PS, if this wasn't enough fun, just consider the tweaks made to the "Debian" images on all the VPS hosts out there.

how I wrote init by accident

I wrote my own init. I didn't mean to, and in the end, it took 2 lines of code. Here's how.

Propellor has the nice feature of supporting provisioning of Docker containers. Since Docker normally runs just one command inside the container, I made the command that docker runs be propellor, which runs inside the container and takes care of provisioning it according to its configuration.

For example, here's a real live configuration of a container:

        -- Exhibit: kite's 90's website.
        , standardContainer "ancient-kitenet" Stable "amd64"
                & Docker.publish "1994:80"
                & Apt.serviceInstalledRunning "apache2"
                & Git.cloned "root" "git://kitenet-net.branchable.com/" "/var/www"
                        (Just "remotes/origin/old-kitenet.net")

When propellor is run inside this container, it takes care of installing apache, and since the property states apache should be running, it also starts the daemon if necessary.

At boot, docker remembers the command that was used to start the container last time, and runs it again. This time, apache is already installed, so propellor simply starts the daemon.

This was surprising, but it was just what I wanted too! The only missing bit to make this otherwise entirely free implementation of init work properly was two lines of code:

                void $ async $ job reapzombies
  where
        reapzombies = void $ getAnyProcessStatus True False

Propellor-as-init also starts up a simple equalivilant of rsh on a named pipe (for communication between the propellor inside and outside the container), and also runs a root login shell (so the user can attach to the container and administer it). Also, running a compiled program from the host system inside a container, which might use a different distribution or architecture was an interesting challenge (solved using the method described in completely linux distribution-independent packaging). So it wasn't entirely trivial, but as far as init goes, it's probably one of the simpler implementations out there.

I know that there are various other solutions on the space of an init for Docker -- personally I'd rather the host's systemd integrated with it so I could see the status of the container's daemons in systemctl status. If that does happen, perhaps I'll eventually be able to remove 2 lines of code from propellor.