new year

I'm trying to work on having days that are somehow individually memorable this year. So far..

0 (leap day)

Finally tackled the chapter on monads. I'd read various explanations a year ago, but was swimming in syntax I didn't understand. After percolating for a year, and learning to read the syntax better, monads turned out to make very simple sense.

(I can't say the same about Johnny Monad.)

I had been meaning to write sometime about a method I used in ikiwiki to let expressions in a mini-language, that normally are evaluated to match a set of pages, instead be evaluated to explain why they succeed or fail. It's a cute technique, though hard to explain. Now I'm pretty sure it's just a monad. So I don't have to explain it!

1

Visiting Abram's falls this time of year, the canyon is in constant wintry shadow. The falls are not frozen, but have icicles twice my height, and there are rank upon rank of icicles all down the walls, an ice cathederal.

It's a bright sunny day, but on the whole hike, I only get into the sunlight once, briefly, at the top of the giant steps. Then back into the shade. Back at my car, I'm suprised that it's only 3 pm, feels like it should be 5.

Made a pecan pie with daddy's pecans and eggs.

2

A grey day with snow and worse. The paper's rss feed repeats "dozensWhundreds of wrecks" over and over, as if to make up for there being no 60 point type.

I'm reading Ted Nelson's book Geeks Bearing Gifts. The chapter summaries seem better than the actual book. And the on-demand printing makes me think I'm reading a poorly laid out web page, rather than something typeset. But I love that he goes all the way back to the invention of the alphabet and of hierarchical categorization and suggests all the basis for modern computers is arbitrary and/or wrong.

Eating ginger duck downtown I look up and a pizza delivery guy has slid out of control right in front of me and crashed.

Posted
proposing rel-vcs

I'm working on designing a microformat that can be used to indicate the location of VCS (git, svn, etc) repositories related to a web page.

I'd appreciate some web standards-savvy eyes on my rel-vcs microformat rfc.

If it looks good, next steps will be making things like gitweb, viewvc, ikiwiki, etc, support it. I've already written a preliminary webcheckout tool that will download an url, parse the microformat, and run the appropriate VCS program(s).

(Followed by, with any luck, github, ohloh, etc using the microformat in both the pages they publish, and perhaps, in their data importers.)

Why? Well,

  1. A similar approach worked great for Debian source packages with the XS-VCS-* fields.
  2. Pasting git urls from download pages of software projects gets old.
  3. I'm tired of having to do serious digging to find where to clone the source to websites like Keith Packard's blog, or cariographics.org, or St Hugh of Lincoln Primary School. Sites that I know live in a git repo, somewhere.
  4. With the downturn, hosting sites are going down left and right, and users who trusted their data to these sites are losing it. Examples include AOL Hometown and Ficlets, Google lively, Journalspace, podango, etc etc. Even livejournal's future is looking shakey. Various people are trying to archive some of this data before it vanishes for good. I'm more interested in establishing best practices that make it easy and attractive to let all the data on your website be cloned/forked/preserved. Things that people bitten by these closures just might demand in the future. This will be one small step in that direction.a
Posted
rip your own adventure

I used to read Choose Your Own Adventure books straight through from page 1 to 100.

Now, I wget -m websites, and less * ... same deal.

PS, Thanks for the great feedback on rel-vcs. The spec has been updated, fixing many of your nits, and there's an implementation in gitweb, ikiwiki, and mr.

Posted
fakechroot warning label

fakechroot, or any similar tool that uses a facility such as LD_PRELOAD, is not suitable for use in a security context. Such tools are way cool, but can be trivially broken out of by the attacker.

Kapil Paranjape suggests using fakechroot for locking down ssh authorized keys for unison. The idea being unison will run in a (fake)chroot, set up by a regular, non-root user, and will thus be limited to the files you want it to access.

First problem is, any statically linked executable on the system (unison can be used to upload one too) is immune to the fakechroot.

joey@gnu:~>FAKECHROOT_EXCLUDE_PATH=/bin fakeroot fakechroot chroot /tmp/empty /bin/sh
sh-3.2# ls
sh-3.2# ls /
sh-3.2# cd ..
sh-3.2# ls
sh-3.2# sash
Stand-alone shell (version 3.7)
> cd ..
> ls
empty  gconfd-joey  gpg-XhukMB  keyring-ZxSuTB  orbit-joey
> cd ..
> ls
bin    etc         lib     opt   selinux  usr

Getting unison to run a static executable in a setup such as Kapil describes is left as an exercise for the more determined attacker than I.

Here, though, is an easier way.

sh-3.2# ls /
sh-3.2# ls /bin/..
bin    etc         lib     opt   selinux  usr

To understand why this works, notice that I left /bin excluded when I ran fakechroot. (Still .. Is this a bug in fakechroot?)

Or, you could use unison to upload a symlink:

joey@gnu:~>ln -s / /tmp/empty/root
joey@gnu:~>FAKECHROOT_EXCLUDE_PATH=/bin fakeroot fakechroot chroot /tmp/empty /bin/sh
sh-3.2# ls root
bin    etc         lib     opt   selinux  usr

Moral: Taking a program, be it fakechroot or unison, that was never designed with security in mind, and trying to use it as a security barrier, is an open invitation to pain.

Posted
little known fact

An apparently little known fact about dpkg is that it clears suid bits when upgrading packages. This defeats the hardlink-a-suid-binary-and-wait-for-exploit attack that used to be a worry, and which apparently still is to some.

joey@gnu:~>ln /usr/bin/sudo
joey@gnu:~>dir sudo
-rwsr-xr-x 3 root root 112K Jul  6  2008 sudo*
joey@gnu:~>sudo apt-get --reinstall install sudo
Reading package lists... Done
[...]
Get:1 http://ftp.egr.msu.edu unstable/main sudo 1.6.9p17-1 [177kB]
Fetched 177kB in 1s (91.0kB/s)    
Created commit e5523fe: saving uncommitted changes in /etc prior to apt run
 1 files changed, 14 insertions(+), 0 deletions(-)
(Reading database ... 167780 files and directories currently installed.)
Preparing to replace sudo 1.6.9p17-1 (using .../sudo_1.6.9p17-1_i386.deb) ...
Unpacking replacement sudo ...
Processing triggers for man-db ...
Setting up sudo (1.6.9p17-1) ...
joey@gnu:~>dir sudo 
-rw------- 1 root root 112K Jul  6  2008 sudo

dpkg has done this since version 1.10.18.1, released in 2004.

PS, Can any rpm users tell me if rpm does this?

PPS, If you find yourself making statements like "While noexec is only a weak defense, it gives a little bit more protection", you are probably not really talking about security, but instead about a warm fuzzy feeling.

PPS, If shellcode can create a suid root executable, it can create it in /root or some other directory that is not mounted nosuid.

Posted
investigation of an inelegant installer

Various people have been taken in by a pretend Debian installer/bootloader for the android G1 phone.

What is it, really? Well, it's a 200+ mb zip file, containing a filesystem image of a Debian installation. About 100 mb of that is cached .debs in /var. A few minor changes have been made, and /root/.bash_history helpfully shows what was done. None of it seems to be malicious, but I did not check every file.

Also included is a hilarious "boot loader" script that contains literally the following code:

echo "Custom Linux Pseudo Bootstrapper V1.0 - by Mark Walker"
echo "WEB: http://www.androidfanatic.com/"
echo "EML: admin@androidfanatic.com"
echo " "
sleep 1
echo "Starting init process"
sleep 1
echo "INIT: Debian booting....."
sleep 1
echo "Running Linux Kernel"
sysctl -w net.ipv4.ip_forward=1
sleep 1
echo "AutoMounter started"
sleep 1
echo "Type EXIT to end session"
echo "Make sure you do a proper EXIT for a clean kill of Debian!"
echo " "

chroot $mnt /bin/bash

That's worth the price of admission right there! I especially like how "init" is "started" before the "kernel". You can't make stuff like this up.

So, in summary, this is a way to get a prebuilt Debian chroot at the expends of wasting bandwidth and space on your phone with unncessary bits. Running debootstrap on the phone is the correct way to accomplish the same thing.

It's sad that there's such interest for Debian on the G1, and general lack of clue, that people will actually probably use this. And, like it or not, this "is" the Debian install for the G1 -- no further efforts are likely to get this kind of press. Sorta like that badly pressed CD back in the day stole the thunder of Debian 1.0.

More generally, it's a pity that when people want to get Debian on a new device, be it the G1 phone or the EEE or OpenMoko or what have you; only about half of them seem to do it right, by modifying d-i and sending the modifications back to the installer team. For this rest, this kind of hand-built, unverifyable chroot gimmeckery is the de facto standard.

This also shows you the kind of fact checking that slashdot and random blogs tend to do -- ie, none. (My own blog, of course, is rigorously fact-checked and edited by armies of lolcats.)

(Sorry, this seems to be my week for posting corrections to other people's posts to Planet Debian. I don't mean to pick on you guys.)

Posted
git as an alternative to unison

I've used unison for a long while for keeping things like my music in sync between machines. But it's never felt entirely safe, or right. (Or fast!) Using a VCS would be better, but would consume a lot more space.

Well, space still matters on laptops, with their smallish SSDs, but I have terabytes of disk on my file servers, so VCS space overhead there is no longer of much concern for files smaller than videos. So, here's a way I've been experimenting with to get rid of unison in this situation.

  • Set up some sort of networked filesystem connection to the file server. I hate to admit I'm still using NFS.

  • Log into the file server, init a git repo, and check all your music (or whatever) into it.

  • When checking out on each client, use git clone --shared. This avoids including any objects in the client's local .git directory.

    git clone --shared /mnt/fileserver/stuff.git stuff
  • Now you can just use git as usual, to add/remove stuff, commit, update, etc.

Caveats:

  • git add is not very fast. Reading, checksumming, and writing out gig after gig of data can be slow. Think hours. Maybe days. (OTOH, I ran that on an Thecus.)
  • Overall, I'm happy with the speed, after the initial setup. Git pushes data around faster than unison, despite not really being intended to be used this way.
  • Note that use of git clone --shared, and read the caveats about this mode in git-clone(1).
  • git repack is not recommended on clients because it would read and write the whole git repo over NFS.
  • Make sure your NFS server has large file support. (The userspace one doesn't; kernel one does.) You don't just need it for enormous pack files. The failure mode I saw was git failing in amusing ways that involved creating empty files.
  • Git doesn't deal very well with a bit flipping somewhere in the middle of a 32 gigabyte pack file. And since this method avoids duplicating the data in .git, the clones are not available as backups if something goes wrong. So if regenerating your entire repo doesn't appeal, keep a backup of it.

(Thanks to Ted T'so for the hint about using --shared, which makes this work significantly better, and simpler.)

Posted
a hairy tale

When I was a kid, I had a daddy with this moustache.

One day, he shaved it off. I looked at the pale, exposed strip of lip, and demanded he grow it back. There are no images of this event, besides those burned into my brain. It was in simpler times.

Dad did grow it back, and a beard too, and and now he's (great)-Grandpa Hess.

Who, come to think, looks similar to my tie-died friend Bdale.

Bdale is one of the Debian grandfathers, those old guys who the young ones in the project look up to. And today, he scarred all their minds.

The collective shout of "grow it back!" is gonna be huge.

Did I mention that it all involves Tasmanian devils, charity, and the mean razor of Linus Torvalds?

These are not simple times.

Posted
ephemera vs the law

The changes to whitehouse.gov are being used as an example of ephemeral digital content.

At the exact moment Barack Obama was inaugurated, all traces of President Bush vanished from the White House website, replaced by images of and speeches by his successor. Attached to the website had been a booklet entitled 100 Things Americans May Not Know About the Bush Administration - they may never know them now. When the website changed, the link was broken and the booklet became unavailable.

At the same time, the Obama administration is chafing under rules that don't allow them to use Facebook, gmail, Blackberries, and Twitter. Because of this pesky Presidential Records Act.

Suddenly, they have to relearn how to communicate, because the law has not caught up with the way people live. They are being forced to return to older technologies (or they will once those technologies start working properly), and abandon or become more guarded in their use of newer technologies.

Anyone else find the dissonance am(az|us)ing?

Seems that anyone who really wants a copy of that missing booklet will be able to access it via FOIA in a few years (unless it got wiped with all of Cheney's files).

Meanwhile, if the new guys find a way to stop being hobbled by the law, their digital content will likely be truely ephemeral, since they have become addicted to "newer technologies" that eliminate all expectations about data's preservation and privacy.

I think this is one case of the law not keeping up with technology that I can support. It might be more fair to describe it though, as technology not keeping up with the law.

Posted
GSoC followups

Arthur Liu is doing a great series of posts following up on the results of Google Summer of Code '08 projects in Debian. (post 1 post 2) Most of them failed to produce code that's actually used in Debian, though there were some very successful projects, too.

My experience with ikiwiki in '07 parallels that:

latex

Patrick Winnertz produced a working teximg (latex -> image) plugin that's in ikiwiki.

The initial proposal also included doing latex -> html and wiki page -> latex conversions; that work never happened.

file upload and image gallery

Ben Coffey had an ambitious proposal to add file upload support to ikiwiki and also write an image gallery plugin using that.

Ben did produce a working file upload interface, but that code never got into ikiwiki. This was due to a combination of at least three things:

  1. The only outside communication with the ikiwiki community was doing a design proposal at the beginning, and a code dump toward the end of the project.
  2. There was an existing complex spec for how to limit who could upload what files, but Ben didn't choose to implement those access controls.
  3. The code dump was not even noticed until much later, when I was putting the finishing touches on my own implementation of an attachments plugin.

By the way, since I happened to do that attachments plugin as a paid consultant, I have an interesting data point: It took me 19 hours total to implement it, vs roughly a summer of work for the GSoC project. I don't mean this to reflect poorly on Ben, it just shows that someone who is familar with a code base and has thought a lot about a problem can work on it much more efficiently than a newcomer.

AFAICS, Ben never did get to the gallery part. His involvement in SoC was cut short for personal reasons.

Wiki WYSIWYG Editor

Taylor Killian produced a working interface to Wikiwyg, but it tragically never made it into ikiwiki.

This seemed to be going swimmingly -- Taylor set up a subversion repository for his work and produced several revisions of the wikiwyg plugin in response to feedback. At the end of the summer, it was close to ready to be included in ikiwiki.

Then we lost contact with Taylor, and his site fell off the net. Subversion repo: Gone. Tarballs: Gone.

The final tragic part of this story is that I had a local copy of all of Taylor's work. And when I moved ikiwiki over from subversion to git, I set up a wikiwyg branch, and put his work into it, and happily deleted my other copies of his work. But I made some kind of newbie mistake pushing that branch, and so his work never made it onto my git server, and at this point seems completly lost.

(I still hope to hear from Taylor..)

gallery

Arpit Jain produced a working gallery plugin for ikiwiki. But that code never made it into ikiwiki.

The main stumbling block seems to be that it used the lightbox javascript library, which, at least at the time was licensed under a non-free CC license.

The code is present in a branch in ikiwiki's git repo, but at this point there is still no image gallery plugin in ikiwiki, though others are now working on other implementations.

Conclusions

I learned some important things from participating in the '07 SoC:

  • There needs to be a well-defined place where students check in their code regularly, and where it is regularly reviewed. Either give them each accounts and branches in the project's main subversion repository, or better use distributed VCS. Ikiwiki switched to git as a direct result of the problems I saw with students publishing their code in the ad-hoc ways that resulted from not having that. And I keep backups of every ikiwiki git repo I am aware of, because any could fall off the net at any time.
  • If your goal is to have a student develop something that is included in your project, that needs to be one of up-front success criteria for the SoC project. Otherwise, most of them will fail to get all the way there. Students need an incentive to deal with licensing issues, to respond to maintainers' feedback, and to keep following up until their code is merged. I suspect that requiring students leap over this sometimes very tall bar will scare a lot of them away though. You have to decide whether your goal is indeed to get working code in production, or if it's more to mentor a student for the summer.
  • Just because one is very familiar with a code base and very productive in working on it, and has had lots of success getting others to contribute to it in the usual free software manner, does not mean that one will be a good mentor for a student working on that code base. In fact, it probably means you won't be, unless you have some prior experience teaching students. Which is why I've not participated in SoC since. :-/
Posted