08

I've posted my cellphone snapshots from Spain and have also collected pictures with me in them.

I had a good time at DebConf 9, though also sort of weird, and just a bit too long toward the end. I'm currently uncertian whether it will induce me to get more involved in Debian than I have been in the past year. There were many people who it was wonderful to see again, or meet for the first time.

I don't think I've mentioned this publically before, but I found a security hole in OpenBSD in 1999. Ok, so it wasn't in the security-focused OS, but just in their infrastructure. And it wasn't a very bad hole; it just allowed doing things like IRCing with a openbsd.org address.

The hole was in their CVS over SSH setup. They had forgotten to turn off port forwarding. I have a clear memory of contacting them, probably on irc, and getting back a dismissive response, but assuming they'd fix it.

So I was really suprised when, checking as I write this article, I found the same hole still exists in one of their main CVS servers, as well as more than one of their CVS mirrors. What this says about OpenBSD and security is left to the reader. And if the reader is in China -- hey, this is a great way to get around the Great Firewall ...

But I hadn't planned to poke at OpenBSD here. What I really want to think about is why ssh, which is very security focused, has in one area had an insecure default for over a decade. By default, if you can connect to a ssh server, you can forward ports to and from that server (with the -L and -R options). The sshd_config(5) man page says:

AllowTcpForwarding
        Specifies whether TCP forwarding is permitted.
        The default is “yes”.  Note that disabling TCP
        forwarding does not improve security unless users
        are also denied shell access, as they can always
        install their own forwarders.

Which is a pretty silly justification -- have you ever tried to install a forwarder on some random shell account? Without pissing off the admin? ssh provides a much less obvious way to do it, one the admin will probably never notice.

Anyway, the assumption is that if a user connects via ssh, they get an unrestricted unix shell. But in many ssh setups, from OpenBSD's CVS, to the Pro Git book's ssh setup, to innumerable limited purpose ssh accounts, that's not true. The user lacks a shell, but unless the administrator remembers to turn off port forwarding, they retain the ability to bounce through the ssh server to google, or an intranet, or whatever.

I wonder that percentage of people setting up restricted ssh accounts forget to add "no-port-forwarding" to authorized_keys? It's hard to tell because most such accounts are limited to a few people. You can't scan the whole net for them. Still I've seen enough people get this wrong that I wouldn't be suprised if it affects between 10% and 50% of restricted ssh accounts.

Why did this feature get into ssh in the first place? I'm sure it was inherited into the current OpenSSH from the original, proprietary ssh. Which turns out to have been one of those programs where the "secure" in the name has a certian amount to do with marketing. Note that the more recent PermitTunnel option defaults to off. I doubt that AllowTcpForwarding would be accepted into OpenSSH today with its current default.

Why does ssh still have this increasingly bad default? Well, partly because it has indeed become increasingly bad over the years as ssh use for special purpose things like git push has spread. (OpenBSD were fairly far ahead of the curve using ssh for anonymous CVS in the 90's.) The frog has been heating up for a while. In 2004 there was CVE-2004-1653 about this, and today books are being published documenting insecure configurations for legions of DVCS fans to copy. Boiling yet?

Another reason might be that any ssh admin you ask about this will swear that they always remember to disable it. And a lot of them do, and that makes them feel smart & superior, so why complain about it. And another reason, of course, is that changing the default would break untold amounts of weird stuff.

I've been taking a closer look at the WebOS side of my Palm Pre tonight, and I noticed that it periodically uploads information to Palm, Inc.

The first thing sent is intended to be my GPS location. It's the same location I get if I open the map app on the Pre. Not very accurate in this case, but I've seen it be accurate enough to find my house before.

{ "errorCode": 0, "timestamp": 1249855555954.000000, "latitude": 36.594108, "longitude": -82.183260, "horizAccuracy": 2523, "heading": 0, "velocity": 0, "altitude": 0, "vertAccuracy": 0 }

Here they can tell every WebOS app I use, and for how long.

{ "appid": "com.palm.app.phone", "event": "close", "timestamp": 1250006362 }
{ "appid": "com.palm.app.messaging", "event": "launch", "timestamp": 1250006422 }
{ "appid": "com.palm.app.messaging", "event": "close", "timestamp": 1250006446 }

It sends the above info on a daily basis.

2009-08-10t09:15:10z    upload  /var/context/pending/1249895710-contextfile.gz.contextlog       ok      rdx-30681971
2009-08-11t09:15:10z    upload  /var/context/pending/1249982110-contextfile.gz.contextlog       ok      rdx-31306808

There is also some info that is recorded when a WebOS app crashes. Now, I've seen WebOS crash hard a time or two, but it turns out apps are crashing fairly frequently behind the scenes, and each such crash is logged and a system state snapshot taken. At least some of these are uploaded, though if things are crashing a whole lot it will be throttled.

2009-08-09T17:01:22Z    upload  /var/log/rdxd/pending/rdxd_log_59.tgz   OK      RDX-30246857
2009-08-09T17:05:36Z    upload  /var/log/rdxd/pending/rdxd_log_26.tgz   OK      RDX-30249465
2009-08-09T17:09:11Z    upload  /var/log/rdxd/pending/rdxd_log_56.tgz   OK      RDX-30252374
2009-08-09T17:11:46Z    upload  /var/log/rdxd/pending/rdxd_log_70.tgz   OK      RDX-30253958
2009-08-09T17:16:29Z    upload  /var/log/rdxd/pending/rdxd_log_67.tgz   ERR_UPLOAD_THROTTLED_DAILY      
2009-08-09T17:17:28Z    upload  /var/log/rdxd/pending/rdxd_log_51.tgz   ERR_UPLOAD_THROTTLED_DAILY      
2009-08-09T17:20:40Z    upload  /var/log/rdxd/pending/rdxd_log_21.tgz   ERR_UPLOAD_THROTTLED_DAILY

Each tarball contains a kernel dmesg, syslog, a manifest.txt listing all installed ipkg packages (including third-party apps), a backtrace of the crash, a df (from which they can tell I'm using Debian on the phone), and ps -f output listing all processes owned by root (but not by joey).

The uploading is handled by uploadd, which reads /etc/uploadd.conf:

[SERVER=rdx]
RepositoryURL=https://<HOST>/palmcsext/prefRequest?prefkey=APPLICATIONS,RDX_SRV
UploadURL=https://<HOST>/palmcsext/RDFileReceiver

[SERVER=context]
RepositoryURL=https://<HOST>/palmcsext/prefRequest?prefkey=APPLICATIONS,RDX_SRV
UploadURL=https://<HOST>//palmcsext/RDFileReceiver

The "HOST" this is sent to via https is ps.palmws.com.

My approach to disable this, which may not stick across WebOS upgrades, was to comment out the 'exec' line in /etc/event.d/uploadd and reboot. However, then I noticed a contextupload process running. This is started by dbus, so the best way to disable it seems to be: rm /usr/bin/contextupload

BTW, since Palm has lawyers, they have a privacy policy, which covers their ass fairly well regarding all this, without going into details or making clear that the above data is being uploaded.

Update: WebOS upgrades do re-enable the spyware; this has to be repeated after each upgrade.

Previously: Debian chroot on Palm Pre, debian desktop via vnc on the palm pre

My sleepy midnight blog post on Palm Pre privacy was picked up by more than 60 news outlets. This is a postmortem of what such an experience is like from the inside.

I first learned that the media were picking up on the story when I got an email from a reporter wanting to do an interview. At the end he mentioned that he had just noticed my blog was submitted to slashdot. I spent half an hour doing an email interview, and by then the story was at the top of slashdot. So I had to switch gears to keeping my web server up.

I've been slashdotted several times before, and figured it would not be a problem since I have a static web site. And the server load was never a problem, but apache still slowed to a crawl. Turned out I had it configured to accept a very low number of simulantaneous clients. When I tried to set it to a more realistic value (like 150), apache crashed. I tried switching out apache2-mpm-worker for apache2-mpm-prefork, but that had the same problem. This apache problem was solved by the new apache2-mpm-event package, which handled slashdot with ease. Server response turned from molasses to snappy.

Two days later, apache has transmitted 6 gigabytes of data to 32 thousand curious visitors.

After reaching Slashdot, the story spread to lots of technical news sites. One of the better of these stories was on The Register, who actually interviewed me -- the only reporter to do so that day.

Overnight, the story spread from tech news sites to more established media, like the Washington Post, BBC, LA Times, MSNBC, and to the UK's Telegraph, Mirror, and Guardian. (None of these news sites contacted me.)

As the story filtered from site to site, I watched random wrong things accumulate in it -- such as me being called "Joey Hess, a mobile app developer". I knew the story had hit bottom when it reached The Inquirer, which referred to me as "A bogger (sic) called Josh (sic)".

By this time, the story came complete with parts of a press release from Palm, Inc, (which of course didn't address the true issue) and was sufficiently third-hand and vague as to do nothing but scare people.

Palm's stock price dropped that day, on news that they were scaling down their production plans. I doubt that was related.

On the second day I was interviewed by Wired. That day and today, the story spread to the non-English news in Austria, Turkey, Poland, France, Russia, Slovakia, Poland, Hungary, Italy, China, Hungary, Germany, Brazil, and The Netherlands.

At this point, the story seems to have died back in the news, and after all the noise, little has been accomplished. Palm insists there is no problem and that data collection can be turned off. I have not found a way to do so, without hacking WebOS.

On Tuesday, I released ikiwiki 3.141592, with a unique new feature contributed by Intrigeri: Translation via PO files.

The normal way to translate a wiki is ad-hoc and generally you get one wiki in English and a different, vaguely related, or out of sync one in another language. What Intrigeri has done with ikiwiki and PO files is to tie translation right into the core of the wiki, so that the wiki engine knows what parts of each page are translated, can display untranslated text if a translation is not available (and even shows the translation percentage on the page and updates it as edits are made).

And the really awesome part is that it uses standard gettext PO files, one per page, which support things like fuzzy translations, and can be checked out and committed via revision control. (Or edited in a stupid web form if you prefer.)

I've set up l10n.ikiwiki.info, which is both a demo of a translated ikiwiki wiki, and a platform for easily translating ikiwiki's basewiki.

I hope to get the basewiki, which includes the default front page and some basic documentation, translated into lots of languages. So far we have one translation, to Danish, done by Jonas Smedegaard. Translating the basewiki is a fairly big job; the POT files for it are 120k in size. (There's another 200k set of POT files of additional documentation if anyone is feeling ambitious.)

This paper has an interesting, almost scary fact in it. Amoung other things, they looked at the 2.5% of commits made to revision control of the projects they studied, that did not change any code, but only added comments. In those commits an average of 47 lines of comments were added. 47 lines‽ That's two full pages of comments, either added in multiple places or as one big block.

The only sensible explanation I can think of for a 47 line comment block is if you're:

Talking about some design-level type thing in detail.
Explaining a complex and horrible hack or bug.
Writing a literate or self-documenting program, such as a perl program with a big POD block. (But perl was the least commented language they studied, with only 1/3 the comments of Java.)
Adding a license block. But the typical license block is less than 47 lines.
Trying to explain why this picture appears in the middle of your blog post and makes people happy.

I suspect what's really going on mostly is none of the above, but instead adding lots of scattered little comments, or horrible comment boilerplate, or horribly exessively verbose comments.

Of course, these days I write commit messages first, user documentation second, and comment dead last. And if I want to understand why code is the way it is or even what it does, the first thing I reach for is git annotate. Looking at how revision control system use influences comment density would be an interesting followup.

PS: They also found a single commit that contained 39 lines of code and 364,438 lines of comments. I'm curious where that lurks, it must be some interesting code.

PPS: According to the paper, "the Debian distribution of Linux is mostly generated code, repeating the same patterns over and over." Grains of salt fly everywhere.

PPPS: Why did this paper leave out the median and mode? Meh.

discussion