In this post I will say some things about localization that I've never felt it was politic to say before. (It still isn't, but what the hey.) But I first want to mention that one of the reasons Christian Perrier is awesome is because he manages to steer projects in ways that avoid (most of) the problems below, and because he has managed to scale rather impressively far. But not far enough that these things haven't gotten under my skin.

So, I hate localization. Hate, hate, hate. I wish it would die. I wish everyone spoke the same language. I hate that we have to be politically correct about it, to the point that even if everyone does speak English (hello, Denmark?), we still have to worry about localization into their native language. I especially hate that, after we waste time and resources doing that, everyone uses the software in English anyway (hello, most of Europe).

I hate that the typical best effort localization of software is typically so incomplete that it's not actually usable by people who really do need the translation. Either because "best effort" == 80% translated, which is sometimes worse than 0%, or because the documentation doesn't get translated too. And that even with what we consider a complete translation, there are still all sorts of English bits that nobody would ever consider translating. Such as the names of commands, of command-line switches, of markup keywords and configuration files.

I hate that translating software and documentation always involves bolting an enormous, nasty build system onto its side, and that the bulk of translations always greatly outweighs the size of the code. I hate that these systems constantly produce deltas that I have to avoid cluttering my commits with. I hate the expectation that every programmer somehow become an expert in internationalization, rather than letting them devote their time to something they're good at.

But none of that is what really bothers me about localization. What really, really bothers me is that I often see the need to localize code inhibiting that code becoming better. Just two examples:

  • Oh no, we worry, if we fix this a message, we will have to do 100 times more work to update the translations. Surely, then, anyone who wants to fix a bad message is an anglocentric who should be ignored.
  • Rather than allow the computer to dynamically generate correct and expressive English on the fly, to actually talk to us, we have to reduce everything to gettextable static strings. The poster child for this disease is that it becomes easier to tack on "(s)" or reword entirely, than to deal with the cumulative complexity of plural forms in every language.

(PS, I also hate that we have to keep "l10n" and "i18n" straight, and that the words are so unwieldly in this f5g modern English that we represent them based on their length.)