I got my Vaio's hibernation working again - yay! Seems it was a victim of "I wonder what this file does -- let's delete it and find out" syndrome. Luckily fixing this was easy.
On the other hand, I now have a 128 mb file in /dos that contains the full contents of memory as of last hibernation. So I'll have to repartition fairly soon anyway to move that to someplace a little less visible.
I spent most of today writing HTML::Sanitizer, a perl module to sanitize untrusted html code by removing all tags except a set you specify, and all tag attributes except sets you specify.
It also removes all traces of javascript. This is suprisingly hard -- not only can javascript lurk inside <script> tags, but there are html javascript entities, and even a way to embed javascript inside any tag attribute that is used as an URL. I regard embedding such a major source of security holes into the html spec in such a myrid of ways as very irresponsible -- and I wonder what other stupid html extentions out there I should make my code deal with.
I wrote this because I needed it for a web site I am designing and I could find no equivilant library for perl. Of course numerous sites like advogato and slashdot address these problems with ahem varying degrees of success, but doing it in a independant library seems like a better solution. If anyone knows of any other code that deals with this problem, preferably general-purpose code in a library, I'd really like to examine it.