  • PHP has a HTML Tidy module. I have been playing around with it since there has been some frustration as of late because of commenting throwing off validation. A quick glance at the QuickRef of Tidy will show that there is a wealth of features. Basically tidy will take invalid tag soup and give you some valid XHTML (if you ask nicely of course). Implementing this in PHP Is not rocket science. The biggest problem is that this module is not compiled in by default.
    The only validation issue that remains is that Tidy will not strip things according to the doctype. This kind of stuff would be fairly trivial to implement in XSLT.
    The big question is whether there is interest in this type of thing? I don’t know how WordPress handles the tag stripping and validation currently, as I do not have it installed myself. Let me know if there would be interest in this type of a plug-in.