Forum Replies Created

Viewing 7 replies - 1 through 7 (of 7 total)
  • Forum: Plugins
    In reply to: TTrouble with HTML Purified

    Hi, if you’re able to make the right double quotation marks conversion before HTML Purifier, your problem will be fixed.

    Forum: Plugins
    In reply to: Clean HTML from MS Word.

    Sorry about that, I’ll be sure to make it clear in the future.

    Forum: Plugins
    In reply to: Clean HTML from MS Word.

    Yes. 🙂 I’m quite proud of the library.

    Don’t get me wrong: it’s really frustrating seeing people constantly botching HTML filtering. It’s a *hard* problem to solve. You can read this comparison for more info.

    Forum: Plugins
    In reply to: Clean HTML from MS Word.

    You’re dead on. Yes: JavaScript is here to stay, and you’d be a fool not to use it. Yes: JavaScript is a fully featured programming language: Mozilla Firefox is practically built on JavaScript. Yes: the average user does not turn off scripting.

    But there’s one issue that no amount of client-side scripting can fully replace: filtering incoming data. “Client-side is bad idea for any serious coding” does not equal “Do not trust data that comes from the client.” The former is false, the latter true. Because JavaScript can be turned off, *any* security checks (for example, removing undesirable tags and attributes), can easily be circumvented.

    Let’s give an example. First the normal use case:

    1. Bob wants to post a MSWord document. He copy pastes it into a text editor
    2. JavaScript (client side) transparently cleans up the formatting for him
    3. Bob presses submit, it gets sent to the server, which DOESN’T do any other checking, and puts it on the result page.

    How to abuse:

    1. Mallory surfs to the web page and turns of JavaScript. She fills in the web form with malicious, raw HTML
    2. Data gets sent to server, since the server doesn’t do any checking, XSS and other meanies get onto the HTML page.

    JavaScript is great for thwarting good-faith incompetency/blundering, but against a determined attacker it is no good. You must implement server-side filtering with something like HTML Purifier.

    P.S. Theoretically speaking, websites should degrade gracefully: when JavaScript is turned off, they should still function, albeit without any of the client-side flashiness/polish. Alas, this is not true of many websites, but most still are like that. Personally, I use NoScript to block scripting on all sites I visit, and then enable scripting on a case by case basis.

    Forum: Plugins
    In reply to: Clean HTML from MS Word.

    Client-side filtering is a bad idea for anything serious and should not be trusted. If you want to get rid of the MsoNormals Microsoft Word is so fond of, be my guest, but realize that anything done in JavaScript can (easily) be circumvented.

    Forum: Plugins
    In reply to: Clean HTML from MS Word.

    Perhaps you should look for the solution in a different place: instead of a standards-complaint WYSIWYG editor, try a standards-compliant HTML filter to integrate into WordPress directly. Not sure how well that filter will deal with Microsoft’s proprietary tags though. And it doesn’t have a WP plugin yet, although the API is so simple that I think doing that would be trivial.

    As for XStandard, this seems to be an application in and of itself (not Javascript), so it would require users to install something. Also, since it’s client side, there’s no guarantee that the input coming to you will be compliant. You really ought to look for something server-side. If the server can transparently clean up the code, it doesn’t matter how bad or good the WYSIWYG editor is as long as it doesn’t drop any tags.

    Sorry if this is resurrecting a dead topic.

Viewing 7 replies - 1 through 7 (of 7 total)