Support » Requests and Feedback » Strip out Microsoft Word’s extraneous HTML tags

  • I’m using WP 2.7 RC3 for a client’s site, he’s copying & pasting from Microsoft Word into WP to generate his posts. Unfortunately, WP is picking up the extraneous HTML generated by Word.

    Is there a way to catch this godawful code?

Viewing 15 replies - 1 through 15 (of 15 total)
  • there is a “paste from word” button in the visual editor

    Try using Windows Live Writer. Or publishing directly from Word 2007+ using the Atom Publishing Protocol. It does that sort of thing correctly.

    you know, Ive been curious for forever as to whether or not the “paste from word” button works — Im pleased to say that it does. I actually tested it, and the opposite, a straight paste, on 2.8 bleeding

    They behaved exactly as expected. One was nice, one wasnt.

    Thanks for the tip about Paste from Word!

    x

    (@offordscott)

    I just tried “Paste from Word”, and “Paste as Plain Text” buttons, in WP 2.7, and it doesn’t work. a blank popup layer appears, and then expands, and then.. nothing.

    In safari, it also downloads an html file when I click the button.

    What’s up with that?

    Scott

    Is there a plugin that does this?

    -Brad

    I’m not sure if this will strip all MS Word code out, but I’ve been using this custom function successfully on a project.

    Pass $post->post_content to this function:

    function strip_msword_tags($content){
    
    	// Strip <span></span> tags
    	$content = preg_replace('/<span.*?>/', '', $content);
    	$content = str_replace('</span>', '', $content);
    
    	// Turn <div><br /></div> into </p><p>
    	$content = str_replace('<div><br /></div>', '</p><p>', $content);
    
    	// Strip <div></div> tags
    	$content = str_replace('<div>', '', $content);
    	$content = str_replace('</div>', '', $content);
    
    	// Apply WP filters
    	$content = apply_filters('the_content', $content);
    
    	return $content;
    }

    Hope that helps someone.

    A plugin would be awesome!

    In the meantime, interesting idea darinreid. Where would one install that code? (which template or core file?)

    Believe it or not, but anytime you copy or paste from one website to another you can pick-up other services tags because you are pasting from one site to another. The only way you can get rid of them is to delete them yourself. If you can avoid coping or pasting it is best to try to avoid it.

    Please don’t mistake me if I am wrong, I think you can add your pluggins on your main website. Once, you have your site up and goning you can add in pluggings, comments, rss and anthing else. Some websites set-up have extra items you can include for free if you wish. The word to the wise.

    Anytime you are writing in a word document the past buttong does work. For instance, if I am typing and I make a mistake or need to add another quote, sentence or word into my typing then I can highlight the word, quote, sentence and click on copy, then put the cursor in the area where I want to add that word or phrase, then either click on the paste button from drop down list or right click on the mouse to have the word or phrase added to the sentence.

    You cold also cut and paste, but you just cannot paste alone because if you do it will not work.

    Let me know if this helps.

    You might want to re-try because sometimes the paste button may not have caught all the words or letters you wanted to paste and that can create a problem within itself.

    I just used the built in Paste from Word button/function and it worked like a charm!!!

Viewing 15 replies - 1 through 15 (of 15 total)
  • The topic ‘Strip out Microsoft Word’s extraneous HTML tags’ is closed to new replies.