WP is invalidating my XHTML Strict valid post (15 posts)

  1. rook
    Posted 11 years ago #

    Wow, this forum cannot handle my attempts to replicate HTML code, in particular HTML comment sequences (lt bang dash dash). Here's the post that I wanted to put on this forum but couldn't figure out how to:
    I really hope you've hung on to this point.

  2. rook
    Posted 11 years ago #

    FYI: On the web page in my previous reply ( http://ziphstric.com/wordpress-mangling-valid-xhtml-post.html ), I've added a link to the original post data (cut-and-pasted from the WP edit text field) which WP is mangling: wordpress-mangling-valid-xhtml-post-data.txt

  3. Anonymous
    Posted 11 years ago #

    And can you have blockquote and cite together ? (says he grappling with 115 validation errors on a new site) ;) Tip for the community: If you want to validate 1. Keep off Javascript unless you know the proper syntax 2. Do not let googleads anywhere near your blog 3. Be careful if you have a long blogroll - it is dirty work hand cleaning 4. Do not copy and paste posts. Just my two cents.

  4. rook
    Posted 11 years ago #

    Anonymous asked:
    And can you have blockquote and cite together ?
    Yes. Cite is an attribute on blockquote in XHTML 1.0 Strict.
    Any Javascript I have came with WP. I've made very little customization.
    Not sure what #4 is about. I attempted to put a relevant clips from a post that demonstrated the problem. I copy-and-pasted and then tried to get suitable markup for these forums, which I wasn't able to do. Copy-and-paste is very valuable in revealing all details of a problem including subtle ones that otherwise might not be noticed. For example, Lane's suggestion that I put two things on the same line that were on separate lines -- had I abstracted the problem, that detail might have disappeared because that CR should be irrelevant.

  5. Root
    Posted 11 years ago #

    Those tips were not specifically implying they affect you but they are just a quick summary of what I am going thru at the moment for the benefit of other readers.
    I am sharing your pain - believe me. Good luck. ;)

  6. rook
    Posted 11 years ago #

    Thanks to all for suggestions thus far. I had some coding work to do to make a good controlled experiment here. I wrote a script that verifies that all 58 of my posts is valid XHTML 1.0 Strict. So I'm starting clean here.
    I then checked the validity of all 58 posts by having the w3.org validator check each individual post archive page. It is then checking for XHTML 1.0 Transitional. Six failed.
    So I commented out these lines in wp-includes/template-functions-post.php :
    //add_filter('the_content', 'convert_smilies');
    //add_filter('the_content', 'convert_chars');
    //add_filter('the_content', 'wpautop');
    //add_filter('the_excerpt', 'convert_smilies');
    //add_filter('the_excerpt', 'convert_chars');
    //add_filter('the_excerpt', 'wpautop');

    (commenting out the_excerpt filters probably wasn't necessary)
    Only one of the six is now valid. The other five are still invalid because of WP processing.
    All five are good testcases because they have no comments. So, they only have what WordPress produces plus my valid posts.
    I then commented out the two commented-out statements shown below in the same file:
    function the_content($more_link_text = '(more...)', $stripteaser = 0, $more_file = '') {
    $content = get_the_content($more_link_text, $stripteaser, $more_file);
    //$content = apply_filters('the_content', $content);
    //$content = str_replace(']]>', ']]>', $content);
    echo $content;

    Now all my archive pages are valid. So there appears to be more filters than those add_filters at the top of template-functions-post.php which affect the_content.

  7. rook
    Posted 11 years ago #

    I suppose it's worth noting that if I leave my WP as modified, there is no error. Replicating should be easy though since if you were to cut-and-paste the story_content into a default WP setup.

  8. TechGnome
    Posted 11 years ago #

    Hrmmm....... okaaaay. I tried to view the links you gave and validate them and they all came out XHTML Transitional valid. So, clicked around and found the advanced page and retried it as strict.... I get "This Page Tentatively Validates As XHTML 1.0 Strict (Tentatively Valid)!" ..... have yea figured it out or ????
    In fact, pluggin in each one came out valid......

  9. Anonymous
    Posted 11 years ago #

    TechGnome: did you miss my followup comment: "if I leave my WP as modified, there is no error"? They are all now valid because I have commented out lines in function the_content in wp-includes/template-functions-post.php
    I didn't notice Tentatively Valid. I'll have to look for that. I wrote a script that queries with a request for XML results and then flags entries that have any <messages>.

  10. rook
    Posted 11 years ago #

    Oops, forgot to login for that previous post. I wrote it (as is probably clear).

  11. TechGnome
    Posted 11 years ago #

    Rook - yeah, my appologies, I dod miss it.... I *really* need to get those darned glasses fixed!
    I think the reason it came out tenetivly valid is because the strict isn't a full standard is it? As far as I know it's still a work in progress and hasn't been set in stone yet. Or is it because of the doctype encoding override?

  12. rook
    Posted 11 years ago #

    I've put the six posts that demonstrate problems into a new WP installation without any modifications that aren't done simply via the admin panels:
    Not only are the pagest generated by WP invalid, but some are corrupted such that the post content gets cut-off and is not displayed.
    The links in the sidebar include links to pages with the uncorrupted post content for each of the six posts in question.

  13. Beel
    Posted 11 years ago #

    We're back to that script comment within a script comment which is not valid.

  14. rook
    Posted 11 years ago #

    Beel, I'm unclear on what you are saying. Are you saying that I have a comment within a comment in my post data? (What do you mean be 'script comment' -- you say we are back to it but I don't see 'script comment' mentioned earlier in this forum thread? We did have the trouble in this thread with trying to capture XHTML snippets, but I don't see that now. So I'm just flummoxed trying to understand what you are saying.)
    I agree that nested comments in my post data would be invalid. I don't see it. And w3.org's and WDG's validators are apparently both missing it as well if that's the case. I'm thus disinclined to believe that I'm posting invalid data.
    After WP processes my valid input, then all manner of things are wrong and invalid. That's what I'm pointing out.
    I do think there should be a mode where one can post valid data and not have WP make it invalid.

  15. rook
    Posted 11 years ago #

    I agree it happens. The question is where and should it. To my eye, the problem is that WP processing is creating the comment-within-a-comment condition on that entry. My post data is valid. But WP processing is creating invalid page for display.
    The "Coordinates" post is a different example where I have the CR in the middle of an img tag, between attributes (nothing wrong with that with respect to valid XML). WP processing inserts a br tag in the middle of the tag, which definitely breaks the link. I won't try and reproduce the code here in the forum, but that entry is short and you can see the problem readily if you look at the source of the following, the error is just the second line in the story content.
    What I gave WP to work with:
    What WP produces:

