Support » Fixing WordPress » WP is invalidating my XHTML Strict valid post

  • rook

    (@rook)


    Hello, all went well from my migration from MT until I checked my site for validity.
    My post was originally validated as XHTML 1.0 Strict. However, the content returned by
    http://ziphstric.com/blog/archives/2004/04/13/where-is-it-enumerated/
    is not valid XHTML 1.0 Transitional.
    I clicked to “Edit” my post in my WP blog and cut-and-pasted the content into a document and was able to confirm that the post content is valid XHTML 1.0 Strict.
    Since my attempt to capture the problem were so troubling because I couldn’t encode them on this forum, I put the detail up on my website:
    http://ziphstric.com/wordpress-mangling-valid-xhtml-post.html
    On a more general note: Can I do something to prevent WP from doing anything to the post? It seems to take real carriage returns and create line breaks and new paragraphs. WP also makes a bunch of entity substitutions, some of which are incorrect.
    Update: I originally had multiple posts here and managed to mess up the forum with my encoding attempts. I see that something has been changed since I originally posted and it made some posts by me and others disappear.
    Lane suggested that I put the

    after the (lt bang dash dash) on the same line.
    That might be workaround for this situation, but does not solve the more general problem that exists. I’ll have to give it a try though.
    Another update: now that I updated my posts below to delete the problematic forum encoding, the posts that appeared to have disappeared have re-appeared!

Viewing 15 replies - 1 through 15 (of 15 total)
  • rook

    (@rook)

    Wow, this forum cannot handle my attempts to replicate HTML code, in particular HTML comment sequences (lt bang dash dash). Here’s the post that I wanted to put on this forum but couldn’t figure out how to:
    http://ziphstric.com/wordpress-mangling-valid-xhtml-post.html
    I really hope you’ve hung on to this point.

    rook

    (@rook)

    FYI: On the web page in my previous reply ( http://ziphstric.com/wordpress-mangling-valid-xhtml-post.html ), I’ve added a link to the original post data (cut-and-pasted from the WP edit text field) which WP is mangling: wordpress-mangling-valid-xhtml-post-data.txt

    Anonymous

    And can you have blockquote and cite together ? (says he grappling with 115 validation errors on a new site) 😉 Tip for the community: If you want to validate 1. Keep off Javascript unless you know the proper syntax 2. Do not let googleads anywhere near your blog 3. Be careful if you have a long blogroll – it is dirty work hand cleaning 4. Do not copy and paste posts. Just my two cents.

    rook

    (@rook)

    Anonymous asked:
    And can you have blockquote and cite together ?
    Yes. Cite is an attribute on blockquote in XHTML 1.0 Strict.
    Any Javascript I have came with WP. I’ve made very little customization.
    Not sure what #4 is about. I attempted to put a relevant clips from a post that demonstrated the problem. I copy-and-pasted and then tried to get suitable markup for these forums, which I wasn’t able to do. Copy-and-paste is very valuable in revealing all details of a problem including subtle ones that otherwise might not be noticed. For example, Lane’s suggestion that I put two things on the same line that were on separate lines — had I abstracted the problem, that detail might have disappeared because that CR should be irrelevant.

    Root

    (@root)

    Those tips were not specifically implying they affect you but they are just a quick summary of what I am going thru at the moment for the benefit of other readers.
    I am sharing your pain – believe me. Good luck. 😉

    rook

    (@rook)

    Thanks to all for suggestions thus far. I had some coding work to do to make a good controlled experiment here. I wrote a script that verifies that all 58 of my posts is valid XHTML 1.0 Strict. So I’m starting clean here.
    I then checked the validity of all 58 posts by having the w3.org validator check each individual post archive page. It is then checking for XHTML 1.0 Transitional. Six failed.
    So I commented out these lines in wp-includes/template-functions-post.php :
    //add_filter('the_content', 'convert_smilies');
    //add_filter('the_content', 'convert_chars');
    //add_filter('the_content', 'wpautop');
    //
    //add_filter('the_excerpt', 'convert_smilies');
    //add_filter('the_excerpt', 'convert_chars');
    //add_filter('the_excerpt', 'wpautop');

    (commenting out the_excerpt filters probably wasn’t necessary)
    Only one of the six is now valid. The other five are still invalid because of WP processing.
    All five are good testcases because they have no comments. So, they only have what WordPress produces plus my valid posts.
    http://ziphstric.com/blog/archives/2003/09/13/shiny-happy-people/
    http://ziphstric.com/blog/archives/2003/09/16/forecasting-rain-on-the-parade/
    http://ziphstric.com/blog/archives/2004/02/04/prefixsuffix-disambiguation-for-programmers/
    http://ziphstric.com/blog/archives/2004/02/14/defining-heuristic/
    http://ziphstric.com/blog/archives/2004/04/13/where-is-it-enumerated/
    I then commented out the two commented-out statements shown below in the same file:
    function the_content($more_link_text = '(more...)', $stripteaser = 0, $more_file = '') {
    $content = get_the_content($more_link_text, $stripteaser, $more_file);
    //$content = apply_filters('the_content', $content);
    //$content = str_replace(']]>', ']]>', $content);
    echo $content;
    }

    Now all my archive pages are valid. So there appears to be more filters than those add_filters at the top of template-functions-post.php which affect the_content.

    rook

    (@rook)

    I suppose it’s worth noting that if I leave my WP as modified, there is no error. Replicating should be easy though since if you were to cut-and-paste the story_content into a default WP setup.

    TechGnome

    (@techgnome)

    Hrmmm……. okaaaay. I tried to view the links you gave and validate them and they all came out XHTML Transitional valid. So, clicked around and found the advanced page and retried it as strict…. I get “This Page Tentatively Validates As XHTML 1.0 Strict (Tentatively Valid)!” ….. have yea figured it out or ????
    In fact, pluggin in each one came out valid……
    TG

    Anonymous

    TechGnome: did you miss my followup comment: “if I leave my WP as modified, there is no error”? They are all now valid because I have commented out lines in function the_content in wp-includes/template-functions-post.php
    I didn’t notice Tentatively Valid. I’ll have to look for that. I wrote a script that queries with a request for XML results and then flags entries that have any <messages>.

    rook

    (@rook)

    Oops, forgot to login for that previous post. I wrote it (as is probably clear).

    TechGnome

    (@techgnome)

    Rook – yeah, my appologies, I dod miss it…. I *really* need to get those darned glasses fixed!
    I think the reason it came out tenetivly valid is because the strict isn’t a full standard is it? As far as I know it’s still a work in progress and hasn’t been set in stone yet. Or is it because of the doctype encoding override?
    TG

    rook

    (@rook)

    I’ve put the six posts that demonstrate problems into a new WP installation without any modifications that aren’t done simply via the admin panels:
    http://ziphstric.com/wordpress/support-3-8153/index.php
    Not only are the pagest generated by WP invalid, but some are corrupted such that the post content gets cut-off and is not displayed.
    The links in the sidebar include links to pages with the uncorrupted post content for each of the six posts in question.

    Beel

    (@beel)

    We’re back to that script comment within a script comment which is not valid.

    rook

    (@rook)

    Beel, I’m unclear on what you are saying. Are you saying that I have a comment within a comment in my post data? (What do you mean be ‘script comment’ — you say we are back to it but I don’t see ‘script comment’ mentioned earlier in this forum thread? We did have the trouble in this thread with trying to capture XHTML snippets, but I don’t see that now. So I’m just flummoxed trying to understand what you are saying.)
    I agree that nested comments in my post data would be invalid. I don’t see it. And w3.org’s and WDG’s validators are apparently both missing it as well if that’s the case. I’m thus disinclined to believe that I’m posting invalid data.
    After WP processes my valid input, then all manner of things are wrong and invalid. That’s what I’m pointing out.
    I do think there should be a mode where one can post valid data and not have WP make it invalid.

    rook

    (@rook)

    I agree it happens. The question is where and should it. To my eye, the problem is that WP processing is creating the comment-within-a-comment condition on that entry. My post data is valid. But WP processing is creating invalid page for display.
    The “Coordinates” post is a different example where I have the CR in the middle of an img tag, between attributes (nothing wrong with that with respect to valid XML). WP processing inserts a br tag in the middle of the tag, which definitely breaks the link. I won’t try and reproduce the code here in the forum, but that entry is short and you can see the problem readily if you look at the source of the following, the error is just the second line in the story content.
    What I gave WP to work with:
    http://ziphstric.com/wordpress/support-3-8153/uncorrupted/coordinates-in-an-infinite-universe-sing-it.html
    What WP produces:
    http://ziphstric.com/wordpress/support-3-8153/archives/2004/06/29/coordinates-in-an-infinite-universe-sing-it/

Viewing 15 replies - 1 through 15 (of 15 total)
  • The topic ‘WP is invalidating my XHTML Strict valid post’ is closed to new replies.