remove HTML-markup from RSS: any suggestions? (14 posts)

  1. infranic
    Posted 11 years ago #

    i really don't know if this a topic to deal with, but for my opinion it looks like garbage, if xml-content is enriched with html-markup.
    some posts and comments have really huge of it.

    is it possible, that this could affect syndicating and ranking results in searchengines?

    and if it is, is there any solution or do i have to fixe this alone?

    thx for any reply or help :)

  2. Kafkaesqui

    Posted 11 years ago #

    "i really don't know if this a topic to deal with, but for my opinion it looks like garbage, if xml-content is enriched with html-markup."

    I'll just enjoy that comment for a moment.


    Ok, one option is to switch to summary display for syndication: Options > Reading, Syndication Feeds.

    Another is to modify your various feed templates to use the_content_rss() instead of the_content() for full text feeds. For example, in wp-rss2.php (RSS 2), look for this line:

    <content:encoded><![CDATA[<?php the_content('', 0, '') ?>]]></content:encoded>

    which I suggest changing to:

    <content:encoded><![CDATA[<?php the_content_rss('', false, '', 0, 2) ?>]]></content:encoded>

    Info on wp_content_rss() and its parameters:


  3. infranic
    Posted 11 years ago #

    it was my guess, that my question would intend some amusement for you out there, but as one can see, i'm a newbie to blogging.

    by attentively following your hints i discovered,

    1. that your suggested code-changes in the feed-templates work successful - many thx
    2. that (my) wp is giving me always the same feed-output - regardless of which post. info on wp_content_rss() says, that it's encoding the actual post, but (for me) it does not. it always shows the rss for the whole blog. guess i'll amuse you again... , but something went wrong.
  4. Kafkaesqui

    Posted 11 years ago #

    My enjoyment is not due to amusement, but agreement with your point. Sorry that wasn't clear.

    2. What's the version of WP you're running right now? And do you have a link?

  5. infranic
    Posted 11 years ago #

    i'm running:

    wp 1.5.1,
    php 4.3.3,
    MySQL 4.1.12,
    apache 1.3.30

    my site isn't online yet, at the moment i'm working here in the lan of our bureau.

    the header uses f.e. the following tag: <link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="<?php bloginfo('rss2_url'); ?>" />. should there be some other parameters been given with it?

    may be i have to creep into the php more than that i intrinsic wanted. ask me anything about perl or javascript, but i am at daggers dawn with php from it's beginning :)...

    glad to meet someone, who cares about poetry!

  6. Kafkaesqui

    Posted 11 years ago #

    <?php bloginfo('rss2_url'); ?> only generates the RSS2 link to the blog as a whole. It's not post-aware. Individual posts will typically provide a link using the comments_rss_link() . There is no single post rss per se.

  7. infranic
    Posted 11 years ago #

    okay, i got it. testing around with the comments-tags will be useful. unfortunately i've made the mistake to customize the layout and structure of my site to early. a real crashcourse in a php. tweak, tweak...

    may be, that i will post another question these days in here.

    thx a lot, be blessed.

  8. Firas
    Posted 11 years ago #

    HTML is the format you're writing your content in. You need the html to be in your rss feed so paragraphs break properly, lists show as lists, images show up, etc in aggregators. This will not affect search engines, but removing html tags might make your syndicated feed readers miserable.

  9. Kafkaesqui

    Posted 11 years ago #

    It's a shame I disagree with you on that Firas, it really is.

  10. Firas
    Posted 11 years ago #

    Kafkaesqui, I don't quite understand what there is to disagree about--how are you going to link to things in RSS item content sans HTML?

  11. Kafkaesqui

    Posted 11 years ago #

    I don't link to things in RSS.

  12. Firas
    Posted 11 years ago #

    Ahh, yeah. It depends on what one things RSS should do (ie., what sort of information it should contain), I guess.

  13. Kafkaesqui

    Posted 11 years ago #

    And that quite simply is the key.

  14. infranic
    Posted 11 years ago #

    Ooops, may I hand you a cleaver.


    may be, that this will enlighten the understanding of feeds and its history:

    RDF = Resource Description Framework

    RSS = Rich Site Summary (later) Realy Simple Syndication

    If one keeps in mind, that any pattern-matching searchtask in the semantic web has to chow rich markup with inline-styles & javascripts etc... , he might understand, why it can be useful, to straighten out feeds in a way, that easily can be accessed for plausible matches.

    For my opinion RSS can be used in many different ways. I will use it for additional reminding of siteupdates, others may use it for blogging. But in any case the codex intends machine readable output, - and not any string (with or without markup) will pass the ball to this. The base always is wellformed xml and qualified dtd's.


Topic Closed

This topic has been closed to new replies.

About this Topic