Forums

incomplete import rss in spanish (3 posts)

  1. nicollb
    Member
    Posted 4 years ago #

    I am trying to convert from an old phpnuke to wordpress using RSS import for my news items. I have done this successfully for an english language site, but the spanish is failing miserably because post content and categories are cut off at the first diacritic (á, ñ, ó etc). Funny - the post title seems to work fine.

    The RSS XML item xml looks like this:

    <item>
    <title>Cuatro diáconos más</title>
    <link>http://www.famvin.org/es/modules.php?name=News&file=article&sid=2066</link>
    <description><p align="justify">En Brasil, en la provincia brasileña de la C.M. fueron ordenados cuatro nuevos diáconos y además este mes se incorporará un nuevo cohermano. Para alegría de la Iglesia, de la Congregación y de los más pobres. </description>
    <pubDate>2007-11-04 12:35:03</pubDate>
    <category><Font size=2>Congregación de la M.</Font</category>
    </item>

    and the resulting post looks like this:

    Cuatro diáconos más
    Noviembre 4th, 2007

    En Brasil, en la provincia brasile

    Posted in Congregaci | Edit | No Comments »

    ---

    Any ideas? hints?

    I tried loading without the UTF-8 on the database; then I get the full post, but I get ? instead of the non-english characters.

    I'm tearing my hair out -

  2. nicollb
    Member
    Posted 4 years ago #

    I looked into the rss import module - there are 2 differences I can see between the title field (which works) and the category and post_content (which fail):

    Title is Text; the other 2 are Longtext;

    Title is not decoded before it is written to the database; the other 2 are.

    Feed displays find in my browser - diacritics and all.

  3. nicollb
    Member
    Posted 4 years ago #

    Just in case anyone else has to do battle with utf8 encoding:

    The issue was resolved by taking the RSS feed for the supplying application and passing the text thru utf8_encode (in the php script that created the RSS feed). That finally replaced the offending characters with ones that would load into the database using the RSS import script.

Topic Closed

This topic has been closed to new replies.

About this Topic

Tags