Support » Installing WordPress » UTF-8 and ISO–8859-1

  • WP1.2 supports UTF-8, that’s nice. But what do i do with my old posts containing special characters? These characters will show right only if i set 1.2 to use ISO-8859-1 in the options page.
    Is there a way to have the old posts converted?

Viewing 13 replies - 1 through 13 (of 13 total)
  • In lieu of a permanent solution, I think you could probably do it using vars.php. Give it a look-see and add the character changes there.

    ‘ISO code here’ => ‘UTF code here’, and WP will take care of translating the iso to utf? Is that all?
    Thank you, Beel.

    I just thought of something: suppose i use this in vars.php:
    234 => 135 (just making up something as an example)
    That tells WP to use unicode #135 everytime it sees a #234, right? That may cause some conflicts – what if i do want to use #234 someday?
    I think there is a need to provide conversion for previous posts, if Unicode support is to be serious. Am i right?

    THAT looks good! I will try it right now, after backing up the DB of course 🙂

    Update: i couldn’t do the MySQL string replacement thing. What i did was: i opened the dump with a text editor to take a look inside the post, to check how the strings with the ISO codes where written therein – but there are no ISO code strings in the dump! The dump reads nearly as a normal text file, no html entities or ampersands with numeric codes inside.
    On the other option, a conversion in the vars.php file: i was summing up the ISO to UTF codes, and codes look the same for many letters! For example, i check the capital O with acute accent, and the ISO table i am using shows #211, then i check a UTF table and it shows under the column “U-dec” (which seems to be the kind of values used for UTF in the vars.php file) the same #211.
    What am i missing here?

    (i am about to bang my head onto the table in front of me. i can’t make even the WP forum display a special character like a capital O with an acute sign. i have edited the above post 4 times trying that, to no avail, which makes me feel very, very stupid)

    That character ? Ó

    Yes, that character. Man :-/
    Well, anyway – the conversion was sorted out by michel_v. Over WP’s IRC channel he told me to open the SQL dump with an editor, and just save it in UTF8 encoding, and dump it in the database again. That did the trick.
    Beel, thank you one more time!

    Hi anatman,
    I have the same problem that you had.
    Could you please post a step-by-step guide? If possible with the programs you used.
    Thanks a lot in advance.

    If you have a linux system:
    iconv -f iso-8859-15 -t utf-8 < dbdump > dbdump.1
    iconv should be avialable on all glibc2 systems

    On Windows I down’t know. On linux you could do this via ssh. As you can see I converted the hole db with no drawbacks at this time. Next problem was apache were you should have an:
    AddDefaultCharset utf-8
    in the configuration for your blog-directory.

    Hi joern,
    thanks very much for your help!
    The support of my hoster did that for me 🙂 Now everything looks great.
    There is still a problem with comments’ email, which have weird characters… But that’s not so important.
    Could you please let me know why should the default chartset be changed in Apache?
    Is it necessary?

    It was necessary because of mozilla/firefox recognized the page encoding as iso-8859-1 regardless whats in the content-type meta-tag. Mozilla makes this decision because of the Apache server-headers which it normaly set to iso-8859-1.

Viewing 13 replies - 1 through 13 (of 13 total)
  • The topic ‘UTF-8 and ISO–8859-1’ is closed to new replies.