• Hi there,

    I’ve noticed that single quotes (') entered in the text editor get converted to in the final HTML output by WordPress, despite wp-config.php containing the line define('DB_CHARSET', 'utf8');.

    Does anybody know if I can easily find a way, preferably without hacking the core code, to have, for example in this case, single quotes in text to be converted to a relevant UTF8 character () instead?

    Many thanks indeed.

Viewing 3 replies - 1 through 3 (of 3 total)
  • Thread Starter hydrurga

    (@hydrurga)

    So, my character entity code was converted in my message above, despite being surrounded by backticks. It should read (removing spaces) … get converted to & # 8 2 1 7 ; ….

    Moderator bcworkz

    (@bcworkz)

    So you want your posts to have the actual UTF8 curly or slanted style characters instead of the HTML character entities that are normally replaced for the straight style characters you get when typing content on a keyboard, correct?

    Post content is run through wptexturize() to insert HTML character entities. WP does this by adding it as a callback to ‘the_content’ filter. You could create your own callback function that in turn replaces the HTML character entities with the actual UTF8 characters in the same manner. Your callback needs to be added with a priority number higher than 10 so that it runs after wptexturize().

    Alternately, you could rewrite wptexturize() so it uses UTF8 characters instead of HTML character entities. Use a different function name. Then remove wptexturize() from ‘the_content’ filter and add in your version. This is technically altering core code, but in a manner considered acceptable because it is through a filter and persists through upgrades.

    Thread Starter hydrurga

    (@hydrurga)

    Just what I needed to know bcworkz – many thanks!

Viewing 3 replies - 1 through 3 (of 3 total)
  • The topic ‘Unwanted character entities in utf8’ is closed to new replies.