Hi,
Are you still seeing this behavior, and if so could you explain a bit more about what you are doing, you say you are entering unicode characters, how are you entering them and in what view modes etc?
I developed a plugin that allows for easy insertion of Font Awesome icons into the editor. It parses the CSS file and grabs the :before content string for each icon and makes a selectable list. For example, the android icon (fa-android) CSS :before content is “\f17b”, which is the $unicode variable below.
When the icon is inserted into the editor, I use:
str_replace(‘\\’, ‘&#x’, $unicode)
That results in the HTML for the icon being (semi-colon appended before insertion):
& # x f 1 7 b ;
(remove the spaces as the forum editor was rendering a boxed question mark)
The icon appears in visual mode, and will be retained when switching back and forth between visual and text modes. As soon as the page is saved, the icon code gets replaced by ‘?’ and visual mode just shows a question mark.
It works fine in 4.1 but does not in any of the latest alpha releases. DB_CHARSET is utf8 in wp-config on the server.
Do you have a link to your plugin so I can have a look at what could be happening ?
You don’t need the plugin to duplicate it. I can duplicate it just by inserting the following dec html code (remove spaces)
& # 9 8 2 4 ;
http://www.w3schools.com/charsets/ref_utf_symbols.asp
That’s the spades character in the UTF-8 misc symbols. On localhost running 4.1 it shows on insert and is retained after save. On 4.2-alpha-31471 on a live server it shows on insert and switching back and forth between visual and text, but on save it is stripped out and replaced by a question mark.
Testing on 4.2-alpha-31471 with the above and it renders as it should in both editor views, post preview and published posts.
Are you perhaps running a different theme that isn’t declaring the character set properly on your live server? (I’d also generally advise against running alpha builds on live sites, just as a bynote)
Exact same theme on both. The live site is just a “playground” with no user access. Tested with TwentyFifteen theme and no plugins installed with the same result. Code was replace by a ?. Multiple browsers same result.
I think I figured it out. The server set the WP database collation to latin1_swedish_ci rather than utf8_general_ci. That would explain the inability to save the utf8 characters.
UPDATE: altered collation to utf8_general_ci on wp_posts table and it works fine now.