It appears that it chokes on atom feeds.
Plugin Author
Allen
(@amweiss98)
what is the rss feed’s url where you think this is choking?
Plugin Author
Allen
(@amweiss98)
well, here is that feed running successfully on my public site (this is using the feed to post)
http://www.wprssimporter.com/
and here it is using the shortcode
http://www.wprssimporter.com/mypage/
Its random. Sometimes it works, sometimes it doesn’t.
It appears to be related to character encoding from what I can research so far.
I found a workaround:
ALTER TABLE wp_posts DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
One could say its a wordpress problem due to inadequate installation requirements or the plugin is not handling character sets properly.
Hmmm…that seemed to fix some of the errors. Do you check for dupes before posting?
Plugin Author
Allen
(@amweiss98)
if you’re using the feed to post, it checks for dupes using the permalink and title – if either of them are already in the database, it doesn’t enter the item again (some people make the mistake of putting an item into trash and thinking the item is gone, but wordpress keeps the item in the database until it is permanently deleted.
Okay. I think I found a feed that it breaks on consistently.
http://betterbarbecueblog.com/feed/
On that feed your escaping the ® html entity with \\xAE which causes the “WordPress database error Incorrect string value:”
Plugin Author
Allen
(@amweiss98)
as you said, I’m not a programmer, so since I’m not having any problem with that feed on my end..where precisely do you see this feed breaking….is there a specific feed item?
Actually, use this file which will solve the problem.
https://dl.dropboxusercontent.com/u/3132388/pluginfiles/excerpt_functions.php
It causes mysql to reject post. You are having problems; it just depends what characters are in the feed. It still posts the the items that don’t fail. A log file should be implemented since this plugin works on a lot of different feeds. If one looks in their php error_log you will see the errors.
Here is a snippet from that same file:
function CleanHTML($content,$thisLink){
$content=str_replace(" »", "", $content);
There are already hacks in place to strip out characters. You would need these for every extened ASCII character for the rss import to posts to operate flawlessly. However, this turns the function into a “lossy” method; one loses information. Anyway, the loss probably would be acceptable if the post was actually transacted. In this particular case you lose all of the information because its never posted.
I’ve moved on to different plugin and will not be spending anymore time on this. Thanks for your time.
This is a three part error problem.
1. utf8mb4 characters are attempting to be inserted into mysql which mysql rejects because the column does not support the character set.
2. Although not the exact cause of the problem; the my.cnf should have the proper character sets stated explicitly in the config file regardless of how you created the database.
3. One must get rid of or translate the 4 byte utf8 characters. This code problem should be addressed in the core of wordpress.
Here is a link to some code that appears to have solved the problem.
http://www.avidheap.org/2013/a-quick-way-to-normalize-a-utf8-string-when-your-mysql-database-is-not-utf8mb4
Plugin Author
Allen
(@amweiss98)
you can do into excerpt_functions.php (line 247) and you’ll see different options to decode the html. Try using this one:
$content=html_entity_decode(pre_esc_html($content), ENT_QUOTES,'UTF-8');