Hello. I’m writing a PHP program to build a spam comment database. More specifically, I select specific fields on the wp_comments table and insert data into a similar table on a different database. First several rows will be successfully inserted into a new table. But… The collation of the original comment_content field is set to latin1_swedish_ci. Spammers use some irregular characters. And it appears that the collation of the comment_content field blocks those comments where irregular characters are used. How do I know? If I skip this field (comment_content), all hundreds or thousands of rows with other 14 fields (skipping comment_ID) will be successfully imported into a new table. Changing the collation to utf8_unicode_ci doesn’t help. Does anybody have any idea what’s the best collation to access all spam characters?
- The topic ‘Collation of Spam comments’ is closed to new replies.