Support » Plugin: Akismet Anti-Spam » configuring “alternative languages”

  • A linguistic website on wordpress that I visit uses the Akismet plugin. For over half a year now, unfortunately, comments posted in Russian are being rejected as spam.

    Another site visitor did some research and told us this: “the Akismet spam recognition service can in fact be told to expect alternative languages such as Russian in the comments, but the plugin available for WordPress cannot be configured to give it that information.”

    Would it be possible to extend the plugin to expose this configuration API of Akismet ?

    • This topic was modified 1 year, 2 months ago by  stuclayton.
Viewing 4 replies - 1 through 4 (of 4 total)
  • Hi there, sorry to hear that.

    The plugin actually does that on its own, based on the value WP’s get_locale() function returns, and transmits it to the API.

    [~/akismet-trunk]# grep -rn blog_lang . --include="*.php"
    ./class.akismet.php:12:	private static $comment_as_submitted_allowed_keys = array( 'blog' => '', 'blog_charset' => '', 'blog_lang' => '', 'blog_ua' => '', 'comment_agent' => '', 'comment_author' => '', 'comment_author_IP' => '', 'comment_author_email' => '', 'comment_author_url' => '', 'comment_content' => '', 'comment_date_gmt' => '', 'comment_tags' => '', 'comment_type' => '', 'guid' => '', 'is_test' => '', 'permalink' => '', 'reporter' => '', 'site_domain' => '', 'submit_referer' => '', 'submit_uri' => '', 'user_ID' => '', 'user_agent' => '', 'user_id' => '', 'user_ip' => '' );
    ./class.akismet.php:143:		$comment['blog_lang']    = get_locale();
    ./class.akismet.php:514:		$c['blog_lang']      = get_locale();
    ./class.akismet.php:653:		$comment->blog_lang    = get_locale();
    ./class.akismet.php:703:		$comment->blog_lang    = get_locale();


    So if the site is set to Russian, it should benefit from said API.


    Now, you did mention a linguistic website, which might get more complex, if the same site sports countless languages. The plugin will adapt to a site’s defined language, but anything more complicated would need to be custom coded by their devs around WP’s WP_LANG constant and get_locale(). They could, for example, set it on a per user basis, etc. This is beyond the scope of what the generic Akismet plugin is intended to do, but at least it can be extended to. 🙂

    Thanks. The problem is merely that comments in Russian are being suppressed as spam, not comments in other languages. It’s not about “the site”, however you define that, but comments falsely identified as spam, apparently for the sole reason that 1) they contain characters from Russian codepages and 2) the site language is English.

    The problem is unwanted Akismet behavior. I don’t know what you mean by the rather general term “adapt” in “the plugin will adapt to to a site’s defined language”. The site operator does not want “adaptation”, or special per-user configuration, or anything like that. He wants Akismet to stop discriminating against Russian comments. At present Akismet appears to operate on the Nancy Reagan principle: “Just say no” to Russian when the site language is not Russian.

    Although I develop in Java, not PHP, I had already found the get_locale() call in class.akismet.php. However, it is merely my guess that whatever is listed (or not listed) in the blog_lang parameter affects the Akismet spam-inference algorithms.

    Another site visitor familiar with this suppression of comments in Russian speculates that the Akismet machine has “inferred” that spam is often associated with Russian codepages. My hope is that adding the Russian tag to blog_lang will disable that inference.

    It would be easy to patch class.akismet.php, but patching might have to be repeated following WP updates, say security updates. Is there no way to specify a “config file” with values that are used in place of hard-coded defaults, a file that is not affected by WP updates ?

    • This reply was modified 1 year, 2 months ago by  stuclayton.
    • This reply was modified 1 year, 2 months ago by  stuclayton.
    • This reply was modified 1 year, 2 months ago by  stuclayton.

    As I said, I don’t know in what way, if at all, the blog_lang values can affect the spam-inference behavior. The language of the site is “*” so to speak – all codepage tags, not a list of selected tags.

    No one is sure what the causes are of the unwanted behavior (suppression of comments in Russian). The “blog_lang” issue may be a distraction. Is a machine-learned “inference rule” in Akismet part of the problem, based perhaps on a human-specified principle such as “look for codepage correlation when probable spam has been identified by other means” ?

    • This reply was modified 1 year, 2 months ago by  stuclayton.
    • This reply was modified 1 year, 2 months ago by  stuclayton.
Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘configuring “alternative languages”’ is closed to new replies.