• Resolved msittig

    (@msittig)


    http://wubi.org/wflms-announce/

    I was using Twitter Tools + untco + wp-sns-share to mirror tweets <i>in Chinese</i> onto a WP blog, and share them to a Chinese social networking site. Since the upgrade to the newest Twitter Tools, the posts to WP and the posts sent to the other SNS are truncated versions of the original tweets <i>if the original tweets are in Chinese</i>.

    http://msittig.wubi.org/test/tt-encoding-posts.jpg

    See how it’s only six characters? Actually, the tweets are downloaded and stored correctly:

    http://msittig.wubi.org/test/tt-encoding-tweets.jpg

    The problem comes when the tweets are published to the blog or sent to the SNS by wp-sns-share. I suspect this is an encoding/utf-8 issue because the posts are truncated to almost exactly one third of the original number of characters (in UTF-8, most Chinese characters take up 3 bytes as opposed to 1 byte in ASCII). After testing I find that this is only happening with tweets in Chinese, not with English tweets.

    Let me know if you need any more testing done on this issue.

    http://wordpress.org/extend/plugins/twitter-tools/

Viewing 8 replies - 1 through 8 (of 8 total)
  • Plugin Contributor Alex King

    (@alexkingorg)

    Please create a pull request on GitHub with the appropriate patch to address this, thanks!

    https://github.com/crowdfavorite/wp-twitter-tools

    Thread Starter msittig

    (@msittig)

    Thanks! Checking first to make sure I didn’t miss anything obvious.

    Not sure I have the time or PHP know-how to patch this, but will poke around a bit.

    Thread Starter msittig

    (@msittig)

    Looking through the code, I’m guessing it has something to do with the link_entities function in aktt_tweet.php. It seems to be populating an object (OO-code newb, can you tell?) using built-in functions like str_pad and strlen that may not play nice with UTF-8-encoded strings, and the logic of link_entities may also make some assumptions about strings and bytes that may be true about ASCII but not about UTF-8.

    That said, I’m pretty sure it’s strlen.

    Now, to figure out what exactly link_entities is doing, to brush up on my PHP, and to learn how to use github…

    Plugin Contributor Alex King

    (@alexkingorg)

    Awesome, thanks!

    WPChina

    (@wordpresschina)

    Yes similar problem here with Chinese on databases using encoding of both utf-8 and gb2312.

    I also encountered this problem. The following is how I handled it.

    1. open the file aktt_tweet.php
    2. find the function link_entities()
    3. change the line

      $str = substr_replace($str, $entity[‘replace’], $start, ($end – $start));

      into

      $str = mb_substr_replace($str, $entity[‘replace’], $start, ($end – $start));

    4. change the line

      $diff += strlen($entity[‘replace’]) – ($end – $start);

      into

      $diff += mb_strlen($entity[‘replace’]) – ($end – $start);

    5. copy mb_substr_replace() from http://www.php.net/manual/en/function.substr-replace.php#90146 into the top of aktt_tweet.php

    Now enjoy your twitter tools.

    Thread Starter msittig

    (@msittig)

    Thank you ruanyf! You are the author of untco? I suspected strlen, but it would have taken me a long time to find mb_substr_replace.

    I will now figure out how to push this back onto git and submit a pull request.

    PS. This is for updating this Weibo with news from my school’s website:
    http://weibo.com/u/2513119342

    Thread Starter msittig

    (@msittig)

    OK I think I understood git enough to branch, commit and push the updates back to my branch. I’ve submitted a pull request to the original Twitter Tools. I’ll consider/mark this thread resolved now.

Viewing 8 replies - 1 through 8 (of 8 total)
  • The topic ‘Encoding issue(?) with Twitter Tools 3.0’ is closed to new replies.