WordPress.org

Ready to get started?Download WordPress

Forums

Yet Another Related Posts Plugin (YARPP)
[Plugin: Yet Another Related Posts Plugin] French "overused" words (18 posts)

  1. saymonz
    Member
    Posted 3 years ago #

    Hi!

    French user of YARPP here. I've been using it for a long time now, and I just noticed that the keyword selection wasn't effective at all for french language. So I managed to make it better.

    Here's a list of "overused" words (ready to use in YARPP after a small modification in intl.php) : http://goo.gl/LF7J2 . This list was compiled after a list of french words ordered by frequency of use ( http://eduscol.education.fr/cid47916/liste-des-mots-classee-par-frequence-decroissante.html )

    I also noticed that when using wp-Typography, YARPP is completely messed up with keyboard, eg. using half-words as keywords... Probably because of the hyphenation feature. Would be great to fix that!

    Hope you will take my contribution in consideration and thanks for this great plugin!

    -- saymonz

    http://wordpress.org/extend/plugins/yet-another-related-posts-plugin/

  2. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    Great! I'll include it with YARPP 3.2.1!

  3. saymonz
    Member
    Posted 3 years ago #

    Good. Be aware that I transformed the list from the link I gave without checking if every words are pertinent... I hope it is (and as I can see on my own blog, that's better than using the english list on french texts anyway) but that sure could be better.

    What about the problem with wp-Typography?

  4. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    The list looks pretty good.

    Not sure about the wp-typography issue. Did you figure this out by manually looking at the keywords produced?

  5. saymonz
    Member
    Posted 3 years ago #

    Yeah, looked directly in database via phpMyAdmin. Keywords selection totally changed after deactivating wp-Typography.

    That's strange, wp-Typography aplly its modifications to the page just before it's sent to the client, so it shouldn't affect the data that YARPP gather from database.

  6. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    YARPP doesn't get the text straight from the db, but instead from the db content with some of the content filters applied to it. Maybe WP-Typography's filters mess with that.

    Can you send me a link to the WP-Typography you're using so I can check it out?

  7. saymonz
    Member
    Posted 3 years ago #

    Here is wp-Typography on WordPress plugin directory. I use the latest version (2.0.4).

    Here's an example of keywords selected by YARPP for one of my posts without wp-Typography :
    "cest quon the souvenirs jai clip version audio pages blowing justine1991 orphelin 2011 pourra seront lecteur choses survirais mêmes gouffre "

    And with wp-Typography enabled :
    "qu ver per sonne the ai ve pages aurais der sion lec audio clip rais nirs ter jus rable orphelin "

    As you can see, YARPP selects half-words as keyboard, probably because of the hyphenation feature of wp-Typography (I use it with default settings except for the hyphenation rules language wich is set to french).

    wp-Typography remove all default wptexturize filters and add its own.

    EDIT : I confirm that hyphenation causes that. I'm not a WordPress developper but if you have a simple way to bypass text filters and just get the content as it's in database, that would solve the problem.

  8. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    I just updated the dev version to add wp-typography on this black list that I maintain. Give this new version a try. You may have to clear your keywords cache table to see any effect.

    http://downloads.wordpress.org/plugin/yet-another-related-posts-plugin.zip

    Let me know how that goes.

  9. saymonz
    Member
    Posted 3 years ago #

    Nothing changed with that version. Keywords are always weird with wp-Typography hyphenation function turned on.

  10. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    Okay. :( Because it applies consistently to your content across pages, though, I suspect it doesn't affect the actual reliability of your results that much... as such, I'm going to not worry about this right now. Thanks!

  11. saymonz
    Member
    Posted 3 years ago #

    That's fine. Maybe I'll get a look into the code myself, but I always get lost in WordPress-related code (though I'm actually not a beginner in PHP coding).

    Thanks for your help anyway!

  12. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    No problem. Thanks!

  13. berniecz
    Member
    Posted 3 years ago #

    Hi Mitcho, I have to confirm exactly the same issue with WP-Typography described by saymonz. Whole thing is, that your plugin takes post content after aplying filters (which would not be a problem in general). But WP-Typography injects into words soft-hyphens where those can be divided at the end of the line. And unfortunatelly YARPP understands soft-hyphen as space. This it cuts every word virtually to several sylables. The only (brute-force-hack) solution was to disable filtering in your plugin (testing for blacklisted filters always returns false). When I check the database now, keyword cache is perfectly readable, this here is the issue. I'm not that good coder in order to detect the name of filter applied by WP-Typography, allowing you to be blacklisted. I can send you the part of code where WP-Typography "something" hooks at the_content hook, but I don't understand that code at all.
    This is for me clear collision between two plugins. Maybe instead of blacklisting WP-Typography would be enough to YARPP understand soft hyphen as nothing, thus stripping those soft-hyphens out of the post content, as you do with other markup things.

  14. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    Stripping soft hyphens here indeed seems like it could be a good solution... let me try that out now...

  15. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    @berniecz, I added stripping of soft hyphens (or tried to) in the dev version. Could you try it out and let me know how it goes?

    http://downloads.wordpress.org/plugin/yet-another-related-posts-plugin.zip

    You'll want to manually clear your cache and keyword tables after installing this version.

  16. berniecz
    Member
    Posted 3 years ago #

    Hohoo! It seems now it works like a charm. Just two lines of code and what a difference! Great job, mitcho. It would be pity if the best related posts plugin and best typographic plugin would not work together.

  17. berniecz
    Member
    Posted 3 years ago #

    And if Czech stop words could be implemented, it would be also great: http://www.provocado.cz/overuser_words.txt
    Many thanks.

  18. mitcho (Michael Yoshitaka Erlewine)
    Member
    Plugin Author

    Posted 3 years ago #

    Alright, I'm going to release this (as well as another minor fix) as 3.2.2! Thanks for your feedback!

Topic Closed

This topic has been closed to new replies.

About this Plugin

About this Topic