• Hi,

    I don’t know why but I have a problem into the function ‘relevanssi_create_excerpt’.

    (FIY, I’m calling it into my code to create an excerpt for some post metas.)

    I have a text that is return with some “?” characters. I was able ot fix the problem by replacing

    $content = preg_replace('/\s+/', ' ', $content);

    by

    $content = preg_replace('/\s+/u', ' ', $content);

    What’s weird is that sometimes it works even without the /u modifier.

    I suspect that there is a php bug here but not really sure as sometimes it happens, sometimes not. With the exact same text !

    As all my texts are Unicode (UTF8) encoded I can go with the /u modifier but somehow that don’t seem ok to me as the behavior is so weird.

    My Setup is WordPress 4.4.2 and relevanssi 3.4.2

    PHP 5.5.12
    Apache 2.4.9
    MySQL 5.6.17

    https://wordpress.org/plugins/relevanssi/

Viewing 7 replies - 1 through 7 (of 7 total)
  • Thread Starter leup

    (@leup)

    In the same function, I have another problem.

    There is this line (234~):
    $term = " $term";

    I do understand that you are searching for words and not parts of words (not fuzzy) but there is a problem here with words with an apostroph.

    Example: query => “afrique”. If the text is “L’afrique”, the excerpt will fail on finding the term ” afrique”.

    Also, if I understand this correctly, if the function “mb_stripos” do not exists you do :

    $titlecased = mb_strtoupper(mb_substr($term, 0, 1)) . mb_substr($term, 1);

    and as the term always start with a blank space it fails to search the term with a first uppercase character.

    Plugin Author Mikko Saari

    (@msaari)

    Yes, that’s a bug with the first uppercase character. Also, adding the space – that’s a bit complicated as well, as it makes sense in some situations and not so much in other.

    I think adding the /u modifier makes sense, since WP content is pretty much always UTF8. I’ll have to see about the added space – something needs to be done with that, I’m just not quite sure what.

    In general the whole excerpt-building is far from being the most brilliant bit of programming in Relevanssi =)

    Thread Starter leup

    (@leup)

    Hi ! Thanks for your answer ! 🙂

    I removed the leading space character as it suits my needs better and added the \u modifier.

    I understand why you add the leading space but indeed it is far from perfect for every cases. Maybe using some regular expressions may be best ? Well, it would not give you the position of the occurence into the text… complex indeed. I will check what solution exists on the internet ^^

    Thread Starter leup

    (@leup)

    I made a quick search

    Google

    I think these links could be useful

    Drupal 7

    Stackoverflow

    WordPress plugin for search excerpts

    Plugin Author Mikko Saari

    (@msaari)

    Thanks, those should help.

    Plugin Author Mikko Saari

    (@msaari)

    Leup, I’m working on a better excerpt-building mechanism. If you’re interested in testing it, please drop me an email at mikko @ mikkosaari.fi.

    Thread Starter leup

    (@leup)

    Hi Mikko,

    Sorry for the delay. It would be definitely interesting but I have not so much time right now to do some tests.

Viewing 7 replies - 1 through 7 (of 7 total)

The topic ‘Problem with relevanssi_create_excerpt and unicode ?’ is closed to new replies.