[resolved] [Plugin: Relevanssi - A Better Search] strip_tags issue (5 posts)

  1. b.l.k
    Posted 4 years ago #


    in function relevanssi_index_doc(), strip_tags() is used at line 2353 :

    $contents = relevanssi_strip_invisibles($contents);
    $contents = strip_tags($contents);
    $contents = relevanssi_tokenize($contents);

    It makes an issue with some text. For example :

    <p>my text</p>
    <p>my second text</p>

    We obtain textmy in the list of indexed words.
    Isn't it better to use preg_replace(), to replace tags by space and to get separated words to index ?

    Thank you for your answer.

  2. Mikko Saari
    Posted 4 years ago #

    strip_tags() does not remove whitespace, but if the original text is

    <p>my text</p><p>my second text</p>

    then yes, there's a problem. I suppose something like


    would do the trick, without running into terrible problems.

  3. b.l.k
    Posted 4 years ago #

    Thank you for your answer, i added :
    $contents = preg_replace('/<[a-zA-Z\/][^>]*>/', ' ', $contents);
    $contents = strip_tags($contents);

    and now i got an indexation that looks pretty good !

    I could add :
    $pcoms = preg_replace('/<[a-zA-Z\/][^>]*>/', ' ', $pcoms);
    between :
    $pcoms = relevanssi_strip_invisibles($pcoms);

    $pcoms = strip_tags($pcoms);
    $pcoms = relevanssi_tokenize($pcoms);

    to modify comments indexation ..

    Is it possible to add a fix in a next version ?

    Thank you again !

  4. Mikko Saari
    Posted 4 years ago #

    Yeah, this is already on my to-do list for the next version. I'll fix the comments as well.

  5. b.l.k
    Posted 4 years ago #

    Thank you for your answers and your plugin :)

Topic Closed

This topic has been closed to new replies.

About this Topic