• Dear Mikko,

    I’m using Relevanssi for quite a while now, and it always worked without any problems. Until I created some new posts, which are quite lengthy (some of the posts contain the text of PDF file with 300 pages).

    Since then, I have a performance problem: A search with Relevanssi takes so long that it sometimes even ends in an internal server error (probably due to timeout).

    The total number of posts is still at about 100, so I assume it has something to do with the length of these posts. Can you confirm that this is a problem for Relevanssi, and do you know what I could try to solve it?

    I know for sure it has to do something with Relevanssi, as the site gets considerably faster when I disable the plugin and do the search via “[URL]?s=XYZ”.

    Additional info: Relevanssi is up to date. My custom search can be restricted by the user to some “cats” values and a value for a custom post field.

    Kind regards,

    Requin

    • This topic was modified 8 years, 6 months ago by requin1989.
Viewing 8 replies - 1 through 8 (of 8 total)
  • Plugin Author Mikko Saari

    (@msaari)

    Yeah, sounds like this is caused by too much content. I haven’t tried this much data ever, so I don’t know what’s the exact mechanism here. What kind of searches time out? Few search terms, many search terms, common search terms?

    I created a 80 000 word post on my test site, and Relevanssi does find that pretty much as fast as any shorter post.

    Can you install the Query Monitor plugin and let me know what that reports – it will show you which are the slow queries. That would shed some light on this and maybe hint at a solution.

    Thread Starter requin1989

    (@requin1989)

    Dear Mikko,

    thank you very much for your quick response 🙂

    I just tried it with one-word search terms. It appears to make a difference whether the search term is common; the QueryMonitor report below comes from the search term “asd” (nonsense), whereas a search for “technology” results in an internal server error (probably timeout). I know that there is a blog post which should be found with the search term “technology” (a long one), so the commonness of a search term might make a difference.

    I got two reports of queries reported to be slow by QueryMonitor; their indicated “time” is however not the real time of the search query, which took about 30 seconds.

    Does this give you additional insights? 🙂

    Query 1:

    SELECT DISTINCT(relevanssi.doc), relevanssi.*, relevanssi.title * 10 + relevanssi.content + relevanssi.comment * 0.75 + relevanssi.tag * 5 + relevanssi.link * 0 + relevanssi.author + relevanssi.category * 3 + relevanssi.excerpt + relevanssi.taxonomy + relevanssi.customfield + relevanssi.mysqlcolumn AS tf
    FROM wp_relevanssi AS relevanssi
    WHERE (term LIKE '%asd'
    OR term LIKE 'asd%')
    AND ((relevanssi.doc IN (SELECT DISTINCT(posts.ID)
    FROM wp_posts AS posts
    WHERE posts.post_type NOT IN ('revision', 'nav_menu_item', 'custom_css', 'customize_changeset', 'feedzy_categories', 'mt_pp')))
    OR (doc = -1))
    ORDER BY tf DESC
    LIMIT 500

    Caller: relevanssi_search()
    wp-content/plugins/relevanssi/lib/search.php:513

    Time: 0.0911

    Query 2:

    SELECT COUNT(DISTINCT(relevanssi.doc))
    FROM wp_relevanssi AS relevanssi
    WHERE (term LIKE '%asd'
    OR term LIKE 'asd%')
    AND ((relevanssi.doc IN (SELECT DISTINCT(posts.ID)
    FROM wp_posts AS posts
    WHERE posts.post_type NOT IN ('revision', 'nav_menu_item', 'custom_css', 'customize_changeset', 'feedzy_categories', 'mt_pp')))
    OR (doc = -1))

    Caller: relevanssi_search()
    wp-content/plugins/relevanssi/lib/search.php:550

    Time: 0.1447

    Thread Starter requin1989

    (@requin1989)

    Additional information: Besides the long text in my posts, I include PDFs in an iframe in every post. The reason is that I would like to have the PDFs displayed in the post, but at the same time have them searchable (thus, the extracted PDF text is in a hidden div, which can be searched by Relevanssi).

    Maybe the inclusion of the PDFs in an iframe could also be an issue which causes the Relevanssi search to be so slow?

    A typical post looks like:

    <div class ="hidden"> --- PDF extracted text --- </div>
    <div id="pdfviewer">
    <iframe class="pdf_document" src ="/wp-content/pdf.js/web/viewer.html?file=URL.pdf">
    </iframe>
    </div>
    • This reply was modified 8 years, 6 months ago by requin1989.
    • This reply was modified 8 years, 6 months ago by requin1989.
    Plugin Author Mikko Saari

    (@msaari)

    So clearly the problem is not in the database queries. If the queries take less than 0.3 seconds, they’re not the reason for 30-second search times. Searching in itself shouldn’t be very slow, and shouldn’t really be affected by long posts; it doesn’t matter in the database how long the posts are.

    Are you using custom excerpts? If you are, that’s probably the reason for the timeout. Creating custom excerpts from long posts can take a lot of time. Does the problem go away if you disable custom excerpts?

    Plugin Author Mikko Saari

    (@msaari)

    Also, I’m guessing you have the excerpt length defined in characters. Switching to counting words is probably enough to solve the issue.

    I created a post that’s over 200 000 words long. Creating a 300-character excerpt took 37 seconds, while creating a 30-word excerpt from the same post was done in less than 2 seconds.

    Thread Starter requin1989

    (@requin1989)

    Dear Mikko,
    indeed, this seems to be the issue! Disabling custom excerpts makes the search considerably faster. Unfortunately, it was already fixed to “30 words” before, so that I have to completely deactivate it.
    Is there maybe a kind of workaround to make it somehow work? These custom excerpts are a really comfortable function… would be sad to have to do without it.
    In any case, thank you already for your help! 🙂
    Requin

    Plugin Author Mikko Saari

    (@msaari)

    Using words is much faster than using characters, but as I found out in my tests, it can still take 1.5 seconds to create a 30-word excerpt from a 200.000-word post. Not a problem if that’s the only result, but if you have 30 posts like that, you’re still looking at 45 seconds. It’s not 18 minutes like it would be with 300-character excerpts, but it’s still too much.

    If you want to help me a bit, here’s some improved code: https://gist.github.com/msaari/a6d97668cadcebcc80a90f8cd843868d

    This is a replacement for the relevanssi_create_excerpt() function that should create excerpts faster. It may slightly reduce the quality of excerpts, but if it makes excerpts possible, then that’s a bonus, right?

    Thread Starter requin1989

    (@requin1989)

    Hi Mikko,

    I was just about to post, but then I saw you replied first.
    I solved it now by trimming the post content to about 20.000 characters. This should be enough to capture the table of content with relevant keywords to make the documents findable in the search.

    Nevertheless, I’ll try your improved code to make it even faster! I’ll tell you if there are any issues with it.

    Thank you for your great support! 🙂
    Requin

Viewing 8 replies - 1 through 8 (of 8 total)

The topic ‘Relevanssi – performance problem’ is closed to new replies.