• Resolved CMProduct

    (@ddapparel)


    Hello,

    I’m hoping you can help. I’ve been testing for a while now trying to figure out where the culprit is.

    My problem? The crawler will run through the sitemap, but it won’t create a ‘hit’ on the pages it crawled. At first load it’s a ‘miss’ but if I reload the page it will then say ‘hit’. So the Litespeed Cache works, just not with the crawler. I’ve enabled logs and found one line saying this: forced no cache [reason] DONOTCACHEPAGE const. So I added to my functions.php file: add_filter( ‘litespeed_const_DONOTCACHEPAGE’, ‘__return_false’ );. The that line went away in the log, but made no change otherwise. In the logs (debug and crawler.log) I cannot find anything else saying why it won’t cache when crawling.

    The logs (I cleared all logs + cache, ran the crawler then downloaded the logs right away so they should be as close as possible to only when the crawler ran):
    crawler.log: https://pastebin.com/Ea7AgHLy
    debug.log: https://pastebin.com/ZZvYDMyU

    The weird part (to me)?
    The crawler log will say this: “X-LiteSpeed-Cache-Control: public,max-age=2592000”
    The debug log will say this: “X-LiteSpeed-Cache-Control: no-cache”

    Is there any reason you could think of why the crawler will say there is caching happening when it really isn’t?

    Some things I have tried:

    • I disabled all caching through Cloudflare, so it now says “DYNAMIC”. But then,
    • I enabled QUIC.cloud CDN and it is fully functional, replacing Cloudflare (using Cloudflare for DNS only)
    • I tried different presets in Litespeed (I have got “Basic” to work a few times but as soon as I change something in Litespeed settings, the same issue re-happens)
    • I disabled most of my plugins (definitely anything that ‘could’ mess with caching)
    • Changed crawler “Delay” and “Timeout” to different times

    Maybe there is something in the log that you can see that I am missing? I honestly don’t really know too much about what I am looking at so maybe you’ll know because you know what it’s supposed to look like.

    Thank you for your time it’s much appreciated.

Viewing 10 replies - 1 through 10 (of 10 total)
  • Plugin Support qtwrk

    (@qtwrk)

    because you have vary cookie set up

    02/08/24 01:00:59.652 [66.249.66.13:12348 1 v6R] LSCACHE_VARY_COOKIE: woodmart_recently_viewed_products

    and the no-cache you see was coming from admin-ajax.php request or something like that , that is meant to be no-cache

    Thread Starter CMProduct

    (@ddapparel)

    Hi @qtwrk

    Thank you for your quick response! And you’re good, that was the problem. I remember I had added that vary cookie because of another issue due to crawling but didn’t realize that broke the crawler (in this way). I didn’t notice after I added that vary that the crawler broke – an oversight on my part – my apologies, I should have noticed that.

    While I have you here, could you help with one other thing sort of related? The vary cookie I had setup was for a recently viewed product block. I noticed that when the page is crawled, that block (which is in the footer of every page) doesn’t show the product image – it shows a data:image/jpg;base64 instead (and I tried turning off lazy load and responsive placeholder). This causes just a whitespace – which is why I did the cache vary cookie thinking this would work – but I won’t do that if it messes with the crawler. So maybe this is where ESI comes in? I put [esi blockname] but it kind of breaks the block and what does work, the image is still a base64 (white space only). I also tried a cookie exclude (in litespeed cache > excludes) and that just causes the whole page to not cache (says no-cache, esi=on in header) – so I guess that won’t work either? Any way to crawl/cache the pages with that block on it but don’t cache that block?

    Thank you again in advance for your time.

    Plugin Support qtwrk

    (@qtwrk)

    please provide the report number , you can get it in toolbox -> report -> click “send to LiteSpeed”

    Thread Starter CMProduct

    (@ddapparel)

    Sorry I’m a stickler for security and not having IPs and such just sent to some system, I hope you understand. I have created a pastebin with my server IP, API keys and domain omitted, everything else is as the report would have sent.

    https://pastebin.com/x1QJrXRy

    Thank you for your time

    Plugin Support qtwrk

    (@qtwrk)

    hmmm, I was kind of needing to look into the page at least

    Thread Starter CMProduct

    (@ddapparel)

    I geoblock using Cloudflare, but I can temporarily unblock an IP or country. Or if you’d prefer you could use a VPN? If you use a VPN you’d have to choose a server in the US. Or if you’re in the US then you’re good to go.

    https://pastebin.com/Nzg4LGg9 – using pastebin because I’ve posted my domain in the past and have had quite a few extra bots hit my site before.

    Plugin Support qtwrk

    (@qtwrk)

    hm , my US VPN is dead-slow connecting from my physical location , perhaps you could temporarily allow ES for a day or two ?

    Thread Starter CMProduct

    (@ddapparel)

    Spain? If so, I have unblocked it. I appreciate your time on this.

    Plugin Support qtwrk

    (@qtwrk)

    well , for that recent view , I don’t really have good solution for it , such dynamic content is the most cache unfriendly thing

    ESI may not be compatible for certain cases , if ESI is no go , then I don’t have any other idea , probably the only way is ask theme dev to make it ajax loading , that way will bypass cache all the way

    Thread Starter CMProduct

    (@ddapparel)

    Hm I see. Yeah I put [esi blockname] and that recents element/block doesn’t seem to fully function afterwards – so I guess it’s not compatible as you say. And unfortunately, I have enabled AJAX on that block (it has that option).

    I’ll try reaching out to the theme dev and see if maybe they can help somehow to make the block more compatible with caching.

    Well thank you very much for your time looking into this it is very appreciated.

Viewing 10 replies - 1 through 10 (of 10 total)
  • The topic ‘Crawler Log: Says page cached / Debug Log: says no-cache. Page is ‘miss’’ is closed to new replies.