WordPress.org

Ready to get started?Download WordPress

Forums

W3 Total Cache
W3 Total Cache - Page cache not preloading (35 posts)

  1. unixpedant
    Member
    Posted 3 years ago #

    Hi,
    I have cache preloading turned on with the default settings, and a sitemap.xml in the wordpress directory. But the cache is not preloading at all. Any ideas ? Is this feature logged anywhere?

    Jim.

    http://wordpress.org/extend/plugins/w3-total-cache/

  2. Frederick Townes
    Member
    Plugin Author

    Posted 3 years ago #

    How do you know for sure that it is not working? It starts working according to the cron interval.

  3. unixpedant
    Member
    Posted 3 years ago #

    It is easy to see the preload is not working because by site is new and traffic is almost zero. I browse the site and every page is not cached until a second visit. Also, the page cache debug is turned on, and says not cached until second visit. Also, I can see the (general lack of) pages in wp-content/w3tc/pgcache.

    You are saying it works off cron? Where is that documented? I checked the cron log and there is no sign of a W3 related job triggering at any time.

  4. wallyO
    Member
    Posted 3 years ago #

    I too was having trouble with cache preloading.
    I checked the sitemap.xml location, and its well formedness, set the preloading interval to 30 seconds (to make it happen soon), then saved.
    I checked the page cache for the next few minutes with an ftp program and no cached pages were created in \wp-content\w3tc\pgcache\
    After research of wp_cron I found that it is triggered by any page load (which makes sense, everything needs a trigger).
    So I called the home page in a browser.
    Sure enough, 10 pages appeared in \wp-content\w3tc\pgcache\.
    I imagine that by setting up a cron job in your hosting panel to hit the file, [ABSOLUTE PATH TO WEBROOT]\wp-cron.php, then the pagecache could be populated x number of pages () per cron run even without requiring site visitors to trigger it.

    Huge thank you to Frederick for creating such an awesome plugin.
    I struggle to think of any feature which would make it better.

  5. michael
    Member
    Posted 3 years ago #

    I'm having the same issue. I have a "semi-static" home page where I have posts that are being pulled in random order. I also have a script where it cycles through a set of quotes.

    I have cache preload setting "automatically prime the page cache" enabled, 900 seconds, pages per interval 10 pages.

    I verified my sitemap: http://fearlessflyer.com/sitemap.xml to be valid.

    I also enabled "HTTP compression" and "Expires header lifetime" via Browser Cache settings (per note in the page cache settings page).

    Now, the reason that I can tell that it's not working is after an hour or so - the random post / random quote / random testimonial - is still the same order.

    Am I missing something? my website is: http://fearlessflyer.com/

  6. michael
    Member
    Posted 3 years ago #

    Discard my last entry - I discovered that it is flushing the cached pages. It just took longer than I was expecting.

    Thanks for an awesome plugin.

  7. Frederick Townes
    Member
    Plugin Author

    Posted 3 years ago #

    Unfortunately, there are no perfect settings to ensure that an entire blog is preloaded without risking that all server resources are exhausted to get the job done. That is, WP Cron is used to do this and as a result, a certain ratio of traffic to cached pages must occur in order for enough "attempts" to prime the cache to happen. There's no ideal algorithm that works for all servers although the settings can be tuned. If someone has a better algorithm, I'm all ears, for now I don't see a way to implement this without creating other liabilities. For that reason the site map was used with the ability to set priorities so that reasonable amounts of control are possible.

  8. wallyO
    Member
    Posted 3 years ago #

    A number of separate issues that all affect w3 cache preload are being discussed in this thread.
    1). Firing of wp-cron.
    This can be controlled by hitting wp-cron with an http request from the hosting account's crontab or by using an external spider application.
    */30 * * * * wget -O - http://domain.com/wp-cron.php > /dev/null 2>&1
    will fire wp-cron every 30 minutes whether the site has traffic or not.
    This has no liability. If wp-cron has tasks scheduled they will run in the same way as if a website visitor had triggered them. This would have to be setup independently of the w3tc plug-in.
    2). Pages missing from cache because garbage collection has deleted them and w3tc is waiting for next cron run to rebuild them.
    This could be eliminated if the w3tc plugin first built a path to, then deleted the cached page before its http request to rebuild it. This would have a very small resource liability. The resource liability could be furthur minimised by adding a sleep after each http request.
    3). Garbage collection only deletes files whose html expires header lifetime has expired. (which is set up on a different page to cache preload but has a huge effect on it)

  9. Frederick Townes
    Member
    Plugin Author

    Posted 3 years ago #

    Adding sleep operation to scripts is actually non-performant. Lots of these points sound reasonable, but they do not scale.

  10. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    I too pondered this a bit and it seemed like the ideal would be to do the priming outside PHP, so I ended up writing a Python script, Optimus Cache Prime, which does essentially the same as the W3 priming feature, except for a few things: It supports throttling (to my knowledge, PHP can't really do this without blocking other requests), but most importantly it checks the state of your static file cache before making any requests, so only the pages that aren't already in the cache get crawled.

    After doing some testing, it seems that OCP can check around 10,000 pages per second with WordPress and W3 Total Cache, assuming that the cache is already mostly primed (if not, there would be a lot of requests).

    You need a server with Python and the ability to run a script to use it. Hopefully someone will find a use for it :)

  11. Avin
    Member
    Posted 3 years ago #

    when trying to execute the script with python i get error line 113

    urlmap[url.text] = prio.text if prio is not None else '0.0'

    I have read about OCP and its just FANTASTIC, at our website we have more then 200.000 posts that need to be preloaded, with OCP our problem is solved, so please tell me why i get this error?

    Thank you

  12. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    Hi Avin!

    Glad to hear it will be useful for you :)

    Could you tell me the URL of your sitemap and your Python version ('python --version') -- I will try to recreate the problem.

    Feel free to email it to contact at pmylund dot com

  13. Avin
    Member
    Posted 3 years ago #

    Hello,

    Thank you for the answer, my python version 2.4.3. and for site map we use xml Site map a wordpress plugin.
    Thank you

  14. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    Hi Avin,

    I will test it, but if you have any possibility of upgrading to Python 2.5, 2.6 or 2.7 (http://python.org/download/), that would be an easy solution. Python 2.4 is very old. (It's from 2004.)

  15. Avin
    Member
    Posted 3 years ago #

    Hello,
    we use HG servers and we will ask if they can update our python, you think this is the problem i get the error ?
    Thank you again for the answer.

  16. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    Hi Avin,

    I just tested with Python 2.4, and got the same error. I fixed the line mentioned, but the module for parsing the XML (ElementTree) didn't exist in Python 2.4, so it won't work without major alterations.

    I am almost completely sure it would work if you tried with Python 2.5-2.7. If you are able to compile applications, you can compile the Python sources without root access.

    Please let me know how it goes :)

  17. Avin
    Member
    Posted 3 years ago #

    Hello,

    We have sent a request to HG and i am sure they will answer in several minutes.
    The OCP if will work how it says will be the best solution for us, we update daily more than 400 posts and have in our DB about 200.000+ posts.

    Again Thank you, i will inform for eventual changes.

  18. Avin
    Member
    Posted 3 years ago #

    Hello,

    We installed python 2.7, the web we are testing is:
    http://www.noa.al/noa-re/wordpress

    I have set a cron job that the script run every 5 minutes , just to test it.

    Here is the script: http://noa.al/avin/ocp.py

    I have tryed it with "local" option and i got some errors, so i have changed it " local = False "

    Other errors at the xml map, the url of site map is here:

    http://noa.al/noa-re/wordpress/sitemap.xml ,but i got another error so i have changed to : http://noa.al/noa-re/wordpress/sitemap.xml.gz

    But i see that the script is not uploading the files fast as you said.
    Have i done something wrong ?
    Thank you

  19. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    Hi Avin,

    You must use Local mode -- what makes it fast is that it's checking the files on the disk instead of the pages via web requests. What error do you get when using local mode?

    Make sure that ocp.py can read the files in the 'pgcache' folder, by the way -- for example by running 'sudo -u www-data ocp.py'.

  20. Avin
    Member
    Posted 3 years ago #

    Hello

    Thank you for the answer, to run the script i use the cron job.
    As i said i have installed the other version of python 2.7.
    The XMLMAP path is correct, i have changed the var/www/ with http://wwww
    same for the local path of pgcache, you can check the pgcache dir your self, url is here :
    http://noa.al/noa-re/wordpress/wp-content/w3tc/pgcache/

    Please take a look at the OCP.PY file its all ok ?
    'Here is the error : The folder /var/www/noa.al/noa-re/wordpress/wp-content/w3tc/pgcache doesn't seem to exist. Please ensure that cache_dir points to the base of the local file cache.'

    Thnx again.

  21. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    Hi Avin,

    It looks correct.

    It sounds like ocp.py doesn't have access to /var/www/noa.al/noa-re/wordpress/wp-content/w3tc/pgcache or /var/www/noa.al/noa-re/wordpress/sitemap.xml, which would explain why it can't find the sitemap or the cache files on the disk. You should run ocp.py as a user that can access those folders and files -- root or, preferrably, the file/folder owner, e.g. "www", "www-data" (check 'ls -la' in /var/www) -- THEN it should work ;)

  22. Avin
    Member
    Posted 3 years ago #

    hello,
    i haven't understood it good, please can you tell me the steps if you can
    Thank you

  23. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    Hi Avin,

    It's hard to say without knowing what kind of access you have to the server, and what kind of server it is, but the problem is that the user running the cron job cannot access the files in /var/www, which means ocp.py can't access them. How are you setting up the cron job? Can you specify that it should be run as an administrator? Do you have shell access?

    HG should be able to fix it easily :) It doesn't sound like a problem with OCP, just basic file permissions. You don't need to change the files or folders, ocp.py just needs to be run as the correct user. (Usually the user that owns the files in /var/www is www-data, apache, nginx, or similar.)

    Sorry I can't help more.

  24. Avin
    Member
    Posted 3 years ago #

    Hello,

    Ok i will try to configure it, Thank you, i will give an answer if all will be ok

  25. Avin
    Member
    Posted 3 years ago #

    hello

    this is the answer

    bash-3.2$ /usr/bin/python2.7 ./ocp.py
    The folder /var/www/noa.al/noa-re/wordpress/wp-content/w3tc/pgcache doesn't seem to exist. Please ensure that cache_dir points to the base of the local file cache.

    they have tryed with root user, are you sure there is nothing wrong at your script ?

    Thank you

  26. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    Hi Avin,

    Sorry if this is a stupid question, but IS the website actually stored in /var/www?

    What result do you get if you run the command find / -name "sitemap.xml" as root?

  27. Avin
    Member
    Posted 3 years ago #

    hello

    this is their answer...

    The convention of cPanel is to separate the document root of each domain by owner. For example, if user 'fuber' has a primary domain of 'bazoids.com' for its cPanel then the document root of 'bazoids.com' will be /home/fuber/public_html. Subdomains and addon domains usually occupy subfolders of public_html. So frob.bazoids.com will live in a folder like /home/fuber/public_html/frob. If you need to know which user own a domain cPanel provides a script for this purpose. The syntax is:
    /scripts/whoowns domain.com

  28. Patrick Mylund Nielsen
    Member
    Posted 3 years ago #

    Hi Avin,

    Then it seems if you change the paths to the sitemap and pgcache in ocp.py from /var/www/noa.al/noa-re/... to /home/yourusername/public_html/noa-re/..., it will work. If you get the same error message as before -- that it can't find the folder -- it's probably not the right path.

    If you go to WordPress options -> W3 Total Cache -> Install, does it tell you the path under "Rewrite rules"?

  29. Avin
    Member
    Posted 3 years ago #

    heh, now a huge error

    Couldn't crawl http://noa.al/noa-re/wordpress/2009/10/qeveria-gruevski-distancohet-nga-enciklopedia-ristevski-te-jape-doreheqje/. Error: HTTP Error 404: Not Found
    Couldn't crawl http://noa.al/noa-re/wordpress/2010/11/ruci-berishes-kudo-dhe-kurdo-ti-flet-vetem-per-opoziten-dhe-edi-ramen/. Error: HTTP Error 500: Internal Server Error
    Couldn't crawl http://noa.al/noa-re/wordpress/2009/11/ps-shqiperia-ne-nje-krize-qe-po-thellohet-dita-dites/. Error: HTTP Error 404: Not Found

    [Moderator:Cut the rest because it was breaking the forum page]

  30. Avin
    Member
    Posted 3 years ago #

    is any way that the script start caching all the latest posts first, and after start with the older one?

Topic Closed

This topic has been closed to new replies.

About this Plugin

About this Topic