Support » Plugins and Hacks » W3 Total Cache » 404 for Sitemap & robots.txt with Browser Cache on

  • I’m getting 404 errors for sitemap.xml and robots.txt when I have Browser Cache turned on. I’m using the latest versions of W3TC and WordPress, and version 4.0b11 of Arne Brachold’s Google XML Sitemaps plugin (http://www.arnebrachhold.de/projects/wordpress-plugins/google-xml-sitemaps-generator/). My site is running on Nginx 1.5.2. I’m using the configuration generated by W3TC for Nginx.

    The default 404 exception rules for W3TC are in use:

    robots\.txt
    sitemap(_index)?\.xml(\.gz)?
    [a-z0-9_\-]+-sitemap([0-9]+)?\.xml(\.gz)?

    Here is the result of a curl header check for sitemap.xml:

    $ curl -I http://christiaanconover.com/sitemap.xml
    HTTP/1.1 404 Not Found
    Server: nginx/1.5.2
    Date: Thu, 04 Jul 2013 00:47:38 GMT
    Content-Type: application/xml; charset=utf-8
    Connection: keep-alive
    X-Pingback: http://christiaanconover.com/xmlrpc.php
    X-Powered-By: W3 Total Cache/0.9.2.11
    X-W3TC-Minify: On
    X-Robots-Tag: noindex

    Turning off Browser Cache then yields the following headers:

    $ curl -I http://christiaanconover.com/sitemap.xml
    HTTP/1.1 200 OK
    Server: nginx/1.5.2
    Date: Thu, 04 Jul 2013 00:48:05 GMT
    Content-Type: application/xml; charset=utf-8
    Connection: keep-alive
    X-Powered-By: PHP/5.4.9-4ubuntu2.1
    X-Pingback: http://christiaanconover.com/xmlrpc.php
    X-W3TC-Minify: On
    X-Robots-Tag: noindex

    Any suggestions?

    http://wordpress.org/extend/plugins/w3-total-cache/

Viewing 4 replies - 1 through 4 (of 4 total)
  • Could be a bug with W3TC. To overcome this, you may use those URLs hardcoded in nginx configuration like this…

    server {
      server_name domainname.com;
      # other directives
    
      location = /robots.txt {
        try_files $uri /index.php;
      }
    
      location = /sitemap.xml {
        try_files $uri /index.php;
      }
    
      location = /sitemap.xml.gz {
        try_files $uri /index.php;
      }
    
    }

    I hope this helps.

    I have a global try_files rule that works in actually serving the requested files, but when Browser Cache is on it gives a 404. With it off it returns a 200.

    A quote from official nginx wiki on location block on The order in which location directives are checked..

    Directives with the “=” prefix that match the query exactly (literal string). If found, searching stops.
    All remaining directives with conventional strings. If this match used the “^~” prefix, searching stops.
    Regular expressions, in the order they are defined in the configuration file.
    If #3 yielded a match, that result is used. Otherwise, the match from #2 is used.

    The rules from W3TC, may have taken precedence. That’s why I mentioned the above possible solution. I hope you can understand.

    Ah, I’m following you now. I’ll try that, thanks!

Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘404 for Sitemap & robots.txt with Browser Cache on’ is closed to new replies.