WordPress.org

Ready to get started?Download WordPress

Forums

[Plugin: WP Super Cache] apache_request_headers very selfish for non apache php builds (13 posts)

  1. unncola
    Member
    Posted 5 years ago #

    If php is built as a CGI, apache_request_headers is unavailable, which disables user_agent filtering. There is a simple fix for this.

    if( !function_exists('apache_request_headers') ) {
    ///
    	function apache_request_headers() {
    		$arh = array();
    		$rx_http = '/\AHTTP_/';
    		foreach($_SERVER as $key => $val) {
    			if( preg_match($rx_http, $key) ) {
    				$arh_key = preg_replace($rx_http, '', $key);
    				$rx_matches = array();
    				$rx_matches = explode('_', $arh_key);
    					if( count($rx_matches) > 0 and strlen($arh_key) > 2 ) {
    						foreach($rx_matches as $ak_key => $ak_val) $rx_matches[$ak_key] = ucfirst(strtolower($ak_val));
    						$arh_key = implode('-', $rx_matches);
    					}
    				$arh[$arh_key] = $val;
    			}
    		}
    		return( $arh );
    	}
    ///
    }

    In addition to this fix, it would be nice to have an option to serve fresh pages to the rejected user-agents. There is no reason a googlebot, etc, shouldn't be seeing the newest site anyways.

  2. Donncha O Caoimh
    Member
    Posted 5 years ago #

    Thanks for that patch, will it in the next release!

    If you want googlebot and other search engines to see the newest pages you'll have to add their user agents to checks in the mod_rewrite rules. That could become unwieldy and not something every will want to do anyway.

  3. Donncha O Caoimh
    Member
    Posted 5 years ago #

    Added a similar chunk of code from the php manual (with a fix to the second substr) that doesn't use preg_match, but thank you very much for the inspiration!

  4. unncola
    Member
    Posted 5 years ago #

    Donncha, could you elaborate how you would do the user agent checking before mod_rewrite?

    Something like....

    RewriteCond %{HTTP_USER_AGENT} ^Googlebot/.*

    But how do you send it to the live pages? Add WordPresses rewrite rules after it? Something like this:

    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteCond %{HTTP_USER_AGENT}  ^Googlebot/.*
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /index.php [L]
    </IfModule>
    
    # BEGIN WPSuperCache
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteCond %{QUERY_STRING} !.*s=.*
    RewriteCond %{HTTP_COOKIE} !^.*(comment_author_|wordpress|wp-postpass_).*$
    RewriteCond %{HTTP:Accept-Encoding} gzip
    RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html.gz -f
    RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html.gz [L]
    
    RewriteCond %{QUERY_STRING} !.*s=.*
    RewriteCond %{HTTP_COOKIE} !^.*(comment_author_|wordpress|wp-postpass_).*$
    RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html -f
    RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html [L]
    </IfModule>
    
    # END WPSuperCache
    # BEGIN WordPress
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /index.php [L]
    </IfModule>
    
    # END WordPress

    Would wp-supercache even work with this .htaccess application?

  5. unncola
    Member
    Posted 5 years ago #

    I don't think it does actually. I cleared wordpress cache. Switched ua to googlebot and was served fresh pages. Switched ua back to default went to a few pages, they cached. Switched back to googlebot and was served cached pages... am I missing something? I have (WP Cache and Super Cache enabled) it might work with super cache disabled, but I've got a pretty high traffic site and I don't want to lose the gzip.

  6. unncola
    Member
    Posted 5 years ago #

    I've sort of figured it out. But I could use a hand. Basically my rewrite doesn't allow supercache to be rewritten. Is there a way to disable regular caching too? I'm assuming all the wp-cache-b5b1b16cf885f1022835ca349593ff0c.html type files are loaded from wp-cache-phase1.php. I'm looking for a way to incorporate wp_cache_user_agent_is_rejected() without white screening my wp. Any suggestions or help would be awesome.

  7. Donncha O Caoimh
    Member
    Posted 5 years ago #

    Is "Googlebot" in the list of rejected user agents?

  8. unncola
    Member
    Posted 5 years ago #

    Yes sir it is.

  9. unncola
    Member
    Posted 5 years ago #

    Any thoughts?

  10. Donncha O Caoimh
    Member
    Posted 5 years ago #

    If the page is supercached as a static file you need to explicitly disallow Googlebot. So, add your user_agent mod_rewrite rule in the groups of RewriteCond lines which just adds that as a condition of serving cached content.

    You probably want this line instead:
    RewriteCond %{HTTP_USER_AGENT} !^Googlebot/.*

  11. unncola
    Member
    Posted 5 years ago #

    Donncha, I think we're misunderstanding each other. I've already got that line in my htaccess. So I've disabled supercaching for whatever page. Since I'm telling the rewrite "if not this ua" they don't get shown the supercached version, which is great.

    My problem now is disabling regular caching for that same page, which the URL rewrite doesn't affect. I'm pretty sure the display of the non supercached page occurs in wp-cache-phase1.php but I'm not sure how to gracefully disable the loading.

    I tried this.... in phase1.php but it whitescreened my wp because I'm not sure how to tell it to load the live blog.

    if(wp_cache_user_agent_is_rejected())
    	return;

    Basically I'm trying to disable "Note that cached files are still sent to these request if they already exists." text from Settings>WP-Super-Cache so that I can serve rejected user agents the live site.

  12. Donncha O Caoimh
    Member
    Posted 5 years ago #

    But, do you have the exclamation mark, "!" in front of "^Googlebot" ? That negates the condition, and means the supercache files will only be served if the UA isn't Googlebot ..

  13. unncola
    Member
    Posted 5 years ago #

    Yeah, the problem isn't with turning supercache off. I've got it turned off with the rewrite. The problem is getting the live page displaying because its being cached using the old school wp-caching.

Topic Closed

This topic has been closed to new replies.

About this Topic