WordPress.org

Ready to get started?Download WordPress

Forums

BulletProof Security
[resolved] WP Super Cache, custom code and 'RewriteEngine On' (18 posts)

  1. definitio
    Member
    Posted 1 year ago #

    I have noticed the following which I don't know whether it signifies a problem or not.

    I've observed that the secure root htaccess file has the "RewriteEngine On" rather high in the file (as would be expected, preceding rewrite rules), but when WP Super Cache is enabled, almost all instances of "RewriteEngine On" are removed (except the inactive one, which belongs to the hotlinking prevention code) and the code of WP Super Cache, inserted at the bottom of the htaccess file has the only active such declaration.

    As it stands I am not sure whether this is a problem or not already, since now "Rewrite" rules are not preceded by the "RewriteEngine On" code.

    Furthermore, if I go ahead and add a rewrite rule e.g. from www. to non-www. (this is the code I used with Joomla) in your CUSTOM CODE BOTTOM area

    RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
    RewriteRule ^(.*)$ http://%1/$1 [R=301,L]

    then this rewrite rule to would not be preceded by a an active (not commented out) instance of "RewriteEngine On".

    Is there any issue with this configuration in general? Should I include the "RewriteEngine On" declaration myself when adding rewrite rules as custom code?

    http://wordpress.org/extend/plugins/bulletproof-security/

  2. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

    Yes, technically RewriteEngine On should come first before any other rules/conditions.

    RewriteEngine On is an assumed thing and yeah I have added it too many times in the root .htaccess file - it does not hurt to duplicate this directive. I think for most folks Hosts this would not be a problem that RewriteEngine On is being stripped out of the .htaccess file, but for some folks it might be a problem.

    I think the best place for your www to non-www code is here in the root .htaccess file. You want your code to be inside of the WordPress rewrite loop and not outside of the loop or what i assume would happen is that URL's above the root directory would not rewrite correctly to non-www URL's. Example: example.com/will rewrite to non-www correctly and example.com/some-category/will not rewrite to non-www correctly

    RewriteEngine On
    RewriteBase /
    RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
    RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
    RewriteRule ^index\.php$ - [L]
  3. definitio
    Member
    Posted 1 year ago #

    Thank you for your response.

    I will place the rewrite code to the suggested place.

    Just one more question, not really relevant to Bulletproof, if you don't mind.
    I've observed that some kind of redirection from www. to non-www is already in place, following the permalink structure in the settings, which does not include the www.
    I am not certain whether I actually should be adding that rewrite code, that is, whether it is unneeded or it may even cause a problem with the existing functionality.

    I read somewhere than in WordPress it may cause trouble, specifically end up redirecting all links to the home page, but when I tried it for a while, I did not get this behaviour.

    The reason I added it in the first place was that after moving from Joomla to WordPress, in Google Webmasters Tools I saw errors in the http://www.mysite.com version of the domain (one should add both and direct Google to use one of them) where none existed before (i.e. crawling errors with regard to robots.txt) whereas before the http://www.mysite.com seemed "invisible".

    I am looking for an technician's opinion, but I know this is outside the Bulletproof support duties, so you can ignore it, by all means

  4. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

    Double check your WordPress General Settings page to see what your WordPress Address (URL) & Site Address (URL) are. Settings >>> General.

    Personally I believe that www is better than non-www, but I do not have any proof that that is actually any better. ;) I believe that I have come across sites that have said www is better than non-www and explained why, but it has been years since I looked that up. So whether or not it is really needed or better I do not have a definite/definitive answer for you.

  5. definitio
    Member
    Posted 1 year ago #

    My settings are without the URL, just the way I wanted it to be, when I used DUPLICATOR plugin to transfer my site from a testing subfolder to the main folder.

    My question regarding www. to non-www concerned whether the EXTRA code I want to add is actually NEEDED given that the permalink and Site & WordPress URL does not include www. and some (efficient or not) redirection is already implemented

    i.e. I don't to have two rules for the same function:
    1) wordpress using non-www. as set in the installation process
    2) my extra code redirecting www to non-www.

    That's what I read might cause trouble. I am just not certain that WP does that rewrite efficiently enough, because I see Google trying to index the www. version of the site, whereas it used to be "invisible" before.

  6. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

    Well all I know for sure is that I have always used www since around 2000 for all the websites I have built, worked on, my personal sites, etc. In the back of my mind is a set rule that I go by - always use www and do not use non-www. Where I got the information or came to that decision is lost to me since it has been many years.

    All i can offer you is my personal opinion - always use www and not non-www. What you should do is Google this info to find out why a www prefix is better to use. ;)

  7. definitio
    Member
    Posted 1 year ago #

    Ok will investigate more.

    Thank you

  8. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

  9. definitio
    Member
    Posted 1 year ago #

    Thank you but the choice between www. and non-www is not my issue, don't know if I have been vague above.

    Let me try and sum up my problem:

    1) My site has been on the non-www for two years, so I don't intend to change it (used to be Joomla, now WP).

    2) When I installed WordPress, I went for the non-www version, intentionally.

    3) By picking the non-www version in settings, WP automatically redirects all www. to non-www

    4) I have doubts as to whether the rewrite WP is automatically doing is actually efficient

    5) The reason I think so, is that for the first time in two years, I see in Google's WebMasters Tools errors in the www. version of the site (set to point to non-www), whereas it was totally invisible before.
    Now I am getting reports about indexing failures by Googlebot. Before it didn't even see it, didn't have stats for www. reported.

    6) I want to add my old www. to non-www rewrite code in WP htaccess file, but I read somewhere that because WP already does that, the inclusion of the extra code might actually cause trouble.

    So I fear WP does not do the rewrite efficiently & that adding the previously proven as efficient code might break the site in some way.

    For now I have the extra code added as you instructed me. It doesn't seem to break anything. May be early to tell.
    My wondering was, will the extra code break WordPress, which is already redirecting www. to non-www. in what may be a not definitely efficient manner?

  10. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

    1. Yep do not change from non-www to www if you have an established website. There are other ways as you see from the link i posted above to handle adding subdomains for future site growth.

    4. I believe the internal rewriting that WordPress does is Very efficient and I believe the issue/problem that is going on is outside/not directly related to WordPress itself. Or maybe some plugin you have installed could be doing something incorrectly? Like an SEO or Sitemap plugin?

    5. I think you need to look at the what is going in Google Webmaster Tools and then try to solve the problem completely in Google Webmaster Tools. And also look at your plugins. WordPress is very efficient and very streamlined. I have been using WordPress for years and have never had the kind of problem you are describing. I have fixed website that were using SEO and Sitemap plugins that caused massive damage. And I have also helped clients fix their fubar Google Webmaster Tools settings. ie adding both a non-www and www site in Google Webmaster Tools. You should only have 1 and not both or you will have problems.

    6. Yes, WordPress does internal rewriting with php coding. I am not exactly sure of the order of things. ie does WordPress default or override .htaccess rewriting? The problem is not going to be with WordPress itself guaranteed.

    No, nothing should break. You should be testing your site using sniffers and other testing to ensure that there is not a problem before you find out if there is a problem the hard way - an email from Google or a deindexed site or Google penalties.

  11. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

    Actually what was I thinking/saying - of course WordPress internal rewriting always defers to whatever rewriting you add to your .htaccess file. DUH. LOL

  12. definitio
    Member
    Posted 1 year ago #

    4. Don't really know...went for the one of the most popular out there, Yoast's SEO plugin, which also creates the sitemap.

    The error I am getting in Webmasters Tools is this:
    A red exclamation mark next to the www. version of the site and I am getting a warning to check its health, saying that robots.txt is blocking some important page. The link of that important page cited in the "details" on the right is the www. site itself.

    When I navigate in the health section for the www. site, I am told that googlebot is blocked from the www. site

    The content of the robots.txt file shown there is

    User-agent: *
    Disallow: /

    When I click the link to the 'www.mysite.com/robots.txt' I am instead shown the contents of the 'mysite.com/robots.txt' although the browser URL shows "www.mysite.com/robots.txt". Its contents are

    User-agent: *
    Disallow: /wp-admin/
    Disallow: /wp-includes/

    Never had an error before in Webmasters Tools. I have correctly set my preferred domain to non-www from the beginning.

    Don't know what to make of this, if there really is an issue or not. My novice rationalization may have complicated it more, confused me worse.

  13. ryansatterfield
    Member
    Posted 1 year ago #

    I have a few questions for you. What plugin(s) are you using? What WordPress version are you using? What is your sites address?

  14. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

    I thought you said you had a non-www site? You should not have both a www and non-www domain submitted in Google Webmasters Tools.

    So is what Google saying actually really important or valid? Have you verified this yourself? I get automated warnings and notifications from Google intermittently. Some of them are valid and some are just false alerts/not valid. All warnings, messages and alerts are autogenerated by Google so some are going to be accurate and some are not. So you need to check and confirm that there is an issue/problem or if it is just a false alert.

  15. definitio
    Member
    Posted 1 year ago #

    I thought you said you had a non-www site? You should not have both a www and non-www domain submitted in Google Webmasters Tools.

    I have read guides telling administrators to claim both the www. and the non-www. version of their site in Google and set a preferred choice through the dashboard settings. Google will then display their links in www. or non-www as per choice.

    I get automated warnings and notifications from Google intermittently. Some of them are valid and some are just false alerts/not valid. All warnings, messages and alerts are autogenerated by Google so some are going to be accurate and some are not. So you need to check and confirm that there is an issue/problem or if it is just a false alert.

    Yes I don't have the knowledge to do that and have already acknowledged that I may be making novice rationalizations.

    What I am seeing is Google reporting failure to crawl the www. version of the site
    - Is this actually "good", signifying that WP has set a global "Disallow" on 'www.mysite.com'? Could be
    - The other hypothesis/conjecture is that the www. version is somehow now more "visible" for Webmasters Tools to report errors on it (?). Never had this thing happen before.

    I forgot to mention that in "Crawl Errors" on this site I am getting a reported failure to crawl this url
    http://www.mysite.com/function.require
    ...at least that's new and it started on the day I made the official switch from Joomla to WordPress

    I have a few questions for you. What plugin(s) are you using? What WordPress version are you using? What is your sites address?

    1) Plugins: http://postimage.org/image/c9q78gclz/
    2) WordPress 3.5
    3) http://tomakrypodari.gr

    Have to say a big thank you for your interest.

  16. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

    I have read guides telling administrators to claim both the www. and the non-www. version of their site in Google and set a preferred choice through the dashboard settings. Google will then display their links in www. or non-www as per choice.

    I have personally fixed several client sites problems that were caused by doing this. The problems ranged from Google assigning penalties to Google deindexing the sites pages. If you know what you are doing then be me guest. I can only tell you what I have found and fixed in my own personal experience.

    If Google is not crawling the www site submission then this is of course a good thing otherwise you would be in the possible scenarios I mentioned above of recovering from Google penalties and scrambling to get your website pages reindexed after Google deindexed them. ;)

    The crawl error is obviously not an "error" but a good thing. I doubt that you want Google to crawl the function.require file.

  17. definitio
    Member
    Posted 1 year ago #

    I have personally fixed several client sites problems that were caused by doing this. The problems ranged from Google assigning penalties to Google deindexing the sites pages. If you know what you are doing then be me guest. I can only tell you what I have found and fixed in my own personal experience.

    In this case, the www. site is not being indexed, nor was it indexed at any time (made conscious decisions from the start and stuck to them), so we're not talking about deindexing penalties. In my case it was crawling errors reported.
    Still, if the non-preferred version of the site was being indexed, this does not seem possible to have been caused by the claiming of both versions of the site (with and without www.) in Google itself.
    It sounds like it could have been caused by
    - bad rewrite
    - no rewrite
    - change of heart concerning the use of www. along the way.
    In this case, there would have to be deindexing.

    Still, the option of making Google use only one of the versions of the site for the search results/links it presents to users is no bad thing and it can only be done if one claims both and explicitly sets the preferred one.
    I cannot see how this is in itself problematic, unless one sets the wrong choice in preference settings, which is not that easy to do, i.e. it's not at all complicated.

    If Google is not crawling the www site submission then this is of course a good thing otherwise you would be in the possible scenarios I mentioned above of recovering from Google penalties and scrambling to get your website pages reindexed after Google deindexed them. ;)

    The crawl error is obviously not an "error" but a good thing. I doubt that you want Google to crawl the function.require file.

    So we're good then! That's great actually, cause it saves me the worrying.

    As I said my novice rationalizations interpreted this as a potential problem. Cause before the site was not being indexed either but this happened without (crawl) errors being reported, now it's still not being indexed, but it's reporting crawl errors.

    I thought that might mean inefficient rewrite/redirection but I'd rather be wrong than right.

    If the result is the same, I'm happy. The difference does still seem to be with how WordPress manages the www. version...is it the php rewrite making the difference between Joomla and WordPress? Let it be...

    Thank you for answers, I know I have taken advantage of your good will and taken up much of your time.

  18. AITpro
    Member
    Plugin Author

    Posted 1 year ago #

    "...it's not that complicated."

    What is complicated to one person may not be complicated to another person. ;) If you know what you are doing then proceed on.

    Since Google is a giant and everything is automated then some alerts / "errors" can be confusing. From time to time I get automated emails from Google telling me I have a significant increase in errors. When I check those "errors" they are exactly what I want to be happening. Ie "pages are blocked/forbidden" from Google being able to access them/crawl them - perfect! that is exactly what I wanted to happen. ;)

    So to sum it all up - A Google "error" may actually be a good thing that verifies that something is working/setup correctly. LOL

Topic Closed

This topic has been closed to new replies.

About this Plugin

About this Topic