Forums

How to prevent Google and Archive.org from caching pages? (4 posts)

  1. Pokus
    Member
    Posted 3 years ago #

    How to prevent Google and Archive.org (and other such sites, if they exist) from caching pages?

    I have a blog, but some time in the future I might want to remove it for this reason or that. In that case I am obviously not going to want it to be safely cached away at various places on the internet.

    How can I achieve this?

  2. Otto42
    Moderator
    Posted 3 years ago #

    At the root of your site (and I mean the *root*, not any subdirectories), create a robots.txt file.

    In that file, put the following:
    User-agent: ia_archiver
    Disallow: /

    This prevents archive.org from archiving your site.

    To prevent Google from keeping a cache of your site (but still allow them to index it), add this to your theme's header.php (near all the other meta tags):
    <meta name="ROBOTS" content="NOARCHIVE">
    <meta name="GOOGLEBOT" content="NOARCHIVE">

    To prevent many others from doing the same, add these as well:
    <meta http-equiv="CACHE-CONTROL" content="NO-CACHE">
    <meta http-equiv="PRAGMA" content="NO-CACHE">

    There is no real way to prevent *all* sites from keeping archives of your site. But these methods above will keep almost all well-behaved systems from doing so.

  3. Pokus
    Member
    Posted 3 years ago #

    That was fast. Thank you. Should I add </meta> to make the tags XML compliant?

  4. Otto42
    Moderator
    Posted 3 years ago #

    Whoops. Yeah, you can change the > at the end of each to a /> to make it XML compliant.

Topic Closed

This topic has been closed to new replies.

About this Topic

Tags

No tags yet.