Support » Fixing WordPress » How to keep WP from being indexed?

  • I tried to search for this, but most people want their blog found and indexed by other engines.

    I don’t want the one I just started up indexed or linked to by any engine (Google, Technorati, etc). It’s for a small mutual water company, and not for general distribution.

    I know that, being on the ‘net, it’s still public, but I’d like it to not show up in search engines, etc.

    What’s the best way to do that?

Viewing 12 replies - 1 through 12 (of 12 total)
  • Create a robots.txt file (with exactly that file name), containing the following, and upload it to the main directory of your website/blog (where the index.php is):

    User-agent: *
    Disallow: /

    See TheSiteWizard.com (not my site; just the first one I found).

    Thread Starter carndt

    (@carndt)

    Thanks Diane! You rock! I knew about robots.txt, but was unclear on whether it would work on blogs. I guess it’s just another website, eh?

    Also (and you have to do this BEFORE you make your first post, or the chance is lost):

    Go into WP Admin > Options > and delete any pinging services. I think “Ping-O-Matic” is in there by default. That means the first time you click “publish” an alert is sent to Ping-O-Matic, and they tell Technorati, and then you’re indexed whether you like it or not. (I personally think this should be opt-in and turned off by default in WP until activated)

    One more trick: you might want to go into Options > Reading and change your RSS feeds to display 1 post with summary instead of full post; or hack out the RSS files entirely. Without pinging, I’m not sure if your RSS feeds would be found accidently, but they might.

    I researched Bitacle, this is from their FAQ. It is unclear what it means, and is probably written by a non-English speaker. One thing that is clear, they get the articles from RSS Feeds (which you can limit or disable in WP, but it’s turned on by default). Here’s the info I found:

    What type of blogs are inlcuded in the search?
    All blogs that has a feed. This it can be RSS or Atom.
    #
    How I can do that my blog appears in the list?
    If your blog publishes a feed in the web (in any format) and automatically makes ping to a service of updates, like, for exampleWeblogs.com or Feedshot, we can find you and add it to the list.
    What happens if I don’t wish to appear in the list?
    If you doesn’t publish feed, it won’t be included in the blog’s search. Nevertheless, if you previously has published feed of the site that was indexed, the old entrances will remain in the index, although the new ones aren’t added.
    The blog search don’t follow the archives ‘robots.txt’ or META labels like NOINDEX, NOFOLLOW.

    I don’t know what that last sentence means, when it says “The blog search don’t follow the archives ‘robots.txt’ or META labels like NOINDEX, NOFOLLOW.” Does it mean these methods will preclude Bitacle, or these methods have no effect on Bitacle?

    BTW – there’s almost no way to delete rpc.pingomatic.com BEFORE making your first post: since it’s included by default in the update services field WHEN YOU INSTALL, the first post (the “Hello World” stupidity) is sent the minute you complete the install.

    The only saving grace is there are probably a million of those identical posts a day, so yours will hopefully be lost among the multitude.

    You sure, vkaryl? Of course WP comes with the ‘Hello World’ post, but I don’t think the ping routine happens until you actually click the “Publish” button. (Note that it would happen if you edit the Hello World post and save the edit)

    If I’m wrong, all the more egregious to include this invasion of privacy. A significant percentage of potential WP users (albeit minority) wish to use WP for semi-private diaries, friend blogs, health and illness journals, baby logs, company websites, etc. It should be the user’s choice whether and when to publicize the existence of the blog and the live-ness of the URL.

    I of course could be wrong, Dgold. But my impression of the install process is that the Hello World post is “published” by default at the point of install completion. In order to completely verify, I’ll have to do a new install somewhere – give me a bit, and I’ll post back with the results.

    And the results are that the Hello World post shows as “published” immediately when the install completes. Before you do anything other than simply look at it in the manage posts screen.

    Yep. I consider that invasion of privacy, though as I have said in the past, using a blogging program (inherently an “internet publishing program”) for private stuff is rather like closing the barn door after the horse is out on the highway….

    “I researched Bitacle, this is from their FAQ.”

    There’s some opinion out there that Bitacle are content thieves and their bot should be banned. See this for more:

    http://stopbitacleorg.wordpress.com/
    http://lutrov.com/blog/80/

    “And the results are that the Hello World post shows as ‘published’ immediately when the install completes.”

    Yes, the “post_status” column IS set to “published” by default but I’ve looked at the code and I can’t see where it pings afterwards.

    You probably have a better idea of that than I do, I’m hardly anyone’s idea of a coder!

    I do remember that a while back, a couple of people commented that they were informed by friends that their new “hello world” posts on installs showed up through pingomatic. I haven’t myself checked that out; when I want an install with zero output like that, I run it first on my local server, set up the whole thing the way I want it, edit the options table in the database and then set it up online (creating a virgin db and importing the local dump I mean….)

    My theory is that once Technorati gets your feed from your first Publish (presumably your 2nd post, or the first time you actually write a post yourself), then it indexes your Hello World retroactively if it’s still in the feed.

    I appreciate the research and thoughts. I still think/hope that installing WP alone does not send a ping, but pressing Publish (or Save Changes) does.

    From what I can by looking at the code the installation sets the “posts” table columns “to_ping” and “pinged” both to nothing, so therefore, the ping is executed when you save a post later.

Viewing 12 replies - 1 through 12 (of 12 total)
  • The topic ‘How to keep WP from being indexed?’ is closed to new replies.