My Calendar
[resolved] Can I stop robots excessively crawling My Calendar (5 posts)

  1. mandy@nepeta
    Posted 2 years ago #

    I'm using version 2.2.9 of My Calendar on a site with a smattering of calendar entries (about 7) at present. Since July I am being hit with 500+ Googlebot hits using what appear to be random parameters (e.g. yr up to 2024 in spite of the last event being next spring!).

    I notice that this has happened for a different plugin and they have issued a suggested fix (http://support.time.ly/limiting-excessive-google-crawls/). Do you have any similar recommendations as to how to limit the hits without compromising visitor searches?


  2. Joe Dolson
    Plugin Author

    Posted 2 years ago #

    The easiest thing you can do is much like what that post suggests; adding a robots.txt line blocking links in your calendar from being crawled.

    That's a little extreme, however, as it will actually block all links on the page.

    Something like this is probably more effective, although only supported by Google and Bing:

    User-agent: *
    Disallow: /calendar/?yr=*

    That will disallow any link that has the parameter ?yr=, followed by any number of wildcard characters.

  3. mandy@nepeta
    Posted 1 year ago #

    Sorry not to have responded sooner, I've been experimenting with a combination of robots.txt blocks and Google parameters. I'm not convinced that Google obeys any of them consistently.

    Adding year blocks does stop hits on the specific year, but Google then gets extreme - and is now hitting 1700 and 3400! I have just blocked all calendar hits with parameters to see if that will do the job. Hopefully it will let through the page without parameters, which will allow the next few months events to be seen.

    Fingers crossed.

  4. dalibu
    Posted 1 year ago #

    I love this plugin, it has helped me a lot. This might be a big ask though. Would AJAX reloading help solve this one?

  5. Joe Dolson
    Plugin Author

    Posted 1 year ago #

    I wouldn't think so; it might depend on whether or not the robot crawler was processing the javascript, because that would mean that the URL would be constant. Most likely, however, the robots are not crawling with JS enabled.

Topic Closed

This topic has been closed to new replies.

About this Plugin

  • My Calendar
  • Frequently Asked Questions
  • Support Threads
  • Reviews

About this Topic