Support » Plugin: My Calendar » Can I stop robots excessively crawling My Calendar

Viewing 4 replies - 1 through 4 (of 4 total)
  • Plugin Author Joe Dolson

    (@joedolson)

    The easiest thing you can do is much like what that post suggests; adding a robots.txt line blocking links in your calendar from being crawled.

    That’s a little extreme, however, as it will actually block all links on the page.

    Something like this is probably more effective, although only supported by Google and Bing:

    User-agent: *
    Disallow: /calendar/?yr=*

    That will disallow any link that has the parameter ?yr=, followed by any number of wildcard characters.

    Sorry not to have responded sooner, I’ve been experimenting with a combination of robots.txt blocks and Google parameters. I’m not convinced that Google obeys any of them consistently.

    Adding year blocks does stop hits on the specific year, but Google then gets extreme – and is now hitting 1700 and 3400! I have just blocked all calendar hits with parameters to see if that will do the job. Hopefully it will let through the page without parameters, which will allow the next few months events to be seen.

    Fingers crossed.

    I love this plugin, it has helped me a lot. This might be a big ask though. Would AJAX reloading help solve this one?

    Plugin Author Joe Dolson

    (@joedolson)

    I wouldn’t think so; it might depend on whether or not the robot crawler was processing the javascript, because that would mean that the URL would be constant. Most likely, however, the robots are not crawling with JS enabled.

Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘Can I stop robots excessively crawling My Calendar’ is closed to new replies.