Support » Fixing WordPress » Robots.txt, plus XML Sitemap plugin are not friendly

  • Resolved Drawer

    (@drawer)


    I asked a question here a few weeks ago about how to block Google Imagebot. I do not want all of my hard work displayed in Google Images for anyone to take.

    I handwrote the robot text, here:
    Sitemap: http://myblog.com/sitemap.xml

    User-agent: Googlebot-Image
    Disallow: /uploads/

    User-agent: Mediapartners-Google
    Allow: /

    User-agent: Adsbot-Google
    Allow: /

    User-agent: *
    Disallow: /uploads/

    Note that in the robots text I used the word uploads instead of images, because I think that’s how WordPress files images – that only took me a week to discover.

    Before this robots text, I had been using the XML Sitemap plugin for WordPress, which did an excellent job.

    But now, my images are no longer collected by Google, but I think my posts are not being indexed, either!

    I still have the xml plugin enabled, but I unchecked the box which says: Add Sitemap url to the virtual robots file. But maybe there is still a conflict between my robot and the xml plugin?

    Can you please help me enable my posts to be found by Google, without images being scanned? Thank you.

Viewing 5 replies - 1 through 5 (of 5 total)
  • Thread Starter Drawer

    (@drawer)

    Did I do the robot text right, at least?

    Joe

    (@shopping-guide)

    What you can do is login to Google Webmaster Tools and add your site, verify it, then check the settings menu for your robots.txt file – it will let you enter a series of urls to test against whether Google can crawl the page/content or if it has been blocked by your robots.txt rules.

    However, having said all this, I think you are taking the wrong approach by blocking the Google imagebot to prevent others from downloading your images – your site and images are still accessible through other means and that will not stop anyone determined to take a copy of your images.

    If you are that concerned about it, you could consider embedding a watermark over the images – though that can lower the quality of the user experience if not done in a subtle manner.

    Joe

    Thread Starter Drawer

    (@drawer)

    Thanks a lot, Joe! That helped. It turns out Google can still find me, but I forgot that they write a robots (no-index) form themselves for you on the Tools, if you want, so I’m using that one now.

    What most people don’t seem to know about SEO – even the big guys- is that images are crawled and indexed faster than anything out. Within a couple of hours. Text takes much longer – 3 to 4 days. Still, happy Google is no longer stealing my images! 🙂

    I make a living from my cartoons, Joe, and the new and worse Google has enlarged the thumbnails on their images by twice as much and they do NOT lead people to your site! You can lift the image without even going to the site – which is how Google wants it, of course. Try it. That’s not okay with most people. And yes, I have a watermark, but I don’t care if it’s subtle.

    Joe

    (@shopping-guide)

    You have your reasons for blocking your images to Google – it’s just my personal opinion that in order to block a potential minority of say 5% of users who would download your work you are also potentially shutting out 95% of new fans and customers.

    Many websites would love to have more free traffic from Google and would bend over backwards to get it if they could, so it can seem counter intuitive to want to block a site that is sending you organic traffic.

    Also regarding the crawling of text vs images, the time difference in your experience is likely unique to your situation – there are many sites that have their text and image content (because it is often on the same page) indexed almost immediately or within a couple of hours. The crawling rate is determined by factors such as how often you generally update the site with fresh content and how much of an authority that Google considers and values your site.

    Joe

    Thread Starter Drawer

    (@drawer)

    Think about this, Joe – what % of people go to Google images to download, or just look? I suggest that at least 50% of people want to download. I’m a cartoonist, and lazy bloggers love to lift a cartoon (or a photo) – takes up a lot of words!

    Also, my customers are editors, not readers. I’m happy for people to read and enjoy my cartoons, but I think it’s fair that I request they not steal them, and do my best to not get them indexed.

    As far as the timing, it takes Googlebot Images a long time to index images, vs text. I do not believe that I’m unique in which it indexes first – I have 4 blogs, I’m embarrassed to say, and I pay a lot of attention to when people find posts. Honestly, it’s images it indexes first. Think of Google without them – it wouldn’t be half the monster it is.

Viewing 5 replies - 1 through 5 (of 5 total)
  • The topic ‘Robots.txt, plus XML Sitemap plugin are not friendly’ is closed to new replies.