Support » Plugin: The SEO Framework » No index media and Google SEC crawl issues

  • Resolved tommel

    (@tommel)


    Hello there,
    I am marking No index to my media files in SEO setting yet they are showing up with crawl issues in Google SEC. Also does changing an individual webpage SEO indexing properties override this?

    Thanks in advance.
    Thomas.

Viewing 4 replies - 1 through 4 (of 4 total)
  • Plugin Author Sybre Waaijer

    (@cybr)

    Hi Thomas!

    WordPress attaches its media uploads in the database, to a post type called attachment, and they named that post type “Media.” This post type creates pages for each attachment you upload, and they carry little to no value to your site.

    In some cases, these pages are of such thin content that they can bring your site down in SERP. This is where the noindex option for attachment comes in, and we enable it by default in The SEO Framework.

    Now, this option doesn’t prevent indexing of the media files themselves. Again, it’s only for the (useless) Media post type’s pages.

    If you wish crawlers to stop indexing images, then please see this answer: https://support.google.com/webmasters/answer/35308.
    There, Google explains that you need to add this to your robots.txt file:

    User-agent: Googlebot-Image
    Disallow: /

    Alternatively, you can use this entry, which affects more crawlers:

    User-agent: *
    Disallow: /*.jpg
    Disallow: /*.png
    Disallow: /*.gif
    Disallow: /*.svg
    

    With that, a complete robots.txt file for WordPress should look a bit like this:

    User-agent: *
    Disallow: /wp-admin/
    Allow: /wp-admin/admin-ajax.php
    Disallow: /*.jpg
    Disallow: /*.png
    Disallow: /*.gif
    Disallow: /*.svg
    
    Sitemap: https://example.com/sitemap.xml

    Now, Google has registered a plethora of different warnings and errors in their Search Console. If you need assistance with that, I need the warning displayed as verbatim, including the affected URL(s). Often, when you click on the warning, Google tells you why they encountered an issue, and sometimes they link you to a fix.

    As for your final question, since TSF v4.0, you can override the global robots settings on a per-page and per-term basis. The SEO Bar should advise you on the status.

    Thread Starter tommel

    (@tommel)

    Hello Sybre,
    Thank you for the reply and direction, it is appreciated. As I understand it I will need to edit the robot txt file to disallow the uploaded images. Ok I will look into how to do it more.

    I got about
    61 x crawl issues down to indexing errors
    3 x submitted URL marked ‘no index’ – these are my checkout pages
    2 x submitted URL seems to be a soft 404

    The website is http://www.megalithicmaps.com

    I did a search and found that after removing some of the offending pages it is good to do validation as it may whittle a large number of errors down. I am just doing that at the minute.

    Thanks for the help. I am a complete novice.

    Plugin Author Sybre Waaijer

    (@cybr)

    Hi again, Thomas!

    No problem 🙂

    It’s a bit difficult to assert the crawling errors from here.

    May I suggest disabling Jetpack’s sitemap functionality? It doesn’t listen to the indexing settings brought in The SEO Framework. This alone should bring down some of the issues, as then The SEO Framework’s sitemap can then take over.

    The other issues listed may be from old pages, which are now removed and marked 404. These issues should resolve automatically over time.

    Thread Starter tommel

    (@tommel)

    Hello Sybre,
    Just to say a big thanks as I now have a site with no page errors. I disabled Jetpacks sitemap functionality, as suggested, and deleted the unnecessary webpages. Tis all nice and shiny.

    Thank you very much for your help.

    Best,

    Thomas.

Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘No index media and Google SEC crawl issues’ is closed to new replies.