• Resolved kosaacupuncture

    (@kosaacupuncture)


    Google Search Console says that URL is not available to Google.
    What do I need to do to fix this?
    Thanks in advance.
    All the best

    • This topic was modified 2 years, 2 months ago by Yui. Reason: moved to fixing wordpress

    The page I need help with: [log in to see the link]

Viewing 7 replies - 1 through 7 (of 7 total)
  • Why do you want Google to index an internal PHP file? What’s there for Google (or your human readers) to see anyway?

    You should only worry about Google accessing and indexing your PUBLICLY AVAILABLE URLs where you have published content for humans to view.

    Anything else… ignore it, or, better yet, block Google from seeing it entirely.

    To run the URL Inspection tool to see URL’s current index status:

    Open the URL Inspection tool.
    Enter the complete URL to inspect. A few notes:
    The URL must be in the current property. URLs outside the current property cannot be tested. If you own that other property, you must switch properties to test the URL.
    AMP vs non-AMP URLs: You can inspect both AMP and non-AMP URLs. The tool provides information about the corresponding AMP or non-AMP version of the page.
    Alternate page versions: If the page has alternate/duplicate versions, the tool also provides information about the canonical version, if the canonical version is in a property that you own.
    Read how to understand the results.
    Optionally run an indexability test on the live URL
    Optionally request indexing for the URL.
    There is a daily limit of inspection requests for each property that you own.

    Thread Starter kosaacupuncture

    (@kosaacupuncture)

    @george Appiah
    Not that I want Google to index it but somehow, Google is looking at it, I don’t know why and how though. I have 9 other Server error (5xx), which have been fixed but Google doesn’t want to clear those red flags because of this problem.
    Thank you though.

    @scarletthompson
    I appreciate your kind help.
    I know how to run URL inspection tool and that’s how I found this red flag.
    Other than that, I just don’t understand what you’ve said.
    Can you please speak in English?

    Not that I want Google to index it but somehow, Google is looking at it, I don’t know why and how though.

    It’s simple: you’re explicitly giving Googlebot full and free reign to rummage through your entire site in your robots.txt file.

    What I’m saying is, completely block Google from even trying to access directories and paths that you don’t want to be indexed, which you’re currently not doing.

    Thread Starter kosaacupuncture

    (@kosaacupuncture)

    What I don’t understand is that this rogots.txt has been in there for more than a month and my other website has pretty much the same settings as below but it doesn’t give me this kind of trouble though.
    User-agent: Googlebot
    Allow: /

    How can I specifically ask Googlebot to not look at those files they don’t have to?
    Thank you so much.

    What I don’t understand is that this rogots.txt has been in there for more than a month and my other website has pretty much the same settings as below but it doesn’t give me this kind of trouble though.

    Translation: I’ve left my door open for more than a month, and pretty much all my houses have the same setup… and I never got a single incidence before. Why are thieves getting into this house alone now? 😀

    How can I specifically ask Googlebot to not look at those files they don’t have to?

    How did you get all those many lines for different crawlers in your robots.txt file to begin with, as those are not generated by WordPress?

    Your robots.txt file includes a directive at the top that should prevent Google (and all good, responsible bots) from crawling the entire wp-includes folder.

    User-agent: *
    Allow: /wp-admin/admin-ajax.php
    ...
    Disallow: /wp-includes/
    ...
    Disallow: /?blackhole

    But then, you went ahead to explicitly override this, allowing Googlebot, Bingbot, Yandex, and a whole lot of other bots to crawl your entire site with many Allow: / directives for individual bots.

    So, to answer your question:

    1) The “error” message you posted from GSC is Google saying they’re unable to access and crawl that URL on your website. But this is a URL that Google shouldn’t index anyway, and WordPress and your webserver are doing a great job by preventing Google. So this is not really a problem or “error” to worry about.

    2) If you don’t want to see such “error” messages in your GSC, then explicitly tell Google that you don’t want these paths indexed at all. How? Specifically for Google, just remove the following lines from your robots.txt file:

    User-agent: Googlebot
    Allow: /

    But why stop at Google? I’d remove all the explicit fill-site overrides:

    User-agent: Googlebot
    Allow: /
    
    User-agent: Mediapartners-Google
    Allow: /
    
    User-agent: AdsBot-Google
    Allow: /
    
    User-agent: AdsBot-Google-Mobile
    Allow: /
    
    User-agent: Bingbot
    Allow: /
    
    User-agent: Msnbot
    Allow: /
    
    User-agent: Applebot
    Allow: /
    
    User-agent: Yandex
    Allow: /
    
    User-agent: Slurp
    Allow: /
    
    User-agent: DuckDuckBot
    Allow: /
    
    User-agent: Qwantify
    Allow: /
    
    User-agent: googleusercontent
    Allow: /
    Thread Starter kosaacupuncture

    (@kosaacupuncture)

    @gappiah
    You’re awesome.
    I guess that I had too much faith in the plugin of Blackhole Pro.
    I will be correcting robots.txt as per your suggestion.
    Thank you so much.
    All the best.

Viewing 7 replies - 1 through 7 (of 7 total)
  • The topic ‘URL is not available to Google. It cannot be indexed.’ is closed to new replies.