Viewing 15 replies - 1 through 15 (of 16 total)
  • Plugin Author Sybre Waaijer

    (@cybr)

    Hi MDC2957,

    I hope you’re doing well!

    Could you test the sitemap on the most current version? You can do so right here.
    Simply hit the sitemap URL in the table below and from there hit the Test button at the top right. It should give you the most recent data.

    If there are errors, could you show me a screenshot of it? Thanks!

    If the files exist, they should be in the root folder of your www or public_html folder. The SEO Framework should report the existence of them within the options page. However, it’s extremely unlikely that this will erroneously be reported.

    However, if a notice is output within the metabox, could you name the contents? There are various notices possible, all having their own cause. Thanks!

    And yes, it’s true. As Yoast SEO is vastly more popular than The SEO Framework, it’s far more likely that many plugins and themes have implemented compatibility patches towards it. I believe that’s the ground for the suggestion.
    With The SEO Framework I try to achieve equilibrium by leaning more towards WordPress core :).

    I hope this helps! Have a great day!

    Thread Starter MDC2957

    (@mdc2957)

    Yes that’s exactly what I did, I submitted the sitemap for a test, like the person in the other thread. Here’s what you get:

    https://i.imgsafe.org/9780bcdbe4.png

    The files do exist, but they don’t appear in the public html folder so I don’t understand what your plugin is doing to generate them and where are they?

    Plugin Author Sybre Waaijer

    (@cybr)

    Oh! Now I understand your question :).

    The files are “virtual”, just like your pages are virtual (they don’t exist either in your root folder). As explained here:

    The plugin doesn’t write anything to disk :). All files, like the robots.txt file and the sitemap.xml file, are “virtual”. This means they’re there because of “magic”, or simply WordPress Rewrite Rules :).

    WordPress Rewrite Rules, put in layman’s terms:
    When a page or file request is sent to a WordPress site, WordPress determines where that request should point to. Be it a page, post, robots.txt or sitemap!
    When that’s determined, the output will then be generated for that request.

    About the robots errors
    It seems that the website has recently changed some variables (looking into the DNS history).

    I’d suggest asking Google to recrawl your website.
    You can do so, provided the following steps:
    1. Go to this page, and select your site from the list.
    2. Hit “Fetch and Render”.
    3. Wait a little until the page is fully loaded.
    4. Hit “Submit to Index”.
    5. The errors should no longer come back.

    Please note that the older issues don’t go away, they’re still visible as for logging purposes. The date should be stable at 21st of June.

    I hope this helps!

    Thread Starter MDC2957

    (@mdc2957)

    I understand now too, thank you. You’re correct. The domain name is the same however, the old CMS and shopping cart were on a different web host and I just took the site live about 6 days ago.

    Should I choose:

    Crawl only this URL
    or
    Crawl this URL and its direct links

    for the purpose of getting everything possible indexed for my site.

    Plugin Author Sybre Waaijer

    (@cybr)

    Hi MDC2957,

    It should be Crawl this URL and its direct links :), as you have over 700 pages to be crawled.

    It might take up to a month for everything to be indexed and stable, this is something you can’t influence.

    Good luck! 🙂

    Thread Starter MDC2957

    (@mdc2957)

    Ok, I did it, now it show status as complete and “URL and linked pages submitted to index” Are you saying it could take up to a month for those 705 warnings to not come up anymore?

    Plugin Author Sybre Waaijer

    (@cybr)

    Hi MDC2957,

    Google isn’t clear on this part :).
    But I think they shouldn’t come up anymore as of now.

    What I do know is that it can take up to a month for everything to be re-indexed, this depends on various factors. When asking for a re-submission (as you did), it will speed this up a little.

    Thread Starter MDC2957

    (@mdc2957)

    “they shouldn’t come up anymore as of now”

    Well it’s been an hour, I resubmitted the sitemap in test mode and everything is still the same. Should I actually “submit” it now, or should I wait until the test mode doesn’t show those 700 warnings about blocked URLs ?

    Plugin Author Sybre Waaijer

    (@cybr)

    Hi MDC2957,

    You should submit them. I thought you already did so :).

    The automated errors then shouldn’t come up anymore. If you actively test it it could still result in those errors until Google has finished parsing your transferred website.

    Thread Starter MDC2957

    (@mdc2957)

    Ok, thank you for clarifying again. I submitted the sitemap and now it shows the 705 warnings, and under the indexed column, it says pending. So now I guess we just wait and see what Google does to me 🙂

    Thread Starter MDC2957

    (@mdc2957)

    So I checked the search console this morning because I got a google alert last night relating to one of the pages on my new site.

    https://i.imgsafe.org/ac3334ef21.png

    Of the 705 I submitted, it says 253 are indexed, but now it says 1410 warnings and under the description, it says the same “Sitemap contains urls which are blocked by robots.txt?

    We know nothing is blocked by my robots.txt file. But now why have the warnings doubled in number?

    Plugin Author Sybre Waaijer

    (@cybr)

    Hi MDC2957,

    That’s not what I expected, and I can’t explain its cause. I believe it’s an accumulation of data, and as they use multiple servers, it could get out of sync.
    Webmaster Tools data isn’t shown in “real-time”, as they have billions of pages to process, it might take some time to catch up.

    If you have questions about the Search Console, you might be better off visiting Google Support. All that I know about it is found there, and I don’t use it quite as often as many others (I have a busy schedule).

    I hope this clears things up :). It’s best to wait a while, we’ve already confirmed everything should be working correctly, Google Webmasters is just lagging behind :).

    P.S. Your site is found correctly again on Google. Use this search query (exchange example.com for your website) to see which pages can be found through Google:

    site:http://www.example.com/

    pftdc

    (@pftdc)

    @cybr. Hey man, i have the same problem. i’m desperately in need of help. would really appreciate it if you take a look.

    I have the following robots.txt submitted to webmaster tools (this is the live one)

    User-agent: *
    Disallow:

    Sitemap: http://XXXXXXXXX.com/post-sitemap.xml

    My sitemap is being created automatically by YOAST SEO plugin. Webmaster Tools shows me 759 Warnings! and it says Sitemap contains urls which are blocked by robots.txt. But there’s nothing in the live version of the robots.txt!!! what should i do?

    Plugin Author Sybre Waaijer

    (@cybr)

    What’s up @pftdc? 🙂

    This is a support topic for The SEO Framework.

    The SEO Framework doesn’t generate a post-sitemap.xml endpoint, so I think you might have found the wrong plugin support page.

    Nevertheless, this part blocks all robots, it seems:

    User-agent: *
    Disallow:
    

    For a correctly working robots.txt output, feel free to use this template:
    https://theseoframework.com/robots.txt

    For best results, you should remove the robots.txt file from your website’s root directory (through FTP), and then the SEO plugin you use should take care of its contents.

    Cheers!

    Hi Guys,

    Even I have the same issue.

    I’m getting 51 warnings for a simple Web Design & Development Company site in Sydney – http://www.elegantwebservices.com.au

    Warning: “Sitemap contains urls which are blocked by robots.txt.” I’m editing the robots.txt in Yoast plugin, but, the file is not changing. SO, assuming the root folder has the file.

    Well, the robots.txt part is:
    The User-agent: *
    Disallow: /*?comments=all

    Sitemap: https://elegantwebservices.com.au/sitemap_index.xml

    And, I’m checking the url in search console, it is allowing the home page, but, its not showing up in search.

    Please help..

Viewing 15 replies - 1 through 15 (of 16 total)
  • The topic ‘Sitemap contains urls which are blocked by robots.txt.’ is closed to new replies.