Support » Plugin: XML Sitemap & Google News » Google News Sitemap Not Found

  • Resolved leonard208

    (@leonard208)


    Hi,

    Im trying to set up a sitemap for google news. I’ve installed the plugin and configured it front end, but when I open the sitemap, it shows not found? It is set to posts and the categories i want it to display in the sitemap.

    Any help would be appreciated

    The page I need help with: [log in to see the link]

Viewing 9 replies - 1 through 9 (of 9 total)
  • Hi, the news sitemap itself works as you can verify on the dynamic URL /?feed=sitemap-news but it looks like Cloudfront returns the error for some reason. You might want to verify your cloudfront settings. Or maybe something in W3 Total Cache?

    Alternatively: You could use the URL /sitemap-news/ to submit to Google but I cannot guarantee that that address will continue to work in future versions of the plugin…

    Thread Starter leonard208

    (@leonard208)

    Hi @ravanh, Thanks for the response. Sorry, I’m not fully technical.
    I checked the feed URL, its working, but I republished a blog to today’s date and the feed URL does not seem to fetch the blog for the same category, – Emptied the cache aswell.

    We need something for the long term so I will pass this on to the developers.
    What I understand is – The feed URL must fetch the blogs, and it should display on
    /sitemap-news.xml as well? , also as you pointed out it could be a CloudFront error, can you direct me to the error if possible. Is there anything else to why it could not be fetching the blogs or why the sitemap-news.xml isn’t working

    So I understand that — appart from the issue with the “404 Not found” response from Cloudfront — you have just published a new post but it’s not visible in the news sitemap? And the problem persists after clearing the W3 Total Cache? In that case it sounds like a conflict with Cloudfront, which acts as a second (remote) cache to speed up your site responses from all over the world.

    To explain: the news sitemap is essentially a feed that will take just the latest posts from the last two days. It should therefore show your recent publication.

    I checked your /feed/ and I see in the code the latest post:
    <item>...<pubdate>Tue, 31 Aug 2021 11:41:50 +0000</pubdate>... from 3 days ago. Then I checked /feed/?nocache and there is shows <item>...<pubDate>Thu, 02 Sep 2021 12:01:06 +0000</pubDate>... which is the recent post you talked about, I assume.

    My conclusion is that indeed the Cloudfront response is being cached longer than you want it to. Or a broken connexion between W3TC and Cloudfront does not properly purge the cache?

    You might also want to force Cloudfront exclude the following URL patterns from being cached:

    
    /feed*
    /*/feed*
    /sitemap*
    

    Hope that helps 🙂

    If all else fails, you could also try the development version of the sitemap plugin which you can download from https://github.com/RavanH/xml-sitemap-feed (green button “Code” then “Download zip”). Then disable the current sitemap plugin and install the zip via Plugins > Add new > Upload, then activate.

    Make sure to never activate the two versions at the same time.

    This development version includes extra sitemap response headers that should tell Cloudfront to never cache the response. Maybe that well help your case…

    Thread Starter leonard208

    (@leonard208)

    Hi @ravanh
    I’ve checked the feed issue you’ve mentioned, I will pass that on to the developers to exclude the patterns from caching, thanks for discovering the issue.

    Okay, now the other issue is that the feed URL is working, it is displaying the blogs upto last 48 hours without delay, see here for example /us/?feed=sitemap-news , it’s on a network so we have US in beginning. but the sitemap-news.xml version isn’t being generated. We have the Yoast plugin installed as well, so I’m unsure if we need the News xml, as I think we can submit the feed URL in google news? I’m unsure but please let me know if possible.

    To explain: the news sitemap is not for submission in the Google News Publisher Center. As you mention, you can submit different feeds for your different news sections there.

    The difference is basically:

    1. The RSS/Atom feeds are to help Google News categorize your different publications and to fetch their content (feeds should contain the full posts). It is up to Google News how often they will check your feeds for updates.

    2. The news sitemap is to quickly alert Google News about recent publications and point toward the posts (it does not provide any content, except the title and meta data). A ping is sent after every new publication to let them know about the changes. It is of course still up to Google News to decide how often they come and index new posts but at least they will be made aware instantly.

    This last part illustrates that it is very important that the news sitemap is never cached!

    To reach that, you will probably also need to tweak W3TC. I’m not an expert on that plugin but there should be settings that allow you to exclude certain URLs, maybe similar to the Cloudfront (with wildcard) or otherwise just enter each full news sitemap full filename/path.

    About the /sitemap-news.xml not working: you might need to check your .htaccess (if you’re on Apache) or your Nginx config rules. The server should be set up so that all requests (including .xml file requests) that “fail” should fall back to /index.php (meaning: WordPress ill handle the request)

    The default .htaccess or recommended Nginx rules (see https://wordpress.org/support/article/nginx/) do just that but in your case there are likely be some custom rules that interfere in the case of .xml requests. See for example the different between /test.xml and /test.xmlz on your site. If the issue is not on your server, then it is at Cloudfront…

    Thread Starter leonard208

    (@leonard208)

    Hey @ravanh
    Thats a very detailed reply! will help a lot.

    So what I understood, in short = The Feeds ?feed=sitemap-news must be submitted to Google Publishing Centre not the xml version.

    The News Sitemap Xml version does not need to be submitted in the Publisher centre. but can be added to the search console? This is just to update google for the pages faster? But still, it is up to google to add these in Google News?

    The Feeds, the sitemap should never be cached, I will get the devs to add this to exclusion as well as check about the /sitemap-news.xml not being generated.

    Please let me know if im on track !

    No, sorry for the confusion: when I talk about feeds I mean an RSS feed address.

    So in Publisher Center you configure your RSS feeds for each news section. If you have different categories in WordPress or example, you could add them each as a section with an address like …/category/my-category/feed/ for the RSS feed.

    In Search Console you add the news sitemap address, either in format /sitemap-news.xml or /?feed=sitemap-news (it is all the same XML response)

    About the cache issue, I found instructions here https://www.redbridgenet.com/how-to-exclude-specific-pages-from-w3-total-cache/

    Try:

    
    sitemap.*\.xml
    sitemap-news
    */feed
    
    Thread Starter leonard208

    (@leonard208)

    @ravanh Understood! Thank you for all your help, I’ve forwarded the issues to the developers, hopefully they can get it sorted! Will keep you updated with the caching issue as well!

Viewing 9 replies - 1 through 9 (of 9 total)
  • The topic ‘Google News Sitemap Not Found’ is closed to new replies.