• Hi, thank you for your work.
    We’re using two your sitemaps deactivate Yoast function,
    Plugin is working very well.
    Reading forum I saw it is no good overload gg crawler,
    but we have about 50k posts in sitemap divided by month.
    So I kindly ask: is it possible to make sitemap of only last 1.000 posts (for istance) living out olders?
    Thank you very much for your time
    M

    • This topic was modified 2 years, 3 months ago by Mattiave.

    The page I need help with: [log in to see the link]

Viewing 6 replies - 1 through 6 (of 6 total)
  • Hi, why do you worry about overload of the google crawler? Having 50k posts in your sitemap is not going to overload it. The crawler will decide on its own when and how many times it will be visiting your site. It might base its decision on
    the last modification dates in your sitemaps but it might also take another decision.

    How many posts do you have in your last month? You could just start with submitting only the current month sitemap (sitemap-posttype-post.202204.xml) instead of the complete index (sitemap.xml) all at once. Then later submit older months…

    But again: I would not worry about the crawler. You are not responsible for the crawler, and the crawler has no obligation to your sitemap. It will probably use the sitemap but it will decide on its own when and how. Or it may decide to just start crawling your site starting with the home page, simply following links.

    It any case, the submitted sitemap will help you to monitor how many of the pages/posts are already indexed and how many remain on the waiting list.

    Thread Starter Mattiave

    (@mattiave)

    Thanks Rolf,
    I’ll do as you tell me, your speech has logic.
    We were just thinking about this after reading in a forum that crawler bandwidth is defined and having obsolete items takes resources away from indexing new things …
    but it’s true, nobody says to limit sitemap…
    thank you
    M

    Well… I’d argue that if you are worried about the crawler taking bandwidth, you should need to be more worried about the bandwidth that the new (future) visitors referred by the crawler (search engine) will be taking.

    When your visitor numbers grow, consider upgrading your site hosting. Better than limiting crawlers 🙂

    Thread Starter Mattiave

    (@mattiave)

    no, I have not explained myself well,
    here the bandwidth of the server is not inherent (that is a dedicated one)
    but what according to recent theories is called “crawl budget”:
    is what is consumed (perhaps) uselessly for old things…

    Ok now I see 🙂 indeed the sitemap should help the crawler decide where to spend this “crawl budget”. In theory, the parameter “Priority” should help but the biggest search engine has announced it does not take that parameter into account. That leaves the last modification date, where it should favour the more recent modified posts.

    Please note that if you modify (or even just re-save) an old post, the modification date will be updated. But if you left your old posts alone they should have an old modification date and the crawler will (likely) first try to index a more recent post or page.

    Tip: the plugin has an option to update the modification date upon each new comment so maybe make sure to NOT enable that option. You can try enabling the Automatic Priority Calculation which is built to make old posts have a lower priority automatically (unless they have many comments) but this is an “expensive” option in the sense that it costs much server resources to calculate and therefor might be too heavy for a very large site. Plus, like I said before, Google has indicated the priority value is ignored anyway…

    Good luck with your site 🙂

    Thread Starter Mattiave

    (@mattiave)

    perfect!
    thanks for your patience,
    You hit the mark and your suggestions are valuable!
    All the best Rolf

Viewing 6 replies - 1 through 6 (of 6 total)
  • The topic ‘Can I live out old posts’ is closed to new replies.