Support » Plugin: XML Sitemap & Google News » Yearly sitemaps use too much memory and make very long queries

  • Sitemap from a year 2018 (for example, but it’s true for any year) crashes due to ridiculously high usages of memory (500+ MB) and very long queries (17k+ character). That year has around 2100 posts. Question is, why is a sitemap being generated on every request? Especially for a year, that has already concluded? Surely some optimizations can be made regarding both problems:
    1. Generate (update) sitemap only when a new object that’s included in sitemap is added to db or an existing one is updated as opposed to generating a whole sitemap on every request.
    2. Possibly introduce some static into sitemaps, especially for past years
    3. Do processing in chunks instead of doing uber long queries for all meta, then all terms, etc.

    • This topic was modified 6 months, 1 week ago by Igor Yavych.
Viewing 8 replies - 16 through 23 (of 23 total)
  • Plugin Author RavanH

    (@ravanh)

    Anyway, it’s already planned to find a way to limit queries to 1k posts. Your suggestion to do it staggered (in batches) within one sitemap, is much appreciated.

    I’ll certainly consider it for the next major release 🙂

    You never know. They’re news articles so at some point it might reach that amount of posts per month.
    Glad to her about that, though do you have any idea as to what is the cause of high memory usage when generating sitemap from the past years?

    Plugin Author RavanH

    (@ravanh)

    No, I don’t know why the memory usage is so high. On my test site with 3.5k posts, the no-split sitemap goes to about 100M. I’ll be testing with 7k posts to see what happens then…

    This for some reason only happens with posts from year < 2019. 2019 and 2020 sitemaps both have adequate usage of memory.

    Plugin Author RavanH

    (@ravanh)

    I’ll do further testing with posts spread over multiple years to see if I can find anything.

    Plugin Author RavanH

    (@ravanh)

    Testing with near 7k posts spread over 2 years does still not reveal any excessive memory usage. Even the no-split sitemap does not go over 184M…

    Largest query was 47,7 kB which is not anywhere near the default max_allowed_packet size (may vary per server/mysql version but I’ve never seen below 1M).

    May I ask what is limiting the query size in your case? I’d like to reproduce the issue for further testing 🙂

    No split doesn’t have issues to begin with. Issues happen with year <=2018.
    Also, it’s not the packet size that is being limited, but char length of the query.

    I’m not sure about what kind of limiting is being used as we have no control over this server.

    Another question is, why is there even a query for taxonomies, if the option to include them is unchecked?

    Plugin Author RavanH

    (@ravanh)

    Issues happen with year <=2018

    Like I said, I cannot reproduce this behavior. And no idea why it would behave differently than other years. Sorry.

    Another question is, why is there even a query for taxonomies, if the option to include them is unchecked?

    This is caused by the fact that the plugin is using a regular WP loop. This also causes separate author queries. I’m currently looking into improving this, which will mean moving away from the main loop.

    To be continued…

Viewing 8 replies - 16 through 23 (of 23 total)
  • You must be logged in to reply to this topic.