• Resolved graffics

    (@graffics)


    I am want to build a sitemap for my website.

    This is a multisite with over 70,000 posts in the form of directory listings. Is there a sitemap plugin that will create an XML this large?

    Does anyone have any suggestions?

Viewing 15 replies - 1 through 15 (of 25 total)
  • Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    http://wordpress.org/extend/plugins/bwp-google-xml-sitemaps/ is the only one I know of that goes over 50k.

    Thread Starter graffics

    (@graffics)

    I am trying to use that plugin but it only accepts one sitemap from one site. Other websites give me error message eventhough I have asked the plugin to just create a sitemap of the posts. Or at least I think I have.
    May 17, 2011 : 17:03:19 — Successfully generated sitemapindex.xml using module sitemapindex.php.
    May 17, 2011 : 17:03:19 — Cache file for module sitemapindex is not found and will be built right away.
    May 17, 2011 : 16:57:12 — Successfully generated page.xml using module page.php.
    May 17, 2011 : 15:42:04 — post_ppt_feedback.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 15:42:04 — Sub-module file: post_ppt_feedback.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.
    May 17, 2011 : 15:42:04 — Cache file for module post_ppt_feedback is not found and will be built right away.
    May 17, 2011 : 15:22:31 — post_ppt_message.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 15:22:31 — Sub-module file: post_ppt_message.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.
    May 17, 2011 : 15:22:31 — Cache file for module post_ppt_message is not found and will be built right away.
    May 17, 2011 : 15:01:53 — post_article_type.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 15:01:53 — Sub-module file: post_article_type.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.
    May 17, 2011 : 15:01:53 — Cache file for module post_article_type is not found and will be built right away.
    May 17, 2011 : 14:51:03 — Successfully generated sitemapindex.xml using module sitemapindex.php.
    May 17, 2011 : 14:47:49 — post_faq_type.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 14:47:49 — Sub-module file: post_faq_type.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.
    May 17, 2011 : 14:47:49 — Cache file for module post_faq_type is not found and will be built right away.
    May 17, 2011 : 14:26:06 — Requested module (taxonomy_category) not found or not allowed.
    May 17, 2011 : 14:25:17 — Requested module (taxonomy_article) not found or not allowed.
    May 17, 2011 : 13:56:49 — Requested module (taxonomy_faq) not found or not allowed.
    May 17, 2011 : 13:25:47 — Successfully generated post.xml using module post.php.
    May 17, 2011 : 13:13:42 — Requested module (taxonomy_post_tag) not found or not allowed.
    May 17, 2011 : 13:06:46 — Successfully generated page.xml using module page.php.
    May 17, 2011 : 12:50:44 — Successfully generated sitemapindex.xml using module sitemapindex.php.
    May 17, 2011 : 12:11:19 — post_ppt_message.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 12:11:19 — Sub-module file: post_ppt_message.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    I am trying to use that plugin but it only accepts one sitemap from one site.

    Okay, just so you understand something, ALL sitemap plugins will work that way. They’re SUPPOSED to. WordPress MultiSite means you’re running multiple, separate, sites.

    You get every site to make a sitemap, and then link to ’em from your robots.txt. Done.

    Thread Starter graffics

    (@graffics)

    Okay yes the plugin creates sitemaps for each website, but I do not see a robot.txt file in the root.

    Should I create one?

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    Go to yourdomain.com/robots.txt and see if anything pops up and, if so, what.

    You MAY need to make your own to force the sitemap in there, if the plugin doesn’t do it.

    Thread Starter graffics

    (@graffics)

    This is what is say does this need to be changed?

    User-agent: *
    Disallow:

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    Yeah, you need to add the sitemaps in there.

    Go to the plugin config tab (looks like http://s.wordpress.org/extend/plugins/bwp-google-xml-sitemaps/screenshot-3.gif?r=385968 )

    Check the ‘add sitemap to robots.txt’ option.

    Thread Starter graffics

    (@graffics)

    I have been on some other forums and they are saying a sitemap that is 70,000 large is pointless to even have.

    Is this true is the another avenue I can take?

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    Yeah, it’s true.

    But you’re wandering off the topic 🙂

    http://wordpress.org/extend/plugins/bwp-google-xml-sitemaps/ is the only one I know of that goes over 50k.

    You need to CONFIGURE it to put the sitemape in your robots.txt (looks like http://s.wordpress.org/extend/plugins/bwp-google-xml-sitemaps/screenshot-3.gif?r=385968 )

    Thread Starter graffics

    (@graffics)

    Yes Ipstenu but if there is not reason for a Sitemap the current topic is not needed and the topic of even needing a sitemap is needed.

    So what do you think? Sitemap or NO?

    I want to thank you for your help and i’m not being pushy I just need to figure this out.

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    Well. Read this: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156184&from=40318&rd=1

    If everything’s crawlable from your site, then you have nothing to worry about.

    Wow 12 posts in just one day!

    Thank you, Ipstenu, for being so helpful, as the developer of BWP GXS I’m much appreciated!

    About the topic: Personally I think 10,000 URLs should be enough, unless you publish 70,000 URLs all at the same time. Sitemaps are needed for crawlers to explore your site in a more “expected” way. Say you have a page at this level: Page 1 > page 2 > page 3, but on page 1 you don’t have any links to page 2 or 3, in such case a sitemap can help. Also, when you have new contents, a sitemap can help crawlers index those fresh things faster. There are other benefits for sure but sitemap will not ensure that your contents are indexed.

    Okay yes the plugin creates sitemaps for each website, but I do not see a robot.txt file in the root.

    It should create one if you enable such option, unless you are using sub-folder multi-site installation.

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    No prob 🙂 I don’t actually use the plugin, but I saw it’s value for others.

    Sitemaps are needed for crawlers to explore your site in a more “expected” way. Say you have a page at this level: Page 1 > page 2 > page 3, but on page 1 you don’t have any links to page 2 or 3, in such case a sitemap can help.

    Right, which IMO is a flaw in site design. If a CRAWLER can’t find the links, how the heck is a USER going to find them!? Even a static front page should link to the rest of the site in an organic way that flows.

    Thread Starter graffics

    (@graffics)

    Right, which IMO is a flaw in site design. If a CRAWLER can’t find the links, how the heck is a USER going to find them!? Even a static front page should link to the rest of the site in an organic way that flows.

    With 70,000 posts it would be impossible for me to have all those links on the homepage. But all my posts are under categories which i believe are all links together.

    I believe Google will be able to crawl the whole website.

    http://www.greenbookpages.com Can someone take a look and say what they think?

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    You don’t have to have all 70k linked to on the main page, you have to have them LINKED to.

    If page 10000 is ONLY linked off page 99, but you can follow a link-trail from page 1 to page 99, then you’re fine 🙂

Viewing 15 replies - 1 through 15 (of 25 total)
  • The topic ‘Sitemap Troubles’ is closed to new replies.