Forums

[resolved] Sitemap Troubles (26 posts)

  1. graffics
    Member
    Posted 1 year ago #

    I am want to build a sitemap for my website.

    This is a multisite with over 70,000 posts in the form of directory listings. Is there a sitemap plugin that will create an XML this large?

    Does anyone have any suggestions?

  2. http://wordpress.org/extend/plugins/bwp-google-xml-sitemaps/ is the only one I know of that goes over 50k.

  3. graffics
    Member
    Posted 1 year ago #

    I am trying to use that plugin but it only accepts one sitemap from one site. Other websites give me error message eventhough I have asked the plugin to just create a sitemap of the posts. Or at least I think I have.
    May 17, 2011 : 17:03:19 — Successfully generated sitemapindex.xml using module sitemapindex.php.
    May 17, 2011 : 17:03:19 — Cache file for module sitemapindex is not found and will be built right away.
    May 17, 2011 : 16:57:12 — Successfully generated page.xml using module page.php.
    May 17, 2011 : 15:42:04 — post_ppt_feedback.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 15:42:04 — Sub-module file: post_ppt_feedback.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.
    May 17, 2011 : 15:42:04 — Cache file for module post_ppt_feedback is not found and will be built right away.
    May 17, 2011 : 15:22:31 — post_ppt_message.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 15:22:31 — Sub-module file: post_ppt_message.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.
    May 17, 2011 : 15:22:31 — Cache file for module post_ppt_message is not found and will be built right away.
    May 17, 2011 : 15:01:53 — post_article_type.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 15:01:53 — Sub-module file: post_article_type.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.
    May 17, 2011 : 15:01:53 — Cache file for module post_article_type is not found and will be built right away.
    May 17, 2011 : 14:51:03 — Successfully generated sitemapindex.xml using module sitemapindex.php.
    May 17, 2011 : 14:47:49 — post_faq_type.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 14:47:49 — Sub-module file: post_faq_type.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.
    May 17, 2011 : 14:47:49 — Cache file for module post_faq_type is not found and will be built right away.
    May 17, 2011 : 14:26:06 — Requested module (taxonomy_category) not found or not allowed.
    May 17, 2011 : 14:25:17 — Requested module (taxonomy_article) not found or not allowed.
    May 17, 2011 : 13:56:49 — Requested module (taxonomy_faq) not found or not allowed.
    May 17, 2011 : 13:25:47 — Successfully generated post.xml using module post.php.
    May 17, 2011 : 13:13:42 — Requested module (taxonomy_post_tag) not found or not allowed.
    May 17, 2011 : 13:06:46 — Successfully generated page.xml using module page.php.
    May 17, 2011 : 12:50:44 — Successfully generated sitemapindex.xml using module sitemapindex.php.
    May 17, 2011 : 12:11:19 — post_ppt_message.xml does not have any item. The plugin has fired a 404 header to the search engine that requests it. You should check the module that generates that sitemap (post.php).
    May 17, 2011 : 12:11:19 — Sub-module file: post_ppt_message.php is not available in both default and custom module directory. The plugin will now try loading the parent module instead.

  4. I am trying to use that plugin but it only accepts one sitemap from one site.

    Okay, just so you understand something, ALL sitemap plugins will work that way. They're SUPPOSED to. WordPress MultiSite means you're running multiple, separate, sites.

    You get every site to make a sitemap, and then link to 'em from your robots.txt. Done.

  5. graffics
    Member
    Posted 1 year ago #

    Okay yes the plugin creates sitemaps for each website, but I do not see a robot.txt file in the root.

    Should I create one?

  6. Go to yourdomain.com/robots.txt and see if anything pops up and, if so, what.

    You MAY need to make your own to force the sitemap in there, if the plugin doesn't do it.

  7. graffics
    Member
    Posted 1 year ago #

    This is what is say does this need to be changed?

    User-agent: *
    Disallow:

  8. Yeah, you need to add the sitemaps in there.

    Go to the plugin config tab (looks like http://s.wordpress.org/extend/plugins/bwp-google-xml-sitemaps/screenshot-3.gif?r=385968 )

    Check the 'add sitemap to robots.txt' option.

  9. graffics
    Member
    Posted 1 year ago #

    I have been on some other forums and they are saying a sitemap that is 70,000 large is pointless to even have.

    Is this true is the another avenue I can take?

  10. Yeah, it's true.

    But you're wandering off the topic :)

    http://wordpress.org/extend/plugins/bwp-google-xml-sitemaps/ is the only one I know of that goes over 50k.

    You need to CONFIGURE it to put the sitemape in your robots.txt (looks like http://s.wordpress.org/extend/plugins/bwp-google-xml-sitemaps/screenshot-3.gif?r=385968 )

  11. graffics
    Member
    Posted 1 year ago #

    Yes Ipstenu but if there is not reason for a Sitemap the current topic is not needed and the topic of even needing a sitemap is needed.

    So what do you think? Sitemap or NO?

    I want to thank you for your help and i'm not being pushy I just need to figure this out.

  12. Well. Read this: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156184&from=40318&rd=1

    If everything's crawlable from your site, then you have nothing to worry about.

  13. OddOneOut
    Member
    Posted 1 year ago #

    Wow 12 posts in just one day!

    Thank you, Ipstenu, for being so helpful, as the developer of BWP GXS I'm much appreciated!

    About the topic: Personally I think 10,000 URLs should be enough, unless you publish 70,000 URLs all at the same time. Sitemaps are needed for crawlers to explore your site in a more "expected" way. Say you have a page at this level: Page 1 > page 2 > page 3, but on page 1 you don't have any links to page 2 or 3, in such case a sitemap can help. Also, when you have new contents, a sitemap can help crawlers index those fresh things faster. There are other benefits for sure but sitemap will not ensure that your contents are indexed.

    Okay yes the plugin creates sitemaps for each website, but I do not see a robot.txt file in the root.

    It should create one if you enable such option, unless you are using sub-folder multi-site installation.

  14. No prob :) I don't actually use the plugin, but I saw it's value for others.

    Sitemaps are needed for crawlers to explore your site in a more "expected" way. Say you have a page at this level: Page 1 > page 2 > page 3, but on page 1 you don't have any links to page 2 or 3, in such case a sitemap can help.

    Right, which IMO is a flaw in site design. If a CRAWLER can't find the links, how the heck is a USER going to find them!? Even a static front page should link to the rest of the site in an organic way that flows.

  15. graffics
    Member
    Posted 1 year ago #

    Right, which IMO is a flaw in site design. If a CRAWLER can't find the links, how the heck is a USER going to find them!? Even a static front page should link to the rest of the site in an organic way that flows.

    With 70,000 posts it would be impossible for me to have all those links on the homepage. But all my posts are under categories which i believe are all links together.

    I believe Google will be able to crawl the whole website.

    http://www.greenbookpages.com Can someone take a look and say what they think?

  16. You don't have to have all 70k linked to on the main page, you have to have them LINKED to.

    If page 10000 is ONLY linked off page 99, but you can follow a link-trail from page 1 to page 99, then you're fine :)

  17. drokkon
    Member
    Posted 1 year ago #

    If I understand correctly, if you have subdomain multisite, then you'll have one robots.txt file under each install, eg: site1.main.com/robots.txt, site2.main.com/robots.txt. And if you have subfolder multisite, then you'll have just one robots.txt file under main.com/robots.txt that will contain all the sitemaps, no?

    Two questions: first of all, I have a sub-folder install, and my robots.txt file only says:

    # Added by Link Alias Generator (LAG) module
    User-agent: *
    Disallow: /go/
    # End LAG
    
    User-agent: *
    Disallow:
    
    Sitemap: http://jotcreative.net/sitemapindex.xml

    Is that right? I unchecked and rechecked the option in the plugin to add the site indexes to robots.txt, but it sure doesn't look like it's adding all 7 of the sites in my multisite...

    Second...is there any benefit to switching to subdomain over subfolder? Can't remember why I picked one over the other, but...

  18. Andrea_r
    team pirate
    Posted 1 year ago #

    no, one robots.txt file works for all sites regardless of format. Same as one set of wp files for all sites.

  19. drokkon
    Member
    Posted 1 year ago #

    Right, but shouldn't that one robots.txt file link to the individual sitemaps, not just the sitemap for the main blog?

  20. Andrea_r
    team pirate
    Posted 1 year ago #

    I wouldn't put the sitemap link in there myself.

  21. drokkon
    Member
    Posted 1 year ago #

    At all? Hm. Now I'm very confused, especially in light of Ipstenu's recommendations (see screenshot). Maybe OddOneOut can shed some light on the plugin itself.

  22. I would, actually, list all the sitemaps there. I do on one site.

    Then again I don't always bother with a sitemap, since if everything's linked, what do I need it for other than priority... And page age tends to cover that, in my experience.

  23. OddOneOut
    Member
    Posted 1 year ago #

    Is that right? I unchecked and rechecked the option in the plugin to add the site indexes to robots.txt, but it sure doesn't look like it's adding all 7 of the sites in my multisite...

    I will improve this in the next version, if you can wait :).

  24. drokkon
    Member
    Posted 1 year ago #

    Of COURSE I can! Great plugin. :) Was looking for an effective multisite solution...

  25. Andrea_r
    team pirate
    Posted 1 year ago #

    I don't always bother with a sitemap, since if everything's linked, what do I need it for other than priority... And page age tends to cover that, in my experience.

    Yeah that's usually my rationale.

  26. graffics
    Member
    Posted 1 year ago #

    Thanks everyone for your suggestions

Topic Closed

This topic has been closed to new replies.

About this Topic