Gryz
Forum Replies Created
-
One more update.
I couldn’t believe something was wrong in the google-analytics code. Google isn’t that sloppy. But then I realized that that code does not come from Google. It is part of the “Google Analytics for Wordpres” plugin. And the code is written by a volunteer from the WP community, not by Google.
When I looked at a lot of those websites that had the same problem, I noticed they were all using the “Google Analytics for Wordpres” plugin. So this plugin might very well be the cause of the problems.
I disabled it, and replaced it with another plugin.
Ultimate Google Analytics
Let’s hope this fixes the root of the problem.
We should know in a few days. I’ll update this post.So there’s three parts to the solution.
1) Use a different GA plugin.
2) Add rules to robots.txt to prevent googlebots crawling for old malformed search-URLs.
3) Wait for old search-URLs to depricate from the google database.If this turns out to fix the problem, I wonder if I should notify parties involved. Would Google remove the bogus 18 million entries from their database ? Should I contact the author of the GA for WordPress plugin ?
Anyway, Relevanssi had nothing to do with this. It was a very useful tool to warn us that something was wrong. Thanks for all your effort, msaari !
It turns out our webhost keeps a logfile with all HTTP requests.
I can see that many of the “no-results” queries are from Googlebot.
I now also understand why the
Disallow: /?s=
line in robots.txt didn’t work. It turns out Google does queries for
/page/3/?s=no-results:no-results:<etc>
/page/8/?s=no-results:<etc>So I added another line to robots.txt.
Disallow: /page/*/?s=
I hope the ? and = characters are not special characters, like * is.Msaari, if you are still reading this.
I have a small suggestion.
Maybe you can include a line:
<meta name=”robots” content=”noindex”>
in all result-pages from searches ?
I don’t think people want dynamic search results indexed in search engines anyway. So if wordpress/relevanssiwould include the “noindex” tag in all search results, that could prevent problems ?Thanks for the suggestion. I’m an old C-programmer who used to write C-code for networking devices. I have no knowledge about php, and I’m not sure I wanna check out all WP code to see how it hangs together. I was hoping for a log-function of WP, where I can just go through all http-requests. Maybe I’ll see if I can write some code.
I’ve grepped through all the php-code. The only place where I could find the exact string “no-results:” was in the google analytics code.
From googleanalytics.php:} else if ($wp_query->is_search) {
$pushstr = “‘_trackPageview’,'”.get_bloginfo(‘url’).”/?s=”;
if ($wp_query->found_posts == 0) {
$push[] = $pushstr.”no-results:”.rawurlencode($wp_query->query_vars[‘s’]).”&cat=no-results'”;
} elseIt looks like the string “no-results:” is pre-pended to the search-string. This seems like a place where excessive no-results: could be prepended.
I disabled google analytics for a few minutes on our website, and I still saw new searches with the mangled query. š It’s very weird. I’ll look into it again this weekend.
The problem is happening at many sites.
When searching on google for “no-results:no-results:” I’m getting 15.8 millions results ! Although google only gave me 355 results. Still doesn’t look good. I’m surprised nobody ever looked at this before.http://www.google.nl/search?complete=0&q=%22no-results%3Ano-results%3A%22
I figured out how google can pickup weird search-URLs. We are running google-analytics. When some broken (or weird) site is generating those searches with the nested no-result: queries, the resulting page will trigger google-analytics. And google will be notified about the existance of the no-result: page. Maybe google uses that information in their page-ranking algorithms ? Not sure if this is what happens, but it could explain one part of the puzzle.
Is there a way to see where searches are coming from (IP address or domainname). When I reset the Relevanssi logs, I’m getting log-entries with no-results: in the query-string within minutes. I can’t believe it’s google’s webcrawler that is so quick to crawl my website. However, if it’s not google, then how does google pick up those bogus URLs ?
Also, when I disable Relevanssi, is there a way for me to see the query-strings that get processed by the default WP-search engine ? That would allow me to prove to myself that the bogus searches also happen when Relevanssi is disabled.
After posting here, I realized that this support-forum isn’t about Relevanssi, but about plugins in general. Sorry about that. But if Relevanssi isn’t the cause, then this is maybe a correct place to ask ?
So what is causing this ?
The WordPress setup we have is nothing special. Only a handfull of plugins. And our site is clearly not the only site that has this problem.We have had the line:
Disallow: /search
in our robots.txt file for a long time. I guess that isn’t enough, and we need /?s= in there as well. But I rather see the root-cause disappear than just having every WordPress user in the world change his robots.txt file manually.Forum: Plugins
In reply to: [Relevanssi - A Better Search] Causing massive CPU loadRelevanssi was updated today (June 21st).
http://www.relevanssi.com/release-notes/free-2-9-3/Quote: “Iām sorry to report 2.9.2 was a buggy update. It had a very small bug ā just an add_action() call to a non-existing function ā with very large consequences. Fortunately nothing permanent, so a quick upgrade to 2.9.3 will fix this”.
I’m sometimes helping a family member with her WordPress site. Today her website showed much higher memory usage than normally. I noticed she had installed a new plugin (Relevanssi). I also noticed high cpu-usage (never seen that before). After the upgrade, things seem back to normal.
Unfortunately Mikko did not notify you in this thread (which probably got the ball rolling). And he doesn’t find it necessary to tell his users exactly what went wrong, and what the symptoms are. That stuff helps users to find the problem, so they can fix it. Even if it would make googling easier. Keeping your users updated is the first step towards keeping your users happy.
Forum: Fixing WordPress
In reply to: What average memory usage can I expect with a simple WP site ?So nobody can tell me what memory consumption to expect on a standard WP installation with the TwentyTen theme ?
We are using the latest non-beta version (as usual). 3.2.4.
It seems XML sitemap generator uses a spike of a few MB while creating the sitemap. That’s fine, I guess. XML sitemap generator was the plugin that showed me there is a problem with memory on my server. But that doesn’t mean it is the cause.
Once I realized that, I could remove my post here. Or keep it, and hope that someone would read it, and make a useful comment. I’ve spoken with a friend (who is not very technical, but does have a lot of experience with building websites). He told me that my webspace-provider might have installed a lot of stuff in apache or php, which can be statically linked to the executables. And thus cause a growth of my webserver process. Without me having any influence on it. I think it’s weird. Because that means they sell webprocess with 48M ram, but already use 25-30mb just because of apache/php.
At the moment the website runs fine. I disabled 1-2 specific plugins, and the normal memory usage is 42/48M. I am now waiting to see if the server ever restarts/reboots. Then I can maybe see if we have a memory leak, or whether the high usage is normal.
No way to edit these posts ?
Anyway, I copied the whole website to my own PC. I’m using Xampp for windows. I copied the SQL database and the whole wordpress folder from the original website. (And made a few small changes to the wp-config.php file for the new url: localhost). This allowed me to mess around with all plugins.
WordPress install with Akismet, and no other plugins: 16.8MB.
I enabled all plugins one by one, and got to a maximum of 23MB.I have no idea how and why we got to 45MB used on the webhost, and only 23MB on my local setup. Very weird.
Maybe a memory leak ? Unfortunately I can find no way how to restart our process on the webhost. š
I have a little more data.
I installed the TPC Memory Usage plugin.
It seems the website is using 40-45MB constantly. So the number reported by this XML sitemap plugin is not memory usage of the plugin only, but usage of the whole process.Still, adding 2500 tags to the sitemap adds 3-5MB to memory usage. (Or more) And that pushes memory usage over the 48MB limit. Not good.
Now I am wondering: is a base memory usage of 45 MB is normal or not ? Too bad the TPC Memory Usage plugin does not specify how much memory each plugin uses. I guess I’m gonna disable plugins one-by-one to see what impact on memory usage they have. But I can only do that during the “quiet hours”.