Yes they see posts. Search_Engine_Optimization_for_Wordpress has some info.
Also check your site:
If search engines only indexed html, they would miss the larger parts of the websites on the internet. No, they index everything, the only thing to wonder about is what and how, since I notice that Google prefers tag-archives over individual posts (the “permalinks”) for example. For the rest, no problems at all, either with “ugly” or “pretty” “permalinks”.
I keep reading the info but I don’t get it. You say that search engines index everything? What is everything? How can a search engine get into a MySQL database and index articles and things stored in there?
It can’t peek inside a database. But it can see the same webpage that everybody else can see, and that is what it indexes.
It might help you understand that search engines do see your WordPress site as HTML because the WP engine processes PHP and MySQL data then returns HTML code to the browser.
Do a View > Source code in your browser and you will see your posts are presented in HTML code such as h2 and paragraph tags. This is similar enough to what those search engine bots see.
I do understand that the posts are displayed as HTML. I also thought that they were stored in the MySQL database and that the HTML was being dynamically built. So when a search engine starts looking at HTML, I don’t know how it can see a post that is dynamically built HTML.
I’m sure I’m missing some big concept here and I want to understand it.
Unless I’m wrong that the posts are stored in the MySQL database…if they are actually stored in HTML documents, then it all makes sense.
posts are stored in a MySQL database. Do you need to peek into your database to see what is on your blog? No.
And neither does Googlebot.
As was already explained, a searchbot sees basically what YOU see.
But how does the SearchBot know to do this? Most blogs require a login to see blogs.
None of the blogs I visit require me to logon to see the content, only if I want to comment on them.
I’ve never seen a blog that I have to login to see anything, that would be silly.
Mpiaser, when you go to a page built (for example) with WordPress, WP will ‘tell’ the browser what it is supposed to show. In most cases that will be an index page with the last 20 (or so) posts. This page is built because the databases gives information to the browser using PHP scripts and the browser will come up with a nice page that looks the same as an html page. This result is also what is indexed by search engines. The same thing happens when you visit an archive, tag-archive, single post (“permalink”) or whatever. An html page is generated, viewable by your browser and ‘indexable’ by Google.
Nothing of course happens with your administration part which is indeed savely behind a password.
Does this make sense?
Makes sense but I’m still missing the real thing. I expect that WP is creating some perminent HTML page which contains the URL’s to the blogs. Then it would make sense, since Google can read the links in an HTML document and follow them.
I know that WP is dynamically creating the temporary HTML pages using PHP and mySQL. I’m a programmer and I do this all the time. But, those pages are temporary (unless WP is saving the HTML code before presenting it to the browser) so they aren’t really readable by Google.
I expect that WP is creating some perminent HTML page
Nope. Some plugins do this for speed reasons, but by default, nope. And even if it did, it wouldn’t help search engines.
But, those pages are temporary (unless WP is saving the HTML code before presenting it to the browser) so they aren’t really readable by Google.
Sure they are. They’re readable by you and a web browser aren’t they? Why would Google be any different?
Google sees exactly the same thing you do. Really. This is not all that complicated.
Look, a web browser connects to the server and asks for a page. The server runs some PHP code (WordPress) to generate that page, and sends the resulting HTML back to the web browser. Right? Okay.
Think of the Googlebot as a web browser. Because it is one. It sends a request to the server, gets the exact same HTML results back. Done and done. Googlebot does not care *how* the HTML was created. It doesn’t care whether it’s a file or not. All it sees is a URL and whatever it gets back from that URL. That’s it. It’s not any different than any other web browser.
As MichaelH pointed out, you can search Google for “site:www.yourdomain.com” to find out which pages and posts on your WordPress site Google is aware of.
Just to add to that: it’s worth signing up for the free Google Webmaster tools because you can get much more detailed reports on how well Google is spidering your blog.
- The topic ‘Search engines and WordPress’ is closed to new replies.