WordPress.org

Ready to get started?Download WordPress

Forums

External Site Integration gives HTTP 404 error - Possible Fix (2 posts)

  1. zAlbee
    Member
    Posted 4 years ago #

    Short version: I have found a bug that affects anyone trying to integrate WP with their site using the WP theme engine. It causes complete removal of non-WP site from search engine indexes. How can I submit a patch to the developers?

    Background: I am hosting my own website with a WordPress 2.9.1 install. I wanted to keep WP in a subdirectory of the site, so that the rest of the site is *not* managed by WP, but still have the rest of the site have the same appearance. I accomplished this by having my external pages load 'blog/wp-blog-header.php' and major edits to the theme (I forget the exact details; anyway, it took a lot of effort and it works).

    Problem: When accessing pages outside WP, they look perfectly normal to a user, but WP spits out HTTP 404 (Not Found) headers instead of HTTP 200 (OK) headers. This caused my pages to be dropped from all search engines!

    Here are similar reports from a quick search: http://bbpress.org/forums/topic/bbpress-wordpress-mu-or-not-leads-to-404-errors-but-pages-still-load [WP 2.3]

    http://www.strangerstudios.com/blog/2009/08/hidden-404-errors-with-wordpress-plugin-pages/ [WP 2.8.4]

    The solutions proposed above are to create a new plugin and load it, or manually edit the WP source code. I feel it should be corrected in the WP source code itself. Secondly, they essentially propose whitelisting all your site pages. That seems not to be scalable nor generic.

    My solution: WP only does URL rewrites within a certain directory. If any requests are for outside that directory, then WP simply should not consider it to be an error. i.e. WP has no jurisdiction outside that directory. This assumes your non-blog pages are outside, and your blog is in a subdirectory.

    This is my diff [WP 2.9.1]:

    In /wp-includes/classes.php, function parse_request():
    239c239,242
    < 			if (empty($request) || $req_uri == $self || strpos($_SERVER['PHP_SELF'], 'wp-admin/') !== false) {
    ---
    > 			if (empty($request) || $req_uri == $self || strpos($_SERVER['PHP_SELF'], 'wp-admin/') !== false
    > 				|| strpos($_SERVER['PHP_SELF'], '/immerge/blog/') !== 0 ## zAlbee change
    > 				## it is not a rewrite unless PHP_SELF is index.php inside the wordpress install directory
    > 			) {

    Here, '/immerge/blog/' is the root of my blog. Basically, the existing code checks for cases where it's NOT an error. I am adding the case where the request (PHP_SELF) does not match my blog root. In fact, I only need to see that it's not '/immerge/blog/index.php', since that seems to be the page that triggers all rewrites, but I wanted to be safe. I'm sure WP has a variable for the blog root, so this code can then be generic for everyone. (Didn't bother looking for it.)

    Could someone review this code, or point me to where I can submit a report/patch? Thanks!

  2. apo36
    Member
    Posted 3 years ago #

    I ran into the same problem, and found another solution than yours:

    Instead of using this at the beginning of your scripts :
    require('/path/to/wp-blog-header.php');
    use:
    require_once('path/to/wp-load.php');

    This way, the function wp() won't be called, and your page won't have the 404 status attributed to it.

Topic Closed

This topic has been closed to new replies.

About this Topic

Tags

No tags yet.