• Is basically broken. Specifically it tries too hard to generate absolute URLs all the time and more often than not, it gets them wrong.

    I have a site which is hidden behind a proxy server. The WP machine is running a vanilla PHP install on Apache with no SSL cert. The reason being that the blog is but one server in a cluster of many different machines with different jobs all living under a common domain (call it http://www.example.com). There is a NGINX server in front of everything handling routing based on url paths. The NGINX handles the SSL encryption and then communicates with backend servers using http. Works fine.

    However – this means that the WP installation thinks it is operating under http rather than https. Any test of headers in any code comes up with protocol http and not https. So any absolute URLs generated such as for style sheets and JS files are being generated with a big fat http: in front and conservative browsers like Chrome are declining to load them as they are viewed as potential security threats. Many themes and plugins are written to only exacerbate the problem but for many it isn’t their fault exactly since they are relying on calls like get_stylesheet_directory_uri() which returns an http: prefixed string on our nifty https: served blog.

    In order to fix our site I did the following modifications to various files in wp-includes. I found where the URL was about to be written and I stripped the protocol off of it using something like

    $baseurl = ltrim(self::$baseurl,’htpsHTPS:’);

    which crudely strips off any leading http/https protocol. The reason this is OK is because RFC 3986 part 4.2 allows for protocol-less or protocol relative URLs. So instead of http://www.example.com it is fine to use //www.example.com and the browser will use whatever protocol was used to fetch the parent page. If WP were to generate these sorts of URLs, wacky plugins like http://wordpress.org/extend/plugins/wordpress-https/ would be totally unnecessary.

    Please update WordPress to use protocol-relative URLs. Trying to “guess” the unguessable and getting it wrong is just inviting all sorts of security holes. A developer should feel confident that if he has secured his server, then all his resources will be secured by default.

Viewing 15 replies - 1 through 15 (of 18 total)
  • Moderator Jan Dembowski

    (@jdembowski)

    Forum Moderator and Brute Squad

    Is basically broken. Specifically it tries too hard to generate absolute URLs all the time and more often than not, it gets them wrong.

    Were that true there would be lots more support forum threads I think…

    There is a NGINX server in front of everything handling routing based on url paths. The NGINX handles the SSL encryption and then communicates with backend servers using http. Works fine.

    That’s cool. HTTP on WordPress HTTPS handled by your NGINX reverse proxy.

    However – this means that the WP installation thinks it is operating under http rather than https.

    Yes. That’s because it is. You’re fronting the HTTP web site with that NGINX reverse proxy.

    In order to fix our site I did the following modifications to various files in wp-includes.

    That’s not a good idea and you’ll a) possibly make your installation insecure or buggy and b) you’ll lose any of your edits when a new release comes out.

    Trying to “guess” the unguessable and getting it wrong is just inviting all sorts of security holes.

    No, that’s not correct. It’s an inconvenience to you perhaps but it is absolutely not a security issue here.

    If WP were to generate these sorts of URLs, wacky plugins like http://wordpress.org/extend/plugins/wordpress-https/ would be totally unnecessary.

    It’s an old topic and the outcome of it is “Absolute URLs are here to stay”. I believe that that is partly because it works well for the majority and partly because it would be nightmarish to change it. That’s my opinion anyway. 😉

    It’s not the solution you want but as you are running Apache with no SSL cert perhaps you may want to look at Apache’s mod_subsititute?

    http://httpd.apache.org/docs/2.2/mod/mod_substitute.html

    That may let you modify the HTML output without hacking, breaking, or turning your installation into something that can’t be supported by other people.

    Moderator Jan Dembowski

    (@jdembowski)

    Forum Moderator and Brute Squad

    Note: Do not cross post your topic into other un-related topics.

    I’ve deleted your other post, your assertion that URL handling is somehow a security issue is incorrect. If you want to discuss URL generation by WordPress fine. But do not inject your topic into others needlessly especially when it is not a security issue.

    Thread Starter tblanchard

    (@tblanchard)

    Were that true there would be lots more support forum threads I think…

    I meant, when I look at my source for my page. Lots of http: a few https: here and there.

    That’s not a good idea and you’ll a) possibly make your installation insecure or buggy and b) you’ll lose any of your edits when a new release comes out.

    I know that. Fix WP and I’ll stop doing it.

    It’s an inconvenience to you perhaps but it is absolutely not a security issue here.

    Strongly disagree. There are apparently MANY ways URLs get generated in WP and the rate of inconsistency is high which tells me that the way URLs are generated is not well thought out or well designed and this means it is not all that secure either. Secure code is simple and consistent code. I’m seeing neither here. If every single URL were generated incorrectly I might buy it. But the large number of places I had to institute hacks to get the URLs correct tells me that this is a real problem that requires a real rethink.

    It’s an old topic and the outcome of it is “Absolute URLs are here to stay”. I believe that that is partly because it works well for the majority and partly because it would be nightmarish to change it.

    So the opinion of the WordPress developers is code that works for the majority is better than code the works for all? That is a disturbing attitude. It may well speak to why I’ve had 8 WP sites hacked in the last three weeks. No, I haven’t figured out how they’ve gotten in. But I can see that there is a lot of inconsistency and gratuitous complexity here. Supposing I do the leg work to get url generation fixed? How do I submit that patch in a way that is likely to be accepted?

    As to mod_sub, that would be a bandaid to fix a problem that simply shouldn’t exist in the first place. I’d rather generate the correct HTML in the first place. I’m willing to dive in and figure out a better place to patch it, but the fact is that WP is using several inconsistent methods to determine protocol to generate URLs and the fact is that this is work that doesn’t need to be done at all. It makes the site more fragile.

    Moderator Jan Dembowski

    (@jdembowski)

    Forum Moderator and Brute Squad

    We disagree and it’s your installation. Ultimately no one else’s opinion even matters in regards to what you want to do.

    I know that. Fix WP and I’ll stop doing it.

    It’s not broken (again we disagree) but you really should, could, and I whole heartedly recommend that you continue to do whatever you like with your installation. 😉

    So the opinion of the WordPress developers is code that works for the majority is better than code the works for all? That is a disturbing attitude.

    I’ve no idea. I’m not a WordPress developer and by no means speak for any of that (really cool, amazingly generous with their time and effort) crowd.

    As to mod_sub, that would be a bandaid to fix a problem

    It does abstract the problem away from hacking and making your installation potentially buggy. I like it myself but see above comment regarding doing what you want to do.

    Thread Starter tblanchard

    (@tblanchard)

    I would buy “not broken” if all of the schemes at the fronts of the urls were the same. They’re not. Hence broken.

    I’ve downloaded the source and will submit a patch. I intend to also highlight its inconsistent behavior across the web until the problem is fixed either by a single flag that can be set to specify generation of the correct url scheme in the admin that is actually properly honored by client code or by adopting protocol-relative URLs.

    I have read the “justification” for using absolute URLs at http://make.wordpress.org/core/handbook/design-decisions/ and while I’m fine with including the server name in the URLs, I think any design that requires search and replace operations across an entire database is pretty fragile. Furthermore, I notice that the section on SSL and load balancing is blank which to me indicates that they recognize there is a problem, but haven’t thought it through.

    Furthermore, I notice that the section on SSL and load balancing is blank which to me indicates that they recognize there is a problem, but haven’t thought it through.

    Or … it’s just not been written yet.

    The handbook you are referencing is actively being developed… like right now – it literally just started from concept to product in the past couple of months. Folks are just being assigned/volunteering for sections. There is no ‘they’, just whomever volunteers to write the section. The low hanging fruit theory probably applies when trying to write an entire handbook – write the easiest stuff first, etc.

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    So the opinion of the WordPress developers is code that works for the majority is better than code the works for all?

    One would presume that majority rules exist when there is no possibility for an ‘all’ solution, however in this case it’s ‘works for pretty much everyone and doesn’t break going forward.’ If at all possible, the devs try not to do massive, sweeping, database changes. The rather immense change this would involve is daunting, dangerous, and apt to be incomplete.

    I don’t disagree that the handling of https in particular is mangled, but part of the reason that the WordPress HTTPS plugin is still around is the number of people who use it is relatively small.

    The other, main, reason however, is that generating URLs is messy. When people insert images, they may or may not notice that https is there, which would result in the same sort of cross-contamination you’re seeing now. Not everyone uses the media uploader to insert images, after all.

    If you can figure out a solution that handles both of those, and regresses cleanly, we’d be a lot closer to a fix :/

    Thread Starter tblanchard

    (@tblanchard)

    I would point out that I found the bulk of the problem isn’t in the database, it is in the themes, the plugins, things that reference site wide resources like style sheets, javascript files, etc….

    The system for referencing these things is what is broken and I think it is because I can set in admin that the site should be known via https://www.example.com but there is code out there looking at $_SERVER instead.

    Thread Starter tblanchard

    (@tblanchard)

    So after some more investigation I have found that the core culprit is this function

    function is_ssl() {
    	if ( isset($_SERVER['HTTPS']) ) {
    		if ( 'on' == strtolower($_SERVER['HTTPS']) )
    			return true;
    		if ( '1' == $_SERVER['HTTPS'] )
    			return true;
    	} elseif ( isset($_SERVER['SERVER_PORT']) && ( '443' == $_SERVER['SERVER_PORT'] ) ) {
    		return true;
    	}
    	return false;
    }

    It is wrong because of the proxy. There needs to be a way to override this cleanly. I found that just changing it to return true solves my issues. Why the database content links get figured one way and the supporting script and styles get figured another is a mystery to me.

    Moderator Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    If a theme or plugin hardcodes the URLs in, that’d do it :/

    They’re supposed to let the URL be dynamically generated, which won’t have a problem.

    I strongly disagree with http://make.wordpress.org/core/handbook/design-decisions/#absolute-versus-relative-urls as a whole. Every point it makes is either wrong or stupid.
    I’m not discussing it here but I wanted to pass by and show my support to the OP.

    Why not submit a patch to core correcting these “issues”?

    A patch has been already suggested in this thread. Read carefully.

    I did, thank you. The same applies. The keyword was “submitted” – not “suggested”.

    Thread Starter tblanchard

    (@tblanchard)

    I appreciate the support from alamart.

    I think the reason WP is the way it is is because it is generally installed in place, edited in place, and never moved. Basically it is designed for amateurs who edit and develop things live on their one copy.

    Professional organizations generally have multiple servers for development, staging, and production and thus pages may be accessed via different server names, behind proxies, and etc….. Code moves form dev to staging for approval and ultimately is placed before the public on production.

    Dumping the protocol would be a great step in the right direction though – ultimate protocol is not always known to the server if it is part of a distributed deployment – for instance behind a load balancer that is handling the https side.

Viewing 15 replies - 1 through 15 (of 18 total)
  • The topic ‘WordPress URL generation’ is closed to new replies.