WordPress.org

Ready to get started?Download WordPress

Forums

WordPress URL generation (19 posts)

  1. tblanchard
    Member
    Posted 1 year ago #

    Is basically broken. Specifically it tries too hard to generate absolute URLs all the time and more often than not, it gets them wrong.

    I have a site which is hidden behind a proxy server. The WP machine is running a vanilla PHP install on Apache with no SSL cert. The reason being that the blog is but one server in a cluster of many different machines with different jobs all living under a common domain (call it http://www.example.com). There is a NGINX server in front of everything handling routing based on url paths. The NGINX handles the SSL encryption and then communicates with backend servers using http. Works fine.

    However – this means that the WP installation thinks it is operating under http rather than https. Any test of headers in any code comes up with protocol http and not https. So any absolute URLs generated such as for style sheets and JS files are being generated with a big fat http: in front and conservative browsers like Chrome are declining to load them as they are viewed as potential security threats. Many themes and plugins are written to only exacerbate the problem but for many it isn't their fault exactly since they are relying on calls like get_stylesheet_directory_uri() which returns an http: prefixed string on our nifty https: served blog.

    In order to fix our site I did the following modifications to various files in wp-includes. I found where the URL was about to be written and I stripped the protocol off of it using something like

    $baseurl = ltrim(self::$baseurl,'htpsHTPS:');

    which crudely strips off any leading http/https protocol. The reason this is OK is because RFC 3986 part 4.2 allows for protocol-less or protocol relative URLs. So instead of http://www.example.com it is fine to use //www.example.com and the browser will use whatever protocol was used to fetch the parent page. If WP were to generate these sorts of URLs, wacky plugins like http://wordpress.org/extend/plugins/wordpress-https/ would be totally unnecessary.

    Please update WordPress to use protocol-relative URLs. Trying to "guess" the unguessable and getting it wrong is just inviting all sorts of security holes. A developer should feel confident that if he has secured his server, then all his resources will be secured by default.

  2. Is basically broken. Specifically it tries too hard to generate absolute URLs all the time and more often than not, it gets them wrong.

    Were that true there would be lots more support forum threads I think...

    There is a NGINX server in front of everything handling routing based on url paths. The NGINX handles the SSL encryption and then communicates with backend servers using http. Works fine.

    That's cool. HTTP on WordPress HTTPS handled by your NGINX reverse proxy.

    However – this means that the WP installation thinks it is operating under http rather than https.

    Yes. That's because it is. You're fronting the HTTP web site with that NGINX reverse proxy.

    In order to fix our site I did the following modifications to various files in wp-includes.

    That's not a good idea and you'll a) possibly make your installation insecure or buggy and b) you'll lose any of your edits when a new release comes out.

    Trying to "guess" the unguessable and getting it wrong is just inviting all sorts of security holes.

    No, that's not correct. It's an inconvenience to you perhaps but it is absolutely not a security issue here.

    If WP were to generate these sorts of URLs, wacky plugins like http://wordpress.org/extend/plugins/wordpress-https/ would be totally unnecessary.

    It's an old topic and the outcome of it is "Absolute URLs are here to stay". I believe that that is partly because it works well for the majority and partly because it would be nightmarish to change it. That's my opinion anyway. ;)

    It's not the solution you want but as you are running Apache with no SSL cert perhaps you may want to look at Apache's mod_subsititute?

    http://httpd.apache.org/docs/2.2/mod/mod_substitute.html

    That may let you modify the HTML output without hacking, breaking, or turning your installation into something that can't be supported by other people.

  3. Note: Do not cross post your topic into other un-related topics.

    I've deleted your other post, your assertion that URL handling is somehow a security issue is incorrect. If you want to discuss URL generation by WordPress fine. But do not inject your topic into others needlessly especially when it is not a security issue.

  4. tblanchard
    Member
    Posted 1 year ago #

    Were that true there would be lots more support forum threads I think...

    I meant, when I look at my source for my page. Lots of http: a few https: here and there.

    That's not a good idea and you'll a) possibly make your installation insecure or buggy and b) you'll lose any of your edits when a new release comes out.

    I know that. Fix WP and I'll stop doing it.

    It's an inconvenience to you perhaps but it is absolutely not a security issue here.

    Strongly disagree. There are apparently MANY ways URLs get generated in WP and the rate of inconsistency is high which tells me that the way URLs are generated is not well thought out or well designed and this means it is not all that secure either. Secure code is simple and consistent code. I'm seeing neither here. If every single URL were generated incorrectly I might buy it. But the large number of places I had to institute hacks to get the URLs correct tells me that this is a real problem that requires a real rethink.

    It's an old topic and the outcome of it is "Absolute URLs are here to stay". I believe that that is partly because it works well for the majority and partly because it would be nightmarish to change it.

    So the opinion of the WordPress developers is code that works for the majority is better than code the works for all? That is a disturbing attitude. It may well speak to why I've had 8 WP sites hacked in the last three weeks. No, I haven't figured out how they've gotten in. But I can see that there is a lot of inconsistency and gratuitous complexity here. Supposing I do the leg work to get url generation fixed? How do I submit that patch in a way that is likely to be accepted?

    As to mod_sub, that would be a bandaid to fix a problem that simply shouldn't exist in the first place. I'd rather generate the correct HTML in the first place. I'm willing to dive in and figure out a better place to patch it, but the fact is that WP is using several inconsistent methods to determine protocol to generate URLs and the fact is that this is work that doesn't need to be done at all. It makes the site more fragile.

  5. We disagree and it's your installation. Ultimately no one else's opinion even matters in regards to what you want to do.

    I know that. Fix WP and I'll stop doing it.

    It's not broken (again we disagree) but you really should, could, and I whole heartedly recommend that you continue to do whatever you like with your installation. ;)

    So the opinion of the WordPress developers is code that works for the majority is better than code the works for all? That is a disturbing attitude.

    I've no idea. I'm not a WordPress developer and by no means speak for any of that (really cool, amazingly generous with their time and effort) crowd.

    As to mod_sub, that would be a bandaid to fix a problem

    It does abstract the problem away from hacking and making your installation potentially buggy. I like it myself but see above comment regarding doing what you want to do.

  6. tblanchard
    Member
    Posted 1 year ago #

    I would buy "not broken" if all of the schemes at the fronts of the urls were the same. They're not. Hence broken.

    I've downloaded the source and will submit a patch. I intend to also highlight its inconsistent behavior across the web until the problem is fixed either by a single flag that can be set to specify generation of the correct url scheme in the admin that is actually properly honored by client code or by adopting protocol-relative URLs.

    I have read the "justification" for using absolute URLs at http://make.wordpress.org/core/handbook/design-decisions/ and while I'm fine with including the server name in the URLs, I think any design that requires search and replace operations across an entire database is pretty fragile. Furthermore, I notice that the section on SSL and load balancing is blank which to me indicates that they recognize there is a problem, but haven't thought it through.

  7. Rev. Voodoo
    Volunteer Moderator
    Posted 1 year ago #

    Furthermore, I notice that the section on SSL and load balancing is blank which to me indicates that they recognize there is a problem, but haven't thought it through.

    Or ... it's just not been written yet.

    The handbook you are referencing is actively being developed... like right now - it literally just started from concept to product in the past couple of months. Folks are just being assigned/volunteering for sections. There is no 'they', just whomever volunteers to write the section. The low hanging fruit theory probably applies when trying to write an entire handbook - write the easiest stuff first, etc.

  8. So the opinion of the WordPress developers is code that works for the majority is better than code the works for all?

    One would presume that majority rules exist when there is no possibility for an 'all' solution, however in this case it's 'works for pretty much everyone and doesn't break going forward.' If at all possible, the devs try not to do massive, sweeping, database changes. The rather immense change this would involve is daunting, dangerous, and apt to be incomplete.

    I don't disagree that the handling of https in particular is mangled, but part of the reason that the WordPress HTTPS plugin is still around is the number of people who use it is relatively small.

    The other, main, reason however, is that generating URLs is messy. When people insert images, they may or may not notice that https is there, which would result in the same sort of cross-contamination you're seeing now. Not everyone uses the media uploader to insert images, after all.

    If you can figure out a solution that handles both of those, and regresses cleanly, we'd be a lot closer to a fix :/

  9. tblanchard
    Member
    Posted 1 year ago #

    I would point out that I found the bulk of the problem isn't in the database, it is in the themes, the plugins, things that reference site wide resources like style sheets, javascript files, etc....

    The system for referencing these things is what is broken and I think it is because I can set in admin that the site should be known via https://www.example.com but there is code out there looking at $_SERVER instead.

  10. tblanchard
    Member
    Posted 1 year ago #

    So after some more investigation I have found that the core culprit is this function

    function is_ssl() {
    	if ( isset($_SERVER['HTTPS']) ) {
    		if ( 'on' == strtolower($_SERVER['HTTPS']) )
    			return true;
    		if ( '1' == $_SERVER['HTTPS'] )
    			return true;
    	} elseif ( isset($_SERVER['SERVER_PORT']) && ( '443' == $_SERVER['SERVER_PORT'] ) ) {
    		return true;
    	}
    	return false;
    }

    It is wrong because of the proxy. There needs to be a way to override this cleanly. I found that just changing it to return true solves my issues. Why the database content links get figured one way and the supporting script and styles get figured another is a mystery to me.

  11. If a theme or plugin hardcodes the URLs in, that'd do it :/

    They're supposed to let the URL be dynamically generated, which won't have a problem.

  12. almart
    Member
    Posted 10 months ago #

    I strongly disagree with http://make.wordpress.org/core/handbook/design-decisions/#absolute-versus-relative-urls as a whole. Every point it makes is either wrong or stupid.
    I'm not discussing it here but I wanted to pass by and show my support to the OP.

  13. esmi
    Theme Diva & Forum Moderator
    Posted 10 months ago #

    Why not submit a patch to core correcting these "issues"?

  14. almart
    Member
    Posted 10 months ago #

    A patch has been already suggested in this thread. Read carefully.

  15. esmi
    Theme Diva & Forum Moderator
    Posted 10 months ago #

    I did, thank you. The same applies. The keyword was "submitted" - not "suggested".

  16. tblanchard
    Member
    Posted 10 months ago #

    I appreciate the support from alamart.

    I think the reason WP is the way it is is because it is generally installed in place, edited in place, and never moved. Basically it is designed for amateurs who edit and develop things live on their one copy.

    Professional organizations generally have multiple servers for development, staging, and production and thus pages may be accessed via different server names, behind proxies, and etc..... Code moves form dev to staging for approval and ultimately is placed before the public on production.

    Dumping the protocol would be a great step in the right direction though - ultimate protocol is not always known to the server if it is part of a distributed deployment - for instance behind a load balancer that is handling the https side.

  17. Basically it is designed for amateurs who edit and develop things live on their one copy.

    Ha! That's funny and many small online companies such as CNN, NY Times, the whole family of CBS Network news sites, many universities, some financial companies, etc. wouldn't think much of that statement.

    Again we disagree. ;)

    Professional organizations generally have multiple servers for development, staging, and production and thus pages may be accessed via different server names, behind proxies, and etc..... Code moves form dev to staging for approval and ultimately is placed before the public on production.

    Yes they do.

    How is that different from running and developing WordPress new releases, testing and QA'ing it to great length, and then when it's cooked releasing it?

    If you haven't already done so you really should take an informed look at the Make:WordPress sites as well as the Core trac site.

    Look at the collaboration and how it's all developed. In comparison to corporate shops it's really refreshing to observe. In "Professional organizations" releases have been made because someone put their foot down and cowed the other developers to tow the line. A manager makes a decisions and that's produced some really horrific products.

    It of course does not happen all the time but many organizations have released brown paper bag code.

    But the way WordPress is developed really is collaborative and it's constantly moving to a better "product". It's successful and what was once a simple blogging tool is developing over time into a robust platform.

    That's far and away from the URL generation topic you posted months ago. ;) But as Esmi stated you or anyone really can submit a patch. If it's good it will move up. If it doesn't pass then the patch will not be accepted.

  18. almart
    Member
    Posted 10 months ago #

    Ha! That's funny and many small online companies such as CNN, NY Times, the whole family of CBS Network news sites, many universities, some financial companies, etc. wouldn't think much of that statement.

    It IS designed for amateurs. Those companies you name have probably undergo heavy modifications to the WP code to make it perform as they want. They won't apply your updates as they are available. And I don't think they will give the community anything back any time soon.

    So stop saying that big companies find the WP code perfectly suitable for their needs. It is just self-contempt.

    I do like WP as a project, but it has flaws (like the URL generation) that are not being addressed. This leads to modified WP code, which leads to updates not being applied, which leads to a poor WP experience.

  19. esmi
    Theme Diva & Forum Moderator
    Posted 10 months ago #

    It IS designed for amateurs.

    I beg your pardon! As a professional developer for the past 12 years, I'm trying really hard not to take exception to that. And I very much doubt that I am the only one...

    Those companies you name have probably undergo heavy modifications to the WP code to make it perform as they want.

    No. They all all Autommatic VIP clients using WordPress at an enterprise level.

    If you don't like WordPress "as is" either don't use it or get involved and start submitting your own patches. Standing on the sidelines throwing insults is just childish.

Topic Closed

This topic has been closed to new replies.

About this Topic