[closed] WordPress phone home privacy issues. Why is it there? (20 posts)

  1. andreasnrb
    Posted 6 years ago #

    I would like to know why WordPress.org collects information about my wordpress installations and also identifiable information.

    That wordpress does this aint written anywhere on the wordpress.org site. There is no reason for WordPress.org to collect this information at all. The only information required for updates are versionnbrs.
    And yes I know about the privacy policy but people that dont know what api means etc have no idea about this. Even I know about API and I didn't get that wp.org collects identifiable information.

    So why are WordPress.org so sneaky and secretive about this?
    And why the refusal to remove it?

  2. Steffen Jørgensen
    Posted 6 years ago #

    What kind of information are you referring to?

  3. andreasnrb
    Posted 6 years ago #

    I'm referring to when wordpress checks for updates it sends information to wordpress.org.
    Url, locale(language), language package, wordpressversion, mysql version, php version. On the recieving side they also get your IP ofcourse.
    Its in the file wp-includes/update.php

    $url = "http://api.wordpress.org/core/version-check/1.3/?version=$wp_version&php=$php_version&locale=$locale&mysql=$mysql_version&local_package=$local_package";
    $options = array(
    'timeout' => 3,
    'user-agent' => 'WordPress/' . $wp_version . '; ' . get_bloginfo( 'url' 	);

    But it doesnt say anywhere what is done with that information.
    Or if they collect it all in a database with your ip,url, locale, your php version, mysql version, wp version, which plugins you use, themes.

    I for one would like to know what they do with the information and how it is stored.

  4. The WordPress version is included in case the response format changes, so it can send back the right responses to the right WP versions.

    The locale you are using is sent to send the correct language data back.

    The versions of PHP and mysql you are using are used to create aggregate data information about how many installs use PHP5, etc. For example, they've said that about 11% of users still use PHP4. This info tells the developers which versions of the software they need to support in the future.

    The blog url is a unique identifier for each site, so that the statistical information can be correct. Otherwise you wouldn't be able to get accurate percentages, since some sites might check more often than others.

    All the plugin information is sent so the server can determine which plugins you have that have updates available for them. Sending just plugin name and version number is not enough, the plugin name and version and description and such can all change, there's no unique identifier. So the update server uses a fuzzy match method, to try to figure out what plugins you're asking about compared with the plugins it knows about. Ditto themes.

    All this data is covered under the Privacy Policy.

  5. andreasnrb
    Posted 6 years ago #

    You didnt answer the question Otto and I'm not that interested in what you think Otto. I want to know from the people in charge. Those that actually control things.
    I want to know what they do with the data and why they refuse to make it non identifiable.

    Also the privacy policy is questionable if you checked the discussion on wp-hackers list. Wp-hackers discussion Where people are insulted for questioning the point of wordpress sending all info.

    All the information sent is still not needed for updates to works. And it should not be stored so it can be connected with a site.

    Why don't Matt/WordPress.org just make a post and disclosing all this and what is stored and how it is used? Its open source the data collected should be open also.

  6. keylan
    Posted 6 years ago #

    I'm not planning on getting that deep into this one, but I just wanted to mention that it appears obvious that Otto has indeed checked the discussion you are referring to - considering he is actually one of larger contributors to it.

  7. NetworkGeek
    Posted 6 years ago #

    Check Google. Matt did comment on this ages and ages ago. In fact, I think it was in that same discussion...

    You can read the entire debate and remedy paraphrased here:

    If you use Windows, you send more information back to Microsoft every time you do updates and they sure don't tell you about it or give you a remedy for it.

  8. Mark Jaquith
    WordPress Lead Dev
    Posted 6 years ago #

    I want to know what they do with the data

    We use it to see plugin/theme popularity, track adoption rates of new WP versions, and get a feel for the platforms that people run WP on. For instance, knowing how many people run PHP 4 is very helpful in deciding whether or not to drop support for it.

    why they refuse to make it non identifiable.

    A URL is a standardized, globally unique, verifiable identifier for a WordPress site. It is not disclosed, per the privacy policy. Indeed, our tool for viewing this statistical data has only numbers and percentages — for instance, the number of people running MySQL 4 vs MySQL 5, or the number of active installs of a particular plugin. We don't have a "see what site X is running" tool. The purpose is to make better decisions about WordPress development by looking at the aggregate data.

    If you're uncomfortable with this, there are plugins available to obfuscate the URL you send (though obviously it cannot obfuscate the IP address your server sends, unless you're willing to turn off update notifications altogether).

  9. andreasnrb
    Posted 6 years ago #

    Keylan I know he is. But he never reads what people actually write. He has his view and everyone else is wrong.

    NetworkGeek: You do know that its most likely illegal to send identifiable information and not disclose it? There was a game a few years back that sent info so the makers could identify pirate copies. They lost in court. Thats why you have to disclose to the user what you send.
    And Matts comment is useless. No plugin should be required. And that the discussion is old only means that they dont give a rats ass about what users think. Its simple to change in the code. Take like 2 seconds.

    Your viewing tool is one thing whats actually in the database is another thing. You can still store all information connected with an url but summarize info and display in viewing tool. Therefore the problem still exists.
    I also don't think you need all the info related to the plugin, theme update checks either. author, descriptions etc.

    I should not have to install a plugin that obsfucates my url and make it non identifiable. The identifiable data should be anonymous before its sent to wordpress api. Its really simple why don't you just implement it?

  10. andreasnrb: Consider this a warning. Stop the personal attacks. The information I posted was factual in nature, you don't have any grounds for making comments about me.

    If you do it again, I will take action.

  11. NetworkGeek
    Posted 6 years ago #

    Because why should they? If I go to your URL, I can see most of the information you're worried about out for public display. Or, if you really are that concerned about it, you have a remedy. Actually, two or three.

    And, I'd disagree about them not "giving a rats ass" about what users think. When this initially came up, it was very clearly debated and I, as a user, felt it was put to bed. The identifiable information that you're so concerned with doesn't track you to a computer. It tracks information to a server on the PUBLIC internet. I'm not a lawyer, nor do I play one on the Internet, but I am someone with a business degree that's had to navigate more than one contract and privacy issue for businesses. I think you'd better check your case law because the case that you're very generally referring to sounds like one of several game console cases. They're different and not applicable here. That involved either, Nintendo trying to block third-party cartridges, XBOX modifications, or the EA Spore DRM fiasco where they installed extra software that interfered with the OS to report back. All totally different situations than this.

    Back when this originally came up, the debate was so limited that I don't think you can really claim a significant portion of the installed userbase really "gives a rats ass" about the data being collected by WordPress.org.
    It's good that people keep track of it and make sure everything is on the up and up, like you were trying to do, but you've come to the party a bit late. This was all hashed out two years ago when it first came up. If you'd been around then, you might have been able to participate in that debate in an effective way.

    Keep fighting the good fight, though.

  12. andreasnrb
    Posted 6 years ago #

    NetworkGeek The url ain't even the main problem its all the information connected to it or might be connected to it. I still haven't gotten an answer on that. And you cant get all info wordpress.org/Automatic gets from just visiting my sites. So thats a flawed argument.

    Your work or your degrees doesn't really matter.
    And the game was for Windows PC, some version of Ultima I think. No DRM stuff involved just a phonehome function that they didn't tell anyone about.

    Just because it was "dealt" with two years ago doesn't mean things cant change now. Its a bad argument against change.
    People can't react against it if they don't know about it in the first place. Almost no users check the wp-hackers list.

  13. And you cant get all info wordpress.org/Automatic gets from just visiting my sites. So thats a flawed argument.

    URL: http://andreasnurbo.com
    Locale: default (en)
    WP Version: 2.8.6
    PHP Version: 5.2.11 with Suhosin-Patch
    Theme: WP Premium
    Some of your plugins:
    Contact Form 7
    Organize Series
    Tweet This

    That's what I got from just looking at your site. No insider info involved.

  14. andreasnrb
    Posted 6 years ago #

    So? Now do it for every single wordpress installation and collect it all in one place with PHP version and MySql version.

  15. NetworkGeek
    Posted 6 years ago #

    Okay, I swear this is the last time I'll "feed the troll"...

    You know, you're right, as long as you use some vague recollection of a case you might have read about a couple of years ago as some kind of "evidence" that your opinion is somehow a legal argument, my degrees and experience don't matter. On the other hand, your example still doesn't apply. If you bothered to Google the EA games DRM issue, you may find that it was, in fact, a PC game and quite possibly what you were talking about. Or not. Regardless, the example may or may not be applicable.

    I'm not sure what particular axe you have to grind, but after Otto showed your straw man argument for what it was, I'd pretty well have thought the point was moot. Also, the "people in charge" have responded to you. Granted, you may not have liked their answer, but that doesn't mean they haven't answered your questions or given a good reason for collecting the data. They just haven't answered in the way you want.

    And, now, I'm just waiting for this "discussion" to devolve to a point that someone invokes Godwin's Law. It's headed that way, as far as I'm concerned, and makes me realize how much of a time-sink this just became.

    Good luck!

  16. andreasnrb
    Posted 6 years ago #

    Learn what a straw man is dude.

    What I want is for wordpress.org to disclose what data is retained in their database on what level of identifiability.
    Also a way to opt out of this collection. Not all info are required for updates to work.

    None has so far disclosed any of this. Not even Ottos ridiculous attempt of justification of the data collection. Just because some plugin info are avaiable from the frontend doesnt mean I want everything else that aint public to be collected. Heck most of my plugins arent even hosted at wordpress.org but info about them are sent to api none the less.


    • Make public what data is stored in the database.
    • Make a way to opt out of this data collection.
    Thats all I want. Its really simple really. Just do that and I'm one happy camper =).
    One more thing. If the data aint identifiable then make it public. There are already hidden pages at wordpress.org that show number of locale specific users and their activity lvl concerning updates.

  17. Clayton James
    Posted 6 years ago #

    As the owner of a "Software Company" and the user of various social media constructs, I would think certain concepts would not be foreign to you. The aggregation of data necessary to provide a consistent and professional quality of service is nothing new. Ever used a computer? Ever get an update? Those would be based on an inventory of your machine hardware and software. Ever have any thoughts about the information being collected by every piece of software you have ever installed that offers an automatic update service? Ever used PhotoShop? Flash? Windows? OS X? Linux? Sure you have. Hell, I'd be willing to bet that even Svenska Antipiratbyrån knows more about you than WordPress does.

    Just kidding... (but not really).

    Not that you should care, nor that it should matter, nor that I really give a rats ass what you think either way, but the flip-flop way through which you "argue" your contentions in this thread present evidence that you really don't possess the organizational skills required to present a compelling argument. Nor do your statements present the appearance that you have yet committed to which ever side of the fence you're actually standing on. I find that somewhat disappointing from a "just the facts please" point of view.

    I admire your ability to use your self-acquired programming language education to do something you really love to do. Not everyone is fortunate enough to be able to fully benefit from osmotic learning methods.

    Not even Ottos ridiculous attempt of justification of the data collection.

    Not cool. Not even mildly humorous anymore. You're just waving your own insecurities in the air like a big flag, for everyone else to see.

    I think NetworkGeek hit the nail right on the head. You just seem to be laying bait and trolling for twitter fodder.

    Best wishes!

  18. Clayton James
    Posted 6 years ago #

    Still laughing my ass off. If I could, I would e-mail Matt a virtual beer right now.

    Suggest Agenda Items for Dec 17th Dev Chat


    Good thread, "Ain't" it "Dude"? Read it all if you care to.

  19. NetworkGeek
    Posted 6 years ago #

    I know I said I wouldn't feed the troll again, but...


    The argument was first, "you're collecting private data that you can't get via a public URL, so you're hiding something and must be evil!"
    Then, Otto showed, in fact, you could get most of it.
    The argument changed to "Well, the *rest* of the data you collect isn't public, so you're hiding something and must be evil!"
    In fact, the veritable definition of the straw man argument as found on Wikipedia. The argument goes through subtle changes and refinements as it is disproved. See the examples.


  20. andreasnrb
    Posted 6 years ago #

    ClaytonJames Almost all software lets you know when it collects data on you and send it to the developer. Those that don't get scolded. I for one wouldn't make software that collects data and send its without informing my user, neither public or private data. Its just bad business practice.

    NetworkGeekThere was no such argument.
    So whos attacking a straw man? My standpoint has been the same from the start, disclosure and make the data non-identifiable. The only thing added was that there should be a way to optout.

    I'm not alone in my questions and suggestions.
    Why is it so hard to disclose what data is stored and make it possible to opt out?

Topic Closed

This topic has been closed to new replies.

About this Topic