Make WordPress Core

Opened 7 years ago

Closed 6 years ago

Last modified 3 years ago

#40794 closed task (blessed) (fixed)

WordPress needs a privacy policy

Reported by: johnbillion's profile johnbillion Owned by: pento's profile pento
Milestone: 4.9 Priority: normal
Severity: normal Version:
Component: Help/About Keywords:
Focuses: Cc:

Description

It's been many years since an installation of WordPress operated in isolation. The software sends data to various endpoints on api.wordpress.org, most visibly for update checks, but also for fetching translations, checking browser compatibility, and (since 4.8) determining the user's location and fetching nearby WordPress events.

WordPress needs a privacy policy which covers data that gets sent to wordpress.org. The wordpress.org website has a privacy policy, and it may be sufficient to link to this, or it may be required to extend this with information specifically regarding the data that installations of WordPress send to api.wordpress.org. I recommend the addition of a new Privacy tab on the About WordPress screen.

It's worth noting that the pending EU GDPR affects everyone because it covers the export of data outside of the EU.

Adding to the 4.8 milestone as the WordPress Events and News dashboard widget is a particularly visible example of data collection in WordPress.

Related: Long-running discussion on #16778.

Attachments (7)

privacy.diff (3.9 KB) - added by swissspidy 7 years ago.
Early patch for the about page as an inspiration
40794.2.diff (3.6 KB) - added by jnylen0 7 years ago.
Patch suitable for 4.8.1 (no new file)
40794.3.diff (3.7 KB) - added by jnylen0 7 years ago.
Minor string update
40794.3.trunk.diff (3.9 KB) - added by jnylen0 7 years ago.
40794.4.diff (3.6 KB) - added by jnylen0 7 years ago.
Remove TODO comment - no longer necessary now that we are using a separate patch for trunk / 4.9.
40794.5.diff (4.0 KB) - added by danieltj 6 years ago.
new-privacy.png (122.9 KB) - added by danieltj 6 years ago.

Download all attachments as: .zip

Change History (74)

@swissspidy
7 years ago

Early patch for the about page as an inspiration

#1 @swissspidy
7 years ago

Also related in terms of data collection: #38418

Too bad previous efforts for such a privacy policy on the about screen were kinda ignored (see https://make.wordpress.org/core/2017/02/24/dev-chat-summary-february-22nd-4-7-3-week-4/). @mattyrob and I even shared mockups and a patch there. Attaching this patch now here for further discussion.

#2 @SergeyBiryukov
7 years ago

  • Component changed from General to Help/About

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


7 years ago

#4 @jbpaul17
7 years ago

  • Milestone changed from 4.8 to 4.8.1

Punting to 4.8.1 per discussion in today's 4.8 rc1 bug scrub in #core.

#5 follow-up: @netweb
7 years ago

If WordPress 4.8 is going to ship with a new data collection feature I think it should include a privacy policy in 4.8, not 4.8.1, privacy should not be considered an afterthought by the project, it should be front and centre IMHO.

#6 in reply to: ↑ 5 @kirasong
7 years ago

Replying to netweb:

If WordPress 4.8 is going to ship with a new data collection feature I think it should include a privacy policy in 4.8, not 4.8.1, privacy should not be considered an afterthought by the project, it should be front and centre IMHO.

Agreed.

This ticket was mentioned in Slack in #core by iandunn. View the logs.


7 years ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


7 years ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


7 years ago

#11 @netweb
7 years ago

Here's the privacy.diff in action with /trunk, I'm all for the approach taken here in adding a new tab here:

  • https://cldup.com/F0rskG2Kxc.png

Looking at the current WordPress.org policy https://wordpress.org/about/privacy/ it explicitly mentions WordPress.org throughout the document, changing these references to WordPress and WordPress.org where applicable would be a good start to then cover both WordPress and WordPress.org.

#12 @Clorith
7 years ago

I like this addition, simple to understand. I also don't think we need to change the policy page on WordPress.org, as the patch mentions that's where we are transferring data, so anything covering .org would be covered by the data we transmit.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


7 years ago

@jnylen0
7 years ago

Patch suitable for 4.8.1 (no new file)

#14 @jnylen0
7 years ago

  • Keywords has-patch added; needs-patch removed
  • Owner set to jnylen0
  • Status changed from new to assigned

This is long overdue. Let's do it in 4.8.1.

The attached patch moves the privacy text to freedoms.php temporarily because we can't add a new file in a minor release.

I'd like to see a blurb about the events widget too, but I think this is a good start.

#15 @Clorith
7 years ago

Looks good, but I'd suggest a slight string change:

Your WordPress site may send anonymous data including the list of installed plugins and themes to WordPress.org when requesting updates.

Let's make this a bit more "vague" if you will, so that we're not painting our selves into a corner:

Your WordPress site may send anonymous data including, but not limited to, the list of installed plugins and themes to WordPress.org when requesting updates.

@jnylen0
7 years ago

Minor string update

#16 @jnylen0
7 years ago

@Clorith done in 40794.3.diff.

The events widget collects and sends a "network ID" value based on the IP address. In order to write the privacy text about this value, we need to know what the WP.org servers do with it. So it looks like we should just go with 40794.3.diff for the upcoming 4.8.1 beta release.

Last edited 7 years ago by jnylen0 (previous) (diff)

#17 @swissspidy
7 years ago

By the way, MattyRob should get props for privacy.diff as well. The patch came together after his initial patch/idea.

#18 @jnylen0
7 years ago

  • Keywords commit added

Per Slack discussion, added a separate patch for trunk:

The trunk patch is very similar to privacy.diff but with the string change from @Clorith and the cleanup of a couple of things that were copied over from the Credits page.

Regarding svn and the merge of these two patches:

westonruter [11:13 PM]
For the Custom HTML widget I committed the separate file in trunk. And then for the 4.8 branch I did svn merge as normal, but before committing, I removed the newly added file and amended it on the existing file. In that way, the merge info is retained.

Last edited 7 years ago by jnylen0 (previous) (diff)

#19 @jnylen0
7 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 41096:

About page: Add a privacy policy.

Props MattyRob, johnbillion, swissspidy.
Fixes #40794.

#20 @jnylen0
7 years ago

  • Keywords fixed-major added
  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening to get 40794.4.diff (DIFFERENT from the above commit) landed in 4.8.1.

Last edited 7 years ago by jnylen0 (previous) (diff)

@jnylen0
7 years ago

Remove TODO comment - no longer necessary now that we are using a separate patch for trunk / 4.9.

#21 @netweb
7 years ago

  • Keywords i18n-change added

#22 @iandunn
7 years ago

The most important thing to be transparent about regarding the Events Widget is the partially anonymized _client_ IP address. Usually API calls only expose the _server_ address, but this one needs to send the client so that we can geolocate their IP to get their location.

The IP is anonymized to the netblock, e.g., 50.60.70.80 becomes 50.60.70.0. That’s typically accurate enough for geolocation, but removes the ability to identify the specific user.

There are also a few other things that the Events Widget sends to api.w.org, but they might not be sensitive enough to be worth mentioning:

  • the locale for their WP user account (or site locale if user locale isn’t set)
  • the timezone from their browser (not the site timezone)
  • the value they typed in to the City field, if they chose to override the geolocated location

Core also exposes the client IP of logged-in users and front-end visitors to external sites in several situations. In those cases, it is not partially anonymized, so the specific device could be identified.

  • Requesting images/videos/etc from the w.org CDN (like wp-admin/about.php)
  • Requesting images from Gravatar (owned by Automattic) in wp-admin and on the front-end (via the default themes).
  • Requesting images from Google Fonts on the front-end (via the default themes)
  • Maybe a few others I missed

Here's a rough draft at some user-oriented language:

Your WordPress site may expose your computer's IP address, and the IP addresses of your visitors, to external websites. This happens when WordPress needs to download images, fonts, and other assets used within the Administration Panels and when browsing your site. To learn more, you can read the privacy policies for WordPress.org, Gravatar, and Google Fonts.

Your site may also send your IP address to WordPress.org, in order to determine your approximate location, so that you can be shown upcoming WordPress events in your area. WordPress.org does not use your IP address for any other purpose, and does not store it permanently.

Since the CDN requests expose the full IP, I don't think it's worth burdening the user with information about the partial anonymizing that the Events Widget does.

We should probably also add something about Akismet, like:

If you choose to enable the Akismet plugin to block spam, your WordPress site will also send data to to Akismet's API, in order to determine if the comment should be blocked. The data may include the text of the comment, and metadata about the commenter, including their IP address, name, and email address. For more details, see Akismet's privacy policy.

If you choose to install any plugins or themes that are not bundled with WordPress, they may also send additional data to external services. You can learn more by reading their respective privacy policies.

This ticket was mentioned in Slack in #core by iandunn. View the logs.


7 years ago

#24 @coreymckrill
7 years ago

Could also mention the Community Events Privacy plugin, which specifically prevents the Events Widget from sending the user's IP address:

https://wordpress.org/plugins/community-events-privacy/

#25 @pento
7 years ago

  • Keywords commit fixed-major removed
  • Owner changed from jnylen0 to pento
  • Status changed from reopened to assigned

The language used here needs to be reviewed by the Foundation before it can be released.

#26 follow-ups: @swissspidy
7 years ago

@iandunn IMHO there should be a filter the Akismet plugin would leverage. It‘s not a part of core after all.

#27 in reply to: ↑ 26 @iandunn
7 years ago

Replying to swissspidy:

@iandunn IMHO there should be a filter the Akismet plugin would leverage. It‘s not a part of core after all.

I don't have any objection to that. I just included it here because being bundled means that the majority of users won't realize the distinction.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


7 years ago

#29 follow-up: @jbpaul17
7 years ago

  • Milestone changed from 4.8.1 to 4.9

Punting to 4.9 per today's bug scrub to give the Foundation time to confirm appropriate language for this.

@pento who can help shepherd this with the Foundation so that we get confirmed language to include this in 4.9?

#30 in reply to: ↑ 26 @kirasong
7 years ago

Replying to swissspidy:

@iandunn IMHO there should be a filter the Akismet plugin would leverage. It‘s not a part of core after all.

I love the idea of a filter here for plugins to add their privacy policy information to.

#31 in reply to: ↑ 29 @pento
7 years ago

Replying to jbpaul17:

@pento who can help shepherd this with the Foundation so that we get confirmed language to include this in 4.9?

I'm on it, chatting to folks now. :-)

#32 @jnylen0
7 years ago

I think adding a filter for plugins is a bit overkill, especially for the first version we ship. It also re-introduces the same issue that caused this to be punted to 4.9, where un-vetted language would be appearing on this page.

A simpler alternative would be to include some phrasing like "Third-party plugins installed on this WordPress site may also collect and send data, subject to [reference to some rules for plugins]. Refer to the privacy policies of the individual plugins for more information."

This ticket was mentioned in Slack in #meta by clorith. View the logs.


7 years ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


6 years ago

@danieltj
6 years ago

#35 @danieltj
6 years ago

  • Keywords needs-testing added

I've updated the patch. I've added a new file (/wp-admin/privacy.php) because I don't think tagging on the back of the freedoms.php using GET parameters is the best way forward. Anyway, the patch I've added updates the about section with the new page and I've updated the copy on that page as well.

Thoughts? Would be good to test and get it shipped in 49 as soon as so we're not late.

This ticket was mentioned in Slack in #core by danieltj. View the logs.


6 years ago

@danieltj
6 years ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


6 years ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


6 years ago

#39 @jbpaul17
6 years ago

  • Milestone changed from 4.9 to Future Release

Punting this to Future Release per the 4.9 bug scrub earlier today. :sad-panda:

#40 @pento
6 years ago

  • Milestone changed from Future Release to 4.9
  • Type changed from enhancement to task (blessed)

Unpunting. The initial version is already in 4.9 ([41096]), but needs language review. I think it will mostly stay the same, the bulk of the policy will live on w.org.

#41 follow-up: @patrickgarman
6 years ago

The privacy page in it's current form doesn't seem to acknowledge that WordPress will send data about the site (multisite counts and user counts) to WordPress.org. It says more information about the data collected will live at wordpress.org/privacy which is not currently live. As a user where could I find the full list of all data being collected by WordPress.org and what that data is used for? I think if it were to live on a .org page that is linked to is a reasonable solution, but I couldn't find a reference to what that page actually would look like.

Also - now that a privacy policy will exist, will there be an opt-out for users to *not* share anonymous data with WordPress.org? Or are we just forcing them to continue sharing by hiding the off button in obscurity by requiring complex filters that most of the userbase has no idea how to use.

#42 @javorszky
6 years ago

Cross posting this information from another ticket, if you haven't seen that one: https://core.trac.wordpress.org/ticket/16778#comment:96

Then there are the examples of other projects dealing with the issue differently (better):
Ghost - https://github.com/TryGhost/Ghost/pull/3064 (issue where this was discussed) and https://github.com/TryGhost/Ghost/blob/master/PRIVACY.md (current version of the documentation)
npm - https://github.com/npm/policies/blob/master/privacy.md
piwik - https://github.com/piwik/piwik/issues/6196

Not only that, Ghost offers the ability to turn off updates / gravatars / google fonts, etc, because each and every one of them are leaking personally identifiable information (no, I'm not interested in debating how that information is personally identifiable, that's been established in other tickets / in blog posts, etc).

#43 @idea15
6 years ago

I personally hate the term "privacy policy" because it suggests impenetrable paragraphs of backside-covering written by a lawyer which bears little to no resemblance to the actual data collection and use on the site. Everyone needs to switch the perspective from privacy policies to GDPR's privacy notices, which are clear, accountable, transparent disclosures of what information is sent, to whom it is sent, and what control the user has over that.

Anyone building a .org site which collects personal data and is subject to GDPR will need to disclose, in that site's privacy notice, what personal data (which, under GDPR, includes online identifiers) is being sent to wp.com and what control they have over the transmission of that information. That goes for the data being collected through plugins and themes as well; see the WP Tavern discussion on Gforms and contact form retention on databases.

The anonymised or pseudonymised information sent for security purposes (updates) is fine. However, if the information transmitted to WP.org for the purposes of checking for upgrades also allows wp.com to see that Popular Ecommerce Site X has 100,000 customers, that's an online identifier, commercially sensitive information, and another headache.

At the very least, there will need to be a way for anyone building a .org site to immediately reference all of the information they need about the data collection and transmission taking place both within the base wp install *and* any plugins and themes in order to include that information within their own privacy notice. That information has to include granular choices for opting-out if the user so wishes, whether that is Gravatar or Google Fonts or anything bar the most essential functionality.

And if wp.org could save the users of the web literally thousands of hours in both sourcing and properly arranging that information, rather than (as has been mentioned above) sending users on a mystery tour for information which only developers can comprehend, all the better.

#44 in reply to: ↑ 41 @zodiac1978
6 years ago

Replying to patrickgarman:

It says more information about the data collected will live at wordpress.org/privacy which is not currently live.

The correct link is https://wordpress.org/about/privacy/

Is this intentional, that the link text is only https://wordpress.org/privacy/ which is a 404?

#45 follow-up: @pento
6 years ago

  • Keywords has-patch i18n-change needs-testing removed

Thank you for the opinions, everyone! I'm checking in with legal folks to see where we're at. This will definitely be ready for 4.9, though!

To address a few of the questions brought up:

The privacy page in it's current form doesn't seem to acknowledge that WordPress will send data about the site (multisite counts and user counts) to WordPress.org.

The privacy page on w.org will go into detail about what's sent, I don't know what the exact wording for privacy.php in Core will be, it will likely continue to be a summary, with a pointer to the page on w.org.

The correct link is https://wordpress.org/about/privacy/

Yup, this will need to be updated in Core once we have the final wording.

will there be an opt-out for users to *not* share anonymous data with WordPress.org?

This ticket is just about the privacy policy. Discussion about filters, opt-outs, and the like should happen on #16778.

Anyone building a .org site which collects personal data and is subject to GDPR will need to disclose, in that site's privacy notice, what personal data (which, under GDPR, includes online identifiers) is being sent to wp.com and what control they have over the transmission of that information.

Unless you install and activate Jetpack, data is only being sent to WP.org, not WP.com. WP.org is hosted on seperate infrastructure, Automattic employees do not generally have access to it. Some Automattic employees (myself included) have access, but we absolutely do not share that data inside Automattic, my colleagues know better than to ask. 🙂

#46 in reply to: ↑ 45 ; follow-up: @javorszky
6 years ago

Replying to pento:

Unless you install and activate Jetpack, data is only being sent to WP.org, not WP.com. WP.org is hosted on seperate infrastructure, Automattic employees do not generally have access to it. Some Automattic employees (myself included) have access, but we absolutely do not share that data inside Automattic, my colleagues know better than to ask. 🙂

The main point is that it needs to be declared what data sent to where, who has access to it, and for what purpose. In this case, as soon as you install WP core, your blog name, if multisite, how many subsites, and your user count will be sent to the .org infrastructure every time WP Core checks for available updates (twice daily by default). The following people have access to the data: core contributors (? I don't actually know, but would love to), and for what purpose, ie: how is the data used to inform whatever decision it is informing.

I'd also like to know what stops "Some Automattic" employees to exfil data from the .org infrastructure to use on other projects, such as .com infrastructure upgrades / marketing / whatever. Separation of roles / concerns / specific purpose.

GDPR isn't even here, but having a read about data protection on the UK Government's site: https://www.gov.uk/data-protection, it seems current data protections aren't adequate even now.

My next question: is this ticket only going to be about wording of the privacy policy, or will there be revisions to how the .org infrastructure handles data as well?

#47 in reply to: ↑ 46 ; follow-up: @pento
6 years ago

Replying to javorszky:

The main point is that it needs to be declared what data sent to where, who has access to it, and for what purpose.

That's the plan. 🙂

The following people have access to the data: core contributors (? I don't actually know, but would love to)

I'm going to take a wild guess and say that there won't be a list of which specific individuals have access. 😉

and for what purpose, ie: how is the data used to inform whatever decision it is informing.

Also part of the plan!

is this ticket only going to be about wording of the privacy policy

This ticket is only to track implementation of the privacy policy, exact wording is up to the lawyers. As programmers, we get to leave it in their expert hands, as it should be. 🙂

will there be revisions to how the .org infrastructure handles data as well?

The Systems team doesn't follow Core Trac, but is aware of GDPR requirements. The current advice I have is that there are no changes to need to .org infrastructure at the moment.

#48 @ocean90
6 years ago

#42169 was marked as a duplicate.

This ticket was mentioned in Slack in #core by melchoyce. View the logs.


6 years ago

#50 @pento
6 years ago

In 41944:

About page: Update the privacy policy language.

See #40794.

#51 @pento
6 years ago

Note: the w.org page isn't finished, so I'll leave this ticket open pending that being updated.

#52 @pento
6 years ago

In 41946:

About page: Update the privacy policy dashes.

If one were to insert 1—3 dashes into a sentence - on purpose - they should use the correct da–
sh.

See #40794.

#53 in reply to: ↑ 47 ; follow-up: @samuelsidler
6 years ago

Replying to pento:

This ticket is only to track implementation of the privacy policy, exact wording is up to the lawyers. As programmers, we get to leave it in their expert hands, as it should be. 🙂

Did a lawyer choose the wording you committed to core?

This data helps WordPress to protect your site by finding and automatically installing new updates, as well as other general enhancements.

I have no idea what "as well as other general enhancements" has to do with the rest of that sentence. It feels out of place and poor English. If I remove the initial part of the sentence, it would turn into: "This data helps WordPress to protect your site by finding and automatically installing other general enhancements." ... And that's assuming the best case with grammar, but is factually inaccurate, to the best of my knowledge.

None of the information shared with the update server contains personally identifiable information.

I'm guessing this line was removed because it's not true and the information shared with the update server contains personally identifiable information. Is that true or was the removal an accident?

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


6 years ago

#55 in reply to: ↑ 53 ; follow-up: @pento
6 years ago

Replying to samuelsidler:

Did a lawyer choose the wording you committed to core?

Yep.

I have no idea what "as well as other general enhancements" has to do with the rest of that sentence. It feels out of place and poor English. If I remove the initial part of the sentence, it would turn into: "This data helps WordPress to protect your site by finding and automatically installing other general enhancements." ... And that's assuming the best case with grammar, but is factually inaccurate, to the best of my knowledge.

Yah, I can see how this could be confusing. I'll see what we can do about clarifying it.

I'm guessing this line was removed because it's not true and the information shared with the update server contains personally identifiable information. Is that true or was the removal an accident?

This was changed because "personally identifiable information" has special meaning in the GDPR, while also being open to interpretation by the supervisory authorities. Once the GDPR has come into force and some test cases have been run, we'll have a better idea of exactly what wording we can use here, I think.

#56 @pento
6 years ago

In 42017:

About page: The link URLs in the privacy policy shouldn't be translatable.

Both of these URLs live on the main wordpress.org site, not Rosetta sites.

See #40794.

#57 @pento
6 years ago

Note for milestone purposes: this ticket will stay open until the privacy policy changes happen on the wordpress.org side, so that progress is being tracked in one location.

#58 in reply to: ↑ 55 @javorszky
6 years ago

Replying to pento:

This was changed because "personally identifiable information" has special meaning in the GDPR, while also being open to interpretation by the supervisory authorities. Once the GDPR has come into force and some test cases have been run, we'll have a better idea of exactly what wording we can use here, I think.

Wait, are you (.org foundation, A8C lawyers, people having decision making powers over this piece of text) suggesting you're going to intentionally leave sites open for liability because you don't know how to deal with GDPR? Even though you had prior knowledge about what needs to be done?

Would YOU want to be one of the few sites on which the test cases will be run?

For the record, "personal data" is a fairly explicit list of things, not really open for interpretation: see this for definition: https://gdpr-info.eu/art-4-gdpr/, and the pdf you can find on this post for a more visual representation of the same thing: https://enterprivacy.com/2017/03/01/categories-of-personal-information/.

Last edited 6 years ago by javorszky (previous) (diff)

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


6 years ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


6 years ago

#61 @pento
6 years ago

In 42045:

About page: Tweak the privacy policy language, for clarity.

See #40794.

#62 @pento
6 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

To clean up the 4.9 milestone, I'm closing this ticket in favour of #meta3237, which is for updating the w.org page.

This ticket was mentioned in Slack in #community-team by iandunn. View the logs.


6 years ago

This ticket was mentioned in Slack in #meta by sergey. View the logs.


6 years ago

This ticket was mentioned in Slack in #gdpr-compliance by iandunn. View the logs.


6 years ago

This ticket was mentioned in Slack in #forums by ipstenu. View the logs.


6 years ago

#67 @shaikhali123
3 years ago

[spam, account blocked -iandunn]

Last edited 3 years ago by iandunn (previous) (diff)
Note: See TracTickets for help on using tickets.