Make WordPress Core

Opened 6 years ago

Closed 5 years ago

Last modified 5 years ago

#45889 closed enhancement (fixed)

Include Session Tokens as personal information in data exports and erasure

Reported by: lakenh's profile lakenh Owned by: garrett-eclipse's profile garrett-eclipse
Milestone: 5.4 Priority: normal
Severity: normal Version: 4.9.6
Component: Privacy Keywords: has-patch has-unit-tests has-screenshots commit has-dev-note
Focuses: administration Cc:

Description (last modified by garrett-eclipse)

#44161 raised some concerns about if we missed any personal data when the personal information export was released. Upon further investigation, the core-privacy team found multiple places in the user_meta table that still contains information that we should include in exports.

The currently known ones are the following:

  • Session Tokens: Contains IP address and user agent
  • Community Events: Contains IP address

*Community Events data will be handled via #43921

The scope of this ticket isn't about removing/anonymizing this information, instead just including it within the current user export and erasure tools.

Attachments (9)

45889.diff (1.3 KB) - added by nickylimjj 5 years ago.
session_token_export.png (155.6 KB) - added by nickylimjj 5 years ago.
45889.2.diff (1.3 KB) - added by nickylimjj 5 years ago.
export-with-session token.png (147.9 KB) - added by nickylimjj 5 years ago.
45889.3.diff (3.6 KB) - added by garrett-eclipse 5 years ago.
Updated patch to isolate the Session Tokens to their own exporter group and introduce unit test coverage.
Screen Shot 2020-01-29 at 1.48.13 AM.png (89.5 KB) - added by garrett-eclipse 5 years ago.
Session Tokens export group
session_tokens.png (47.1 KB) - added by xkon 5 years ago.
45889.4.diff (3.6 KB) - added by garrett-eclipse 5 years ago.
Refreshed patch to separate individual session tokens within the group.
Screen Shot 2020-01-29 at 1.32.18 PM.png (81.3 KB) - added by garrett-eclipse 5 years ago.
Updated Session Tokens grouping

Download all attachments as: .zip

Change History (23)

#1 @desrosj
6 years ago

  • Component changed from Privacy to Users
  • Focuses privacy added
  • Milestone changed from Awaiting Review to Future Release

One thing to be careful with here is the erasing the IP and user agent information used for sessions would cause an issue with sessions becoming invalidated. This would need to either be communicated to the user or, instead of erasing in the current process, a second step could be provided for erasing the sessions.

#2 follow-up: @garrett-eclipse
6 years ago

One question I had about the use of IP in the Community Events is would an anonymized IP be sufficient to geolocate an area to surface community events from? If so we could avoid needing to include it in export and erasure by anonymizing it prior to storing in the usermeta table which would make it no longer PII. Just a thought.

With Session Tokens I don't believe they would be as secure if the IP was anonymized and might not work at all so wouldn't go down that route with them.

#3 in reply to: ↑ 2 @garrett-eclipse
6 years ago

  • Component changed from Users to Privacy
  • Description modified (diff)
  • Focuses administration added; privacy removed
  • Summary changed from Include personal information from within the user_meta table in data exports to Include Session Tokens as personal information in data exports and erasure
  • Version changed from 5.0.2 to 4.9.6

As #43921 already exists with an existing patch we'll continue work for the Community Events Location information through that ticket. As such I've updated this ticket to change it's focus to be specific to Session Tokens.

And to answer my question from previous garrett-eclipse:

One question I had about the use of IP in the Community Events is would an anonymized IP be sufficient to geolocate an area to surface community events from? If so we could avoid needing to include it in export and erasure by anonymizing it prior to storing in the usermeta table which would make it no longer PII. Just a thought.

*This was answered on the other ticket indicating that the IP address is already partially anonymized as was indicated on this ticket comment;
https://core.trac.wordpress.org/ticket/40794#comment:22

#4 @garrett-eclipse
6 years ago

  • Keywords needs-patch added

@nickylimjj
5 years ago

#5 follow-up: @nickylimjj
5 years ago

  • Keywords has-patch needs-unit-tests added; needs-patch removed

Added a patch to export session tokens data. $token-props defined based on ticket but would require investigating into where in codebase it is defined. Did a cursory check but can't seem to find it in the __constructor for abstract base class WP_Session_Tokens nor class WP_User_Meta_Session_Tokens.

#6 in reply to: ↑ 5 @nickylimjj
5 years ago

Replying to nickylimjj:

Added a patch to export session tokens data. $token-props defined based on ticket but would require investigating into where in codebase it is defined. Did a cursory check but can't seem to find it in the __constructor for abstract base class WP_Session_Tokens nor class WP_User_Meta_Session_Tokens.

To any helpful debugger, the attachment is an example of the key-values for the session_tokens metadata if anyone is familiar where value's type is defined.

@nickylimjj
5 years ago

#7 @rconde
5 years ago

Thanks @nickylimjj it would be great to have this feature included in a next version of WP, as IP addresses fall as private data under the GDPR.

Your patch works great so at least it's something to be fully compliant.

@garrett-eclipse
5 years ago

Updated patch to isolate the Session Tokens to their own exporter group and introduce unit test coverage.

@garrett-eclipse
5 years ago

Session Tokens export group

#8 @garrett-eclipse
5 years ago

  • Keywords has-unit-tests has-screenshots needs-testing added; needs-unit-tests removed
  • Milestone changed from Future Release to 5.4
  • Owner set to garrett-eclipse
  • Status changed from new to accepted

Thanks @nickylimjj for getting this started, your patch worked nicely but I found the more tokens we get the more convoluted the value of that row becomes, as well the User profile information we latched in to is for the profile. I felt we should isolate the tokens to their own section so have refreshed your patch in 45889.3.diff to go that direction and provide a unit test for coverage.

Note: This follows the recent patch I placed on #43921

I also included in the export the expiration and last login date as that information can assist users attempting to use the data to vet if any malicious sessions exist.

Give it a test please and hopefully we can address this in 5.4.

@xkon
5 years ago

#9 @xkon
5 years ago

Do we really need an extra group just for session data? We're adding a filter (core values are protected) at #47509 to have an easier way to include metadata that are simple & fast getters in the User meta section. Since sessions are also within usermeta and they should be in the same table IMHO.

On a further note since you can have multiple sessions per user, the table view from 45889.3.diff isn't optimal from a "reading" perspective, see session_tokens.png as there's no clear view of where 1 sessions data start and end.

I would prefer them under the existing "User meta data" (it's getting renamed on the ticket mentioned earlier from Users profile data to Users meta data) identified as Session Data group 1, Session data group 2 so on so forth, or with any other incremental name to again have an easier identification while being read by someone. We can explode the actual info with <br/> or something like that to separate the individual data per session but essentially keep them within the same <tr> of the table if that makes sense.

@garrett-eclipse
5 years ago

Refreshed patch to separate individual session tokens within the group.

@garrett-eclipse
5 years ago

Updated Session Tokens grouping

#10 @garrett-eclipse
5 years ago

Thanks for testing @xkon and the feedback.

The screenshot you provided was exactly why going with a grouping I feel will be easier to read and digest. When it was in a single table row inside the User Meta even with one token it's convoluted, as soon as you have 2+ it's impossible to distinguish them. I took that into account for 45889.4.diff so each token inside of the group will be slightly separated similar to comments. See screenshot.

Along with the TOC coming in #46894 by using a grouping we make it easier to find and digest this information. And currently I'm seeing the User grouping is labelled specifically for profile data which Session Tokens aren't apart of.

Let me know what you think.

#11 @xkon
5 years ago

Aha! Now it actually makes total sense @garrett-eclipse . My main concern will always be on how to make it as easy as possible to end-users who are reading unknown things already. After our Privacy bug-scrubs as well yes we're definitely on the same side on moving this forward as it is on its own section :-).

Thanks for the fast follow-up patch! I'll give it another run asap.

#12 @xkon
5 years ago

  • Keywords commit added; needs-testing removed

Re-tested the final patch and works as expected. Thanks for the refresh & breaking the groups up @garrett-eclipse ! Marking this for commit.

#13 @SergeyBiryukov
5 years ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 47237:

Privacy: Include session tokens in Personal Data Export.

Session tokens contain an IP address and user agent.

Props garrett-eclipse, nickylimjj, lakenh, xkon, rconde.
Fixes #45889.

Note: See TracTickets for help on using tickets.