Make WordPress Core

Opened 7 years ago

Closed 7 years ago

Last modified 6 years ago

#43443 closed enhancement (fixed)

Add a method for confirmation of requests for deleting or anonymizing of personal data

Reported by: azaozz's profile azaozz Owned by: mikejolley's profile mikejolley
Milestone: 4.9.6 Priority: normal
Severity: normal Version:
Component: Privacy Keywords: gdpr needs-testing commit fixed-major
Focuses: Cc:

Description

To avoid misuse when a request is made to anonymize or delete personal data, we will need to have a simple way of confirming the user's intention.

A good way would be to send an email to the user's known email address. That email will contain a link similar to the reset password link. When that link is clicked, we can send an email to the site's owner/admin informing them about the request.

Attachments (12)

feature.43443.diff (11.8 KB) - added by mikejolley 7 years ago.
First pass
43443.patch (12.6 KB) - added by azaozz 7 years ago.
update.43443.diff (13.8 KB) - added by mikejolley 7 years ago.
Update for 43443
update.43443.2.diff (15.0 KB) - added by mikejolley 7 years ago.
43443.3.patch (14.4 KB) - added by azaozz 7 years ago.
43443.4.diff (14.4 KB) - added by mikejolley 7 years ago.
Synced with master and working :)
repeating-requests.png (17.7 KB) - added by azaozz 7 years ago.
43443-refactor.diff (36.7 KB) - added by mikejolley 7 years ago.
43443.5.patch (38.4 KB) - added by azaozz 7 years ago.
Screen Shot 2018-04-26 at 8.11.17 PM.png (107.1 KB) - added by desrosj 7 years ago.
43443.6.patch (23.7 KB) - added by mikejolley 7 years ago.
43443.diff (516 bytes) - added by desrosj 7 years ago.

Download all attachments as: .zip

Change History (71)

#1 @azaozz
7 years ago

  • Owner set to mikejolley
  • Status changed from new to assigned

@mikejolley
7 years ago

First pass

#2 @mikejolley
7 years ago

  • Keywords has-patch dev-feedback needs-testing added; needs-patch removed

Rather than a solution just for the above 2 use cases (delete account and anonymize), I thought it would be good to work on something more generic for this.

  • A function that could be called to confirm action X for email Y
  • An email to be generated with a confirmation link:
    • Based on email/password change email functions
    • Avoid personal identifiable information in the URL itself
    • Support for VISITORS who may not have an account, but may still have data in the database e.g. from comments.
    • Email content is filterable.
  • Avoided a new DB table (although we could add one for performance reasons, I opted for a combination of usermeta and options in first pass).
  • Added handling code to wp-login.php with similar code. That file is a mess but refactor is out of scope.

I've attached my first pass diff. Feedback welcome for naming/wording/approach.

Usage:

  1. Code which needs confirmation from the user first calls the send_confirm_account_action_email() function. You give it a name for your action, user email, and optionally a user friendly description for the action that is added to the email.
  2. Email is generated and sent - send_confirm_account_action_email() returns true if the mail was sent, or WP_Error object if there was a problem.
  3. User clicks link in the email. It will look something like this: https://local.wordpress.test/wp-login.php?action=emailconfirm&confirm_action=confirm-edit-account&uid=1&confirm_key=jIzpeoknqQZHErNhQsWJ
    1. Note the UID. This will be a user ID for a real WP account, and an email hash for a visitor who has no account.
    2. confirm_action is your given action name.
    3. confirm_key avoids conflicts with password reset.
  4. wp-login.php handles the new action (emailconfirm) and calls check_confirm_account_action_key(). This function returns true or false depending on if the confirmation data is valid. After this one of two things can happen:
    1. The link was valid. account_action_confirmed action is fired which passes the action name and email address of the user.
    2. The link was not valid, or expired. account_action_failed action is fired which passes the error object. The page is killed with error message.

The email that gets sent looks like this:

https://dl.dropboxusercontent.com/s%2Fn16y623mw1f1tgn%2Fs31c4pdznjnmauozSpark%252520-%252520Inbox%2525202018-03-05%25252017-32-34.png%2520%2528746%25C3%2597507%2529%25202018-03-05%252018-14-40.png

Thats it in a nutshell. The roughest part is the wp-login handling but I want feedback before working on it further.

@azaozz
7 years ago

#3 follow-up: @azaozz
7 years ago

In 42791:

Add a method to confirm user requests by email. First run.

Props mikejolley.
See #43443.

#4 @azaozz
7 years ago

feature.43443.diff works quite well. Only changed it so we always delete the stored token when the hash matches, and fixed a typo in var order. Left the ticket open so we can iterate/enhance it further.

Things to consider:

  • Prevent "flood" of requests. If a request is made and it hasn't expired, perhaps limit how many new requests can be made for the same email. Something like 10 should be plenty to cover legitimate user cases.
  • Perhaps add garbage collection function to delete expired requests.
  • Consider how this can be used through the REST API and add an endpoint.
  • Log confirmed requests and perhaps show them on the dashboard? Typically an admin will have to perform the requested action. When a site has more than one admin, would be good if all can see pending requests.
Last edited 7 years ago by azaozz (previous) (diff)

#5 follow-up: @azaozz
7 years ago

Log confirmed requests and perhaps show them on the dashboard

Was thinking about this a bit more: instead of deleting the confirmation token from the DB perhaps we can set a "confirmed" status on it and keep it until the action is performed?

Also, may be better to have a permanent log. Perhaps we can make a new private CPT (without editor, terms, revisions, etc. support) that will hold the log. Then can use postmeta to store the tokens on it. After the action is performed can add a row with the date and type of action but not the user email so it is anonymous.

#6 in reply to: ↑ 3 ; follow-up: @ocean90
7 years ago

Replying to azaozz:

In 42791:

Add a method to confirm user requests by email. First run.

Props mikejolley.
See #43443.

  • hash() may not be available, see #29518.
  • @return WP_ERROR|bool should be @return WP_Error|bool.
  • There should be no empty line between @param and @return tags.
  • All the WP_Error messages should end with a full stop.
  • In check_confirm_account_action_key the function should bail directly if one of the parameters is empty. So move that return new WP_Error( 'invalid_key', __( 'Invalid key' ) ); line up to avoid one huge if condition.
  • All the new actions have no @since tag.
  • Why are none of the functions prefixed with wp_?

#7 in reply to: ↑ 6 @azaozz
7 years ago

Replying to ocean90:

This is just the "first run" patch committed for easier testing/enhancements. Fixes welcome! :)

This ticket was mentioned in Slack in #core by azaozz. View the logs.


7 years ago

#9 in reply to: ↑ 5 @allendav
7 years ago

Replying to azaozz:

Log confirmed requests and perhaps show them on the dashboard

Was thinking about this a bit more: instead of deleting the confirmation token from the DB perhaps we can set a "confirmed" status on it and keep it until the action is performed?

Also, may be better to have a permanent log. Perhaps we can make a new private CPT (without editor, terms, revisions, etc. support) that will hold the log. Then can use postmeta to store the tokens on it. After the action is performed can add a row with the date and type of action but not the user email so it is anonymous.

I like the idea of a privacy actions log, but since that may lead us to having to have a separate db table, I recommend we break that out into a separate issue and implement this issue as a priority without that sub-feature.

A privacy actions to-do-list/log could be useful for demonstrating compliance or even just for knowing who did what and when. I don't get the impression from Article 30 or Recital 82 that the GDPR requires this to be automated or electronically held however, so I recommend a lower priority for privacy actions logging, but IANAL.

Maybe for V1 we just leave request tracking to the administrator(s) and their email inbox :)

#10 @mikejolley
7 years ago

All the WP_Error messages should end with a full stop.

@ocean90 See check_password_reset_key. If it needs adding it should be added to all strings. I don't want to introduce another localisation string just to add a ., and want to avoid touching other code for the moment :)

Why are none of the functions prefixed with wp_?

Can add - none of the other functions in that file were so I assumed there was no standard.

Addressed the other items of feedback, and renamed some functions with what I think is more logical naming.

@azaozz

Consider how this can be used through the REST API and add an endpoint.

I think this would be separate. Consider these to be helper functions than an endpoint could utlize to perform a user action.

Perhaps add garbage collection function to delete expired requests.
Log confirmed requests and perhaps show them on the dashboard? Typically an admin will have to perform the requested action. When a site has more than one admin, would be good if all can see pending requests.
Prevent "flood" of requests. If a request is made and it hasn't expired, perhaps limit how many new requests can be made for the same email. Something like 10 should be plenty to cover legitimate user cases.

These points would be better addressed by a custom table as I suggested. It would be difficult to scale garbage collection with the current meta/options because it would need to load all and compare the values inside the serialized array.

Which leads on to @allendav's point:

I like the idea of a privacy actions log, but since that may lead us to having to have a separate db table, I recommend we break that out into a separate issue and implement this issue as a priority without that sub-feature.

I agree separate issue but this is the same idea. The generic user action confirmations/requests table I'd like to introduce would not only store the generated key, date of request, action name, UID, it would also store dates of confirmation which serves as a log and could be exposed via a UI. e.g. list table showing all rows for action "delete_account".

Should that be moved out of this issue?

Maybe for V1 we just leave request tracking to the administrator(s) and their email inbox :)

I also agree with this. If we just agreed users would contact site administrator to request details or anonomize, this system is probably not needed. If we want to support form's, the confirmation step is still required to verify requests are genuine.

Sidenote; I am in favour of emailing requests to admin for account deletion/details instead of showing new list tables in core. The UI side could be let for plugins. For our use, an email with the details of the request and some shortcuts to the new export/anonymize tools would be sufficient.

Flow without confirmation:

  1. User contacts admin via email.
  2. Admin generates export via new tools.
  3. Admin sends export to user.

Flow with confirmation:

  1. User requests export via form or link on site.
  2. System emails user.
  3. User clicks confirm link.
  4. Admin gets email with request and tool links.

Can we discuss this further? If thats an option we should prioritize the new tools over the processes, including this issue.

---

Anyhow, new code is here in my fork here https://github.com/WordPress/WordPress/compare/master...mikejolley:update/43443?expand=1

Functions:

  • wp_send_account_verification_key
  • wp_get_account_verification_key
  • wp_check_account_verification_key

wp-login action:

  • verifyaccount

Hooks:

  • account_verification_email_content
  • account_verification_expiration

@azaozz will send a diff too.

@mikejolley
7 years ago

Update for 43443

#11 @DavidAnderson
7 years ago

Mike asked me to post this here (copied from a comment in the WooCommerce github):

One complex issue that I haven't seen a lot of discussion about for GDPR is the need to keep an *external* log of deletion events, so that if you have to restore your database/site from a backup, you must then re-delete all the data that has been deleted since the backup.

i.e. If a user requests deletion of their data, then technically it should be deleted from backups too. But, GDPR has allowances for technically infeasible tasks - having to unpack your backup, delete data, and re-pack it would arguably be in that category. But, if the backup is actually used, then the deleted data is back. So, a deletion log has to be kept. Obviously having that log in the database itself is no good because the backup might be needed because of the database being lost. About the only universally available mechanism would be "send an email", and then the site owner has to go through his emails. Larger sites would be able to use a hook to deploy a more sophisticated method of logging upon every deletion event.

What do you think?

#12 @allendav
7 years ago

Nice work Mike! A few thoughts:

In src/wp-includes/user.php you generate a pseudo user ID using
$uid = function_exists( 'hash' ) ? hash( 'sha256', $email ) : sha1( $email );
would you add a comment saying why you chose that ternary? i assume that is similar to code elsewhere in WordPress core?

I kinda liked the send_confirm_account_action_email function name better than wp_send_account_verification_key - because the latter suggests a specific action (verify your account) whereas the former is more generic so… maybe wp_send_account_action_confirmation_email instead? IDK - I guess this would impact the (re-)naming of your filters too so feel free to ignore

The account_verification_expiration filter - seems that might be named something like account_action_key_max_age

BTW, I get a few whitespace and equal sign alignment warnings from phpcs with the WordPress package (just installed it myself today)

Next, in the email template

All at ###SITENAME###

The All seems weird in an actual email, e.g. I got

Regards,
All at WordPressSVN
http://localhost/wordpress-svn/src

Maybe make that "All" template-able too (or make it "All of us", idk)

Tests well. I made sure I could confirm an action. Made sure old action keys were not accepted once a new key was generated for an email.

Lastly, regarding flows - if the user contacts the admin via email, especially in the case of a deletion request, the admin should probably email them back before doing the delete, since email headers can be faked. We might want to expose this feature you've written to admins in that flow especially so they can get verification before a destructive action. That could be part of #43437 of course.

Maybe add to this a means of cc'ing the administrator on the confirmation request? I.e. as part of the paper trail? IDK. The reason I'm wondering about that is it isn't clear to me yet is where we could log the final confirmation after the user completes it. At the end of the scenario I've described, the admin would just have the original request from the user, we could cc ourselves to get the confirmation request, but we wouldn't end up with a record (e.g. in email) of their confirming the request. Hmmmm. Maybe we can't avoid that log :P

#13 @xkon
7 years ago

My 2c on this:

Requests UI: if the handling is only through e-mails and not add a UI at all to keep requests there are tons of failing e-mails everyday throughout installations for various ( silly ofc ) reasons as not all admins are well Admins per say + in the case of more than 1 admins it's easier to just provide a view that everybody has access to.

If not for the first push, it's something that definitely has to happen at some point and soon imho.

About the e-mail: Not necessarily for 'v1' again but the whole e-mail should be an editable thing for Admins through the UI again. If possible for a shortcode that they can [confirm-link] in there wherever they want even better, if not let them type whatever they like and just insert the link either before or after their message and keep the translation editors just change the text label for the actual link in a way of 'Confirmation Link'. We can't decide what they want to write in there for them, this e-mail might seem as an automated WordPress message, but it's basically an automated Website message so it should be personally handled on the way it 'talks' to it's users.

About Backups: I've asked the same question over and over again on different lawyers and everyone said the same thing:

If you restore a backup with deleted information, sure you could have an extra list and re-delete them. Under the GDPR though there are failsafes for technical issues so you might as well don't even want to do that, nobody is going to blame you, that's something either way that is going to be 'seen' IF and when you ever reach an Audit etc, so basically you can simply keep ( not in WordPress ) the date that you reverted to a previous backup and that's it.

As for progressively 'deleting' within previous backups that's not up to the core at all as nobody knows of course how those backups are even kept or where.

In general backups and re-deletions etc is not something for core and especially not at this stage imho, there's already the reason out of it let's just try to use it and not add extra weight for the time being. This could be easily bumped for further looks if you like as this actually has the regulation itself protecting it.

--

Note: You always see me focusing on UI and trying to push things into the Admin. You have to always see it from a non-tech / experienced user (I'm sure you do but do it x2 this time as we're talking about Regulations and not a plugin that isn't that important to understand or you can call your dev to adjust it for you). Point being we have to make this whole 'UX' zombie level for both users + admins (for the user side it's actually mentioned in the GDPR itself).

Note 2: If I'm getting super-crazy feel free to stop me at any point hahah :P I'm always replying here with the actual stuff and ways / things that I'm told to fix for our clients in the company I'm at, and usually everything is from lawyers.

Last edited 7 years ago by xkon (previous) (diff)

#14 @mikejolley
7 years ago

@allendav I copied the sha code from elsewhere yes (some hosts don't support hash apparently), and also copied the email and expiration filter wording from the confirm email change email functions in the same file :) I'd like to leave those unless the other functions are changing also (out of scope?).

r/e

I kinda liked the send_confirm_account_action_email function name better than wp_send_account_verification_key

My thinking is that any action you're confirming for an account needs verification, so I thought it would work. I'm not tied to either.

Maybe add to this a means of cc'ing the administrator on the confirmation request? I.e. as part of the paper trail? IDK. The reason I'm wondering about that is it isn't clear to me yet is where we could log the final confirmation after the user completes it.

This is really dependent on how we track things elsewhere. The logging table doesn't have to be part of this but I imagine functions using this code would create a log on request and log the confirmation when the action fires?

@xkon

About the e-mail: Not necessarily for 'v1' again but the whole e-mail should be an editable thing for Admins through the UI again. If

I'd leave this for plugins just like the other emails WordPress generates. Having a UI here would be inconsistent with the rest and isn't a 'must have' IMO.

#15 @allendav
7 years ago

@mikejolley - what would you think about a small change to the site option key, i.e. instead of

'_verify_' . $uid . '_' . $action_name

what if it was

'_verify_' . $action_name . '_' . $uid

That way it would 1) have the same initial string as the user meta key ( '_verify_' . $action_name ) and 2) would be a little easier to write a SQL LIKE query to fetch all the options for presentation, e.g.

SELECT option_value FROM wp_options WHERE option_name LIKE '_verify_export_personal_data_%'

#16 @allendav
7 years ago

@mikejolley - I've made some progress on an export UX with this patch - you can see the work thus far here: #43546

I'm in the middle of hooking up everything and I've gotten to the point where I'd like to fill a WP_List_Table with the requests and their status - but like @azaozz points out above, we delete the option/meta on confirmation

Was thinking about this a bit more: instead of deleting the confirmation token from the DB perhaps we can set a "confirmed" status on it and keep it until the action is performed?

I realize your code allows us to hook on account_action_confirmed, but if I were to leverage that it seems like more complexity (and queries) than I'd prefer - I would love it if the option_value/meta_value were just amended to include a confirmed timestamp (in addition to the requested timestamp)

If you agree, this patch would need to add that confirmed timestamp and probably reject confirmations if that confirmed timestamp was already present

This would allow my patch in 43546 to use the options/meta you presented as a log and by only using two queries (one for options and one for meta)

#17 follow-up: @mikejolley
7 years ago

@allendav I actually think that would be a mistake here. We should keep this mechanism for the approval only, not make it into a log.

I'm in the middle of hooking up everything and I've gotten to the point where I'd like to fill a WP_List_Table with the requests and their status

If this what we're doing, with a CPT, when the request is triggered it should be logged as pending, then approval should be obtained with the email.

When confirmed, hook in and update the log entry. Then the log is independent of this mechanism.

What we may want to add is instead a way to pass data with the action so when it's confirmed, the data is passed to the action. Then you could for example pass the log ID so it's updated :)

Sound good?

#18 in reply to: ↑ 17 @allendav
7 years ago

Replying to mikejolley:

@allendav I actually think that would be a mistake here. We should keep this mechanism for the approval only, not make it into a log.

I'm in the middle of hooking up everything and I've gotten to the point where I'd like to fill a WP_List_Table with the requests and their status

If this what we're doing, with a CPT, when the request is triggered it should be logged as pending, then approval should be obtained with the email.

When confirmed, hook in and update the log entry. Then the log is independent of this mechanism.

What we may want to add is instead a way to pass data with the action so when it's confirmed, the data is passed to the action. Then you could for example pass the log ID so it's updated :)

Sound good?

If we switch to a CPT for facilitating the approvals, we wouldn't use the options/meta approach anymore right?

And, if we switch to a CPT, would we not want this patch/code to store the approval in post meta itself instead of relying on code in a separate patch?

I _do_ like the idea of having the action (including passing hooked callbacks the CPT id and action in play) so that plugins can do additional things with this approval mechanism you've created beyond the storing of request-started, request-approved in a CPT.

Does that make sense?

#19 @mikejolley
7 years ago

If we switch to a CPT for facilitating the approvals, we wouldn't use the options/meta approach anymore right?

It depends. I've built this with other uses in mind - maybe things that don't need a log like account resets. Trying to make it generic as possible but useful this this use case too :)

Let me send my latest diff and you can take a look. I'm not concerned with an extra update to set the log confirmed when the action fires TBH. In fact, having an action too might be good because it can mean our core log could be replaced/enhanced with outside code.

#20 @mikejolley
7 years ago

Latest diff above.

  • Takes care of the email rewording @allendav requested
  • Allows extra request data to be passed with the request and passed to the actions.
  • Makes the option/user meta names more consistent
  • Adds an inline comment explaining use of hash

#21 @allendav
7 years ago

@mikejolley - I'm having trouble applying either of your latest patches - can you double check them?

src $ svn up
Updating '.':
At revision 42873.
src $ patch -p0 < update.43443.
update.43443.2.diff  update.43443.diff    
src $ patch -p0 < update.43443.
update.43443.2.diff  update.43443.diff    
src $ patch -p0 < update.43443.diff 
patching file wp-includes/user.php
patching file wp-login.php
Hunk #3 FAILED at 874.
Hunk #4 FAILED at 889.
2 out of 4 hunks FAILED -- saving rejects to file wp-login.php.rej
src $ svn revert wp-includes/user.php 
Reverted 'wp-includes/user.php'
src $ svn revert wp-login.php
Reverted 'wp-login.php'
src $ patch -p0 < update.43443.2.diff 
patching file wp-includes/user.php
patching file wp-login.php
Hunk #3 FAILED at 874.
Hunk #4 FAILED at 889.
2 out of 4 hunks FAILED -- saving rejects to file wp-login.php.rej

@azaozz
7 years ago

#22 @azaozz
7 years ago

43443.3.patch is the same as update.43443.2.diff just cleaned a bit of white space.

@mikejolley
7 years ago

Synced with master and working :)

#23 @allendav
7 years ago

Thank you @mikejolley - .4 works

#24 follow-up: @mikejolley
7 years ago

@azaozz Anything left to do on this before it can be considered for merge?

#25 in reply to: ↑ 24 ; follow-up: @azaozz
7 years ago

Replying to mikejolley:

43443.4.diff looks good, but weren't you preparing another (larger) patch on gh together with @allendav?

#26 in reply to: ↑ 25 @allendav
7 years ago

Replying to azaozz:

Replying to mikejolley:

43443.4.diff looks good, but weren't you preparing another (larger) patch on gh together with @allendav?

The larger patch (on gh at https://github.com/allendav/wp-privacy-requests ) is the next layer up - which uses this patch to do its work. This patch can land by itself. See the README.md on that repo in particular - it prompts you to install 43443.4.diff in order to work with the repo.

#27 @azaozz
7 years ago

In 42964:

Privacy: fixes and updates for the method to confirm user requests by email.

  • Improve function and variable names.
  • Allow extra data to be passed with the request.
  • Make the option/user meta names more consistent.
  • Adds an inline comment explaining use of hash.

Props mikejolley.
See #43443.

#28 @azaozz
7 years ago

  • Keywords needs-patch added; has-patch dev-feedback removed
  • Milestone changed from 5.0 to 4.9.6

Now we have CPTs for storing these user requests. We should change wp_send_account_verification_key() and friends to use the CPTs for storing the verification key, etc. No point to use the options or user_meta tables.

@mikejolley could you have a look please :)

#29 follow-up: @mikejolley
7 years ago

@azaozz Correct CPTs are used for logging requests, but the verification system is independent of that on purpose so it could be used in other places. For example, if I wanted to use these for lost password emails I wouldn't need to log via the CPT.

Make sense?

#30 in reply to: ↑ 29 ; follow-up: @azaozz
7 years ago

Replying to mikejolley:

...the verification system is independent of that on purpose so it could be used in other places. For example, if I wanted to use these for lost password emails I wouldn't need to log via the CPT.

Hmm, why not? On large/busy sites it would be nice to see when somebody has changed passwords, or the last time an email was verified. Perhaps we can abstract the CPT a bit, make it more generic, then set different post_name depending on request type?

I was also thinking we would need only one CPT, not one per request type. We can use all the data WP_Post has to offer: title, post_name, content, content_filtered, excerpt, etc. etc. For example the request type can be stored in post_name, and the request status (pending, verified, expired) in post_status.

The main reason we started using CPTs was to not store the verification key in the options table :)

#31 in reply to: ↑ 30 ; follow-up: @mikejolley
7 years ago

Replying to azaozz:

Hmm, why not? On large/busy sites it would be nice to see when somebody has changed passwords, or the last time an email was verified.

CPT is much heavier just to store a hash/key, and I disagree you need logs for that type of activity where it's a) automated and b) something that happens more or less in the background. Look at the existing system; it's just meta.

Perhaps we can abstract the CPT a bit, make it more generic, then set different post_name depending on request type?

We implemented the 2 CPT for the requests to take advantage of wp_count_posts. We'll need lots of custom queries otherwise on our screens to do the status filtering.

I was also thinking we would need only one CPT, not one per request type. We can use all the data WP_Post has to offer: title, post_name, content, content_filtered, excerpt, etc. etc. For example the request type can be stored in post_name, and the request status (pending, verified, expired) in post_status.

Previous comment. This was more efficient for the list tables/counts.

The main reason we started using CPTs was to not store the verification key in the options table :)

No, the main reason for CPTs was to get a list table log without lots of heavy custom queries on meta data, since no one liked the idea of a custom table.

Last edited 7 years ago by mikejolley (previous) (diff)

#32 in reply to: ↑ 31 ; follow-up: @azaozz
7 years ago

Replying to mikejolley:

CPT is much heavier just to store a hash/key, and I disagree you need logs for that type of activity where it's a) automated and b) something that happens more or less in the background. Look at the existing system; it's just meta.

Hm, if we plan for this to be reusable in the future, imho we should also plan for the log to be reusable.

Yeah, at first CPT may look heavier but the options table is just the wrong place to store this type of data. All options get loaded on each request, including on the front-end, which makes them a lot heavier than a CPT which is going to be loaded only when an admin visits the privacy screen.

Another big disadvantage is that the hash, time, email are stored in a CSV which makes it impossible to look them up separately. If I remember right that was one of the main reasons a new DB table was so attractive. Given that such table will stay mostly empty on the great majority of sites, a CPT is the next best thing.

Perhaps we can abstract the CPT a bit, make it more generic, then set different post_name depending on request type?

We implemented the 2 CPT for the requests to take advantage of wp_count_posts. We'll need lots of custom queries otherwise on our screens to do the status filtering.

Hm, don't think so. wp_count_posts() uses a simple db query to count. We actually can do better as it has some specific stuff for posts and pages that is not needed. Making another similar function which is adapted to our needs would be the right thing in any case.

Also, this will be needed "once in a blue moon" when an admin visits one of the privacy screens :)

The main reason we started using CPTs was to not store the verification key in the options table :)

No, the main reason for CPTs was to get a list table log without lots of heavy custom queries on meta data, since no one liked the idea of a custom table.

Hmm, think there has been some miscommunication here :) As a custom table would have been empty or almost empty on the great majority of sites, it doesn't make sense to add one. Even if we added one it will also need a whole new set of low-level functions to become usable. For example we would need to add or duplicate most of the stuff that drives the list tables UI.

CPTs are the next best thing to a custom table. They can store a lot of information quite efficiently. Not using them to store the simple data needed for implementing email verification doesn't seem right. In addition, having two almost identical custom post types doesn't seem best.

Last edited 7 years ago by azaozz (previous) (diff)

#33 @azaozz
7 years ago

Also things like the above screenshot shouldn't be possible. Storing the request in the same CPT that later is used for logging would make that easier to fix.

We'd also probably need some garbage collection for expired unconfirmed requests. See comment 4.

#34 @mnelson4
7 years ago

I tested this out on trunk and it worked fine.

Maybe the text shown to the user could give more direction about what to do after they've clicked the link.

Here's the content from an email that I generated on my test site:

Howdy,
A request has been made to perform the following action on your account:
Export Personal Data
To confirm this, please click on the following link: {link}
You can safely ignore and delete this email if you do not want to take this action.
This email has been sent to {email}.
Regards,
All at {sitename}

And then when the link is clicked, the user sees

Action has been confirmed.

I think we should add instructions on what the user should do next. Something like this:

Action has been confirmed. The site administrator has been notified and will fulfill your request as soon as possible.

Or something like that. The key point being that it says they need to wait for the admin to do something now.

#35 @xkon
7 years ago

Even though I do understand the way you see it @mnelson4 , let's not forget that the Admins themselves will create any given Form on their website to take 'requests' from users that want anonymization/exports and then send the confirmation needed.

So they will/should already communicate the process needed in their own way on that form probably as well. There's no need imho from a pre-made text like that even though it seems to server a purpose.

#36 @mnelson4
7 years ago

that the Admins themselves will create any given Form on their website to take 'requests' from users that want anonymization/exports and then send the confirmation needed.

Ya what you're saying makes sense on sites that have a form for anonymization/export where they explain the process. But these requests can also come over the phone, or a user could simply contact the admin via email. So I don't think users will have always seen any text explaining the process. So some users might not be previously aware of the process, so IMHO it's good to inform them.

Also, I don't think a reminder about what will happen next is bad.

What's more, even if the admin does create a contact form explaining the process for getting an export of personal data or how to delete data, this part of the process will always be the same (the user gets an email, clicks the link, then waits for the admin to fulfill their request), so why does every site admin need to create their own text explaining the process? That's a lot of duplication, no? (Granted admins are free to explain as much as they want on their contact form page; it's just non-essential.)

#37 in reply to: ↑ 32 ; follow-up: @mikejolley
7 years ago

Replying to azaozz:

Another big disadvantage is that the hash, time, email are stored in a CSV which makes it impossible to look them up separately. If I remember right that was one of the main reasons a new DB table was so attractive. Given that such table will stay mostly empty on the great majority of sites, a CPT is the next best thing.

Maybe there is some confusion about how this was intended to work.

There are 2 completely separate parts here; let's call them:

a) the request logging UI
b) the verify action API

The request logging UI (what you see/screenshot) lists requests. Requests are created during certain events. For example,

  • I may want to add a request manually without email confirmation.
  • I may want to add a request and get the user to confirm the action.

If there are duplicates, then the requests table logic needs an update, simply to check if a request already exists for the email address.

Actions can be verified via the other part: verify action API. This generates an action email, stores a hash, and gets confirmation from the user.

Why is this separate? For a number of reasons:

  1. It can be used by things which don't need a logging table, or are completely unrelated to the request logging UI. Possible future examples: a) Lost password notifications b) Account update requests from WooCommerce (thinking ahead) c) Changing your email address d) Deleting your account

By combining the request logging and this, you'll be restricting the above and it won't be as reusable.

  1. A request in the request logging UI can trigger multiple emails. E.g. 'resend' will keep the same request ID and trigger a new email, with a new hash, via the verify action API
  1. When an action is confirmed, a generic action files. Anything can use this - request logging uses it to update the request but there are other use cases, again, outside of the request logging which could perform an action once confirmed.
  1. It doesn't matter that the verify action API stores it's hash and other supporting information as a CSV; it's not exposed via UI anywhere and isn't designed to be listed. It's just data that gets stored and sent via the actions once the request is confirmed.

I feel they should remain separate. The duplication issue should be fixed but can be fixed without merging them into one thing.

Hm, don't think so. wp_count_posts() uses a simple db query to count. We actually can do better as it has some specific stuff for posts and pages that is not needed. Making another similar function which is adapted to our needs would be the right thing in any case.

Just trying to KISS :) That functionality exists so piggybacking it where possible.

CPTs are the next best thing to a custom table. They can store a lot of information quite efficiently. Not using them to store the simple data needed for implementing email verification doesn't seem right. In addition, having two almost identical custom post types doesn't seem best.

It's just for convenience. Seemed cheaper to query by post_type rather than name, meta, or any other abuse of the API. I still prefer the way it is.

#38 in reply to: ↑ 37 @azaozz
7 years ago

Replying to mikejolley:

Maybe there is some confusion about how this was intended to work.

Yeah, I agree. Seems we are still talking about different things. The title of this ticket is "Add a method for confirmation of requests for deleting or anonymizing of personal data" and that is what we need to do. Anything else is out of scope.

Three of the four TODO items from seven weeks ago are still not done:

  1. Prevent "flood" of requests. If a request is made and it hasn't expired, perhaps limit how many new requests can be made for the same email.
  2. Perhaps add garbage collection function to delete expired requests.
  3. Consider how this can be used through the REST API and add an endpoint.

In addition this stores random data in the options table which is a bad thing to do. Yep, it's true I committed the first patch that adds this, but the reason was to not hold back development of the UI and other functionality that depended on this, not because it is the proper way to do things.

  • I may want to add a request manually without email confirmation.

I'm not sure what that means. Are you talking about some sort of a messaging system between users inside of WP?

If there are duplicates, then the requests table logic needs an update...

No. See the TODO list above. Duplicates have to be detected and discarded as soon as a new request is made. That has nothing to do with the logging.

Why is this separate? For a number of reasons:

  1. It can be used by things which don't need a logging table, or are completely unrelated to the request logging UI. Possible future examples:
    • Lost password notifications
    • Account update requests from WooCommerce (thinking ahead)
    • Changing your email address
    • Deleting your account

This currently is out of scope. I'd argue that it would be a very nice feature to have logging for all of the above actions (security/user activity log?). Not sure why we shouldn't log them. However we simply don't have the time to implement this now. Opened #43841 as a follow-up.

I feel they should remain separate. The duplication issue should be fixed but can be fixed without merging them into one thing.

Uh, I'm still thinking we misunderstand each other :(

My strongest concern is that this feature uses the options table inappropriately and stores the requests inefficiently (in different places and with no way to fetch them from the db when needed). The main reason we moved to using CPTs was to not store the requests in the options and user_meta tables.

My second concern is that it uses multiple CPTs. It seems inefficient/not a good design to use more than one CPT, and this hinders any possibility to extend it in the future.

...Seemed cheaper to query by post_type rather than name, meta, or any other abuse of the API. I still prefer the way it is.

I'm not sure what you mean here too. A db query is pretty much the same on any table field as long as it has an index. It makes no difference if we are going to select posts by id, post_name, post_type, GUID, etc.

Last edited 7 years ago by azaozz (previous) (diff)

This ticket was mentioned in Slack in #gdpr-compliance by mikejolley. View the logs.


7 years ago

#40 @mikejolley
7 years ago

@azaozz https://core.trac.wordpress.org/attachment/ticket/43443/43443-refactor.diff takes care of:

  • Preventing multiple requests
  • Using a single post type
  • Moving user actions into a post type rather than meta.

This is just missing cleanup which I don't have time to tackle this evening.

For now this is also using post_title for action_name - we could use meta or something else, but I found post_name was no good due to handling in wp_insert_post.

@azaozz
7 years ago

#41 @azaozz
7 years ago

43443-refactor.diff works quite well :)

43443.5.patch is like 43443-refactor.diff with a bit more docs, few var names changes and couple of small fixes.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


7 years ago

#43 follow-up: @desrosj
7 years ago

Can the wp-login.php confirmation have more information for the user? The notice is short, but it doesn't really explain what action has been confirmed, or what is next. Also, "return to Site Name" may be a little weird here because they were not on the site, to begin with.

See Shot 2018-04-26 at 8.11.17 PM.png

This ticket was mentioned in Slack in #gdpr-compliance by xkon. View the logs.


7 years ago

#45 @azaozz
7 years ago

In 43008:

Privacy: update the method to confirm user requests by email. Use a single CPT to store the requests and to allow logging/audit trail.

Props mikejolley.
See #43443.

#46 in reply to: ↑ 43 @azaozz
7 years ago

Replying to desrosj:

Can the wp-login.php confirmation have more information for the user?

That was improved a bit on [43008].

@mikejolley
7 years ago

#47 @mikejolley
7 years ago

https://core.trac.wordpress.org/attachment/ticket/43443/43443.6.patch

Latest patch moves meta to post columns for more efficient storage:

  • post title - email address
  • post name - action name
  • post password - confirm key/hash
  • post modified - timestamp for the key

All wrapped in a new WP_User_Request object for convenience.

Also introduces some clean up methods which marked expired requests as failed as requested by @azaozz

Tested both screens and seems to work :)

This ticket was mentioned in Slack in #gdpr-compliance by mikejolley. View the logs.


7 years ago

#49 @azaozz
7 years ago

In 43011:

Privacy: update and enhance the method to confirm user requests by email. Introduce WP_User_Request to hold all request vars similarly to WP_Post.

Props mikejolley.
See #43443.

#50 @desrosj
7 years ago

  • Keywords commit fixed-major added; needs-patch removed

@desrosj
7 years ago

#51 @desrosj
7 years ago

43443.diff adds a missing @since tag to the user_request_action_description filter inline documentation and removes trailing tabs at the end of the docblock.

#52 @azaozz
7 years ago

In 43014:

Fix docs and white space.

Props desrosj.
See #43443.

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


7 years ago

#54 @SergeyBiryukov
7 years ago

In 43069:

Add a method to confirm user requests by email. First run.

Props mikejolley.
Merges [42791] to the 4.9 branch.
See #43443.

#55 @SergeyBiryukov
7 years ago

In 43070:

Privacy: fixes and updates for the method to confirm user requests by email.

  • Improve function and variable names.
  • Allow extra data to be passed with the request.
  • Make the option/user meta names more consistent.
  • Adds an inline comment explaining use of hash.

Props mikejolley.
Merges [42964] to the 4.9 branch.
See #43443.

#56 @SergeyBiryukov
7 years ago

In 43083:

Privacy: update the method to confirm user requests by email. Use a single CPT to store the requests and to allow logging/audit trail.

Props mikejolley.
Merges [43008] to the 4.9 branch.
See #43443.

#57 @SergeyBiryukov
7 years ago

In 43084:

Privacy: update and enhance the method to confirm user requests by email. Introduce WP_User_Request to hold all request vars similarly to WP_Post.

Props mikejolley, desrosj.
Merges [43011] and [43014] to the 4.9 branch.
See #43443.

#58 @SergeyBiryukov
7 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

#59 @desrosj
6 years ago

  • Component changed from General to Privacy

Moving to the new Privacy component.

Note: See TracTickets for help on using tickets.