Make WordPress Core

Opened 6 years ago

Closed 5 years ago

Last modified 4 years ago

#43438 closed enhancement (fixed)

Add filters and Ajax support for personal data export

Reported by: azaozz's profile azaozz Owned by: desrosj's profile desrosj
Milestone: 5.2 Priority: low
Severity: normal Version: 4.9.6
Component: Privacy Keywords: has-unit-tests has-patch commit
Focuses: Cc:

Description (last modified by azaozz)

It is a GDPR requirement that users should be able to export their private data from one site and (eventually) import/reuse it at another.

As discussed in Slack, this is the data the users entered themselves like name, email, nickname, etc. This is not automatically generated data about the user like ID, IP addresses, browser UA (for comments), etc.

All of this data is already available/visible on the user's profile screen. It also has to be downloadable in a common format like JSON.

Attachments (29)

43438.diff (15.8 KB) - added by allendav 6 years ago.
First pass at an extensible personal data exporter - includes core comments personal data export
43438.2.diff (17.8 KB) - added by allendav 6 years ago.
Updated patch to target privacy.php page instead
43438.3.diff (17.6 KB) - added by allendav 6 years ago.
Added Ajax referrer check; current_user_can check; only get comments with email match; break comments up into batches of 100; coding standards
1.png (293.9 KB) - added by allendav 6 years ago.
Proposed report style export, pg.1
2.png (299.4 KB) - added by allendav 6 years ago.
Proposed report style export, pg.2
3.png (173.0 KB) - added by allendav 6 years ago.
Proposed report style export, pg.3
43438.4.diff (2.9 KB) - added by allendav 6 years ago.
Removes comment handling, ux, adds multibyte email support
43438.5.diff (3.0 KB) - added by allendav 6 years ago.
Cast page and exporter index to int, sanitize post vars
43438.6.diff (4.9 KB) - added by allendav 6 years ago.
Ruggedize further, more error handling, action->filter for export builder
43438.7.diff (5.4 KB) - added by desrosj 6 years ago.
43438.8.diff (5.1 KB) - added by desrosj 6 years ago.
43438.9.diff (18.6 KB) - added by birgire 6 years ago.
error_response_without_Error_string_part.jpg (17.9 KB) - added by birgire 6 years ago.
43438.10.diff (18.9 KB) - added by birgire 6 years ago.
43438.11.diff (6.5 KB) - added by birgire 6 years ago.
43438.12.diff (12.8 KB) - added by birgire 6 years ago.
43438.13.diff (30.8 KB) - added by birgire 6 years ago.
43438.14.diff (51.4 KB) - added by desrosj 6 years ago.
43438.15.diff (50.7 KB) - added by desrosj 6 years ago.
43438.15.2.diff (48.9 KB) - added by desrosj 6 years ago.
43438.16.diff (49.0 KB) - added by birgire 6 years ago.
43438.17.diff (48.3 KB) - added by desrosj 6 years ago.
43438.18.diff (48.3 KB) - added by desrosj 6 years ago.
Use _unset_..._key() instead of _erase_..._key() for consistency, use protected visibility on helper methods.
43438.19.diff (48.1 KB) - added by birgire 6 years ago.
43438-20.diff (49.4 KB) - added by birgire 5 years ago.
43438-21.diff (52.4 KB) - added by garrett-eclipse 5 years ago.
Updated invalid translator comments, and clarified a few others by defining which array index eraser/exporter
43438.20.diff (2.0 KB) - added by david.binda 5 years ago.
Partial patch for fixing already committed unit tests on multisite install.
43438.21.diff (1.9 KB) - added by garrett-eclipse 5 years ago.
Cleaned up the is_multisite() conditionals and a spacing issue.
43438.22.diff (3.3 KB) - added by desrosj 5 years ago.

Download all attachments as: .zip

Change History (126)

#1 @azaozz
6 years ago

  • Description modified (diff)

#2 @fclaussen
6 years ago

I think that a more common file extension for exports like this is XML, but it's nice to give both options.

Regarding generated data, I believe from your list, the only generated data is the user ID, but I can also identify something with that data.

Other examples like browser UA and IP addresses can also be used to identify an individual.

Last edited 6 years ago by fclaussen (previous) (diff)

#3 @allendav
6 years ago

A few us of at Automattic chatted about a personal data exporter design that could support WooCommerce and its extensions needs (and be applicable to core itself or any plugin)

In a nutshell, we eventually thought it would be best to try having plugins (and extensions) register a callback function(s) for the export of personal data, e.g. have a plugin do something like this):

add_filter( "wp_privacy_register_export_data_callback", array( $this, 'register_export_data_callback' ) );
 
function register_export_data_callback( $export_callbacks ) {
    $export_callbacks[] = array(
        'slug' => 'stripe',
        'plugin_friendly_name' => __( 'Stripe Payments for WooCommerce' ),
        'callback' => array( $this, 'export_data' )
    );
    return $export_callbacks;
}

The plugin’s actual data-exporting callback would then look something like this, accepting the search parameter (e.g. the email address to export for) and a page parameter (more about that below).

function export_data( $email_address, $page ) {
    // do something here to get the next page of data for the email address given
    return array(
        'data' => array(
            // data goes here
        ),
    'done' => true // or return false if we want to be called again for another page of data
    );
}

For now, we are thinking the data should be super simple, an array of name/value pairs, to keep the barrier to entry low – we could always add more things to the pairs later, e.g:

[
    {
        name: “billing address order #1000”,
        value:  “1444 Seattle Hill Rd”
    },
    {
        name: “billing address order #998”,
        value:  “17725 108th St SE”
    },
]

Then, ajax on a core provided wp-admin page (or the REST API) could be used to fetch a list of all the registered callbacks (and their plugins), e.g. ala:

$registered_callbacks_array = apply_filter( “wp_privacy_register_plugin_export_data_callback”, array() );

and work with each plugin in the $registered_callbacks_array in turn calling its callback(s) to fetch how ever many pages of data that plugin decides to return for the email address of interest.

The web CLIENT would assemble the responses in memory (not the server) and then the user could do whatever they want with the data.

Thoughts?

#4 follow-up: @jesperher
6 years ago

Instead of returning arrays, would it be a safer approach, to use objects, like in WooCommerce and adding items to cart dynamicly ?

like
$object->add_callback('plugin_slug','callback');

#5 in reply to: ↑ 4 @allendav
6 years ago

Replying to jesperher:

Instead of returning arrays, would it be a safer approach, to use objects, like in WooCommerce and adding items to cart dynamicly ?

like
$object->add_callback('plugin_slug','callback');

Instead of using a filter to register our callback, we could certainly use a global method, like core does today with things like register_post_type.

I do appreciate what you proposed is not entirely dissimilar from the approach your GDPRWP project is using by having plugins call methods on an instance of GdprToolbox, but I don't think calling instance methods will be as familiar to your average plugin developer as using a filter or calling a global method.

What do you think?

#6 @allendav
6 years ago

  • Summary changed from Export registered user's private data on request to Export registered user's personal data on request

#7 follow-up: @jesperher
6 years ago

@allendav The method of just using a filter, is not far away from my current code, and was infact the first thought of how to approach this.

I can change my code to do the filter/array method istead but i am hitting an dead end. And can see it still will be an issue even when using the filter/array method.

when we are calling 1 callback at a time, the $variabel who contains all the callbacks, from the

´wp_privacy_register_export_data_callback´ filter.

when this is going into the javascript, the instance of the objects used to inform the callbacks dies.

example:

--Random plugin--
´´´
add_filter( "wp_privacy_register_export_data_callback", array( $this, 'register_export_data_callback' ) );

function register_export_data_callback( $export_callbacks ) {

$export_callbacks[] = array(

'slug' => 'stripe',
'plugin_friendly_name' => ( 'Stripe Payments for WooCommerce' ),
'callback' => array( $this, 'export_data' )

);
return $export_callbacks;

}

$all_callbacks = apply_filters('wp_privacy_register_export_data_callback',array() );

localize_script.... $all_callbacs
´´´
The $this, object dies as it goes into the javascript.
if creating Ajax call, and sending the 'callback' => array( $this, 'export_data' )´ to REST endpoint, "$this" doesnt work anymore, and using the ´call_user_func´ wont work.

A "solution" (but not a good one) would be to build "$all_callbacks" in every callback, and call the specific callback by index.

Do you have any thought on how to get around this issue.?

@allendav
6 years ago

First pass at an extensible personal data exporter - includes core comments personal data export

#8 in reply to: ↑ 7 @allendav
6 years ago

Replying to jesperher:

@allendav The method of just using a filter, is not far away from my current code, and was infact the first thought of how to approach this.

I can change my code to do the filter/array method istead but i am hitting an dead end. And can see it still will be an issue even when using the filter/array method.

when we are calling 1 callback at a time, the $variabel who contains all the callbacks, from the

´wp_privacy_register_export_data_callback´ filter.

when this is going into the javascript, the instance of the objects used to inform the callbacks dies.

A "solution" (but not a good one) would be to build "$all_callbacks" in every callback, and call the specific callback by index.

Do you have any thought on how to get around this issue.?

My patch attached is index based. The JS doesn't even need to really know about any particular callback other than how many there are.

What do you think?

#9 @allendav
6 years ago

  • OK, to try my patch, you'll need a site where someone (anyone) has left at least one comment (the more the merrier)
  • Then, go wp-admin > Tools > Privacy
  • Enter the commentor's username or email address in the box provided
  • Hit the button to start the export
  • You should see a table of all the personal data from their comments assemble

edit: next steps - expand coverage to usermeta and options, come to agreement on what to do with the export (i.e. making it email-able to the data subject, downloadable, how to handle things like images)

edit: edit: this patch will work against revision 42806 (i will update shortly to account for the merge of 43435 in revision 42814)

Last edited 6 years ago by allendav (previous) (diff)

@allendav
6 years ago

Updated patch to target privacy.php page instead

#10 follow-up: @azaozz
6 years ago

Few initial points on 43438.2.diff:

  • The admin-ajax endpoint will need a nonce check and current_user_can() check.
  • Think that we should have a specific DB query for this export, something like SELECT * FROM {$wpdb->comments} WHERE comment_author_email = '[email]' LIMIT 10000. Then can pick only the unique values from that, no need to repeat the same data all the time. Also, that DB query can possibly be more complex so we get only the unique data out of the DB.
  • It's using AJAX but overwrites the whole page. That is somewhat unexpected and can be confusing. IMHO better to just show it inline. Can move the PHP bits in the AJAX endpoint and just send the user_name or email to get it started.

@allendav
6 years ago

Added Ajax referrer check; current_user_can check; only get comments with email match; break comments up into batches of 100; coding standards

#11 in reply to: ↑ 10 @allendav
6 years ago

  • Keywords has-patch added; needs-patch removed

Replying to azaozz:

Few initial points on 43438.2.diff:

Thank you so much for the fast turn around!

  • The admin-ajax endpoint will need a nonce check and current_user_can() check.

Done!

  • Think that we should have a specific DB query for this export, something like SELECT * FROM {$wpdb->comments} WHERE comment_author_email = '[email]' LIMIT 10000. Then can pick only the unique values from that, no need to repeat the same data all the time. Also, that DB query can possibly be more complex so we get only the unique data out of the DB.

Done - instead of custom query, get_comments was up to the task. I set the limit to 100 per page - let me know if you want something higher. 10000 seemed uncomfortably high :)

  • It's using AJAX but overwrites the whole page. That is somewhat unexpected and can be confusing. IMHO better to just show it inline. Can move the PHP bits in the AJAX endpoint and just send the user_name or email to get it started.

Yep - I didn't make any changes here - this is not meant to be the final UX - I'd like us to all discuss what we want to do with the gathered data and if name/value is going to work in general.

#12 follow-up: @jesperher
6 years ago

src/wp-admin/includes/ajax-actions.php
$exporters = apply_filters( 'wp_privacy_personal_data_exporters', array() );


This line/filter will be called once pr callback added to it. + once for counting how big an index we need.

so for each callback, we are calling all plugins and asking witch callbacks we need. Although the function here is asynchronous, I imagine it can be a heavy process. is there a way that we can only get the retrieved callbacks once, instead of n-times?

#13 in reply to: ↑ 12 @allendav
6 years ago

Replying to jesperher:

src/wp-admin/includes/ajax-actions.php
$exporters = apply_filters( 'wp_privacy_personal_data_exporters', array() );


This line/filter will be called once pr callback added to it. + once for counting how big an index we need.

so for each callback, we are calling all plugins and asking witch callbacks we need. Although the function here is asynchronous, I imagine it can be a heavy process. is there a way that we can only get the retrieved callbacks once, instead of n-times?

I don't think the add_filter for each plugin will be heavy - it really just needs to add its callback and do no other processing. The callback isn't called when the filter is added. It is not that different from how custom post types are registered by every plugin on every page load

Does that make sense?

#14 follow-up: @allendav
6 years ago

There have been a few questions about why User Agent is personal data. I think we should include it because combined with IP address, you can probably identify a single device in a home or business and possibly then identify a user.

This also aligns with the definitions in Article 4(1) of the GDPR https://gdpr-info.eu/art-4-gdpr/

@allendav
6 years ago

Proposed report style export, pg.1

@allendav
6 years ago

Proposed report style export, pg.2

@allendav
6 years ago

Proposed report style export, pg.3

#15 follow-up: @allendav
6 years ago

I have an idea for cutting down on the repetition in the export - we drop the identifier from the name (making it not specific to any one comment, etc) and then collect all the "findings" in the final export with numbers indicating where there is more than one finding with that value, e.g. see images attached immediately above.

Last edited 6 years ago by allendav (previous) (diff)

#16 @allendav
6 years ago

Question to those watching this ticket - what sorts of non-string-able personal data do you need export to handle - in the examples above we have an image which is thankfully URL reference-able - is there anything should we support at the get-go?

#17 in reply to: ↑ 14 @azaozz
6 years ago

Replying to allendav:

There have been a few questions about why User Agent is personal data. I think we should include it because combined with IP address, you can probably identify a single device in a home or business and possibly then identify a user.

You may be able to identify a device (with high probability of errors), but you cannot identify a person.

This also aligns with the definitions in Article 4(1) of the GDPR https://gdpr-info.eu/art-4-gdpr/

Right, the text there is:

...an identifiable natural person is one who can be identified, directly or indirectly, 
in particular by reference to an identifier such as a name, an identification number, 
location data, an online identifier or to one or more factors specific to the physical, 
physiological, genetic, mental, economic, cultural or social identity of that natural 
person;

The things we are looking for here are:

  • a name
  • an identification number
  • location data
  • an online identifier

of that natural person.

Where is the identification of a software a person may or may not have used listed there? :)

In any case, thinking we need to clear that with a lawyer.

For the time being I agree we should probably err on the side of "safety" and include the browser UA.

Last edited 6 years ago by azaozz (previous) (diff)

#18 in reply to: ↑ 15 ; follow-up: @azaozz
6 years ago

Replying to allendav:

Yep, the screenshots look a lot better now, not even sure we need the count after the different bits of data.

Also thinking we can perhaps remove the "Provider" column. This is showing the personal data for a user of a particular website. There is no difference which plugin stores the data, the important part is that the data is on the site and they get to see and download it.

For getting only unique data from the comments query in 43438.3.diff I think we can try something like:

$name = $url = $ip = $ua = array(); // we know the email is unique
foreach ( $comments as $comment ) {
  if ( ! in_array( $comment->comment_author, $name ) ) {
    $name[] = $comment->comment_author;
  }
  if ( ! in_array( $comment->comment_author_url, $url ) ) {
    $url[] = $comment->comment_author_url;
  }
  .....

or alternatively just do:

foreach ( $comments as $comment ) {
  $name[ $comment->comment_author ] = 1;
  $url[ $comment->comment_author_url ] = 1;
  .....
}
$name = array_filter( array_keys( $name ) );
....

Maybe 10000 is too much but this way we can easily process 5000-6000 comment rows in one go and get only the unique data. The idea is to get all of the user's comments at once and not have to merge these arrays later in js.

Last edited 6 years ago by azaozz (previous) (diff)

#19 follow-up: @Kloon
6 years ago

So since the actual content of a post or comment does not qualify as user data, and I agree it should not, what happens when that data contains an email address for example? Might be worth looking into excepts and post content to make sure no personal information is present there and then display that?

#20 follow-up: @mikejolley
6 years ago

@allendav

$email_address = trim( strtolower( $email_address ) );

Why don't we do this before passing to the data exporter? Also do we need mb_ function here for special chars?

#21 @allendav
6 years ago

  • Summary changed from Export registered user's personal data on request to Add filters and Ajax support for collecting personal data

Re-focusing this ticket to just "Add filters and Ajax support for personal data export" per ticket scrub discussion in Making WordPress gdpr-compliance chat today.

Last edited 6 years ago by allendav (previous) (diff)

#22 @allendav
6 years ago

  • Summary changed from Add filters and Ajax support for collecting personal data to Add filters and Ajax support for personal data export

#23 in reply to: ↑ 18 @allendav
6 years ago

Replying to azaozz:

Replying to allendav:

Yep, the screenshots look a lot better now, not even sure we need the count after the different bits of data.

Also thinking we can perhaps remove the "Provider" column. This is showing the personal data for a user of a particular website. There is no difference which plugin stores the data, the important part is that the data is on the site and they get to see and download it.

Agreed.

For getting only unique data from the comments query in 43438.3.diff I think we can try something like:

$name = $url = $ip = $ua = array(); // we know the email is unique
foreach ( $comments as $comment ) {
  if ( ! in_array( $comment->comment_author, $name ) ) {
    $name[] = $comment->comment_author;
  }
  if ( ! in_array( $comment->comment_author_url, $url ) ) {
    $url[] = $comment->comment_author_url;
  }
  .....

or alternatively just do:

foreach ( $comments as $comment ) {
  $name[ $comment->comment_author ] = 1;
  $url[ $comment->comment_author_url ] = 1;
  .....
}
$name = array_filter( array_keys( $name ) );
....

Maybe 10000 is too much but this way we can easily process 5000-6000 comment rows in one go and get only the unique data. The idea is to get all of the user's comments at once and not have to merge these arrays later in js.

Agreed. I am going to split out the comments into #43440 so we'll pick this up over there.

@allendav
6 years ago

Removes comment handling, ux, adds multibyte email support

#24 in reply to: ↑ 20 @allendav
6 years ago

Replying to mikejolley:

@allendav

$email_address = trim( strtolower( $email_address ) );

Why don't we do this before passing to the data exporter? Also do we need mb_ function here for special chars?

Agreed. Good catch. Also, replaced strtolower with mb_strtolower

#25 in reply to: ↑ 19 @allendav
6 years ago

Replying to Kloon:

So since the actual content of a post or comment does not qualify as user data, and I agree it should not, what happens when that data contains an email address for example? Might be worth looking into excepts and post content to make sure no personal information is present there and then display that?

Not a bad idea. At the very least, we could look for the email address in the the comment content I suppose? Since we're breaking the comment exporter out of this ticket, let's move this discussion over to #43440

Last edited 6 years ago by allendav (previous) (diff)

#26 @allendav
6 years ago

FYI - testing this patch now requires also applying the patches from #43440 and from #43546

#27 follow-up: @mikejolley
6 years ago

@allendav Working great, I've got an implementation going for WooCommerce now.

Just one bug. The callback is passed $page but it’s a string, not an int. Can you cast it?

#28 in reply to: ↑ 27 @allendav
6 years ago

Replying to mikejolley:

@allendav Working great, I've got an implementation going for WooCommerce now.

Just one bug. The callback is passed $page but it’s a string, not an int. Can you cast it?

Awesome! Yes - can do - I'm going to cast exporter index too, and sanitize_text_field all the post vars for good measure.

@allendav
6 years ago

Cast page and exporter index to int, sanitize post vars

#29 @allendav
6 years ago

Hang on - one more tweak needed - I want wp_privacy_personal_data_export_page to be a filter, not an action, so we can use it to return the assembled ZIP file url in the response on the last loop in #43551 - will post a tweaked patch in a sec

@allendav
6 years ago

Ruggedize further, more error handling, action->filter for export builder

#30 follow-up: @allendav
6 years ago

Note: It is possible that we will need to export not just obvious unique personal data for a user (e.g. author_email) from objects like comments, but every "field" in a comment that user made. If so, this means that the name->value approach would result in extremely large files since we 1) wouldn't be able to de-duplicate since 2) we'd probably keep each exported comment object "intact" (i.e. with all its fields together, so the export made more sense.)

Hope to have an update shortly from discussion with stakeholders.

cc @azaozz

#31 in reply to: ↑ 30 @azaozz
6 years ago

Replying to allendav:

Note: It is possible that we will need to export ... every "field" in a comment that user made.

If that's the case we are doing it all wrong :) We'll need to "dump" most fields from the comments table from the database.

However the guidance for now is that the "personal data export" should include only "private or identifiable personal data" like email, street address, IP address, etc. That does not include publicly published data as it is covered under other laws and perhaps other licenses, and is available on the internet.

It may be nice to have a way to export all comments by author email, but not sure if this is needed for law enforcement compliance.

Last edited 6 years ago by azaozz (previous) (diff)

#32 @azaozz
6 years ago

In 42889:

Privacy: add support for exporting multiple pages of personal data.

Props allendav.
See #43438.

#33 @azaozz
6 years ago

  • Keywords needs-unit-tests added

Leaving the ticket open so tests can be added.

#34 follow-ups: @ocean90
6 years ago

  • Keywords needs-patch added; has-patch removed

[42889] has a few issues which need to be addressed:

  • wp_ajax_wp_privacy_export_personal_data() has no DocBlock.
  • The mbstring PHP extension might be not available, so mb_strtolower() needs function_exists() check. Note that is_email() currently doesn't support email addresses with multibyte characters so strtolower() should be enough.
  • The docs for the wp_privacy_personal_data_exportersfilter doesn't match the docs standards.
  • It seems like $_POST['exporter'] is an numeric index of an exporter. Wouldn't it be easier if exporters are referenced by a name instead?
  • What's the idea behind the error messages? Should probably be translatable or a code instead.
  • The nonce check should be enabled.

#35 in reply to: ↑ 34 @allendav
6 years ago

Replying to ocean90:

[42889] has a few issues which need to be addressed:

Thank you @ocean90 ! - I'll submit a patch to address these

This ticket was mentioned in Slack in #core-multisite by mnelson4. View the logs.


6 years ago

This ticket was mentioned in Slack in #gdpr-compliance by joemcgill. View the logs.


6 years ago

#38 @desrosj
6 years ago

  • Milestone changed from 5.0 to 4.9.6
  • Owner set to desrosj
  • Status changed from new to assigned

Other functionality depends on this, so it needs to land in 4.9.6. Going to review this and address the remaining concerns.

Last edited 6 years ago by desrosj (previous) (diff)

@desrosj
6 years ago

#39 in reply to: ↑ 34 @desrosj
6 years ago

Replying to ocean90:

[42889] has a few issues which need to be addressed:

  • wp_ajax_wp_privacy_export_personal_data() has no DocBlock.

A DocBlock was added in [43012].

  • The mbstring PHP extension might be not available, so mb_strtolower() needs function_exists() check. Note that is_email() currently doesn't support email addresses with multibyte characters so strtolower() should be enough.

The mb_strtolower() call was removed in [43012] when the email was switched to use the email property of a WP_User_Request object. I looked at that, though, and it does not seem to account for any multibyte characters, so added strtolower() back in 43438.7.diff.

  • The docs for the wp_privacy_personal_data_exportersfilter doesn't match the docs standards.

Fixed in 43438.7.diff.

  • It seems like $_POST['exporter'] is an numeric index of an exporter. Wouldn't it be easier if exporters are referenced by a name instead?

I agree with this. @allendav, @azaozz any objection? If not, a new ticket may be best for this, I think.

  • What's the idea behind the error messages? Should probably be translatable or a code instead.

Ensured every error message is translatable in 43438.7.diff.

  • The nonce check should be enabled.

Was also addressed in [43012].

@desrosj
6 years ago

#40 @desrosj
6 years ago

Looked at the email again, and I don't think a strtolower() is needed at all. 43438.8.diff does not include that.

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


6 years ago

@birgire
6 years ago

#42 @birgire
6 years ago

The patch in 43438.9.diff:

  • Adds checks if array keys are set in $_POST, else send an error, for these cases:
$request_id     = (int) $_POST['id'];
$exporter_index = (int) $_POST['exporter'];
$eraser_index   = (int) $_POST['eraser'];
$page           = (int) $_POST['page'];

  • Adds order to multiple placeholders in translation strings. For example %s, %d to '%1$s, %2$d.
  • Adds "translators:" comments.
  • Removes Error: in error messages. We already have "An error occurred while attempting to erase/export personal data" displayed before the error message. Also for consistency, because not all error messages start with Error:. See error_response_without_Error_string_part.jpg.
  • Adjusts the doc block for the wp_privacy_personal_data_export_page filter
    • Adds the possible WP_Error to inline doc for $response.
    • Removes the "zero-based" from the inline doc for $page.
  • Adjusts the doc block for the wp_privacy_personal_data_erasure_page filter
    • Adds the possible WP_Error to inline doc for $response.
    • Removes the "zero-based" from the inline doc for $page.
  • Further adjustments to the Coding Standard.
Last edited 6 years ago by birgire (previous) (diff)

#43 @birgire
6 years ago

Further possible changes could be to adjust the error messages for more consistency.

Here are the possible error messages raised in wp_ajax_wp_privacy_export_personal_data() for the WordPress Comments exporter, as an example:

  • Missing request ID.
  • Invalid request type.
  • Invalid request ID.
  • Invalid request.
  • A valid email address must be given.
  • Missing exporter index.
  • Missing page index.
  • An exporter has improperly used the registration filter.
  • Exporter index cannot be negative.
  • Exporter index out of range.
  • Page index cannot be less than one.
  • Expected an array describing the exporter at index 1.
  • Exporter array at index 1 does not include a friendly name.
  • Exporter does not include a callback: WordPress Comments.
  • Exporter callback is not a valid callback: WordPress Comments.
  • Expected response as an array from exporter: WordPress Comments.
  • Expected data in response array from exporter: WordPress Comments.
  • Expected data array in response array from exporter: WordPress Comments.
  • Expected done (boolean) in response array from exporter: WordPress Comments.

Here are the possible error messages raised in wp_ajax_wp_privacy_erase_personal_data() for the WordPress Comments exporter, as an example:

  • Missing request ID.
  • Invalid request type.
  • Invalid request ID.
  • Invalid request.
  • A valid email address must be given.
  • Missing exporter index.
  • Missing page index.
  • Eraser index cannot be negative.
  • Eraser index out of range.
  • Page index cannot be less than one.
  • Expected an array describing the exporter at index 1.
  • Eraser array at index 1 does not include a callback.
  • Eraser callback at index 1 is not a valid callback.
  • Eraser array at index 1 does not include a friendly name.
  • Did not receive array from WordPress Comments eraser (index 1).
  • Expected num_items_removed key in response array from WordPress Comments eraser (index 1).
  • Expected num_items_retained key in response array from WordPress Comments eraser (index 1).
  • Expected messages key in response array from WordPress Comments eraser (index 1).
  • Expected messages key to reference an array in response array from WordPress Comments eraser (index 1).
  • Expected done flag in response array from WordPress Comments eraser (index 1).

For example:

  • Expected done (boolean) in response array from exporter: WordPress Comments.
  • Expected done flag in response array from WordPress Comments eraser (index 1).

and

  • Expected response as an array from exporter: WordPress Comments.
  • Did not receive array from WordPress Comments eraser (index 1).

Since plugins can define their own friendly name, there's a possible confusion if the friendly name is a bad one like "response eraser". Then we have e.g.:

  • Expected messages key in response array from response eraser eraser (index 1).

that is difficult to understand.

I wonder if we should use translatable quotes, so it would display like:

  • Expected messages key in response array from "response eraser" eraser (index 1).

or keep the friendly name at last, like:

  • Expected messages key in response array (index 1) from eraser: response eraser.
Last edited 6 years ago by birgire (previous) (diff)

#44 @birgire
6 years ago

An additional important thing here is that we need to sanitize/escape the exporter/eraser friendly name.

We don't want it to contain HTML and stuff, but currently it goes through.

Here's an example:

wp_send_json_error(
	sprintf(
		/* translators: %s: exporter friendly name */
		__( 'Exporter does not include a callback: %s.' ),
		$exporter['exporter_friendly_name']
	)
);

The wp_send_json_error() is a wrapper for wp_send_json() that uses wp_encode_json().

What comes to mind are esc_js(), esc_html() and wp_encode_json() but I'm thinking sanitize_text_field() on $exporter['exporter_friendly_name'] should do it here.

What do you think?

ps: I did some testing here:

Raw friendly name:

WordPress Comments " \' \ - &amp; & <b>Hello</b> <script>alert(1)</script>

and here's the how the ajax error is displayed in the HTML table:

sanitize_text_field:
...<li>Exporter does not include a callback: WordPress Comments " ' \ - &amp; &amp; Hello.</li>...

esc_html:
...<li>Exporter does not include a callback: WordPress Comments " ' \ - &amp; &amp; &lt;b&gt;Hello&lt;/b&gt; &lt;script&gt;alert(1)&lt;/script&gt;.</li>...

esc_js:
...<li>Exporter does not include a callback: WordPress Comments " \'  - &amp; &amp; &lt;b&gt;Hello&lt;/b&gt; &lt;script&gt;alert(1)&lt;/script&gt;.</li>...

esc_html + sanitize_text_field:
...<li>Exporter does not include a callback: WordPress Comments " ' \ - &amp; &amp; Hello.</li>...

esc_js + sanitize_text_field:
...<li>Exporter does not include a callback: WordPress Comments " \'  - &amp; &amp; Hello.</li>...

wp_json_encode:
...<li>Exporter does not include a callback: "WordPress Comments \" ' \\ - &amp; &amp; <b>Hello&lt;\/b&gt; <script>alert(1)<\/script>".</li></ul></div></td></tr></tbody></table></script></b></li>...

So I would now think that we should do (at least) esc_html for the escaping and then decide if we want the sanitizing part.

Last edited 6 years ago by birgire (previous) (diff)

@birgire
6 years ago

#45 @birgire
6 years ago

43438.10.diff adds escaping for the friendly exporter/eraser names.

This ticket was mentioned in Slack in #gdpr-compliance by iandunn. View the logs.


6 years ago

#47 @azaozz
6 years ago

In 43060:

Privacy: translate error messages, some fixes and improvements for the AJAX actions for exporting and erasing user data.

Props desrosj, birgire.
See #43438.

#48 @birgire
6 years ago

43438.11.diff is a first pass for adding tests for the wp_ajax_wp_privacy_export_personal_data() function.

@birgire
6 years ago

#49 @SergeyBiryukov
6 years ago

In 43074:

Privacy: add support for exporting multiple pages of personal data.

Props allendav.
Merges [42889] to the 4.9 branch.
See #43438.

#50 @SergeyBiryukov
6 years ago

In 43088:

I18N: Use consistent pattern for placeholder references in translator comments in wp_ajax_wp_privacy_erase_personal_data().

See #43438.

#51 @SergeyBiryukov
6 years ago

In 43105:

Privacy: translate error messages, some fixes and improvements for the AJAX actions for exporting and erasing user data.

Props desrosj, birgire.
Merges [43060] to the 4.9 branch.
See #43438.

#52 @SergeyBiryukov
6 years ago

In 43107:

I18N: Use consistent pattern for placeholder references in translator comments in wp_ajax_wp_privacy_erase_personal_data().

Merges [43088] to the 4.9 branch.
See #43438.

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


6 years ago

#54 @desrosj
6 years ago

  • Keywords needs-patch removed

This ticket was mentioned in Slack in #core by desrosj. View the logs.


6 years ago

This ticket was mentioned in Slack in #gdpr-compliance by allendav. View the logs.


6 years ago

This ticket was mentioned in Slack in #core by desrosj. View the logs.


6 years ago

@birgire
6 years ago

#58 @birgire
6 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

43438.12.diff adds tests for the Ajax handlers:

  • wp_ajax_wp_privacy_erase_personal_data()
  • wp_ajax_wp_privacy_export_personal_data()

#59 @azaozz
6 years ago

  • Priority changed from normal to low

Set priority to low for tickets that only need unit tests.

This ticket was mentioned in Slack in #core by desrosj. View the logs.


6 years ago

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


6 years ago

This ticket was mentioned in Slack in #gdpr-compliance by birgire. View the logs.


6 years ago

#63 @desrosj
6 years ago

  • Component changed from General to Privacy

Moving to the new Privacy component.

#64 @casiepa
6 years ago

  • Milestone changed from 4.9.6 to 4.9.8

Moving enhancement to 4.9.8

#65 @desrosj
6 years ago

  • Keywords needs-refresh added
  • Milestone changed from 4.9.8 to 4.9.7
  • Version set to 4.9.6

Moving to 4.9.7 so that these unit tests can get added. Some thoughts on 43438.12.diff (mostly a check list of what is left):

wp_ajax_wp_privacy_export_personal_data() tests:

  • A test for Invalid request ID. scenario is missing.
  • test_wp_ajax_wp_privacy_export_personal_data_should_error_when_current_user_missing_required_capability() test's documentation is confusing. It says that the function should successfully send exporters response data when the user is an administrator, but it then sets the user as an Author, and tests that the response was not a success.
  • Missing tests for the Invalid request type scenario.
  • MIssing tests for the A valid email address must be given. scenario.
  • Missing tests for the Missing exporter index. scenario.
  • Missing tests for the Missing page index. scenario.
  • Missing tests for all of the errors indicating incorrect exporter usage.

wp_ajax_wp_privacy_erase_personal_data() tests:

  • This is also missing a large handful of error scenarios, as well as scenarios indicating incorrect erasure usage by plugins.

#66 @birgire
6 years ago

@desrosj thanks, the ajax tests have a reputation of being slow, so it was a very reserved patch regarding the number of tests.

I haven't looked into it where that slowness might come from, if it's related to the usage of WP_Ajax_UnitTestCase, _handleAjax() or elsewhere.

I can try to add more tests and see how it affects the run time.

Then there's
this kind of approach that uses WP_UnitTestCase with selected bits from the WP_Ajax_UnitTestCase class setup. Wonder if this approach is faster.

Last edited 6 years ago by birgire (previous) (diff)

This ticket was mentioned in Slack in #core-privacy by desrosj. View the logs.


6 years ago

@birgire
6 years ago

#68 @birgire
6 years ago

43438.13.diff updates the Tests_Ajax_PrivacyExportPersonalData class with more test cases.

An update for Tests_Ajax_PrivacyErasePersonalData is in progress.

#69 @ocean90
6 years ago

  • Milestone changed from 4.9.7 to 4.9.8

4.9.7 has been released, moving to next milestone.

#70 @desrosj
6 years ago

  • Keywords gdpr removed

Removing the GDPR keyword. This has been replaced by the new Privacy component and privacy focuses in Trac.

#71 @pbiron
6 years ago

  • Milestone changed from 4.9.8 to 4.9.9

4.9.8 has hit RC, moving to 4.9.9.

@desrosj
6 years ago

#72 @desrosj
6 years ago

  • Keywords has-patch added; needs-refresh removed

In 43438.14.diff

  • Change @since to 4.9.9.
  • Fix some docblock indentation and spelling/grammar issues.
  • Moved $this->_setRole( 'administrator' ); calls to the setUp() method. This code was in every method except one, and that one set the role to author.
  • Added test method for when an incorrect request type is passed (export request sent for an erase request).
  • Added missing test assertions for erase functionality.
  • Moved the eraser friendly name check higher up in wp_ajax_wp_privacy_erase_personal_data().
  • Include the eraser friendly name in more error messages to be more informative.
  • Fixes an incorrect error message when attempting to erase personal data and the request is not the remove type.

When the user submitting the request does not have the correct capability, the error returned is Invalid request.. Thinking that this should be changed to something more descriptive, such as Incorrect permissions..

I believe all potential scenarios in these two functions are now covered. @birgire if you could review, that would be great. I am wondering if we can cut down on some code repetition.

This ticket was mentioned in Slack in #core-privacy by desrosj. View the logs.


6 years ago

@desrosj
6 years ago

@desrosj
6 years ago

#74 @desrosj
6 years ago

43438.15.2.diff has a few small changes. (ignore 43438.15.diff, uploaded the patch wrong file).

  • Make the test method names more precise.
  • Remove $_POST = array() in the tearDown() method (this is handled in parent::tearDown()).
  • Add a test method in both classes to validate the correct error is encountered when a nonce does not properly validate.
  • Add a test method validating a WP_Error returned by a data exporter callback is correctly passed as the response.
  • Consolidated repetitive code that unset exporter/eraser indexes and changed exporter/eraser callback values into one method each.

I think these tests are good to go.

#75 @desrosj
6 years ago

  • Keywords commit added

@birgire
6 years ago

#76 @birgire
6 years ago

great @desrosj

Some minor adjustments in 43438.16.diff:

In both test classes:

  • Fix a typo in the property inline documentation by replacing $key_to_unset to $new_callback_value.
  • Fix a typo: Replace "An value to change the ..." to "A value to change the ..."
  • Change for better readability from 'security' => 'not-a-valid-nonce' to 'security' => 'invalid-nonce' in test_failure_with_invalid_nonce().
  • Adds an inline comment in test_error_when_incorrect_request_type(): // Incorrect request type, expects 'remove_personal_data'/'export_personal_data'.
  • Adds to the inline comment in test_error_when_request_id_invalid(): // Invalid request I, less than 1.

Adds inline comments in the following Tests_Ajax_PrivacyErasePersonalData methods:

  • test_error_when_missing_request_id(): // Missing request ID.
  • test_error_when_incorrect_request_type(): // Incorrect request type, expects 'remove_personal_data'.
  • test_error_when_invalid_email: // Invalid requester's email address.
  • test_error_when_missing_eraser_index: // Missing eraser index.
  • test_error_when_missing_page_index: // Missing page index.

---

PS: I wonder if we could simplify and dry it more with a helper method like _set_exporter_callback( $callback ) ?

Example:

$this->_set_exporter_callback( 'callback_return_wp_error' );
$this->_make_ajax_call();

instead of:

$this->new_callback_value = array( $this, 'callback_return_wp_error' );
add_filter( 'wp_privacy_personal_data_exporters', array( $this, 'filter_exporter_callback_index_value' ), 20 );
$this->_make_ajax_call();

#77 @desrosj
6 years ago

  • Keywords commit removed

#78 follow-up: @desrosj
6 years ago

Thanks, @birgire.

I made that code consolidation change in 43438.17.diff. I also fixed the @group ajax1 typo and fixed some typos in the erasure tests that incorrectly referred to exporting.

@desrosj
6 years ago

#79 in reply to: ↑ 78 @birgire
6 years ago

Replying to desrosj:

Thanks, @birgire.

I made that code consolidation change in 43438.17.diff. I also fixed the @group ajax1 typo and fixed some typos in the erasure tests that incorrectly referred to exporting.

We've both _unset_..._key() and _erase_..._key(). Should we use same naming format in both test classes?

Should we use protected/private for the underscore helper methods, that don't need public visibility?

Last edited 6 years ago by birgire (previous) (diff)

@desrosj
6 years ago

Use _unset_..._key() instead of _erase_..._key() for consistency, use protected visibility on helper methods.

This ticket was mentioned in Slack in #core-privacy by desrosj. View the logs.


6 years ago

@birgire
6 years ago

#81 @birgire
6 years ago

  • Keywords commit added

43438.19.diff has minor adjustments that:

  • Renames the _erase_response_key() method to _unset_response_key().
  • Changes the inline doc for the eraser's $key_to_unset to * An array key in the test eraser to unset.
  • Changes int he to in the in the inline doc for the _unset_exporter_key() method.

@desrosj I mark it with commit.

This ticket was mentioned in Slack in #core-privacy by desrosj. View the logs.


5 years ago

#83 @pento
5 years ago

  • Milestone changed from 4.9.9 to Future Release

This ticket was mentioned in Slack in #core-privacy by webdevlaw. View the logs.


5 years ago

#85 follow-up: @garrett-eclipse
5 years ago

  • Keywords needs-refresh added; commit removed

Thanks @birgire this still applies cleanly and all tests run successfully.

Minor note, the translator comments are now invalid as the string placeholder is now %s and not %d.
/* translators: %d: array index */
*There's two instances of this.
Can probably just reuse this comment found earlier in the file;
/* translators: %s: exporter friendly name */

Also, It will need a refresh for version numbers once it gets milestoned.

This ticket was mentioned in Slack in #core-privacy by garrett-eclipse. View the logs.


5 years ago

#87 in reply to: ↑ 85 @garrett-eclipse
5 years ago

  • Milestone changed from Future Release to 5.2

Let's move this into 5.2, @birgire are you able to refresh your patch to apply version numbers as well as address these points;
garrett-eclipse comment 85:

Minor note, the translator comments are now invalid as the string placeholder is now %s and not %d.
/* translators: %d: array index */
*There's two instances of this.
Can probably just reuse this comment found earlier in the file;
/* translators: %s: exporter friendly name */

Appreciated

@birgire
5 years ago

#88 @birgire
5 years ago

The 43438-20.diff:

  • Adds test_success_when_no_items_to_erase()
  • Adds test_success_when_no_items_to_export()
  • Adds :: prefix to @cover
  • Updates to @since 5.2.0
  • Updates the tests regarding the string change in #44833.
  • Adjusts an assert in test_success_when_current_user_has_required_capabilities() to use self::$request_email.

Thanks @garrett-eclipse, regarding the translators comments, do you mean these:

https://core.trac.wordpress.org/browser/tags/5.1/src/wp-admin/includes/ajax-actions.php#L4708

in wp_ajax_wp_privacy_erase_personal_data() ?


@garrett-eclipse
5 years ago

Updated invalid translator comments, and clarified a few others by defining which array index eraser/exporter

#89 @garrett-eclipse
5 years ago

  • Keywords commit added; needs-refresh removed

Thanks @birgire for the refresh and additional tests cases. Everything seems to be working nicely and passes tests.

As to the translator comments I've uploaded 43438-21.diff to avoid wasting your time with that. It also places the eraser/exporter prefix on some other translator comments for the array index.

Moving this forward to commit and we can do a final review in Monday's Bug Scrub.

#90 @desrosj
5 years ago

In 44908:

Privacy: Improve data export/erasure error messages and translator comments.

These changes address inaccuracies in current messages and makes them more informative.

Props birgire, garrett-eclipse, desrosj.
See #43438.

#91 @desrosj
5 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 44909:

Privacy: Add unit tests for exporting and erasing personal data.

Props birgire, garrett-eclipse, desrosj.
Fixes #43438.

#92 @david.binda
5 years ago

The unit tests added in r44909 do not seem to work properly on multisite installation, as export_others_personal_data and erase_others_personal_data meta capabilities are mapped to manage_network capability on multisite (only super admins have this cap by default).

The issue can be reproduced by running phpunit -c tests/phpunit/multisite.xml --group ajax.

@david.binda
5 years ago

Partial patch for fixing already committed unit tests on multisite install.

#93 @desrosj
5 years ago

  • Keywords needs-testing added; commit removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

Thanks for catching this, @davidbinda. Your patch looks good at first glance. Tests can be committed during the beta period, so let's get this fixed after this Thursday.

These were able to sneak in because the ajax group is not being tested for multisite on Travis or in any grunt tasks (such as grunt precommit). I have opened #46567 to propose adding that so AJAX related multisite issues can be caught moving forward.

@garrett-eclipse
5 years ago

Cleaned up the is_multisite() conditionals and a spacing issue.

#94 @garrett-eclipse
5 years ago

Nice @davidbinda good catch. I ran the test before patching to see what I was dealing with and got 1 error and 46 failures with 42 of them being in the privacy component. Applying your patch and retesting I still have the 1 error but only 4 failures and none within privacy. That's awesome, thank you.

That error and failures will be dealt with in #46567 that @desrosj opened there, so I posted the remaining error and failures there.

As to the refreshed patch I just uploaded 43438.21.diff. It simply switches from the unnecessary yoda conditionals to just check is_multisite() which follows the current convention in core. *Also there was a minor spaces instead of tabs thing which is also corrected.

Cheers
P.S. @desrosj this is good to commit in my opinion but I left in needs-testing so you can give it a once through.

@desrosj
5 years ago

#95 @desrosj
5 years ago

  • Keywords commit added; needs-testing removed

43438.22.diff adds two test methods to ensure the right message is returned when the current user is not a network admin on multisite.

#96 @desrosj
5 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 44943:

Tests: Fix failing tests for privacy export/erase requests when running the ajax group for multisite.

Props davidbinda, garrett-eclipse.
Fixes #43438.

#97 @SergeyBiryukov
4 years ago

In 46693:

Tests: Don't skip the tests intended for Multisite when running on single site, add them to the ms-required group instead.

See #43438.

Note: See TracTickets for help on using tickets.