WordPress.org

Make WordPress Core

Opened 3 months ago

Last modified 7 days ago

#43438 assigned enhancement

Add filters and Ajax support for personal data export

Reported by: azaozz Owned by: desrosj
Milestone: 4.9.6 Priority: low
Severity: normal Version:
Component: Privacy Keywords: gdpr has-unit-tests
Focuses: Cc:

Description (last modified by azaozz)

It is a GDPR requirement that users should be able to export their private data from one site and (eventually) import/reuse it at another.

As discussed in Slack, this is the data the users entered themselves like name, email, nickname, etc. This is not automatically generated data about the user like ID, IP addresses, browser UA (for comments), etc.

All of this data is already available/visible on the user's profile screen. It also has to be downloadable in a common format like JSON.

Attachments (16)

43438.diff (15.8 KB) - added by allendav 3 months ago.
First pass at an extensible personal data exporter - includes core comments personal data export
43438.2.diff (17.8 KB) - added by allendav 3 months ago.
Updated patch to target privacy.php page instead
43438.3.diff (17.6 KB) - added by allendav 2 months ago.
Added Ajax referrer check; current_user_can check; only get comments with email match; break comments up into batches of 100; coding standards
1.png (293.9 KB) - added by allendav 2 months ago.
Proposed report style export, pg.1
2.png (299.4 KB) - added by allendav 2 months ago.
Proposed report style export, pg.2
3.png (173.0 KB) - added by allendav 2 months ago.
Proposed report style export, pg.3
43438.4.diff (2.9 KB) - added by allendav 2 months ago.
Removes comment handling, ux, adds multibyte email support
43438.5.diff (3.0 KB) - added by allendav 2 months ago.
Cast page and exporter index to int, sanitize post vars
43438.6.diff (4.9 KB) - added by allendav 2 months ago.
Ruggedize further, more error handling, action->filter for export builder
43438.7.diff (5.4 KB) - added by desrosj 3 weeks ago.
43438.8.diff (5.1 KB) - added by desrosj 3 weeks ago.
43438.9.diff (18.6 KB) - added by birgire 3 weeks ago.
error_response_without_Error_string_part.jpg (17.9 KB) - added by birgire 3 weeks ago.
43438.10.diff (18.9 KB) - added by birgire 3 weeks ago.
43438.11.diff (6.5 KB) - added by birgire 3 weeks ago.
43438.12.diff (12.8 KB) - added by birgire 10 days ago.

Download all attachments as: .zip

Change History (79)

#1 @azaozz
3 months ago

  • Description modified (diff)

#2 @fclaussen
3 months ago

I think that a more common file extension for exports like this is XML, but it's nice to give both options.

Regarding generated data, I believe from your list, the only generated data is the user ID, but I can also identify something with that data.

Other examples like browser UA and IP addresses can also be used to identify an individual.

Last edited 3 months ago by fclaussen (previous) (diff)

#3 @allendav
3 months ago

A few us of at Automattic chatted about a personal data exporter design that could support WooCommerce and its extensions needs (and be applicable to core itself or any plugin)

In a nutshell, we eventually thought it would be best to try having plugins (and extensions) register a callback function(s) for the export of personal data, e.g. have a plugin do something like this):

add_filter( "wp_privacy_register_export_data_callback", array( $this, 'register_export_data_callback' ) );
 
function register_export_data_callback( $export_callbacks ) {
    $export_callbacks[] = array(
        'slug' => 'stripe',
        'plugin_friendly_name' => __( 'Stripe Payments for WooCommerce' ),
        'callback' => array( $this, 'export_data' )
    );
    return $export_callbacks;
}

The plugin’s actual data-exporting callback would then look something like this, accepting the search parameter (e.g. the email address to export for) and a page parameter (more about that below).

function export_data( $email_address, $page ) {
    // do something here to get the next page of data for the email address given
    return array(
        'data' => array(
            // data goes here
        ),
    'done' => true // or return false if we want to be called again for another page of data
    );
}

For now, we are thinking the data should be super simple, an array of name/value pairs, to keep the barrier to entry low – we could always add more things to the pairs later, e.g:

[
    {
        name: “billing address order #1000”,
        value:  “1444 Seattle Hill Rd”
    },
    {
        name: “billing address order #998”,
        value:  “17725 108th St SE”
    },
]

Then, ajax on a core provided wp-admin page (or the REST API) could be used to fetch a list of all the registered callbacks (and their plugins), e.g. ala:

$registered_callbacks_array = apply_filter( “wp_privacy_register_plugin_export_data_callback”, array() );

and work with each plugin in the $registered_callbacks_array in turn calling its callback(s) to fetch how ever many pages of data that plugin decides to return for the email address of interest.

The web CLIENT would assemble the responses in memory (not the server) and then the user could do whatever they want with the data.

Thoughts?

#4 follow-up: @jesperher
3 months ago

Instead of returning arrays, would it be a safer approach, to use objects, like in WooCommerce and adding items to cart dynamicly ?

like $object->add_callback('plugin_slug','callback');

#5 in reply to: ↑ 4 @allendav
3 months ago

Replying to jesperher:

Instead of returning arrays, would it be a safer approach, to use objects, like in WooCommerce and adding items to cart dynamicly ?

like $object->add_callback('plugin_slug','callback');

Instead of using a filter to register our callback, we could certainly use a global method, like core does today with things like register_post_type.

I do appreciate what you proposed is not entirely dissimilar from the approach your GDPRWP project is using by having plugins call methods on an instance of GdprToolbox, but I don't think calling instance methods will be as familiar to your average plugin developer as using a filter or calling a global method.

What do you think?

#6 @allendav
3 months ago

  • Summary changed from Export registered user's private data on request to Export registered user's personal data on request

#7 follow-up: @jesperher
3 months ago

@allendav The method of just using a filter, is not far away from my current code, and was infact the first thought of how to approach this.

I can change my code to do the filter/array method istead but i am hitting an dead end. And can see it still will be an issue even when using the filter/array method.

when we are calling 1 callback at a time, the $variabel who contains all the callbacks, from the

´wp_privacy_register_export_data_callback´ filter.

when this is going into the javascript, the instance of the objects used to inform the callbacks dies.

example:

--Random plugin-- ´´´ add_filter( "wp_privacy_register_export_data_callback", array( $this, 'register_export_data_callback' ) ); function register_export_data_callback( $export_callbacks ) {

$export_callbacks[] = array(

'slug' => 'stripe', 'plugin_friendly_name' => ( 'Stripe Payments for WooCommerce' ), 'callback' => array( $this, 'export_data' )

); return $export_callbacks;

}

$all_callbacks = apply_filters('wp_privacy_register_export_data_callback',array() );

localize_script.... $all_callbacs ´´´ The $this, object dies as it goes into the javascript. if creating Ajax call, and sending the 'callback' => array( $this, 'export_data' )´ to REST endpoint, "$this" doesnt work anymore, and using the ´call_user_func´ wont work.

A "solution" (but not a good one) would be to build "$all_callbacks" in every callback, and call the specific callback by index.

Do you have any thought on how to get around this issue.?

@allendav
3 months ago

First pass at an extensible personal data exporter - includes core comments personal data export

#8 in reply to: ↑ 7 @allendav
3 months ago

Replying to jesperher:

@allendav The method of just using a filter, is not far away from my current code, and was infact the first thought of how to approach this.

I can change my code to do the filter/array method istead but i am hitting an dead end. And can see it still will be an issue even when using the filter/array method.

when we are calling 1 callback at a time, the $variabel who contains all the callbacks, from the

´wp_privacy_register_export_data_callback´ filter.

when this is going into the javascript, the instance of the objects used to inform the callbacks dies.

A "solution" (but not a good one) would be to build "$all_callbacks" in every callback, and call the specific callback by index.

Do you have any thought on how to get around this issue.?

My patch attached is index based. The JS doesn't even need to really know about any particular callback other than how many there are.

What do you think?

#9 @allendav
3 months ago

  • OK, to try my patch, you'll need a site where someone (anyone) has left at least one comment (the more the merrier)
  • Then, go wp-admin > Tools > Privacy
  • Enter the commentor's username or email address in the box provided
  • Hit the button to start the export
  • You should see a table of all the personal data from their comments assemble

edit: next steps - expand coverage to usermeta and options, come to agreement on what to do with the export (i.e. making it email-able to the data subject, downloadable, how to handle things like images)

edit: edit: this patch will work against revision 42806 (i will update shortly to account for the merge of 43435 in revision 42814)

Last edited 3 months ago by allendav (previous) (diff)

@allendav
3 months ago

Updated patch to target privacy.php page instead

#10 follow-up: @azaozz
3 months ago

Few initial points on 43438.2.diff:

  • The admin-ajax endpoint will need a nonce check and current_user_can() check.
  • Think that we should have a specific DB query for this export, something like SELECT * FROM {$wpdb->comments} WHERE comment_author_email = '[email]' LIMIT 10000. Then can pick only the unique values from that, no need to repeat the same data all the time. Also, that DB query can possibly be more complex so we get only the unique data out of the DB.
  • It's using AJAX but overwrites the whole page. That is somewhat unexpected and can be confusing. IMHO better to just show it inline. Can move the PHP bits in the AJAX endpoint and just send the user_name or email to get it started.

@allendav
2 months ago

Added Ajax referrer check; current_user_can check; only get comments with email match; break comments up into batches of 100; coding standards

#11 in reply to: ↑ 10 @allendav
2 months ago

  • Keywords has-patch added; needs-patch removed

Replying to azaozz:

Few initial points on 43438.2.diff:

Thank you so much for the fast turn around!

  • The admin-ajax endpoint will need a nonce check and current_user_can() check.

Done!

  • Think that we should have a specific DB query for this export, something like SELECT * FROM {$wpdb->comments} WHERE comment_author_email = '[email]' LIMIT 10000. Then can pick only the unique values from that, no need to repeat the same data all the time. Also, that DB query can possibly be more complex so we get only the unique data out of the DB.

Done - instead of custom query, get_comments was up to the task. I set the limit to 100 per page - let me know if you want something higher. 10000 seemed uncomfortably high :)

  • It's using AJAX but overwrites the whole page. That is somewhat unexpected and can be confusing. IMHO better to just show it inline. Can move the PHP bits in the AJAX endpoint and just send the user_name or email to get it started.

Yep - I didn't make any changes here - this is not meant to be the final UX - I'd like us to all discuss what we want to do with the gathered data and if name/value is going to work in general.

#12 follow-up: @jesperher
2 months ago

src/wp-admin/includes/ajax-actions.php
$exporters = apply_filters( 'wp_privacy_personal_data_exporters', array() );

This line/filter will be called once pr callback added to it. + once for counting how big an index we need.

so for each callback, we are calling all plugins and asking witch callbacks we need. Although the function here is asynchronous, I imagine it can be a heavy process. is there a way that we can only get the retrieved callbacks once, instead of n-times?

#13 in reply to: ↑ 12 @allendav
2 months ago

Replying to jesperher:

src/wp-admin/includes/ajax-actions.php
$exporters = apply_filters( 'wp_privacy_personal_data_exporters', array() );

This line/filter will be called once pr callback added to it. + once for counting how big an index we need.

so for each callback, we are calling all plugins and asking witch callbacks we need. Although the function here is asynchronous, I imagine it can be a heavy process. is there a way that we can only get the retrieved callbacks once, instead of n-times?

I don't think the add_filter for each plugin will be heavy - it really just needs to add its callback and do no other processing. The callback isn't called when the filter is added. It is not that different from how custom post types are registered by every plugin on every page load

Does that make sense?

#14 follow-up: @allendav
2 months ago

There have been a few questions about why User Agent is personal data. I think we should include it because combined with IP address, you can probably identify a single device in a home or business and possibly then identify a user.

This also aligns with the definitions in Article 4(1) of the GDPR https://gdpr-info.eu/art-4-gdpr/

@allendav
2 months ago

Proposed report style export, pg.1

@allendav
2 months ago

Proposed report style export, pg.2

@allendav
2 months ago

Proposed report style export, pg.3

#15 follow-up: @allendav
2 months ago

I have an idea for cutting down on the repetition in the export - we drop the identifier from the name (making it not specific to any one comment, etc) and then collect all the "findings" in the final export with numbers indicating where there is more than one finding with that value, e.g. see images attached immediately above.

Last edited 2 months ago by allendav (previous) (diff)

#16 @allendav
2 months ago

Question to those watching this ticket - what sorts of non-string-able personal data do you need export to handle - in the examples above we have an image which is thankfully URL reference-able - is there anything should we support at the get-go?

#17 in reply to: ↑ 14 @azaozz
2 months ago

Replying to allendav:

There have been a few questions about why User Agent is personal data. I think we should include it because combined with IP address, you can probably identify a single device in a home or business and possibly then identify a user.

You may be able to identify a device (with high probability of errors), but you cannot identify a person.

This also aligns with the definitions in Article 4(1) of the GDPR https://gdpr-info.eu/art-4-gdpr/

Right, the text there is:

...an identifiable natural person is one who can be identified, directly or indirectly, 
in particular by reference to an identifier such as a name, an identification number, 
location data, an online identifier or to one or more factors specific to the physical, 
physiological, genetic, mental, economic, cultural or social identity of that natural 
person;

The things we are looking for here are:

  • a name
  • an identification number
  • location data
  • an online identifier

of that natural person.

Where is the identification of a software a person may or may not have used listed there? :)

In any case, thinking we need to clear that with a lawyer.

For the time being I agree we should probably err on the side of "safety" and include the browser UA.

Last edited 2 months ago by azaozz (previous) (diff)

#18 in reply to: ↑ 15 ; follow-up: @azaozz
2 months ago

Replying to allendav:

Yep, the screenshots look a lot better now, not even sure we need the count after the different bits of data.

Also thinking we can perhaps remove the "Provider" column. This is showing the personal data for a user of a particular website. There is no difference which plugin stores the data, the important part is that the data is on the site and they get to see and download it.

For getting only unique data from the comments query in 43438.3.diff I think we can try something like:

$name = $url = $ip = $ua = array(); // we know the email is unique
foreach ( $comments as $comment ) {
  if ( ! in_array( $comment->comment_author, $name ) ) {
    $name[] = $comment->comment_author;
  }
  if ( ! in_array( $comment->comment_author_url, $url ) ) {
    $url[] = $comment->comment_author_url;
  }
  .....

or alternatively just do:

foreach ( $comments as $comment ) {
  $name[ $comment->comment_author ] = 1;
  $url[ $comment->comment_author_url ] = 1;
  .....
}
$name = array_filter( array_keys( $name ) );
....

Maybe 10000 is too much but this way we can easily process 5000-6000 comment rows in one go and get only the unique data. The idea is to get all of the user's comments at once and not have to merge these arrays later in js.

Last edited 2 months ago by azaozz (previous) (diff)

#19 follow-up: @Kloon
2 months ago

So since the actual content of a post or comment does not qualify as user data, and I agree it should not, what happens when that data contains an email address for example? Might be worth looking into excepts and post content to make sure no personal information is present there and then display that?

#20 follow-up: @mikejolley
2 months ago

@allendav

$email_address = trim( strtolower( $email_address ) );

Why don't we do this before passing to the data exporter? Also do we need mb_ function here for special chars?

#21 @allendav
2 months ago

  • Summary changed from Export registered user's personal data on request to Add filters and Ajax support for collecting personal data

Re-focusing this ticket to just "Add filters and Ajax support for personal data export" per ticket scrub discussion in Making WordPress gdpr-compliance chat today.

Last edited 2 months ago by allendav (previous) (diff)

#22 @allendav
2 months ago

  • Summary changed from Add filters and Ajax support for collecting personal data to Add filters and Ajax support for personal data export

#23 in reply to: ↑ 18 @allendav
2 months ago

Replying to azaozz:

Replying to allendav:

Yep, the screenshots look a lot better now, not even sure we need the count after the different bits of data.

Also thinking we can perhaps remove the "Provider" column. This is showing the personal data for a user of a particular website. There is no difference which plugin stores the data, the important part is that the data is on the site and they get to see and download it.

Agreed.

For getting only unique data from the comments query in 43438.3.diff I think we can try something like:

$name = $url = $ip = $ua = array(); // we know the email is unique
foreach ( $comments as $comment ) {
  if ( ! in_array( $comment->comment_author, $name ) ) {
    $name[] = $comment->comment_author;
  }
  if ( ! in_array( $comment->comment_author_url, $url ) ) {
    $url[] = $comment->comment_author_url;
  }
  .....

or alternatively just do:

foreach ( $comments as $comment ) {
  $name[ $comment->comment_author ] = 1;
  $url[ $comment->comment_author_url ] = 1;
  .....
}
$name = array_filter( array_keys( $name ) );
....

Maybe 10000 is too much but this way we can easily process 5000-6000 comment rows in one go and get only the unique data. The idea is to get all of the user's comments at once and not have to merge these arrays later in js.

Agreed. I am going to split out the comments into #43440 so we'll pick this up over there.

@allendav
2 months ago

Removes comment handling, ux, adds multibyte email support

#24 in reply to: ↑ 20 @allendav
2 months ago

Replying to mikejolley:

@allendav

$email_address = trim( strtolower( $email_address ) );

Why don't we do this before passing to the data exporter? Also do we need mb_ function here for special chars?

Agreed. Good catch. Also, replaced strtolower with mb_strtolower

#25 in reply to: ↑ 19 @allendav
2 months ago

Replying to Kloon:

So since the actual content of a post or comment does not qualify as user data, and I agree it should not, what happens when that data contains an email address for example? Might be worth looking into excepts and post content to make sure no personal information is present there and then display that?

Not a bad idea. At the very least, we could look for the email address in the the comment content I suppose? Since we're breaking the comment exporter out of this ticket, let's move this discussion over to #43440

Last edited 2 months ago by allendav (previous) (diff)

#26 @allendav
2 months ago

FYI - testing this patch now requires also applying the patches from #43440 and from #43546

#27 follow-up: @mikejolley
2 months ago

@allendav Working great, I've got an implementation going for WooCommerce now.

Just one bug. The callback is passed $page but it’s a string, not an int. Can you cast it?

#28 in reply to: ↑ 27 @allendav
2 months ago

Replying to mikejolley:

@allendav Working great, I've got an implementation going for WooCommerce now.

Just one bug. The callback is passed $page but it’s a string, not an int. Can you cast it?

Awesome! Yes - can do - I'm going to cast exporter index too, and sanitize_text_field all the post vars for good measure.

@allendav
2 months ago

Cast page and exporter index to int, sanitize post vars

#29 @allendav
2 months ago

Hang on - one more tweak needed - I want wp_privacy_personal_data_export_page to be a filter, not an action, so we can use it to return the assembled ZIP file url in the response on the last loop in #43551 - will post a tweaked patch in a sec

@allendav
2 months ago

Ruggedize further, more error handling, action->filter for export builder

#30 follow-up: @allendav
2 months ago

Note: It is possible that we will need to export not just obvious unique personal data for a user (e.g. author_email) from objects like comments, but every "field" in a comment that user made. If so, this means that the name->value approach would result in extremely large files since we 1) wouldn't be able to de-duplicate since 2) we'd probably keep each exported comment object "intact" (i.e. with all its fields together, so the export made more sense.)

Hope to have an update shortly from discussion with stakeholders.

cc @azaozz

#31 in reply to: ↑ 30 @azaozz
2 months ago

Replying to allendav:

Note: It is possible that we will need to export ... every "field" in a comment that user made.

If that's the case we are doing it all wrong :) We'll need to "dump" most fields from the comments table from the database.

However the guidance for now is that the "personal data export" should include only "private or identifiable personal data" like email, street address, IP address, etc. That does not include publicly published data as it is covered under other laws and perhaps other licenses, and is available on the internet.

It may be nice to have a way to export all comments by author email, but not sure if this is needed for law enforcement compliance.

Last edited 2 months ago by azaozz (previous) (diff)

#32 @azaozz
8 weeks ago

In 42889:

Privacy: add support for exporting multiple pages of personal data.

Props allendav.
See #43438.

#33 @azaozz
8 weeks ago

  • Keywords needs-unit-tests added

Leaving the ticket open so tests can be added.

#34 follow-ups: @ocean90
8 weeks ago

  • Keywords needs-patch added; has-patch removed

[42889] has a few issues which need to be addressed:

  • wp_ajax_wp_privacy_export_personal_data() has no DocBlock.
  • The mbstring PHP extension might be not available, so mb_strtolower() needs function_exists() check. Note that is_email() currently doesn't support email addresses with multibyte characters so strtolower() should be enough.
  • The docs for the wp_privacy_personal_data_exportersfilter doesn't match the docs standards.
  • It seems like $_POST['exporter'] is an numeric index of an exporter. Wouldn't it be easier if exporters are referenced by a name instead?
  • What's the idea behind the error messages? Should probably be translatable or a code instead.
  • The nonce check should be enabled.

#35 in reply to: ↑ 34 @allendav
8 weeks ago

Replying to ocean90:

[42889] has a few issues which need to be addressed:

Thank you @ocean90 ! - I'll submit a patch to address these

This ticket was mentioned in Slack in #core-multisite by mnelson4. View the logs.


4 weeks ago

This ticket was mentioned in Slack in #gdpr-compliance by joemcgill. View the logs.


4 weeks ago

#38 @desrosj
3 weeks ago

  • Milestone changed from 5.0 to 4.9.6
  • Owner set to desrosj
  • Status changed from new to assigned

Other functionality depends on this, so it needs to land in 4.9.6. Going to review this and address the remaining concerns.

Last edited 3 weeks ago by desrosj (previous) (diff)

@desrosj
3 weeks ago

#39 in reply to: ↑ 34 @desrosj
3 weeks ago

Replying to ocean90:

[42889] has a few issues which need to be addressed:

  • wp_ajax_wp_privacy_export_personal_data() has no DocBlock.

A DocBlock was added in [43012].

  • The mbstring PHP extension might be not available, so mb_strtolower() needs function_exists() check. Note that is_email() currently doesn't support email addresses with multibyte characters so strtolower() should be enough.

The mb_strtolower() call was removed in [43012] when the email was switched to use the email property of a WP_User_Request object. I looked at that, though, and it does not seem to account for any multibyte characters, so added strtolower() back in 43438.7.diff.

  • The docs for the wp_privacy_personal_data_exportersfilter doesn't match the docs standards.

Fixed in 43438.7.diff.

  • It seems like $_POST['exporter'] is an numeric index of an exporter. Wouldn't it be easier if exporters are referenced by a name instead?

I agree with this. @allendav, @azaozz any objection? If not, a new ticket may be best for this, I think.

  • What's the idea behind the error messages? Should probably be translatable or a code instead.

Ensured every error message is translatable in 43438.7.diff.

  • The nonce check should be enabled.

Was also addressed in [43012].

@desrosj
3 weeks ago

#40 @desrosj
3 weeks ago

Looked at the email again, and I don't think a strtolower() is needed at all. 43438.8.diff does not include that.

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


3 weeks ago

@birgire
3 weeks ago

#42 @birgire
3 weeks ago

The patch in 43438.9.diff:

  • Adds checks if array keys are set in $_POST, else send an error, for these cases:
$request_id     = (int) $_POST['id'];
$exporter_index = (int) $_POST['exporter'];
$eraser_index   = (int) $_POST['eraser'];
$page           = (int) $_POST['page'];

  • Adds order to multiple placeholders in translation strings. For example %s, %d to '%1$s, %2$d.
  • Adds "translators:" comments.
  • Removes Error: in error messages. We already have "An error occurred while attempting to erase/export personal data" displayed before the error message. Also for consistency, because not all error messages start with Error:. See error_response_without_Error_string_part.jpg.
  • Adjusts the doc block for the wp_privacy_personal_data_export_page filter
    • Adds the possible WP_Error to inline doc for $response.
    • Removes the "zero-based" from the inline doc for $page.
  • Adjusts the doc block for the wp_privacy_personal_data_erasure_page filter
    • Adds the possible WP_Error to inline doc for $response.
    • Removes the "zero-based" from the inline doc for $page.
  • Further adjustments to the Coding Standard.
Last edited 3 weeks ago by birgire (previous) (diff)

#43 @birgire
3 weeks ago

Further possible changes could be to adjust the error messages for more consistency.

Here are the possible error messages raised in wp_ajax_wp_privacy_export_personal_data() for the WordPress Comments exporter, as an example:

  • Missing request ID.
  • Invalid request type.
  • Invalid request ID.
  • Invalid request.
  • A valid email address must be given.
  • Missing exporter index.
  • Missing page index.
  • An exporter has improperly used the registration filter.
  • Exporter index cannot be negative.
  • Exporter index out of range.
  • Page index cannot be less than one.
  • Expected an array describing the exporter at index 1.
  • Exporter array at index 1 does not include a friendly name.
  • Exporter does not include a callback: WordPress Comments.
  • Exporter callback is not a valid callback: WordPress Comments.
  • Expected response as an array from exporter: WordPress Comments.
  • Expected data in response array from exporter: WordPress Comments.
  • Expected data array in response array from exporter: WordPress Comments.
  • Expected done (boolean) in response array from exporter: WordPress Comments.

Here are the possible error messages raised in wp_ajax_wp_privacy_erase_personal_data() for the WordPress Comments exporter, as an example:

  • Missing request ID.
  • Invalid request type.
  • Invalid request ID.
  • Invalid request.
  • A valid email address must be given.
  • Missing exporter index.
  • Missing page index.
  • Eraser index cannot be negative.
  • Eraser index out of range.
  • Page index cannot be less than one.
  • Expected an array describing the exporter at index 1.
  • Eraser array at index 1 does not include a callback.
  • Eraser callback at index 1 is not a valid callback.
  • Eraser array at index 1 does not include a friendly name.
  • Did not receive array from WordPress Comments eraser (index 1).
  • Expected num_items_removed key in response array from WordPress Comments eraser (index 1).
  • Expected num_items_retained key in response array from WordPress Comments eraser (index 1).
  • Expected messages key in response array from WordPress Comments eraser (index 1).
  • Expected messages key to reference an array in response array from WordPress Comments eraser (index 1).
  • Expected done flag in response array from WordPress Comments eraser (index 1).

For example:

  • Expected done (boolean) in response array from exporter: WordPress Comments.
  • Expected done flag in response array from WordPress Comments eraser (index 1).

and

  • Expected response as an array from exporter: WordPress Comments.
  • Did not receive array from WordPress Comments eraser (index 1).

Since plugins can define their own friendly name, there's a possible confusion if the friendly name is a bad one like "response eraser". Then we have e.g.:

  • Expected messages key in response array from response eraser eraser (index 1).

that is difficult to understand.

I wonder if we should use translatable quotes, so it would display like:

  • Expected messages key in response array from "response eraser" eraser (index 1).

or keep the friendly name at last, like:

  • Expected messages key in response array (index 1) from eraser: response eraser.
Last edited 3 weeks ago by birgire (previous) (diff)

#44 @birgire
3 weeks ago

An additional important thing here is that we need to sanitize/escape the exporter/eraser friendly name.

We don't want it to contain HTML and stuff, but currently it goes through.

Here's an example:

wp_send_json_error(
	sprintf(
		/* translators: %s: exporter friendly name */
		__( 'Exporter does not include a callback: %s.' ),
		$exporter['exporter_friendly_name']
	)
);

The wp_send_json_error() is a wrapper for wp_send_json() that uses wp_encode_json().

What comes to mind are esc_js(), esc_html() and wp_encode_json() but I'm thinking sanitize_text_field() on $exporter['exporter_friendly_name'] should do it here.

What do you think?

ps: I did some testing here:

Raw friendly name:

WordPress Comments " \' \ - &amp; & <b>Hello</b> <script>alert(1)</script>

and here's the how the ajax error is displayed in the HTML table:

sanitize_text_field:
...<li>Exporter does not include a callback: WordPress Comments " ' \ - &amp; &amp; Hello.</li>...

esc_html:
...<li>Exporter does not include a callback: WordPress Comments " ' \ - &amp; &amp; &lt;b&gt;Hello&lt;/b&gt; &lt;script&gt;alert(1)&lt;/script&gt;.</li>...

esc_js:
...<li>Exporter does not include a callback: WordPress Comments " \'  - &amp; &amp; &lt;b&gt;Hello&lt;/b&gt; &lt;script&gt;alert(1)&lt;/script&gt;.</li>...

esc_html + sanitize_text_field:
...<li>Exporter does not include a callback: WordPress Comments " ' \ - &amp; &amp; Hello.</li>...

esc_js + sanitize_text_field:
...<li>Exporter does not include a callback: WordPress Comments " \'  - &amp; &amp; Hello.</li>...

wp_json_encode:
...<li>Exporter does not include a callback: "WordPress Comments \" ' \\ - &amp; &amp; <b>Hello&lt;\/b&gt; <script>alert(1)<\/script>".</li></ul></div></td></tr></tbody></table></script></b></li>...

So I would now think that we should do (at least) esc_html for the escaping and then decide if we want the sanitizing part.

Last edited 3 weeks ago by birgire (previous) (diff)

@birgire
3 weeks ago

#45 @birgire
3 weeks ago

43438.10.diff adds escaping for the friendly exporter/eraser names.

This ticket was mentioned in Slack in #gdpr-compliance by iandunn. View the logs.


3 weeks ago

#47 @azaozz
3 weeks ago

In 43060:

Privacy: translate error messages, some fixes and improvements for the AJAX actions for exporting and erasing user data.

Props desrosj, birgire.
See #43438.

#48 @birgire
3 weeks ago

43438.11.diff is a first pass for adding tests for the wp_ajax_wp_privacy_export_personal_data() function.

@birgire
3 weeks ago

#49 @SergeyBiryukov
3 weeks ago

In 43074:

Privacy: add support for exporting multiple pages of personal data.

Props allendav.
Merges [42889] to the 4.9 branch.
See #43438.

#50 @SergeyBiryukov
3 weeks ago

In 43088:

I18N: Use consistent pattern for placeholder references in translator comments in wp_ajax_wp_privacy_erase_personal_data().

See #43438.

#51 @SergeyBiryukov
3 weeks ago

In 43105:

Privacy: translate error messages, some fixes and improvements for the AJAX actions for exporting and erasing user data.

Props desrosj, birgire.
Merges [43060] to the 4.9 branch.
See #43438.

#52 @SergeyBiryukov
3 weeks ago

In 43107:

I18N: Use consistent pattern for placeholder references in translator comments in wp_ajax_wp_privacy_erase_personal_data().

Merges [43088] to the 4.9 branch.
See #43438.

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


3 weeks ago

#54 @desrosj
3 weeks ago

  • Keywords needs-patch removed

This ticket was mentioned in Slack in #core by desrosj. View the logs.


3 weeks ago

This ticket was mentioned in Slack in #gdpr-compliance by allendav. View the logs.


2 weeks ago

This ticket was mentioned in Slack in #core by desrosj. View the logs.


13 days ago

@birgire
10 days ago

#58 @birgire
10 days ago

  • Keywords has-unit-tests added; needs-unit-tests removed

43438.12.diff adds tests for the Ajax handlers:

  • wp_ajax_wp_privacy_erase_personal_data()
  • wp_ajax_wp_privacy_export_personal_data()

#59 @azaozz
9 days ago

  • Priority changed from normal to low

Set priority to low for tickets that only need unit tests.

This ticket was mentioned in Slack in #core by desrosj. View the logs.


9 days ago

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


8 days ago

This ticket was mentioned in Slack in #gdpr-compliance by birgire. View the logs.


7 days ago

#63 @desrosj
7 days ago

  • Component changed from General to Privacy

Moving to the new Privacy component.

Note: See TracTickets for help on using tickets.