Opened 7 years ago
Last modified 6 years ago
#44215 new defect (bug)
Personal data exports - Empty values are included in the report
Reported by: | bor0 | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | 4.9.6 |
Component: | Privacy | Keywords: | has-patch has-screenshots 2nd-opinion |
Focuses: | Cc: |
Description
Empty values should not be included in the report. Otherwise, the report will be cluttered with empty values.
Use the following code in mu-plugins
to reproduce:
<?php function exporter_1( $email_address, $page = 1 ) { return array( 'data' => array( array( 'group_id' => 'group-1', 'group_label' => __( 'Orders', 'woocommerce-services' ), 'item_id' => 'item-1', 'data' => array( array( 'name' => __( 'Exporter 1' ), 'value' => 1234, ), array( 'name' => __( 'Exporter 2' ), 'value' => '', ), array( 'name' => __( 'Exporter 3' ), 'value' => '0', ), array( 'name' => __( 'Exporter 4' ), 'value' => 0, ), ), ) ), 'done' => true, ); } function register_exporter_1( $exporters ) { $exporters['exporter_1'] = array( 'exporter_friendly_name' => __( 'Exporter 1' ), 'callback' => 'exporter_1', ); return $exporters; } add_filter( 'wp_privacy_personal_data_exporters', 'register_exporter_1' );
Results:
Attachments (3)
Change History (17)
#2
@
7 years ago
I don't know - I would hope plugins could avoid inserting meaningless values into the report. That said, a completely empty string - sure - we could discard that. Beyond that however, I would push back beyond having core pass judgement on what is passed to it by an exporter.
This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.
7 years ago
#5
@
7 years ago
- Version changed from trunk to 4.9.6
I am wondering if leaving empty values in the report is important. The site is technically storing data on the user, the string is just empty. This could be significant in certain contexts.
In your example there are 0
s. A 0
(which is falsey or can be considered empty) could indicate that a user has no orders or no published posts. An empty string could mean that they did not provide a URL on their profile. A user meta field could be stored for speaks_english
and be set to false
or 0
.
So in certain contexts, these empty values may be important.
#6
@
7 years ago
It's a good question, but I agree with @desrosj, that we should not filter out "empty" values, like false
and 0
as they could have a specific meaning.
Also anonymous functions, like in 44215.patch, were first supported in PHP 5.3.0
https://secure.php.net/manual/en/functions.anonymous.php
but WordPress still supports PHP 5.2.4
This ticket was mentioned in Slack in #core-privacy by desrosj. View the logs.
6 years ago
#9
@
6 years ago
I agree with @desrosj on that it should include data that's falsey, but present for the reasons he already mentioned in his comment (#5).
#10
follow-up:
↓ 11
@
6 years ago
One thing that I want to verify before closing this is that values, such as false
or 0
, are correctly represented in the export. For example, a false
value should not show up as a blank space. It should show as false
. I think it is important for accuracy that these empty values are not just displayed as an empty table cell.
#11
in reply to:
↑ 10
@
6 years ago
That sounds reasonable, though that depends on the type of column being used. Varchars can't store false
, only 0
or ''
. But they could also store the string false
.
I see situations where changing the display value from an empty string to the word false
might be misleading, because does false
mean it's really an empty string, or is it the actual word false
?
Maybe there should be a type declaration as well for falsey values, much like var_dump
gives us?
Replying to desrosj:
One thing that I want to verify before closing this is that values, such as
false
or0
, are correctly represented in the export. For example, afalse
value should not show up as a blank space. It should show asfalse
. I think it is important for accuracy that these empty values are not just displayed as an empty table cell.
#12
@
6 years ago
- Keywords close removed
Did some testing. Here are my findings of the current behaviors in core.
The fields currently included in a data export are core user fields, a few core user meta fields, and core comment fields. All of these are either varchars, int/bigints, or text types.
- Passing
false
as a meta value results in an empty string being stored. - Passing a
0
integer as a meta value results in a0
string being stored.
All values for these fields are passed through if ( ! empty( $value ) )
before adding the data to the export. This means that in the current state, any empty field (which indicates an absence of data) will always be removed from the export. This should probably be changed.
If the meta key exists, it potentially means something and should be displayed in the export, even if empty. Users, for example, may have an empty description, last name, or nickname field. Comments may have an empty author URL or Author IP. It could be important for the user to know that those fields are present and are in fact empty.
If a core user meta field has been deleted, a PHP notice is currently thrown. Example: A plugin deletes the last_name
meta field for a user. When this happens, the field can be excluded from the report because it does not exist.
Incoming patch with these changes.
Worth noting, only the behavior in core's data exporters can be controlled. Plugins with custom data exporters will have their own logic about what to include and why. I think that the plugin handbook pages about creating data exporters should be updated to include a section about the differences between no data and empty data.
#13
@
6 years ago
The patch 44215.2.diff is based on 44215.diff and additionally:
- Adds the
@group privacy
annotation. - Uses
assertCount(...)
instead ofassertSame(..., count(...) )
. - Moves the cleanup before assertions, so it runs also on failures.
ps: I hope you don't mind that I used the opportunity to do similar minor adjustments
on few other privacy test methods + removed unused $c
variables. :-)
Pinging @allendav for any input. Thanks!