WordPress.org

Make WordPress Core

Opened 19 months ago

Last modified 7 months ago

#44215 new defect (bug)

Personal data exports - Empty values are included in the report

Reported by: bor0 Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 4.9.6
Component: Privacy Keywords: has-patch has-screenshots 2nd-opinion
Focuses: Cc:
PR Number:

Description

Empty values should not be included in the report. Otherwise, the report will be cluttered with empty values.

Use the following code in mu-plugins to reproduce:

<?php
function exporter_1( $email_address, $page = 1 ) {
	return array(
		'data' => array( array(
			'group_id'    => 'group-1',
			'group_label' => __( 'Orders', 'woocommerce-services' ),
			'item_id'     => 'item-1',
			'data'	=> array(
				array(
					'name'  => __( 'Exporter 1' ),
					'value' => 1234,
				),
				array(
					'name'  => __( 'Exporter 2' ),
					'value' => '',
				),
				array(
					'name'  => __( 'Exporter 3' ),
					'value' => '0',
				),
				array(
					'name'  => __( 'Exporter 4' ),
					'value' => 0,
				),
			),
		) ),
		'done' => true,
	);
}

function register_exporter_1( $exporters ) {
	$exporters['exporter_1'] = array(
		'exporter_friendly_name' => __( 'Exporter 1' ),
		'callback' => 'exporter_1',
	);
	return $exporters;
}

add_filter(
	'wp_privacy_personal_data_exporters',
	'register_exporter_1'
);

Results:

Before proposed patch:
https://cloudup.com/iPJ8ocUnyiz

After proposed patch:
https://cloudup.com/i7MqWkMZjrj

Attachments (3)

44215.patch (726 bytes) - added by bor0 19 months ago.
44215.diff (3.7 KB) - added by desrosj 16 months ago.
44215.2.diff (5.7 KB) - added by birgire 16 months ago.

Download all attachments as: .zip

Change History (17)

@bor0
19 months ago

#1 @bor0
19 months ago

Pinging @allendav for any input. Thanks!

#2 @allendav
19 months ago

I don't know - I would hope plugins could avoid inserting meaningless values into the report. That said, a completely empty string - sure - we could discard that. Beyond that however, I would push back beyond having core pass judgement on what is passed to it by an exporter.

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


19 months ago

#4 @bor0
19 months ago

Empty strings would be neat and sufficient :)

#5 @desrosj
19 months ago

  • Version changed from trunk to 4.9.6

I am wondering if leaving empty values in the report is important. The site is technically storing data on the user, the string is just empty. This could be significant in certain contexts.

In your example there are 0s. A 0 (which is falsey or can be considered empty) could indicate that a user has no orders or no published posts. An empty string could mean that they did not provide a URL on their profile. A user meta field could be stored for speaks_english and be set to false or 0.

So in certain contexts, these empty values may be important.

Last edited 19 months ago by desrosj (previous) (diff)

#6 @birgire
19 months ago

It's a good question, but I agree with @desrosj, that we should not filter out "empty" values, like false and 0 as they could have a specific meaning.

Also anonymous functions, like in 44215.patch, were first supported in PHP 5.3.0

https://secure.php.net/manual/en/functions.anonymous.php

but WordPress still supports PHP 5.2.4

https://wordpress.org/about/requirements/

#7 @desrosj
18 months ago

  • Keywords 2nd-opinion close added; dev-feedback removed

This ticket was mentioned in Slack in #core-privacy by desrosj. View the logs.


16 months ago

#9 @javorszky
16 months ago

I agree with @desrosj on that it should include data that's falsey, but present for the reasons he already mentioned in his comment (#5).

#10 follow-up: @desrosj
16 months ago

One thing that I want to verify before closing this is that values, such as false or 0, are correctly represented in the export. For example, a false value should not show up as a blank space. It should show as false. I think it is important for accuracy that these empty values are not just displayed as an empty table cell.

#11 in reply to: ↑ 10 @javorszky
16 months ago

That sounds reasonable, though that depends on the type of column being used. Varchars can't store false, only 0 or ''. But they could also store the string false.

I see situations where changing the display value from an empty string to the word false might be misleading, because does false mean it's really an empty string, or is it the actual word false?

Maybe there should be a type declaration as well for falsey values, much like var_dump gives us?

Replying to desrosj:

One thing that I want to verify before closing this is that values, such as false or 0, are correctly represented in the export. For example, a false value should not show up as a blank space. It should show as false. I think it is important for accuracy that these empty values are not just displayed as an empty table cell.

#12 @desrosj
16 months ago

  • Keywords close removed

Did some testing. Here are my findings of the current behaviors in core.

The fields currently included in a data export are core user fields, a few core user meta fields, and core comment fields. All of these are either varchars, int/bigints, or text types.

  • Passing false as a meta value results in an empty string being stored.
  • Passing a 0 integer as a meta value results in a 0 string being stored.

All values for these fields are passed through if ( ! empty( $value ) ) before adding the data to the export. This means that in the current state, any empty field (which indicates an absence of data) will always be removed from the export. This should probably be changed.

If the meta key exists, it potentially means something and should be displayed in the export, even if empty. Users, for example, may have an empty description, last name, or nickname field. Comments may have an empty author URL or Author IP. It could be important for the user to know that those fields are present and are in fact empty.

If a core user meta field has been deleted, a PHP notice is currently thrown. Example: A plugin deletes the last_name meta field for a user. When this happens, the field can be excluded from the report because it does not exist.

Incoming patch with these changes.

Worth noting, only the behavior in core's data exporters can be controlled. Plugins with custom data exporters will have their own logic about what to include and why. I think that the plugin handbook pages about creating data exporters should be updated to include a section about the differences between no data and empty data.

@desrosj
16 months ago

@birgire
16 months ago

#13 @birgire
16 months ago

The patch 44215.2.diff is based on 44215.diff and additionally:

  • Adds the @group privacy annotation.
  • Uses assertCount(...) instead of assertSame(..., count(...) ).
  • Moves the cleanup before assertions, so it runs also on failures.

ps: I hope you don't mind that I used the opportunity to do similar minor adjustments
on few other privacy test methods + removed unused $c variables. :-)

This ticket was mentioned in Slack in #core-privacy by garrett-eclipse. View the logs.


7 months ago

Note: See TracTickets for help on using tickets.