Make WordPress Core

Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#44141 closed defect (bug) (fixed)

Privacy: Don't replace comment author URL and email with anything

Reported by: johnbillion's profile johnbillion Owned by: azaozz's profile azaozz
Milestone: 4.9.8 Priority: normal
Severity: normal Version: 4.9.6
Component: Comments Keywords: has-patch has-unit-tests needs-testing fixed-major
Focuses: privacy Cc:

Description

When anonymizing comment author information, replacing the comment author's site URL with https://site.invalid has the effect of exposing broken links to end users.

During the anonymization process there's no reason to replace the comment author URL with anything. It should be replaced with an empty string as if the comment author had not entered a URL.

Introduced in [42994].

Attachments (3)

44141.patch (878 bytes) - added by TZ Media 6 years ago.
44141.diff (5.7 KB) - added by desrosj 6 years ago.
44141.2.diff (7.1 KB) - added by birgire 6 years ago.

Download all attachments as: .zip

Change History (22)

@TZ Media
6 years ago

#1 @TZ Media
6 years ago

  • Keywords has-patch added; needs-patch removed

Quick fix that sets comment_author_url to an empty string instead of using wp_privacy_anonymize_data().

#2 @johnbillion
6 years ago

  • Keywords needs-unit-tests added

Thanks for the patch, @TZ Media . This will need some updates to the corresponding unit tests too.

#3 @iandunn
6 years ago

I think the reason to replace it was so that it's obvious that its been redacted, rather than introducing ambiguity about whether or not there originally was data there.

I don't think that's critical in this case, but it should be considered before changing it.

An alternative approach would be to modify the front end comment form to ignore anonymized URLs, although that's not as elegant as just making it an empty string.

#4 @summoner
6 years ago

What if there would be a screen with some text like "Most probably the user has let their data deleted" or something like that. So the people could find out what has happened to the content they wanted to see and even the link would not be invalid.

#5 @desrosj
6 years ago

  • Milestone changed from 4.9.7 to 4.9.8

Moving all tickets in 4.9.7 to 4.9.8.

#6 @SergeyBiryukov
6 years ago

  • Type changed from enhancement to defect (bug)

Exposing broken links to end users sounds like a bug to me, I don't think avoiding the ambiguity about whether or not the URL was originally provided is worth it.

@desrosj
6 years ago

#7 follow-up: @desrosj
6 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed
  • Milestone changed from 4.9.8 to 4.9.7

In 44141.diff:

  • Add @group privacy to the tests in Tests_Comments that test the anonymization of comment data.
  • Adds translator context to the Anonymous string that replaces the commenter's name. This could be important for some languages.
  • Continue passing the comment author's URL through wp_privacy_anonymize_data() so that plugins and themes can filter the URL if desired.
  • Adjust tests for anonymizing comment author URLs and emails.
  • Add a _wp_anonymized comment meta value of true when a comment is successfully anonymized.
  • Change wp_privacy_anonymize_data() to return an empty string by default for both emails & URLs.

The main change here is that a comment meta value is added flagging a comment as anonymized. It allows the URL to be empty but it's still obvious that there was potentially data there that was redacted. The _wp_anonymized key could also contain an array of the fields that did contain data before being erased if knowing that is important.

This ticket was mentioned in Slack in #gdpr-compliance by desrosj. View the logs.


6 years ago

#9 in reply to: ↑ 7 @earnjam
6 years ago

44141.diff looks good and works well for me.

I don't think it's necessary to store an array of the fields that were anonymized instead the boolean that you have in your patch. IMO, the point of anonymizing is to remove any indication of who made the comment in the first place. In that regard, it might be helpful to know that it was anonymized for overall tracking purposes, but it doesn't matter what specifically about the comment was anonymized.

@birgire
6 years ago

#10 @birgire
6 years ago

44141.2.diff includes some minor suggestions:

  • Uses assertSame() instead of assertEquals(), for stricter checks.
  • Changes @since 5.0.0 to @since 4.9.6. in anonymization.php.
  • Adjusts the test_anonymize_with_filter(), so it removes only the filter callback that was added, before the assertions.
  • Adjusts the filter_wp_privacy_anonymize_data() so it also tests the $data input argument.
  • Adjusts the inline docblock for filter_wp_privacy_anonymize_data(): adds a @since and adds the variable to the @return line.

ps: I noticed that 44141.diff changes the email anonymization as well. Should the ticket be adjusted to mention that as well?

#11 @desrosj
6 years ago

  • Keywords needs-testing added
  • Summary changed from Privacy: Don't replace comment author URL with anything to Privacy: Don't replace comment author URL and email with anything

Thanks for the adjustments, @birgire! Updating the ticket title to reflect the email behavior is also changed.

#12 @ocean90
6 years ago

  • Milestone changed from 4.9.7 to 4.9.8

4.9.7 has been released, moving to next milestone.

#13 @pbiron
6 years ago

  • Keywords needs-refresh added

Any chance this can land in 4.9.8? If it does, then it needs a refresh for @since 4.9.8.

Beta is scheduled for July 17, next Tues.

#14 @azaozz
6 years ago

Add a _wp_anonymized comment meta value of true when a comment is successfully anonymized.

Why more data? We are trying to get rid of it :)
If this really needs to be logged, it should go somewhere else: a proper logging system perhaps, although that would mostly defeat the purpose (as it would be possible to trace anonymized data back to a particular user from a particular "anonymization event").

Looking at 44141.diff and 44141.2.diff, wp_privacy_anonymize_data() shouldn't be changed. It's not meant just for comments, it should "Return uniform anonymous data by type" (see the inline docs) :) The empty strings should be in the wp_comments_personal_data_eraser() as these strings are optional for comments but may not be optional in other places.

I understand the concerns about having invalid URLs in the DB, but the whole point of these privacy efforts is to get the users to "look before they click", i.e. be a bit more concerned with their own privacy. Following an URL that has .invalid domain is pretty clear imho :)

#15 @azaozz
6 years ago

  • Owner set to azaozz
  • Resolution set to fixed
  • Status changed from new to closed

In 43467:

Privacy: Don't replace comment author URL and email with anything.

Props TZ-Media, desrosj, birgire.
Fixes #44141.

#16 @azaozz
6 years ago

  • Keywords fixed-major added; needs-refresh removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopen for 4.9.8.

#17 @azaozz
6 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 43468:

Privacy: Don't replace comment author URL and email with anything.

Props TZ-Media, desrosj, birgire.
Merges [43467] to the 4.9 branch.
Fixes #44141.

#18 @azaozz
6 years ago

In 43471:

Privacy: Fix tests after [43467].

See #44141.

#19 @azaozz
6 years ago

In 43473:

Privacy: Fix tests after [43467].

Merges [43471] to the 4.9. branch.
See #44141.

Note: See TracTickets for help on using tickets.