Make WordPress Core

Opened 3 years ago

Last modified 2 months ago

#53465 assigned task (blessed)

PHP 8.1.: the default value of the flags parameter for htmlentities() et all needs to be explicitly set

Reported by: jrf's profile jrf Owned by: hellofromtonya's profile hellofromTonya
Milestone: 6.7 Priority: normal
Severity: normal Version:
Component: General Keywords: php81 has-patch has-unit-tests
Focuses: coding-standards, php-compatibility Cc:

Description

From the PHP 8.1 changelog:

htmlspecialchars(), htmlentities(), htmlspecialchars_decode(),
html_entity_decode() and get_html_translation_table() now use
ENT_QUOTES | ENT_SUBSTITUTE rather than ENT_COMPAT by default. This means
that ' is escaped to ' while previously it was left alone.
Additionally, malformed UTF-8 will be replaced by a Unicode substitution
character, instead of resulting in an empty string.

Ref: https://github.com/php/php-src/blob/28a1a6be0873a109cb02ba32784bf046b87a02e4/UPGRADING#L149-L154

If effect this means that the output of the above mentioned functions may be different depending on the PHP version and the passed text string, unless the $flags parameter is explicitly passed.

I've run an initial scan over WordPress core with a new (not yet published) sniff for PHPCompatibility and this flags 33 issues.

  • 1 issue in GetID3 which should be fixed upstream and the copy of GetID3 used in WP should be updated once the issue is fixed.
  • 1 issue in PHPMailer which should be fixed upstream and the copy of PHPMailer used in WP should be updated once the issue is fixed.
  • 1 issue in SimplePie which should be fixed upstream and the copy of SimplePie used in WP should be updated once the issue is fixed.
  • And 30 issues in WP Core native code or code from external dependencies which are no longer maintained externally.

Detailed issue list: https://gist.github.com/jrfnl/9d56b4053faa62a0fe91dea1b14839bf

To fix this issue, the $flags parameter should be explicitly passed in each of these function calls.

Some investigation will be needed for each of these instances to determine what will be the optimal value for $flags.

Take note that the "old" parameter default in the function signature is documented as ENT_COMPAT, while in the parameter detail documentation, it states that the default, in actual fact, is ENT_COMPAT | ENT_HTML401.

However, by the looks of it, the full range of flag constants is available to us, which is at least one less problem.
There is no mention of any of the flags being added since PHP 5.6.
Ref: https://php-legacy-docs.zend.com/manual/php5/en/string.constants

It is strongly recommended to make sure that for each of these at least one unit test exists which exposes the difference in output between PHP < 8.1 and PHP 8.1 to safeguard the fixes which will be added for the future.

Also see:

Change History (30)

#2 @hellofromTonya
3 years ago

  • Milestone changed from 5.9 to 6.0

With 4 days left until 5.9 Beta 1, moving this to 6.0.

#3 follow-up: @GaryJ
2 years ago

According to https://www.php.net/manual/en/function.htmlentities.php, the default for flags for PHP 8.1 was "changed from ENT_COMPAT to ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401".

I don't know what difference that ENT_HTML401 might make (there's also a ENT_HTML5 flag available as well), but might be something else to consider.

#4 in reply to: ↑ 3 @jrf
2 years ago

Replying to GaryJ:

According to https://www.php.net/manual/en/function.htmlentities.php, the default for flags for PHP 8.1 was "changed from ENT_COMPAT to ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401".

I don't know what difference that ENT_HTML401 might make (there's also a ENT_HTML5 flag available as well), but might be something else to consider.

The ENT_HTML401 makes no difference as the value of that constant is 0, so the "bit" value remains the same with or without.
The ENT_HTML401 part is not an actual change, just better documentation.

See:

#5 @costdev
2 years ago

  • Milestone changed from 6.0 to 6.1

With 6.0 RC1 tomorrow, I'm moving this to the 6.1 milestone.

#6 @rafiahmedd
2 years ago

I think we can discuss this ticket on the next dev chat on slack. And it seems to be an important one.

#7 @jrf
2 years ago

FYI: If I remember correctly, @SergeyBiryukov is working on patches for this ticket together with other people from the Yoast WP Core team.

#8 @SergeyBiryukov
2 years ago

Yes, @aristath, @justinahinon, @poena, @afercia, and I worked on this recently in group sessions and should have a PR ready for review soon :)

This ticket was mentioned in Slack in #core by chaion07. View the logs.


2 years ago

This ticket was mentioned in Slack in #hosting-community by javier. View the logs.


2 years ago

This ticket was mentioned in Slack in #hosting-community by jadonn. View the logs.


2 years ago

#12 @audrasjb
2 years ago

  • Milestone changed from 6.1 to 6.2

With WP 6.1 RC 1 scheduled tomorrow (Oct 10, 2022), there is not much time left to address this ticket. Given it still needs a patch, let's move this ticket to 6.2.

Ps: if you were about to send a patch and if you feel it is realistic to commit it in the next few hours, please feel free to move this ticket back to milestone 6.1.

This ticket was mentioned in PR #3429 on WordPress/wordpress-develop by SergeyBiryukov.


2 years ago
#13

  • Keywords has-patch has-unit-tests added; needs-patch removed

This ticket was mentioned in Slack in #core by costdev. View the logs.


19 months ago

#15 @costdev
19 months ago

This ticket was discussed during the recent bug scrub.

@SergeyBiryukov Do you have time to give the PR a quick bump to see results on GitHub Actions?

#16 @hellofromTonya
19 months ago

  • Milestone changed from 6.2 to 6.3
  • Owner set to hellofromTonya
  • Status changed from new to assigned

The last scheduled 6.2 beta happened today. RC1 is next week. Moving this ticket to 6.3 and self-assigning ownership to help shepherd its completion.

This ticket was mentioned in Slack in #core by oglekler. View the logs.


15 months ago

This ticket was mentioned in Slack in #core by mukeshpanchal27. View the logs.


14 months ago

#19 @chaion07
14 months ago

  • Milestone changed from 6.3 to 6.4

Thanks @jrf for continuing to work on this. We reviewed this ticket during a recent bug-scrub session and based on the feedback we are updating the milestone to 6.4. Cheers!

Props to @costdev & @mukesh27

#20 @hellofromTonya
13 months ago

  • Focuses php-compatibility added

Marking this work as a PHP 8.1 incompatibility. This is done for tracking purposes and for identifying documented exceptions to Core being compatible with PHP 8.1.

These functions will be included in the list of PHP 8.1 "compatible with documented exceptions" for WP 6.3.0 and beyond until resolved. See https://make.wordpress.org/core/2023/06/20/proposal-criteria-for-removing-beta-support-from-each-php-8-version/.

@hellofromTonya commented on PR #3429:


11 months ago
#21

Besides the merge conflicts, is this patch ready for review @SergeyBiryukov?

#22 @hellofromTonya
11 months ago

Hey @SergeyBiryukov and @jrf, following up on the patch for this ticket as 6.4 RC1 is next week.

Is https://github.com/WordPress/wordpress-develop/pull/3429 ready for review and commit consideration? It's still marked as WIP and there are merge conflicts. I haven't yet gone through the code. But am I also curious if each instance has a test associated with it:

It is strongly recommended to make sure that for each of these at least one unit test exists which exposes the difference in output between PHP < 8.1 and PHP 8.1 to safeguard the fixes which will be added for the future.

@jrf commented on PR #3429:


11 months ago
#23

@hellofromtonya If you don't mind, I'll review when the PR gets marked _ready for review_.

#24 @hellofromTonya
11 months ago

  • Milestone changed from 6.4 to 6.5

6.4 RC1 is in a few hours. The patch is not quite ready yet. Moving to 6.5.

#25 @swissspidy
7 months ago

@hellofromTonya Is this on your radar still for 6.5? Looks like a punt candidate given the lack of activity.

#26 @swissspidy
6 months ago

  • Milestone changed from 6.5 to 6.6

#27 in reply to: ↑ description @SergeyBiryukov
5 months ago

Replying to jrf:

From the PHP 8.1 changelog:

htmlspecialchars(), htmlentities(), htmlspecialchars_decode(),
html_entity_decode() and get_html_translation_table() now use
ENT_QUOTES | ENT_SUBSTITUTE rather than ENT_COMPAT by default. This means
that ' is escaped to &#039; while previously it was left alone.
Additionally, malformed UTF-8 will be replaced by a Unicode substitution
character, instead of resulting in an empty string.

...
To fix this issue, the $flags parameter should be explicitly passed in each of these function calls.

I had a question while revisiting this ticket and PR: What about the instances where this change is not relevant, e.g.:

  • ' or malformed UTF-8 would never occur under normal circumstances.
  • They might occur, but the escaping would not make any difference.

For example, instances like htmlentities( __( 'Unknown Feed' ) ). Do we still need to add the $flags parameter there?

Some investigation will be needed for each of these instances to determine what will be the optimal value for $flags.

Take note that the "old" parameter default in the function signature is documented as ENT_COMPAT, while in the parameter detail documentation, it states that the default, in actual fact, is ENT_COMPAT | ENT_HTML401.

The current draft PR adds ENT_COMPAT | ENT_HTML401 to most of the instances to keep the current behavior, whether or not the change is relevant to that particular instance. In some cases, it replaces the function call altogether with a more appropriate function.

Should there be any additional considerations when determining the optimal value for $flags?

#28 @hellofromTonya
3 months ago

  • Milestone changed from 6.6 to 6.7

With the last scheduled beta for 6.6 happening in ~2 hours, seems this ticket needs a wee bit more time to land. Moving to 6.7.

vlakoff commented on PR #3429:


2 months ago
#29

How about using a constant? Something like WP_LEGACY_ENTITY_FLAGS.

  • This could help for future changes in the codebase, where the uses of that constant would be progressively replaced by something else. This way, the removals of the legacy flags could be more easily tracked.
  • And of course, searching "WP_LEGACY_ENTITY_FLAGS" in the codebase would be easier then searching the flags.

As a subsequent step, I'm thinking about adding another constant, WP_ENTITY_FLAGS, that would consist of more adequate flags:

  • Replace ENT_COMPAT with ENT_QUOTES: Way more secure.
  • Add ENT_SUBSTITUTE: Better show the substitution problems.
  • Stay on ENT_HTML401: ENT_HTML5 seems to be tricky, and ENT_HTML401 is the most compatible.

Note it would match the newer PHP default flags. But by specifying our flags, it would prevent breakage if the PHP default flags were to change again (just like it happened here).

Then, you guessed it: WP_LEGACY_ENTITY_FLAGS would be progressively replaced with WP_ENTITY_FLAGS in the codebase. Then, WP_LEGACY_ENTITY_FLAGS could be removed.

Additionally, you may be interested by this very exhaustive answer on Stack Overflow.

vlakoff commented on PR #3429:


2 months ago
#30

However, this PR has not been merged yet, and probably many people are now running on at least PHP 8.1.

This means they are already using the newer flags, apparently without disruption.

If merging this PR or adding a WP_LEGACY_ENTITY_FLAGS constant as I suggested above, people would be switched back to the old flags.

Therefore, it may be better to skip the "legacy flags" step, and directly make the change to the newer flags, using the WP_ENTITY_FLAGS constant I suggested above.

Note: See TracTickets for help on using tickets.