Make WordPress Core

Opened 17 months ago

Last modified 2 months ago

#55603 assigned task (blessed)

PHP 8.2: address deprecation of the utf8_encode() and utf8_decode() functions

Reported by: jrf's profile jrf Owned by: hellofromtonya's profile hellofromTonya
Milestone: 6.4 Priority: normal
Severity: normal Version: 6.0
Component: General Keywords: 2nd-opinion php82 dev-feedback
Focuses: coding-standards Cc:

Description (last modified by jrf)

Context

The PHP RFC to remove the `utf8_encode()` and `utf8_decode()` functions from PHP in PHP 9.0 has recently been accepted.

This means in effect that as of PHP 8.2, those functions will be deprecated and a deprecation notice will be thrown whenever they are called.

The reasoning behind the deprecation and removal is that these functions are confusing and rarely used correctly.

See the Usage section of the RFC for an analysis of the various (mostly incorrect) uses of the functions.

The Problem

The typical replacements for these functions are using the MBString extension and/or the Iconv extension.

As these extensions are both optional extensions in PHP, they cannot be relied on to be available in an open source context.

WordPress uses the utf8_encode() function a few times in the codebase:

  • 1 x utf8_encode() in src/wp-admin/includes/export.php
  • 2 x utf8_encode() in src/wp-admin/includes/image.php
  • 1 x utf8_encode() in tests/phpunit/tests/kses/php

Aside from that the external dependency GetID3 also uses both these functions a number of times.

A search of the plugin and theme directory shows more worrying results with a plenitude of matches:

Options

So, what are the options we have ?

In my opinion, especially seeing how these functions are used so often in plugins, there are only two realistic options:

1. We could polyfill these functions.

While some functions which may not be available are polyfilled by WP, this is generally only done to have access to new PHP functionality or to allow for using functions which require certain optional extensions to be enabled.

As far as I know, no PHP native function has ever been polyfilled due to it being removed from PHP.

Pro:
Relatively simple solution and everything keeps working (deprecation notices will still show when running on PHP 8.x, though these could silenced).

Con:
As most uses of these functions are likely to be incorrect usage (especially in plugins), these "bugs" will remain and not be reviewed or addressed, undercutting the improvement PHP is trying to make.

2. We could make the MbString (or the Iconv) extension a requirement

At this moment, both the MbString as well as the Iconv extension are recommended, but not required by WP.

A couple of MbString functions are also polyfilled in WP, so now might be a good time to make the MbString extension a requirement for WP.

Pro:
MbString being available will allow for fixing the deprecations in a forward-compatible manner. It will also allow for other code improvements to be made to improve WPs support for languages using non-latin based scripts.

Con:
A new requirement would be added to WP which should not be taken lightly. At the same time, it should be noted that MbString is generally enabled already anyway, so this will impact only a small percentage of users.

Why MbString instead of Iconv ?

While both are included (though not enabled) by default with PHP, Iconv requires the `libiconv` library, which may not be available, while MbString has no external dependencies.

MbString is not enabled by default in PHP, but generally is enabled in practice.
Iconv is enabled by default in PHP, but can be disabled.

Having said that, MbString offers much more functionality than the limited functionality offered by Iconv and - as indicated by a number of functions being polyfilled - is already in use in WP.

Still, it would be helpful if someone with access to the underlying statistics data collected by WP could add figures to this issue showing how often either extension is enabled on systems running WP.

Recommendation

I'd strongly recommend option 2, but would like to hear the opinions of additional Core devs.

Action lists

General

  • [ ] Report the issue to GetID3

Action list for option 1

  • [ ] Polyfill the functions.
  • [ ] Review the uses of the functions in WP Core anyhow to see if those could/should be removed/the code using them should be refactored.
  • [ ] Add a note about the polyfills in a dev-note with a recommendation for plugin/theme authors to review their use of these functions anyhow.

Action list for option 2

  • [ ] Make the MbString a requirement for installing WP/in the WP bootstrapping.
  • [ ] Change the MbString extension from optional to required in the Site Health component.
  • [ ] Remove the current MbString related polyfills from the compat.php file.
  • [ ] Review the uses of the functions in WP Core and replace with more appropriate alternatives.
  • [ ] Add a note about the deprecation in the PHP 8.2 dev-note with a recommendation for plugin/theme authors to review their use of these functions and noting that the MbString extension can be relied upon to be available (as of WP 6.1).

Change History (44)

#1 follow-up: @SergeyBiryukov
17 months ago

Thanks for the ticket!

For option 2, I think we'd need to also check the required PHP modules during installation, to avoid fatal errors. As far as I can tell, marking a module as required in Site Health displays a warning on existing sites if the module is unavailable, but does not block the installation for new sites.

#2 in reply to: ↑ 1 @jrf
17 months ago

  • Description modified (diff)

Replying to SergeyBiryukov:

For option 2, I think we'd need to also check the required PHP modules during installation, to avoid fatal errors. As far as I can tell, marking a module as required in Site Health displays a warning on existing sites if the module is unavailable, but does not block the installation for new sites.

Very good point! I always forget that those are two different lists of checks.
I've added it to the action list.

Might also be a good time to improve the code re-usability in that regards (have one master list of requirements and recommendations, which both the requirements checker on installation/load + the Site Health component draw from).

#3 @ayeshrajans
16 months ago

Not a core developer, but I'd strongly prefer the option 2, too.

I apologize if I missed any part of the conversation, but I was wondering if we considered using symfony/symfony/polyfill-mbstring (contains 36 mb_ functions) to replace our two functions in compat.php. It would work as a stop-gap, and we can add a note if we are polyfill-ing mbstring asking the user to enable mbstring extension.

In case polyfilled, symfony/symfony/polyfill-mbstring adds about 1,000 LOCs to the code base, plus some unicode data to the WordPress code base, but it's a maintained library and we might eventually use it as a composer dependency rather than manually bundling it.

#4 @jrf
16 months ago

Hi @ayeshrajans Thanks for your input.

I did consider the Symfony polyfills, but as the Symfony polyfills have a minimum requirement of PHP 7.1 and the MbString extensions has been a recommended extension via WP Site Health for quite a while, I think requiring the extension should be the preferred option.

To me, the PHP 7.1 requirement is a blocker as that would mean that WP would have to use an old version of the polyfills which will not get any updates, which is asking for trouble.
As for raising the minimum PHP version of WP, that's a whole other discussion and I have little hope of that happening in time for the release of PHP 8.2.

Other than that: Symfony polyfills is licensed under the MIT license and I'm not sure how well that plays together with the WordPress GPL license. That would need some advise from legal.

This ticket was mentioned in Slack in #core by costdev. View the logs.


16 months ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


16 months ago

#7 @Chouby
16 months ago

@jrf In the eventuality that you would like to use the Symfony polyfills, the license should not be an obstacle. The license is listed as compatible with the GPL by GNU itself. See https://www.gnu.org/licenses/license-list.en.html#Expat.
As there is an ambiguity for the name "MIT license", you can check that the Symfony license (https://symfony.com/doc/current/contributing/code/license.html) uses exactly the same terms as the Expact (or MIT) license: https://directory.fsf.org/wiki/License:Expat.

#8 follow-ups: @TimothyBlynJacobs
16 months ago

What does making the extension require look like for servers that don't currently have mbstring installed? Do we have the technology to ensure they stay on 6.0.x? Is it that they'll see the 6.1 update as available, but if they try to update to it, they'll get an error message?

#9 in reply to: ↑ 8 @SergeyBiryukov
15 months ago

Replying to TimothyBlynJacobs:

What does making the extension require look like for servers that don't currently have mbstring installed? Do we have the technology to ensure they stay on 6.0.x? Is it that they'll see the 6.1 update as available, but if they try to update to it, they'll get an error message?

Not at this time, see comment:1. We should check for the required PHP extensions both during installation and updates, similar to how we check for the required PHP and MySQL versions. I have created #56017 for that.

#10 @ayeshrajans
15 months ago

Thanks a lot for the reply @jrf, you are absolutely right about the minimum version requirement in symfony/polyfill, that really deal breaker.

One more imperfect approach I could think of is adding WP's own equivalents (say, iso8859_1_to_utf8) based on symfony/polyfill-php72 (which provides user-land polyfills) to utf8_encode and utf8_decode if we cannot afford to require ext-mbstring yet. But this again comes with the MIT license legality, and is inherently a bad idea.

#11 @ayeshrajans
15 months ago

As for the two utf8_encode calls in wp_read_image_metadata and wxr_cdata functions, they are wrapped with !seems_utf8() clauses, which make believe core's utf8_encode calls are wrong too because there is no guarantee the source strings are always ISO-8859-1.

This ticket was mentioned in Slack in #hosting-community by javier. View the logs.


15 months ago

This ticket was mentioned in Slack in #hosting-community by jadonn. View the logs.


15 months ago

This ticket was mentioned in Slack in #core-php by jrf. View the logs.


15 months ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


15 months ago

This ticket was mentioned in Slack in #hosting-community by javier. View the logs.


15 months ago

This ticket was mentioned in Slack in #hosting-community by chaion07. View the logs.


15 months ago

This ticket was mentioned in Slack in #hosting-community by jadonn. View the logs.


15 months ago

This ticket was mentioned in Slack in #hosting-community by javier. View the logs.


15 months ago

This ticket was mentioned in Slack in #hosting-community by amykamala. View the logs.


15 months ago

This ticket was mentioned in Slack in #core by chaion07. View the logs.


14 months ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


14 months ago

#23 @audrasjb
14 months ago

Hello there, as per today's bug scrub, just wanted to mention that we'll give this ticket a few days before moving in to Future Release, as it is marked as early and as we're just one month away from 6.1 beta 1 :)

#24 @jrf
14 months ago

@audrasjb Thanks for the heads-up. The problem is not a patch, the problem is that we need more people speaking up to support adding MbString (or iconv or Intl) as a required extension.

I personally would find it pretty problematic if this gets moved to a future release, as it will make it significantly harder for plugins/themes to get ready for PHP 8.2.

This ticket was mentioned in Slack in #core by chaion07. View the logs.


14 months ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


13 months ago

#27 @audrasjb
13 months ago

  • Keywords early removed

As per today's bugscrub, let's remove the early keyword and raise this ticket in the next devchat.

This ticket was mentioned in Slack in #core by abhanonstopnews. View the logs.


13 months ago

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


13 months ago

This ticket was mentioned in Slack in #hosting-community by costdev. View the logs.


13 months ago

This ticket was mentioned in Slack in #core by chaion07. View the logs.


12 months ago

#32 @chaion07
12 months ago

  • Keywords dev-feedback added

Thanks @jrf for reporting this. We reviewed the ticket during a recent bug-scrub session. Based on the feedback received we are making the following changes:

  1. Adding the dev-feedback for additional attention
  2. We also recommend this ticket to be discussed during devchat if possible.

Cheers!

Props to @mukesh27 & @robinwpdeveloper

#33 @costdev
12 months ago

@jrf @SergeyBiryukov is #56017 a blocker for Option 2, or a nice-to-have enhancement?

This ticket was mentioned in Slack in #core by abhanonstopnews. View the logs.


12 months ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


12 months ago

#36 in reply to: ↑ 8 @desrosj
12 months ago

Replying to TimothyBlynJacobs:

What does making the extension require look like for servers that don't currently have mbstring installed? Do we have the technology to ensure they stay on 6.0.x? Is it that they'll see the 6.1 update as available, but if they try to update to it, they'll get an error message?

I wanted to get clarification on the answer to this question.

There is precedent for blocking a site from upgrading when a PHP extension is missing. In WordPress 5.2, the native JSON extension was marked as required (see [46455]). When an upgrade was attempted and a site did not have this extension loaded, an error was shown and the update was cancelled.

I know it's not ideal to add a bunch of on-off extension checks in the updater, but these changes are rare and they could be consolidated into a better solution through #56017 once solved.

I'm not yet advocating for or against this being in 6.1, but I want to make sure I'm not missing something when to comes to Core having the ability to "block" an update if a site lacks a certain PHP extension.

One other thing to note is that an error code specific to the scenario where the JSON extension missing is returned, which allows the number of sites to being blocked from an update to be tracked in .org's mission control to gauge whether this deprecation ends up being problematic or not (see [46560]).

#37 @audrasjb
12 months ago

  • Milestone changed from 6.1 to 6.2

With WP 6.1 RC 1 scheduled today (Oct 11, 2022), there is not much time left to address this ticket. Let's move it to the next milestone.

This ticket was mentioned in Slack in #hosting-community by jrf. View the logs.


11 months ago

This ticket was mentioned in Slack in #core by mukeshpanchal27. View the logs.


8 months ago

#40 @hellofromTonya
7 months ago

  • Milestone changed from 6.2 to 6.3
  • Owner set to hellofromTonya
  • Status changed from new to assigned

No additional work or discussion has happened in the 6.2 cycle. With Beta 4 next week and then RC1 the week after, moving this ticket to 6.3. As I'm planning to focus efforts on PHP 8+ compatibility during this next cycle, self-assigning to help shepherd it forward.

This ticket was mentioned in Slack in #core by chaion07. View the logs.


3 months ago

#42 @costdev
3 months ago

This ticket was reviewed during the recent bug scrub. @hellofromTonya Has there been any further work on this during the cycle that hasn't been added to the ticket yet, or will there be?

Props to @chaion07 for discussion.

This ticket was mentioned in Slack in #core by mukeshpanchal27. View the logs.


2 months ago

#44 @chaion07
2 months ago

  • Milestone changed from 6.3 to 6.4

We reviewed this ticket once again during a recent bug-scrub session and based on the feedback we are updating the milestone to 6.4. Cheers!

Props to @mukesh27 & @hareesh-pillai

Note: See TracTickets for help on using tickets.