Make WordPress Core

Opened 11 months ago

Closed 7 months ago

Last modified 7 months ago

#59656 closed enhancement (fixed)

Merge Performant Translations (Ginger MO)

Reported by: swissspidy's profile swissspidy Owned by: swissspidy's profile swissspidy
Milestone: 6.5 Priority: high
Severity: normal Version:
Component: I18N Keywords: has-patch has-unit-tests has-dev-note
Focuses: performance Cc:

Description (last modified by swissspidy)

Over the past half a year or so, the core performance team spent significant time on extensively analyzing i18n performance in WordPress and finding ways to improve it. This cumulated in the Performance Translations feature plugin, which uses PHP files in favor of MO files for translations, which is much faster and benefits from OPcache as well.

This ticket tracks the suggestion of merging Performant Translations, which was originally developed by @dd32 under the name Ginger MO, into WordPress core.

View related make/core post: https://make.wordpress.org/core/2023/11/08/merging-performant-translations-into-core/

Performant Translations supports multiple file formats (.mo and .php), as well as multiple text domains and locales loaded at the same time.

This means:

  • Loading .mo files as usual will be much faster and use less memory
  • If an .mo translation file has a corresponding .php file, that file will be loaded instead, making things even faster and use even less memory
  • When switching locales, we no longer have to unload the previous translations, but can instead keep them in memory, making locale switching much cheaper

A quick comparison:

Locale Scenario Memory Usage Load Time
en_US Default 14.46 MB 124.66 ms
de_DE Default 27.96 MB 173.44 ms
de_DE Performant Translations 15.62 MB 132.60 ms

As you can see, translations used to massively slow down a site, but with this new library, there is little to no additional overhead.

Since this uses a new file format for translations, some changes are required to add support for it in related places such as translate.wordpress.org.

Relevant companion tickets:

Everything is 100% backward compatible and if there are no PHP translation files, MO files will be loaded as usual.


Related: this is an alternative to #17268

Attachments (2)

wordpress-6-5-translations-issue (2).png (244.8 KB) - added by oglekler 7 months ago.
plugin-activate-message-in-english.png (64.2 KB) - added by oglekler 7 months ago.

Download all attachments as: .zip

Change History (65)

This ticket was mentioned in Slack in #core-performance by clarkeemily. View the logs.


11 months ago

#2 @swissspidy
11 months ago

  • Description modified (diff)

#3 @flixos90
11 months ago

  • Priority changed from normal to high

Marking this as a high priority for the 6.5 due to the significant estimated impact this will have for localized sites' performance.

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


11 months ago

#5 @swissspidy
10 months ago

In 57083:

Build/Test Tools: Expand performance test scenarios.

Adds new tests for localized sites as well as the dashboard.
Also amends Server-Timing output to measure memory usage in all scenarios.

Props swissspidy, joemcgill, flixos90, mukesh27, mamaduka.
See #59656.
Fixes #59815.

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


10 months ago

#7 @joemcgill
10 months ago

The PR for this isn't being picked up by the GH -> Trac connection. It's linked from the ticket description, but for anyone else having trouble finding it, the PR is: https://github.com/WordPress/wordpress-develop/pull/5306

#8 follow-up: @swissspidy
10 months ago

@joemcgill I think it‘s because it‘s still a draft

#9 in reply to: ↑ 8 @joemcgill
10 months ago

Replying to swissspidy:

@joemcgill I think it‘s because it‘s still a draft

I've seen other draft PRs get picked up in tickets, like this one on #59600. I suspect the additional URLs in your PR description.

#10 @swissspidy
10 months ago

  • Description modified (diff)

#11 @akirk
10 months ago

Since this adds the ability to include a PHP file generated elsewhere, I am worried that this potentially creates a vector for putting malicious code in translation files that didn't exist before.

The contents of the PHP file is predictable, we could parse it easily using PHP's token_get_all() function. We could introduce a "secure" mode where the file would be checked before it is included.

Although, since this is about performance, we likely don't want to add a performance penalty. Thus we could try and use a checksum to ensure the file was not changed after it has been checked.

What do you think?

#12 @swissspidy
10 months ago

Note that the security aspect has been previously mentioned in the corresponding i18n performance analysis post.
The main takeaway is that installing translations is no different than installing a plugin or theme, which can also execute arbitrary code. And WordPress has always considered translations to be trusted.
The post does also mention a static analysis or checksum check for added safety, though doing that at runtime (vs. at install time) will negate the performance wins again. The latter would require additional infrastructure for storing and retrieving checksums too.
Right now the "secure" mode is using the translation_file_format filter to disable PHP file usage.

This ticket was mentioned in Slack in #core-i18n by swissspidy. View the logs.


10 months ago

This ticket was mentioned in PR #5306 on WordPress/wordpress-develop by @swissspidy.


10 months ago
#14

Trac ticket: https://core.trac.wordpress.org/ticket/59656

What it does

  • If an MO file has a corresponding PHP file, that file is loaded instead.

What it does not do (notable changes from the plugin)

  • Does not automatically convert any MO files to PHP files upon reading.
    • This avoids any FS interaction on regular requests, avoiding any unexpected results.
    • Makes the logic much simpler.
    • This additional feature can stay in the Performant Translations plugin for those who wish to use it.
  • No integration with Language_Pack_Upgrader to automatically convert MO files to PHP files.
    • This was only needed because WordPress.org didn't yet serve PHP files in language packs.
    • If someone is using a custom translation platform like Traduttore, it is up to them to provide PHP files as well (if they want to)

Related

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


10 months ago

@swissspidy commented on PR #5306:


8 months ago
#16

@westonruter For the return types I wasn't actually sure whether they are allowed in core, which is why I removed them at one point.

But seems like they are: https://developer.wordpress.org/coding-standards/wordpress-coding-standards/php/#type-declarations - just the docs need a little updating now that PHP 7 is the required minimum

#17 @swissspidy
8 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 57337:

I18N: Introduce a more performant localization library.

This introduces a more lightweight library for loading .mo translation files which offers increased speed and lower memory usage.
It also supports loading multiple locales at the same time, which makes locale switching faster too.

For plugins interacting with the $l10n global variable in core, a shim is added to retain backward compatibility with the existing pomo library.

In addition to that, this library supports translations contained in PHP files, avoiding a binary file format and leveraging OPCache if available.
If an .mo translation file has a corresponding .l10n.php file, the latter will be loaded instead.
This behavior can be adjusted using the new translation_file_format and load_translation_file filters.

PHP translation files will be typically created by downloading language packs, but can also be generated by plugins.
See https://make.wordpress.org/core/2023/11/08/merging-performant-translations-into-core/ for more context.

Props dd32, swissspidy, flixos90, joemcgill, westonruter, akirk, SergeyBiryukov.
Fixes #59656.

#19 @swissspidy
8 months ago

In 57338:

I18N: Add missing variable in string replacement.

Ensures the preferred file name for lookup has the correct extension.

Follow-up to [57337].
See #59656.

#20 @swissspidy
8 months ago

In 57339:

I18N: Improve edge case handling in WP_Translation_Controller.

Prevents PHP warnings for possibly undefined array keys.
Also fixes incorrect @covers annotations.

Follow-up to [57337].
See #59656.

@swissspidy commented on PR #5306:


8 months ago
#21

@mukeshpanchal27 thanks for the feedback. just fyi it's easy to miss feedback on closed PRs, I almost didn't see your comments because I didn't get any notification. Best to ping separately.

#22 @swissspidy
8 months ago

In 57344:

I18N: Improve docblocks after [57337].

Props mukesh27.
See #59656.

@mukesh27 commented on PR #5306:


8 months ago
#23

Yes, but it's nit-pick feedback, so I missed pinging you.

#24 @swissspidy
8 months ago

  • Keywords needs-dev-note added

#25 @swissspidy
8 months ago

In 57350:

I18N: Rename WP_Translation_Controller::instance() method to get_instance().

This improves consistency as get_instance() is more commonly used in core.

See #59656.

#26 @swissspidy
8 months ago

In 57381:

I18N: Ensure .l10n.php files are deleted when upgrading language packs.

Props amieiro.
See #59656.

#27 @swissspidy
8 months ago

In 57382:

I18N: Delete .l10n.php files when deleting a theme.

Follow-up to [57337] where this was already added for plugins.

See #59656.

#28 @swissspidy
8 months ago

In 57386:

I18N: Improve singular lookup of pluralized strings.

Ensures that looking up a singular that is also used as a pluralized string works as expected.
This improves compatibility for cases where for example both __( 'Product' ) and _n( 'Product', 'Products’, num ) are used in a project, where both will use the same translation for the singular version.

Although such usage is not really recommended nor documented, it must continue to work in the new i18n library in order to maintain backward compatibility and maintain expected behavior.

See #59656.

#29 @swissspidy
8 months ago

In 57387:

I18N: Add missing space after foreach keyword.

Follow-up to [57386].

See #59656.

#30 @swissspidy
8 months ago

In 57504:

I18N: Load new translation library in wp_load_translations_early().

Ensures localization continues to work as expected with the new library in case
translations need to be loaded early in the process.

See #59656.

#31 @swissspidy
8 months ago

In 57505:

I18N: Revert [57386] pending further investigation.

Reverts the change for fallback string lookup due to a performance regression in the bad case scenario.

See #59656.

#32 @Chrystl
8 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

@swissspidy With WordPress 6.5-alpha-57505, I noticed that the .php file works only if the .mo file is present. If I remove the .mo file, the .php file is no longer loaded.

Steps to reproduce the issue:

  1. Update WordPress to the nightly version, specifically 6.5-alpha-57505.
  2. Activate the Twenty Twenty-One theme; no plugins are activated.
  3. Go to Settings > General > set the Site Language to "French."
  4. Visit your "Hello world" post on the front end and check that you see "Laisser un commentaire" as expected.
  5. Go to https://translate.wordpress.org/projects/wp-themes/twentytwentyone/fr/default/ and download the wp-themes-twentytwentyone-fr.l10n.php file.
  6. Edit your file, modify "Laisser un commentaire" to "Laisser un commentaire - Test," and save your changes.
  7. Rename your file to twentytwentyone-fr_FR.l10n.php, then move it to wp-content/languages/themes.
  8. Visit your "Hello world" post on the front end and check that you now see "Laisser un commentaire - Test" as expected.

=> If you rename the twentytwentyone-fr_FR.po file, the twentytwentyone-fr_FR.l10n.php file is still loaded.
=> If you rename the twentytwentyone-fr_FR.mo file, the twentytwentyone-fr_FR.l10n.php file is no longer loaded.

This ticket was mentioned in PR #6000 on WordPress/wordpress-develop by @swissspidy.


8 months ago
#33

Trac ticket:

#34 @swissspidy
8 months ago

In 57513:

I18N: Improve singular lookup of pluralized strings.

Ensures that string lookup in MO files only uses the singular string.

This matches expected behavior with gettext files and improves compatibility for cases where for example both __( 'Product' ) and _n( 'Product', 'Products’, num ) are used in a project, where both will use the same translation for the singular version. Maintains backward compatibility and feature parity with the pomo library and the PHP translation file format.

Replaces [57386], which was reverted in [57505], with a more accurate and performant solution.

See #59656.

#36 @swissspidy
8 months ago

@Chrystl That is currently the expected behavior, with .mo files remaining the source of truth as there are many places in core and plugins that expect .mo files to exist. .l10n.php is sort of the more performant variant that takes precedence if it's available.

I was planning on iterating on that in a future release. That said, I see how it can be confusing. Given that we already made some related i18n changes in #58919 anyway, it might be best to address this already in 6.5. I'll take another look at this!

This ticket was mentioned in PR #6004 on WordPress/wordpress-develop by @swissspidy.


8 months ago
#37

@swissspidy commented on PR #6004:


8 months ago
#38

FYI @Chrystll

#39 @Chouby
8 months ago

@swissspidy First of all thank you for all the effort you are putting in getting this huge performance improvement.

However I notice a possible issue with the translation of plurals
This test done with the Slovenian translation of WordPress used to pass with older versions of WP and doesn't pass with current alpha (57514):

public function test_translate_plurals() {
        load_textdomain( 'default', TEST_DATA_DIR . 'wp-dev-sl.mo' );

        $this->assertSame( '%s razpoložljiva posodobitev', _n( '%s update available', '%s updates available', 101 ) ); // 1, 101, 201
        $this->assertSame( '%s razpoložljivi posodobitvi', _n( '%s update available', '%s updates available', 102 ) ); // 2, 102, 202
        $this->assertSame( '%s razpoložljive posodobitve', _n( '%s update available', '%s updates available', 103 ) ); // 3, 4, 103
        $this->assertSame( '%s razpoložljivih posodobitev', _n( '%s update available', '%s updates available', 5 ) ); // 0, 5, 6
}

Last edited 8 months ago by Chouby (previous) (diff)

This ticket was mentioned in PR #6005 on WordPress/wordpress-develop by @swissspidy.


8 months ago
#40

Trac ticket:

#41 @swissspidy
8 months ago

Thanks @Chouby! Can't believe I missed that. Turns out there was a silly mistake in parsing plural forms. I'm working on a fix now in https://github.com/WordPress/wordpress-develop/pull/6005 and will commit it soon.

Please let me know if you encounter any other oddities. The more testing the better.

#42 @swissspidy
8 months ago

In 57516:

I18N: Support loading .l10n.php translation files on their own.

Adjusts the translation file lookup in WP_Textdomain_Registry so that just-in-time translation loading
works even if there is only a .l10n.php translation file without a corresponding .mo file.

While language packs continue to contain both file types, this makes it easier to use translations in a project
without having to deal with .mo or .po files.

Props Chrystl.
See #59656.

#46 @swissspidy
8 months ago

In 57518:

I18N: Fix plural forms parsing in WP_Translation_File.

Ensures the plural expression from the translation file header is correctly parsed.
Prevents silent failures in the attempt to create the plural form function.

Adds additional tests.

Props Chouby.
See #59656.

#48 @swissspidy
8 months ago

In 57519:

I18N: Add type declaration to new method missed in [57518].

See #59656.

#49 @swissspidy
7 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

#50 @swissspidy
7 months ago

In 57639:

I18N: Prevent incorrect language dropdown entries when there are .l10n.php files.

In [57516], the just-in-time translation loading logic was enhanced to support cases where only .l10n.php translation exist but no .mo or .po files. This caused a slight regression in get_available_languages(), which uses the list of files to populate the language dropdown list on the settings page.

To address this, the new file extension is now properly stripped off, and the resulting file list is de-duplicated. New test files are added to allow the existing tests to cover this new scenario.

See #59656.
Fixes #60553.

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


7 months ago

#52 @oglekler
7 months ago

Hi @swissspidy I have an issue with the translation when the user language is not the same as the site language:

Environment

  • WordPress: 6.5-beta2
  • Translations are updated

Steps to Reproduce

  1. Choose site language: German
  2. Choose user language: Russian
  3. Note that some of the strings in the interface are not translated
  4. Choose site language: Russian
  5. Check that translations are in place.

The same is true for the other languages I tried.

This ticket was mentioned in PR #6174 on WordPress/wordpress-develop by @swissspidy.


7 months ago
#53

Fixes issues when site & user language are different and then the loaded translations are associated with the wrong locale, causing translations to be missing.

Trac ticket: https://core.trac.wordpress.org/ticket/59656

#54 @swissspidy
7 months ago

@oglekler Thanks a lot for your report! I was able to reproduce this. Can you please check whether https://github.com/WordPress/wordpress-develop/pull/6174 fixes this for you? Your help is much appreciated!

#55 @oglekler
7 months ago

@swissspidy Yes, it looks like it is fixed 🙌 I didn't manage to find anything that isn't matching. There are strings that should be translated due to changes, but this is ok 🙂

#56 @swissspidy
7 months ago

In 57704:

I18N: Do not set translation controller locale in bootstrap.

Removes the WP_Translation_Controller::set_locale() call from wp-settings.php, which happened before the current user was loaded.
That caused translations to be missing when the site locale and user locale were different, as the translation was associated with the wrong locale.

Turns out this call was not needed at all, as the locale will be set/updated when calling load_textdomain() anyway.

Props oglekler.
See #59656.

#58 @oglekler
7 months ago

I am sorry, @swissspidy,
I have no idea where this is comming from: I am installing plugins, and after installation, I am getting the message 'Activate' in English. Reload is adding the message in the correct language. Themes installation is not getting such results; there is the right language. 6.4 don't have such behavior.

Environment

  • WordPress: 6.5-beta2
  • Checked Site languages: Russian/German, User language: Site Default.

#59 @swissspidy
7 months ago

Ah, very interesting! Looks like an issue with the JavaScript translation. Thanks for flagging. I'll check it out ASAP.

#60 @swissspidy
7 months ago

@oglekler Quick update: This looks to be caused by the plugin dependencies work.

WordPress used to just use the string __( 'Activate' ) for both plugins and themes. But now after installing a plugin, its dependencies are checked, and it now uses a new string _x( 'Activate', 'plugin' ) that hasn't been used in core before.

So this is simply a new string that will need to be translated for 6.5. Not a bug.

That said, it's not ideal that we have both __( 'Activate' ) and _x( 'Activate', 'plugin' ) in core. Not sure if this string really needs the context.

For comparison, we have _x( 'Activate %s', 'plugin' ) and _x( 'Activate %s', 'theme' ) in core, but not __( 'Activate %s' ). So I suppose the context is useful.

I'll open a new ticket to make this more clear.

Aside: the plugin dependencies work also added incorrect function calls such as _x( 'Network Activate' ) without any context, which is wrong too.

I'll address this too.

/cc @costdev simply FYI

#61 @oglekler
7 months ago

Thank you @swissspidy 🙏❤️🔥

This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.


7 months ago

Note: See TracTickets for help on using tickets.