Make WordPress Core

Opened 22 months ago

Closed 12 days ago

#60362 closed defect (bug) (fixed)

Emojis cause "Updating failed. Could not update post in the database." error

Reported by: mikefitzman's profile mikefitzman Owned by: westonruter's profile westonruter
Milestone: 6.9 Priority: normal
Severity: normal Version: 4.2
Component: Database Keywords: has-test-info needs-testing has-patch commit
Focuses: Cc:

Description

Hi guys

I recently discovered what I think is a bug, 'confirmed' by the folks at Brainstorm Force.

As of 26 January, I was suddenly unable to update certain posts and pages in WordPress. When I tried to save, I received the "Updating failed" error, but had made no backend or other changes to spontaneously create this error. I use Gutenberg.

I stumbled across a GitHub thread about someone with a lightning emoji in a post with the same problem. With this in mind, I removed all emojis from the pages and posts that could not save and sure enough, all functionality was restored to normal after doing so. I verified this several times by re-introducing emoijis, which reintroduced the error.

I've reported this to my host, Brainstorm Force (Astra theme devs) and now here. I hope it is of some assistance.

Regards

Mike

liquidaudio.com.au

Change History (26)

#1 @mikefitzman
22 months ago

Sorry, my first post here and suggested by Brainstorm Force. I should add that I tried all suggested troubleshooting including deactivating plugins, clearing caches, checking REST API functionality, confirming database health and checking with my hosting provider.

Mike

#3 @dmsnell
22 months ago

@mikefitzman can you check on which encoding your database is using? if it's storing content with the utf8mb3 encoding then it cannot represent characters which require four bytes in utf8.

you can test this out indirectly by saving a post separately with each of the following two characters:

  • 😄 (U+1F604) requires four bytes to encode in utf8, this should fail if the database encoding is utf8mb3
  • (U+2684) requires only three bytes to encode in utf8, a post should save with this character in it.

ideally, for many reasons, it would be preferable to store your database content as utf8mb4.

#4 @mikefitzman
22 months ago

@dmsnell Great suggestion, I've completed the test and cannot save posts containing the '😄'. Posts containing the other '⚄' character save normally. I'm wondering why my ability to save these 4-byte emojis changed overnight. I've had them in posts for a decade or more.

Mike

#5 @dmsnell
22 months ago

@mikefitzman you will want to figure out how to convert your database into utf8mb4. it's possible, though unlikely, that in the past those were being stored as their HTML numeric character references. For example, U+1F604 can be saved in HTML using only basic ASCII characters as 😄 or as 😄.

Another possibility which is more likely is that there were some glyphs that were saving because they only require three bytes, as did. Another example is the ❤️, which only requires three bytes in utf8.

#6 @mikefitzman
22 months ago

Very interesting. I'd hate to break my site, so I may not try to convert the database unless I find a method that looks safe enough for me to try. Makes me think that Hostinger changed something without saying so. I wonder if this will be resolved through code changes...?

#7 @salzano
22 months ago

I think this is the same bug but with a different behavior because there is no error message.

When using the Full Site Editor to edit a Template Part, the Save button appears to work, but no changes are saved after pasting in the US flag emoji into the template 🇺🇸. My database collation is utf8mb4_unicode_ci.

Here are steps to reproduce on 6.4.3 and using the free FotaWP theme:

  1. Go to Appearance > Editor > Patterns > Template Parts > Footer.
  2. Paste the US flag emoji into a paragraph block in the template.
  3. Press Save twice.
  4. View a page that uses the Footer Template Part, see no changes.

This ticket was mentioned in Slack in #core-test by ankit-k-gupta. View the logs.


22 months ago

#9 @Ankit K Gupta
22 months ago

  • Keywords has-testing-info needs-testing needs-screenshots added

#10 @nataliat2004
22 months ago

I was not able to reproduce this issue with and without Gutenberg activated. @mikefitzman , @salzano could you please check if instruction from this comment will help https://github.com/WordPress/gutenberg/issues/19552#issuecomment-1758798993?

#11 @salzano
22 months ago

OK. I deactivated all plugins. I still can't save a flag emoji in a Template Part.

I found that my tables were not all the same collation. I ran the query SHOW TABLE STATUS and found they looked like this:

wp_cf_form_entries	utf8mb4_unicode_520_ci
wp_cf_form_entry_meta	utf8mb4_unicode_520_ci
wp_cf_form_entry_values	utf8mb4_unicode_520_ci
wp_cf_forms	utf8mb4_unicode_520_ci
wp_cf_pro_messages	utf8mb4_unicode_520_ci
wp_cf_queue_failures	utf8mb4_unicode_520_ci
wp_cf_queue_jobs	utf8mb4_unicode_520_ci
wp_commentmeta	utf8mb3_general_ci
wp_comments	utf8mb3_general_ci
wp_edd_adjustmentmeta	utf8mb4_unicode_520_ci
wp_edd_adjustments	utf8mb4_unicode_520_ci
wp_edd_customer_addresses	utf8mb4_unicode_520_ci
wp_edd_customer_email_addresses	utf8mb4_unicode_520_ci
wp_edd_customermeta	utf8mb3_general_ci
wp_edd_customers	utf8mb3_general_ci
wp_edd_license_activations	utf8mb3_general_ci
wp_edd_licensemeta	utf8mb3_general_ci
wp_edd_licenses	utf8mb3_general_ci
wp_edd_logmeta	utf8mb4_unicode_520_ci
wp_edd_logs	utf8mb4_unicode_520_ci
wp_edd_logs_api_requestmeta	utf8mb4_unicode_520_ci
wp_edd_logs_api_requests	utf8mb4_unicode_520_ci
wp_edd_logs_file_downloadmeta	utf8mb4_unicode_520_ci
wp_edd_logs_file_downloads	utf8mb4_unicode_520_ci
wp_edd_notemeta	utf8mb4_unicode_520_ci
wp_edd_notes	utf8mb4_unicode_520_ci
wp_edd_notifications	utf8mb4_unicode_520_ci
wp_edd_order_addresses	utf8mb4_unicode_520_ci
wp_edd_order_adjustmentmeta	utf8mb4_unicode_520_ci
wp_edd_order_adjustments	utf8mb4_unicode_520_ci
wp_edd_order_itemmeta	utf8mb4_unicode_520_ci
wp_edd_order_items	utf8mb4_unicode_520_ci
wp_edd_order_transactions	utf8mb4_unicode_520_ci
wp_edd_ordermeta	utf8mb4_unicode_520_ci
wp_edd_orders	utf8mb4_unicode_520_ci
wp_gf_addon_feed	utf8mb4_unicode_520_ci
wp_gf_draft_submissions	utf8mb4_unicode_520_ci
wp_gf_entry	utf8mb4_unicode_520_ci
wp_gf_entry_meta	utf8mb4_unicode_520_ci
wp_gf_entry_notes	utf8mb4_unicode_520_ci
wp_gf_form	utf8mb4_unicode_520_ci
wp_gf_form_meta	utf8mb4_unicode_520_ci
wp_gf_form_revisions	utf8mb4_unicode_520_ci
wp_gf_form_view	utf8mb4_unicode_520_ci
wp_gf_rest_api_keys	utf8mb4_unicode_520_ci
wp_links	utf8mb3_general_ci
wp_nextend2_image_storage	utf8mb3_general_ci
wp_nextend2_section_storage	utf8mb3_general_ci
wp_nextend2_smartslider3_generators	utf8mb3_general_ci
wp_nextend2_smartslider3_sliders	utf8mb3_general_ci
wp_nextend2_smartslider3_sliders_xref	utf8mb3_general_ci
wp_nextend2_smartslider3_slides	utf8mb3_general_ci
wp_options	utf8mb3_general_ci
wp_p2p	utf8mb4_unicode_520_ci
wp_p2pmeta	utf8mb4_unicode_520_ci
wp_postmeta	utf8mb3_general_ci
wp_posts	utf8mb3_general_ci
wp_redirection_404	utf8mb4_unicode_520_ci
wp_redirection_groups	utf8mb4_unicode_520_ci
wp_redirection_items	utf8mb4_unicode_520_ci
wp_redirection_logs	utf8mb4_unicode_520_ci
wp_term_relationships	utf8mb3_general_ci
wp_term_taxonomy	utf8mb3_general_ci
wp_termmeta	utf8mb3_general_ci
wp_terms	utf8mb3_general_ci
wp_usermeta	utf8mb3_general_ci
wp_users	utf8mb3_general_ci

I used the linked github page to run queries to change them. Every table in the database is now utf8mb4_unicode_ci. I still can't save the flag emoji in a Template Part.

#12 @mikefitzman
21 months ago

My apologies guys, I'd love to try this but I don't know PHP or have a staging capability to do this without worrying about bricking my site. I still have the issue BTW, no 4-byte emojis are possible for me.

#13 @dotdoc
10 months ago


Last edited 10 months ago by dotdoc (previous) (diff)

#14 @josklever
10 months ago

Today I've spent some time investigating this issue on a website. Some of the tables were indeed using utf8mb3 collations, so I changed them to utf8mb4_unicode_ci, but no success.

So I found a code snippet where the wp_encode_emoji() function was used in a filter to fix this and that works.

I've saved it as a small plugin: https://github.com/JosKlever/emoji-fix

This client always used emojis until it didn't work anymore (posts with emojis couldn't be saved again.

Almost at the same time I got a report from another client with the same issue. They have the same webhost, so that might be related. I've found info on the web about upgrades of database software (in this case MariaDB 10.6) that might have caused this. But I didn't really understand it enough to explain here.

#15 @wordpressdotorg
6 months ago

  • Keywords has-test-info added; has-testing-info removed

#16 @slaFFik
3 months ago

I was shot in a foot today with this problem, some commands I ran to fix the issue.

List all tables that are not utf8mb4:

SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, CHARACTER_SET_NAME, COLLATION_NAME
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = 'DATABASE_NAME'
  AND CHARACTER_SET_NAME <> 'utf8mb4';

Don't forget to change DATABASE_NAME to your database name.

---

Here is a wp-cli command to fix it on a per table basis:

wp db query "ALTER TABLE wp_posts CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;"

Don't forget to change the wp_posts table to the one you want to fix.

---

You can also fix that in a loop for all tables using a bash script:

for t in $(wp db tables); do
  wp db query "ALTER TABLE $t CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;"
done

---

But ideally a new if check(s) should be added to \wpdb::process_fields() and/or \wpdb::strip_invalid_text() methods that will notify specifically about emojis.

#17 @kasparsd
3 weeks ago

  • Keywords needs-patch added; needs-screenshots removed

I was able to replicate the issue with a fresh WP 6.8.2 + mysql v8 and importantly define( 'DB_CHARSET', 'utf8mb3' ); in wp-config.php before running the WP install.

Recent versions of MySQL and MariaDB now use utf8mb3_* instead of utf8_* for collations/charsets.

This causes the wp_insert_post() logic https://github.com/WordPress/wordpress-develop/blob/35037e699148d9e73607480319bd494f8449767f/src/wp-includes/post.php#L4840-L4850 to skip wp_encode_emoji() because $wpdb->get_col_charset( $wpdb->posts, 'post_content' ) returns utf8mb3 instead of utf8.

$emoji_fields = array( 'post_title', 'post_content', 'post_excerpt' );

foreach ( $emoji_fields as $emoji_field ) {
        if ( isset( $data[ $emoji_field ] ) ) {
                $charset = $wpdb->get_col_charset( $wpdb->posts, $emoji_field );

                if ( 'utf8' === $charset ) {
                        $data[ $emoji_field ] = wp_encode_emoji( $data[ $emoji_field ] );
                }
        }
}

get_col_charset() returns utf8mb3 because:

public function get_col_charset( $table, $column ) {
        // Rest of logic.

        list( $charset ) = explode( '_', $this->col_meta[ $tablekey ][ $columnkey ]->Collation );
        return $charset;
}

where $this->col_meta[ $tablekey ][ $columnkey ]->Collation is something like utf8mb3_general_ci from which it extracts utf8mb3.

The actual fix here is to extend the if ( 'utf8' === $charset ) { conditional to if ( 'utf8' === $charset || 'utf8mb3' === $charset ) {. This is safe to do because utf8mb3 is an alias https://dev.mysql.com/doc/refman/8.4/en/charset-unicode-utf8.html

This ticket was mentioned in PR #10376 on WordPress/wordpress-develop by @kasparsd.


3 weeks ago
#18

  • Keywords has-patch added; needs-patch removed

Recent versions of MySQL (v8) and MariaDB server now use utf8mb3_* instead of utf8_* (an alias of utf8mb3) for collations/charsets. This makes WP skip the encoding of emojies when it _should_ happen.

Trac ticket: https://core.trac.wordpress.org/ticket/60362

#19 @staurand
3 weeks ago

I had a similar issue in a plugin, I ended up encoding emojis for all charsets except "utf8mb4".

if ( 'utf8mb4' !== $charset ) {

That's the safest way we found to fix this bug.

#20 @westonruter
2 weeks ago

  • Milestone changed from Awaiting Review to 6.9
  • Owner set to westonruter
  • Status changed from new to reviewing

#21 @westonruter
2 weeks ago

#46947 was marked as a duplicate.

#22 @westonruter
2 weeks ago

@kasparsd Can you confirm that #46947 is the same issue as this?

This ticket was mentioned in Slack in #core-test by jon_bossenger. View the logs.


2 weeks ago

#24 @westonruter
12 days ago

  • Component changed from Database to General
  • Keywords commit added
  • Version set to 4.2

I was able to reproduce the issue.

I updated my wp-config.php to define( 'DB_CHARSET', 'utf8mb3' ); and then reset the database to do a fresh install.

I then tried creating a post with an emoji (🦬), and when attempting to save, I got an error banner. The REST API response was:

{
    "code": "db_update_error",
    "message": "Could not update post in the database.",
    "data": {
        "status": 500
    },
    "additional_data": [
        "WordPress database error: Processing the value for the following field failed: post_content. The supplied value may be too long or contains invalid data."
    ]
}

I tried applying the change from @kasparsd and it was able to save successfully.

Note that the initial post content I saved was:

 HTML
<!-- wp:paragraph -->
<p>Bison: 🦬</p>
<!-- /wp:paragraph -->

And when I reloaded the edit post screen, the post_content was revealed to be:

 HTML
<!-- wp:paragraph -->
<p>Bison: &#x1f9ac;</p>
<!-- /wp:paragraph -->

So this is exactly as expected.

The emoji also renders on the frontend as expected.

#25 @westonruter
12 days ago

  • Component changed from General to Database

#26 @westonruter
12 days ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

In 61110:

Emoji: Convert emoji to HTML entities for utf8mb3 columns during wp_insert_post().

The utf8 character set is a deprecated alias of utf8mb3, but only the former was accounted for.

Follow-up to [31733].

Props kasparsd, mikefitzman, dmsnell, salzano, sabernhardt, ankit-k-gupta, nataliat2004, josklever, slaFFik, staurand.
See #31242, #21212.
Fixes #60362.

Note: See TracTickets for help on using tickets.