WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#33697 closed defect (bug) (invalid)

utf8mb4 conversion runs on "term_relationships" when perhaps it should not

Reported by: tollmanz Owned by:
Milestone: Priority: normal
Severity: normal Version: 4.2
Component: Database Keywords:
Focuses: Cc:

Description

When the term_relationships table is passed to maybe_convert_table_to_utf8mb4(), it will always end in the ALTER TABLE command. As far as I can tell, there is no point in running the update on this table since it only contains BIGINT and INT columns. Additionally, one of the exit early conditions is, 'utf8' !== $charset && 'utf8mb4' !== $charset, which all columns in term_relationships would match, but this statement is never reached for the table.

In this snippet:

	foreach ( $results as $column ) {
		if ( $column->Collation ) {
			list( $charset ) = explode( '_', $column->Collation );
			$charset = strtolower( $charset );
			if ( 'utf8' !== $charset && 'utf8mb4' !== $charset ) {
				// Don't upgrade tables that have non-utf8 columns.
				return false;
			}
		}
	}

the if ( $column->Collation ) condition prevents all of the columns in term_relationships from entering the condition. Since $column->Collation === NULL for each term_relationship, the exit condition will never be reached.

Further on, this snippet appears:

	$table_details = $wpdb->get_row( "SHOW TABLE STATUS LIKE '$table'" );
	if ( ! $table_details ) {
		return false;
	}

	list( $table_charset ) = explode( '_', $table_details->Collation );
	$table_charset = strtolower( $table_charset );
	if ( 'utf8mb4' === $table_charset ) {
		return true;
	}

I cannot seem to get $table_details->Collation to be anything other than utf8_general_ci, even after running the utf8mb4 alter query (which does make sense). Because the if ( 'utf8mb4' === $table_charset ) is not true, the function does not exit, and the alter query executes.

Admittedly, I don't think this is a big deal, but perhaps has the ability to become an annoying bug in the future. It seems to me like an additional check could be helpful here. The check would look to see if all columns in a table have a NULL value for Collation, the function should exit early.

Change History (6)

#1 @pento
4 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to invalid
  • Status changed from new to closed

This behaviour is on purpose.

Apart from changing the character set of the columns, the ALTER TABLE also changes the default character set of the table. While it's unlikely that term_relationships will ever have new columns added, it's quite possible that a plugin using that function could add new string columns. If we didn't update the table character set, and the plugin didn't specify a character set for the new column, it would unexpectedly be added as utf8 instead of utf8mb4.

(I also like having all of the tables showing as utf8mb4 when looking at the table list in phpMyAdmin.) ;-)

#2 @tollmanz
4 years ago

Thanks for the response @pento. When I run the alter statement against the table manually, then check the result with SHOW TABLE STATUS LIKE 'wp_term_relationships', I'm not seeing it marked at utf8mb4, hence that final check fails. Is that the only way to check if the table has been updated or am I missing something?

#3 @pento
4 years ago

Are you getting a warning when you run the ALTER TABLE? Because it works for me:

mysql> SHOW TABLE STATUS LIKE "wp_term_relationships"\G
*************************** 1. row ***************************
           Name: wp_term_relationships
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 1
 Avg_row_length: 16384
    Data_length: 16384
Max_data_length: 0
   Index_length: 16384
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2015-09-03 02:38:27
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_unicode_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.00 sec)

mysql> ALTER TABLE wp_term_relationships CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> SHOW TABLE STATUS LIKE "wp_term_relationships"\G
*************************** 1. row ***************************
           Name: wp_term_relationships
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 1
 Avg_row_length: 16384
    Data_length: 16384
Max_data_length: 0
   Index_length: 16384
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2015-09-03 02:38:43
    Update_time: NULL
     Check_time: NULL
      Collation: utf8mb4_unicode_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.00 sec)

#4 @tollmanz
4 years ago

This is what I get and hence the reason I opened this ticket:

mysql> SHOW TABLE STATUS LIKE "wp_term_relationships"\G
*************************** 1. row ***************************
           Name: wp_term_relationships
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 685560
 Avg_row_length: 42
    Data_length: 28917760
Max_data_length: 0
   Index_length: 15253504
      Data_free: 4194304
 Auto_increment: NULL
    Create_time: 2015-03-25 21:48:18
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

mysql> ALTER TABLE wp_term_relationships CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Query OK, 0 rows affected (0.11 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> SHOW TABLE STATUS LIKE "wp_term_relationships"\G                                              *************************** 1. row ***************************
           Name: wp_term_relationships
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 685560
 Avg_row_length: 42
    Data_length: 28917760
Max_data_length: 0
   Index_length: 15253504
      Data_free: 4194304
 Auto_increment: NULL
    Create_time: 2015-03-25 21:48:18
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

Is the original utf8_general_ci collation causing an issue?

P.S. - Thanks for the \G trick!

#5 @pento
4 years ago

It's not utf8_general_ci, the conversion worked fine for me when I tried it with that.

What server version are you using? Is it vanilla MySQL, or a fork?

#6 @tollmanz
4 years ago

I'm using 5.6.23 on CentOS 7.

Note: See TracTickets for help on using tickets.