Make WordPress Core

Opened 15 years ago

Closed 15 years ago

#8915 closed defect (bug) (wontfix)

Multi-Byte characters not readable because "latin1_swedish_ci" is set by default of WP 2.7 Install

Reported by: misty9's profile misty9 Owned by:
Milestone: Priority: high
Severity: major Version: 2.7
Component: General Keywords:
Focuses: Cc:

Description

The current WordPress 2.7 has automatically set the collation for all tables set to "latin1_swedish_ci" except for wp_terms, wp_term_relationships and wp_taxonomy with "utf8_general_ci". Because of this issue, the multi-byte characters such as Japanese, Korean and etc are unreadable in the Page or Post editor of Site Admin.

The workaround:

Manually set each each field of all applicable tables to "utf8_general_ci". Can you guys please fix the problem? thx.

Change History (10)

#1 @misty9
15 years ago

  • Summary changed from Multi-Byte display does NOT work because "latin1_swedish_ci" is set by default for WP 2.7 Install to Multi-Byte characters not readable because "latin1_swedish_ci" is set by default of WP 2.7 Install

#2 @ryan
15 years ago

All new tables should be created with utf8_general_ci. The latin1_swedish_ci tables were likely created with an old version of WordPress. We don't automatically change charsets for old tables.

#3 @misty9
15 years ago

Not true. Tables wp_terms, wp_term_relationships and wp_taxonomy are with "utf8_general_ci".

The multi-bye characters worked fine in WP 2.2 until WP 2.7 was installed.

On another note, why there are so many issues in WP 2.7? I've just found another one. The page id counter appears to be out of sync. There were total 70 pages, it creates a page id 133 after the WP 2.7 upgrade.

#4 @DD32
15 years ago

Not true. Tables wp_terms, wp_term_relationships and wp_taxonomy are with "utf8_general_ci".

When the other tables were created, WP didnt set the type, Therefor, it used the default mysql type, which for many, was latin1_swedish_ci

In 2.3/2.5 somewhere there, the new taxonomy tables were introduced, when they got created, it used the default of utf8_general_ci instead of relying on MySQL to be set up well

Now as to why the multi-byte characters broke, Would once again, Be due to later WP's using utf8 instead of non-utf8 types, The end result is that certain setups break and you need to fix them manually, simply because automatically converting them would be horrendously error-prone.. The odd thing about this is, That you should've had problems in 2.2 with this.. which means the characters were probably being stored incorrectly in the first place.. See http://codex.wordpress.org/Converting_Database_Character_Sets for more info.

As for the page counter, Thats correct. The ID's are used by posts, pages, revisions, and attachments, so you'll never get 1, 2, 3, 4....10000 unless you only create pages and disable revisions.

#5 @MichaelH
15 years ago

This warning from http://codex.wordpress.org/Editing_wp-config.php applies here, right?

"WARNING: Those performing upgrades (especially blogs that existed before 2.2) If DB_CHARSET and DB_COLLATE do not exist in your wp-config.php file, DO NOT add either definition to your wp-config.php file unless you read and understand Converting Database Character Sets. Adding DB_CHARSET and DB_COLLATE to the wp-config.php file, for an existing blog, can cause major problems."

#6 @ryan
15 years ago

  • Milestone 2.8 deleted
  • Resolution set to wontfix
  • Status changed from new to closed

In this case, either DB_CHARSET has been defined or the default collation for the database has changed since the older tables were created. We can check the charset on existing tables when creating new tables during upgrade, but I doubt bothering will be worth the potential for bugs since this is very much an edge case. Further, we might someday do as Drupal does and convert existing tables to UTF-8, which would address this. In the meantime, marking this wontfix.

#7 @misty9
15 years ago

"As for the page counter, Thats correct. The ID's are used by posts, pages, revisions, and attachments, so you'll never get 1, 2, 3, 4....10000 unless you only create pages and disable revisions."

The page and post ids are pretty much in sync in WP 2.2. There has never been in a case such that one page id is 70 and the next page id is 133. It makes no sense no matter what your argument is here. You guys should at least try to figure out what's really wrong with your code rather than shutting down the bugs that customers report. You may not have a product one day if you keep at it like this.

I refuse to use the buggy product WP 2.7.

#8 @misty9
15 years ago

The bottom line is that WP 2.7 installation should have never broken what's working in WP 2.2 regardless the table is new or old.

#9 @misty9
15 years ago

  • Resolution wontfix deleted
  • Status changed from closed to reopened

I have reopened only for you to view my latest comments.

#10 @ryan
15 years ago

  • Resolution set to wontfix
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.