Make WordPress Core

Opened 10 years ago

Closed 10 years ago

Last modified 9 years ago

#7554 closed defect (bug) (invalid)

Indic Characters replaced by ???

Reported by: murky Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: General Keywords:
Focuses: Cc:


I was looking back at old posts, and noticed that characters which were working have been mangled at some stage in the wordpress upgrades. Things like é changed could be changed back, but Indic characters, when corrected, revert to the mangled state.

For example: http://www.murky.org/blg/back-in-delhi/

I had some bengali on this page, this worked well - but has now been replaced by ?????

I can demonstrate that it was working, as at a later date I exported my site over to wordpress.com as a backup


--- and as you can see, the bengali is visible here, (though not semantically correct as I had to be cunning with the coding due to a bug in winxp or mozilla)

I have tried to fix this at murky.org - but every time I do, it's replaced by ???? again - obviously quite frustrating.

The vast majority of the site is in English, but I do want to be able to show non-english characters from time to time.

My charset is utf8: define('DB_CHARSET', 'utf8'); define('DB_COLLATE', );

Change History (9)

#1 @DD32
10 years ago

My charset is utf8: define('DB_CHARSET', 'utf8'); define('DB_COLLATE', );

What is the actual Databases Charset though? and what was it when you created the original post?

Might be useful: http://eazyvg.linuxoss.com/2008/05/fixing-wordpress-and-mysql-charset-problem-especially-when-importing-from-one-blog-to-another/

#2 @murky
10 years ago

Unless I've had a very specific memory lapse, I've never adjusted this - I'd have no reason to (unless wp-config used to be something else, back in the 1.x days).

This post was originally written on the site with the problem, but the characters have since been lost. Very weird. I can't edit them back (or rather, I can, but they don't get saved).

#3 @murky
10 years ago

How do I find the actual database charset? More to the point, how do I put it right, if somehow it is different?

#4 @murky
10 years ago

Re: the URL - possibly useful (though the export/import would mean manually editing XML as wordpress import fails for big imports). I'd like to try and understand how this went wrong before fiddling with things at such a level though.

#5 @DD32
10 years ago

The DB_CHARSET constants were introduced with a recent version of WordPress, If the database was from before then, Theres a chance its using the default MySQL charset of Latin1, If thats the case, And WordPress talks to it with utf8, then mis-understandings of non-ascii(ie. non-english) characters can occur.

If you've got access to PHPMyAdmin then its easy to find the details, Just load up the database, and look at the "Collation" column.

The other option is to comment out the DB_CHARSET/DB_COLLATE defines and see if that fixes it.

The article i linked to before, Doesnt use WordPress at all, And so its export/import should work (It used PHPMyAdmin)

#6 @murky
10 years ago

I've looked at PHPMyAdmin....

it's latin1_swedish_ci

I take it that means I have to use the details in the article, or can I change it directly?

#7 @murky
10 years ago

... except for terms related tables, which are utf8_general_ci

#8 @murky
10 years ago

Steps taken:

1) Back up database 2) Make empty database 3) Import old database to new one. 4) Assign database user mentioned in wp-config to database 5) Backup wp-config 6) Edit wp-config to refer to new database 7) Test - check all is working (e.g. make a change in WP, then swap wp-config back and forth to be sure it's all swapping as expected). 8) run database change plugin (ignoring warnings that it's not been tested) 9) Observe the database in phpadmin, note utf-general-ci 10) be amazed it all seems okay now (although I am having to manually re-edit the bengali), at least this time it sticks. 11) Leave the original database sitting there untouched for a while.


#9 @DD32
10 years ago

  • Milestone 2.6.2 deleted
  • Resolution set to invalid
  • Status changed from new to closed

Glad to hear you've got it working, As its not a WP fault but rather a configuration error, I'm closing this as Invalid.

Damn MySQL for defaulting to latin1_*_ci.. Damn people who setup MySQL and do not change the default! :P

Note: See TracTickets for help on using tickets.