#42 closed defect (bug) (wontfix)
Encoding Problem with old entries UTF-8 <-> Latin1
Reported by: | Agent Orange | Owned by: | matt |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | minor | Version: | |
Component: | General | Keywords: | |
Focuses: | Cc: |
Description
When a new UTF-8 using Wordpress version is used on a database with Latin1-encoded posts, these posts look awful. In my case (german language), the Umlauts and the ß were garbage.
Maybe the Upgrade-Script could convert old entries.
Attachments (1)
Change History (12)
#3
@
20 years ago
That seems to deal with only a few characters. Is that all that's needed for Latin-1?
#4
@
20 years ago
No there a lot more characters. The attachment lists only the German special characters, not all people use. Dutch has for example 'ï' and 'ë' and accents on the e, a, i, o etc. Other languages have a '/' through the 'o' (I should learn names of those...). French has also accents and a special 'c' (see Tantek).
#5
@
20 years ago
Yep, I made the character-array manually and only put the chars I need in there (selfish, yes :). But the array can be easily extended.
If you want me to integrate any changes you may have made to the list into the "official" version on my homepage, send them to janvarwig [at] gmx.net.
#6
@
20 years ago
This should not be integrated in "your official version", I think Matt wants this in the default build.
#7
@
20 years ago
That would be even better of course, altough the on-the-fly conversion is hardly an ideal solution.
Not many people have the iconv program, I also doubt that everybody running wordpress is familiar with such tools or even able to dump his DB.
Another approach is to use iconv, it it is available on your system. Dump your database to dbdump and then:
iconv -f iso-8859-1 -t utf-8 < dbdump > dbdump.utf
Restore dbdump.utf.