Make WordPress Core

Changes between Initial Version and Version 12 of Ticket #7813


Ignore:
Timestamp:
01/04/2026 08:27:05 AM (4 months ago)
Author:
dmsnell
Comment:

The WXR export should be UTF-8 because it’s an XML document.

However, there are still things to improve here, but none of them should use utf8_encode().

It would be nice to see improvements in the export flow to convert into UTF-8, but that is definitely a complicated matter.

For now it would be great if someone could clarify the behavior and ensure that WordPress does not indicate that the WXR is in the blog_charset or any other value than utf-8. @tott can you confirm if this is still a problem and explain more on the “header and encoding is set to encoding used in blog” part?

The fix here is not preserving the charset because that will cause all sorts of trouble when attempting to import files. Note that this does not in any way depend on support for UTF-8 code or conversion functionality. If WordPress is unable to convert to UTF-8 from a known charset then it should produce an error of some kind.

Eventually I hope to add full fallback support in #64473 which would give us the tools to do so reliably and safely regardless of which extensions are installed.

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #7813

    • Property Severity changed from normal to minor
    • Property Cc lloydbudd added
    • Property Component changed from i18n to Export
    • Property Version changed from to 2.7
    • Property Milestone changed from to Future Release
    • Property Keywords 2nd-opinion close added; export encoding i18n removed
  • Ticket #7813 – Description

    initial v12  
    22
    33this causes trouble when importing later.
     4
     5WordPress should always convert to UTF-8 and indicate this in the XML declaration and metadata.