Opened 18 years ago
Closed 13 years ago
#4794 closed defect (bug) (fixed)
xml-rpc should identify encoding
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 3.5 | Priority: | normal |
Severity: | normal | Version: | 2.2.2 |
Component: | XML-RPC | Keywords: | has-patch dev-feedback |
Focuses: | Cc: |
Description (last modified by )
WordPress provides users with a preference to identify the text encoding of the blog's content. But this encoding format is not used to identify the content expectations for (most) XML documents generated by xmlrpc.php.
Notice that when RSD support was added, the developer who wrote that code *did* include the blog's encoding in the document header. But for all other XML documents generated (i.e. replies to XML-RPC queries, the encoding is omitted.
When the encoding is omitted, as I understand it, the presumed encoding is UTF8. In my limited experience with customers running non-UTF8 blogs, they tend to use ISO-8859-1 encoding. When they use this encoding and also take advantage of some of the accented characters in that set, such as 0xE9 or 0xc9, the resulting document is illegal XML because it contains characters that are not part of the presumed UTF8 set.
This failure to identify properly the encoding of XML documents can lead blog clients to fail to parse the XML, and therefore cause the XML-RPC to more or less completely fail for a certain class of users.
I propose that xmlrpc.php be modified such that every XML document it generates for the purposes of exposing blog content, be identified as being of the encoding specified by the user in Options -> Reading.
Attachments (5)
Change History (23)
#1
@
18 years ago
- Summary changed from WordPress should identify XML document text encoding to xml-rpc should identify encoding
#3
@
18 years ago
If it was, I don't think that would be sufficient, because the idea is that the XML document should be legally parseable as-is, right?
But it doesn't advertise it in the Content-type header. Here are the relevant lines from a typical response:
Content-Length: 3714
Content-Type: text/xml
<?xml version="1.0"?>
<methodResponse>
#4
@
18 years ago
Perhaps there was some confusion as to what constitutes "encoding" - yes the Content-type header identifies the content as being text XML, but the XML then in turn does not identify which character encoding it uses for its node contents.
#5
@
18 years ago
The problem is specifically in the IXR_Server class:
function output($xml) { $xml = '<?xml version="1.0"?>'."\n".$xml; ...
Edit that to include the charset and you should be good.
#9
@
16 years ago
- Keywords needs-patch added
- Milestone changed from 2.9 to Future Release
- Type changed from defect (bug) to enhancement
#11
@
13 years ago
- Cc sergey.s.betke@… added
- Type changed from enhancement to defect (bug)
This bug still exists in WordPress 3.2.1. When PHP.ini doesn't have default_charset option, and http server used default charset different that get_option('blog_charset'), XML-RPC application (for example - Microsoft Live Writer, Microsoft Word) when reading the posts, tags, categories, don't recognize the content encoding.
Example - http://sergey-s-betke.blogs.novgaro.ru/wordpress-3-2-1-i-problema-s-utf-8-v-ajax-i-xmlrpc.
Patch tested - it's working properly.
#12
@
13 years ago
- Keywords dev-feedback added
I added the patch (class-IXR.php.diff) to the current WordPress kernel version. I ask the developers to make check-in.
#14
in reply to:
↑ 13
@
13 years ago
- Cc cappuccino.e.cornetto@… added
Replying to SergeyBiryukov:
Closed #20705 as a duplicate.
Right, it's a duplicate. But is there any reason for not fixing this issue?
I've just checked 3.4.1 and still no fix...
#16
@
13 years ago
This is technically a third-party library, but we've inherited it so we can tweak if needed. Just in case someone is trying to use this standalone, let's add a function_exists() for get_option().
#17
@
13 years ago
4794.3.diff only specifies encoding if get_option()
exists.
Isn't the encoding sent in the Content-type header even in xmlrpc.php?