#21903 closed defect (bug) (duplicate)
UTF-8 encoded image caption processed incorrectly
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | |
Component: | Media | Keywords: | has-patch |
Focuses: | Cc: |
Description
utf8_encode is always run on UTF-8 encoded image captions, which destroys image caption in UTF-8 encoding.
An tentative patch is included (not tested for non UTF-8 encoded contents).
Attachments (1)
Change History (4)
#1
@
13 years ago
- Component changed from Administration to Media
- Keywords needs-unit-tests added
utf8_encode() makes sense when going from ISO-8859-1 to UTF-8. You're right that there is an escape sequence in the IPTC standard to mark that encoding is UTF-8, and that we currently don't check it. It would be helpful if "#090" and "\x1B%G" is fully explained.
Also, for this, we are going to want some unit tests with an image with metadata encoded with UTF-8.
#2
@
13 years ago
- Keywords has-patch added; needs-unit-tests removed
- Milestone Awaiting Review deleted
- Resolution set to duplicate
- Status changed from new to closed
#3
@
13 years ago
- Cc chenxing added
I don't know if there is a reliable source. I got it from here:
http://php.net/manual/en/function.iptcparse.php#105025
Otherwise maybe we can try seems_utf8.
Note: See
TracTickets for help on using
tickets.
tentative patch