#20408 closed defect (bug) (duplicate)

wp_read_image_metadata wrongly assumes IPTC metadata is Latin1

Reported by: koke Owned by:
Priority: normal Milestone:
Component: Media Version: 3.4
Severity: normal Keywords:
Cc:

Description

Some applications like Aperture, use UTF8 in the IPTC tags. When exported images are uploaded to WordPress, the title and description metadata has the wrong encoding

The problem is in wp_read_image_metadata, since it runs utf8_encode for every field even if it's already utf8

To check if the parsed iptc is already UTF-8:

$iptc_is_utf8 = isset( $iptc['1#090'] ) && "\x1B%G" == $iptc['1#090'];

Depending on that result, the tags should go through utf8_encode or left as they are

Attachments (1)

testiptc.php (245 bytes) - added by koke 14 months ago.
Test script: dump IPTC contents

Download all attachments as: .zip

Change History (5)

koke14 months ago

Test script: dump IPTC contents

Test image: https://breakmeplease.files.wordpress.com/2012/04/dsc_4320.jpg

Title: Título

Description: Descripción

Related: #7495

Duplicate of #9417?

Seems that $iptc['1#090'] marker may not always be present. test-image-iptc.jpg (with IPTC tags written by IrfanView 4.33) doesn't have it. In that case, seems_utf8() check looks more reliable.

Version 0, edited 14 months ago by SergeyBiryukov (next)
  • Keywords needs-patch removed
  • Milestone Awaiting Review deleted
  • Resolution set to duplicate
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.