Make WordPress Core

Opened 13 years ago

Closed 13 years ago

#20408 closed defect (bug) (duplicate)

wp_read_image_metadata wrongly assumes IPTC metadata is Latin1

Reported by: koke's profile koke Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.4
Component: Media Keywords:
Focuses: Cc:

Description

Some applications like Aperture, use UTF8 in the IPTC tags. When exported images are uploaded to WordPress, the title and description metadata has the wrong encoding

The problem is in wp_read_image_metadata, since it runs utf8_encode for every field even if it's already utf8

To check if the parsed iptc is already UTF-8:

$iptc_is_utf8 = isset( $iptc['1#090'] ) && "\x1B%G" == $iptc['1#090'];

Depending on that result, the tags should go through utf8_encode or left as they are

Attachments (1)

testiptc.php (245 bytes) - added by koke 13 years ago.
Test script: dump IPTC contents

Download all attachments as: .zip

Change History (5)

@koke
13 years ago

Test script: dump IPTC contents

#1 @koke
13 years ago

Test image: https://breakmeplease.files.wordpress.com/2012/04/dsc_4320.jpg

Title: Título

Description: Descripción

#2 @koke
13 years ago

Related: #7495

#3 @SergeyBiryukov
13 years ago

Duplicate of #9417?

Seems that $iptc['1#090'] marker may not always be present. test-image-iptc.jpg (with IPTC tags written by IrfanView 4.33) doesn't have it. In that case, seems_utf8() check looks more reliable.

Version 0, edited 13 years ago by SergeyBiryukov (next)

#4 @SergeyBiryukov
13 years ago

  • Keywords needs-patch removed
  • Milestone Awaiting Review deleted
  • Resolution set to duplicate
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.