Make WordPress Core

Opened 10 years ago

Closed 6 years ago

#33787 closed defect (bug) (worksforme)

Euro hex code being converted to smiley url

Reported by: umeshsingla's profile UmeshSingla Owned by:
Milestone: Priority: normal
Severity: normal Version: 4.2
Component: Emoji Keywords: has-patch needs-unit-tests
Focuses: Cc:

Description

While using a hex code for euro symbol "€" in a HTML email, wp_staticize_emoji_for_email filter tries to convert it into a smiley url, which doesn't exist, and the symbol appears broken in the email.

Screenshot of email, Euro symbol being converted to false emoji url

To test this out, here is a small plugin.

Updating the regex in /wp-includes/formatting.php https://core.trac.wordpress.org/browser/trunk/src/wp-includes/formatting.php#L4530, fixes the issue.

Attachments (4)

33787.diff (457 bytes) - added by UmeshSingla 10 years ago.
Updated regex to match the emoji hex code
euro-to-emoji url.png (98.2 KB) - added by UmeshSingla 10 years ago.
Screenshot of email, Euro symbol being converted to false emoji url
wp-test-email.php (1.2 KB) - added by UmeshSingla 10 years ago.
Plugin to test the issue
Euro sign converted to Emoji URL ‹ DilliBoss Site — WordPress 2015-09-09 15-13-26.png (45.5 KB) - added by UmeshSingla 10 years ago.
Using the plugin to send test email

Download all attachments as: .zip

Change History (9)

@UmeshSingla
10 years ago

Updated regex to match the emoji hex code

@UmeshSingla
10 years ago

Screenshot of email, Euro symbol being converted to false emoji url

@UmeshSingla
10 years ago

Plugin to test the issue

#1 @johnbillion
10 years ago

  • Component changed from Mail to Formatting
  • Keywords has-patch needs-unit-tests added
  • Milestone changed from Awaiting Review to Future Release
  • Summary changed from Euro hex code being converted to smiley url in html email to Euro hex code being converted to smiley url
  • Version changed from 4.3 to 4.2

Thanks for the patch, Umesh!

This affects any text that gets passed through wp_staticize_emoji(), including HTML emails and post content in RSS feeds.

The wp_staticize_emoji() function needs unit tests.

#2 @samuelsidler
10 years ago

I'm sure @pento is going to enjoy this.

#3 @pento
10 years ago

Good times.

Ideally, we should be using the same regex as in twemoji.js, but PHP only supports the \u style code points from PHP 7.

Writing something to convert the regex to \x notation isn't too tricky, but there aren't any options for keeping them is sync that excite me.

@azaozz - Do you have thoughts on syncing methods?

#4 @azaozz
10 years ago

Ideally, we should be using the same regex as in twemoji.js

You mean this one: https://github.com/twitter/twemoji/blob/gh-pages/twemoji.js#L236? It is generated from http://www.unicode.org/Public/UNIDATA/EmojiSources.txt. We are matching (hex) HTML entities as strings, can probably just get the first column (as array), and do something like:

'/&#x(' . implode( '|', $codepoints ) . ');/i';

However that will generate pretty long regexp. Not sure if PCRE (all the versions used in different PHP versions) is optimized to handle it. We had problems with that in Chrome which were fixed by optimizing the actual regexp (by the browser) before running it.

Last edited 10 years ago by azaozz (previous) (diff)

#5 @pento
6 years ago

  • Component changed from Formatting to Emoji
  • Resolution set to worksforme
  • Status changed from new to closed

wp_staticize_emoji_for_email() was fixed ages ago, it now correctly replaces based on the list of emoji that Twemoji supports.

I've confirmed that the original bug no longer occurs.

Note: See TracTickets for help on using tickets.