WordPress.org

Make WordPress Core

Opened 3 years ago

Last modified 3 years ago

#33787 new defect (bug)

Euro hex code being converted to smiley url

Reported by: UmeshSingla Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 4.2
Component: Formatting Keywords: has-patch needs-unit-tests
Focuses: Cc:

Description

While using a hex code for euro symbol "€" in a HTML email, wp_staticize_emoji_for_email filter tries to convert it into a smiley url, which doesn't exist, and the symbol appears broken in the email.

Screenshot of email, Euro symbol being converted to false emoji url

To test this out, here is a small plugin.

Updating the regex in /wp-includes/formatting.php https://core.trac.wordpress.org/browser/trunk/src/wp-includes/formatting.php#L4530, fixes the issue.

Attachments (4)

33787.diff (457 bytes) - added by UmeshSingla 3 years ago.
Updated regex to match the emoji hex code
euro-to-emoji url.png (98.2 KB) - added by UmeshSingla 3 years ago.
Screenshot of email, Euro symbol being converted to false emoji url
wp-test-email.php (1.2 KB) - added by UmeshSingla 3 years ago.
Plugin to test the issue
Euro sign converted to Emoji URL ‹ DilliBoss Site — WordPress 2015-09-09 15-13-26.png (45.5 KB) - added by UmeshSingla 3 years ago.
Using the plugin to send test email

Download all attachments as: .zip

Change History (8)

@UmeshSingla
3 years ago

Updated regex to match the emoji hex code

@UmeshSingla
3 years ago

Screenshot of email, Euro symbol being converted to false emoji url

@UmeshSingla
3 years ago

Plugin to test the issue

#1 @johnbillion
3 years ago

  • Component changed from Mail to Formatting
  • Keywords has-patch needs-unit-tests added
  • Milestone changed from Awaiting Review to Future Release
  • Summary changed from Euro hex code being converted to smiley url in html email to Euro hex code being converted to smiley url
  • Version changed from 4.3 to 4.2

Thanks for the patch, Umesh!

This affects any text that gets passed through wp_staticize_emoji(), including HTML emails and post content in RSS feeds.

The wp_staticize_emoji() function needs unit tests.

#2 @samuelsidler
3 years ago

I'm sure @pento is going to enjoy this.

#3 @pento
3 years ago

Good times.

Ideally, we should be using the same regex as in twemoji.js, but PHP only supports the \u style code points from PHP 7.

Writing something to convert the regex to \x notation isn't too tricky, but there aren't any options for keeping them is sync that excite me.

@azaozz - Do you have thoughts on syncing methods?

#4 @azaozz
3 years ago

Ideally, we should be using the same regex as in twemoji.js

You mean this one: https://github.com/twitter/twemoji/blob/gh-pages/twemoji.js#L236? It is generated from http://www.unicode.org/Public/UNIDATA/EmojiSources.txt. We are matching (hex) HTML entities as strings, can probably just get the first column (as array), and do something like:

'/&#x(' . implode( '|', $codepoints ) . ');/i';

However that will generate pretty long regexp. Not sure if PCRE (all the versions used in different PHP versions) is optimized to handle it. We had problems with that in Chrome which were fixed by optimizing the actual regexp (by the browser) before running it.

Last edited 3 years ago by azaozz (previous) (diff)
Note: See TracTickets for help on using tickets.