Emoji: Port the Twemoji regex to PHP.
Previously, wp_encode_emoji()
and wp_staticize_emoji()
used inaccurate regular expressions to find emoji, and transform then into HTML entities or <img>
s, respectively. This would result in emoji not being correctly transformed, or occasionally, non-emoji being incorrectly transformed.
This commit adds a new grunt
task - grunt precommit:emoji
. It finds the regex in twemoji.js
, transforms it into a PHP-friendly version, and adds it to formatting.php
. This task is also automatically run by grunt precommit
, when it detects that twemoji.js
has changed.
The new regex requires features introduced in PCRE 8.32, which was introduced in PHP 5.4.14, though it was also backported to later releases of the PHP 5.3 series. For versions of PHP that don't support this, it will fall back to an updated version of the loose-matching regex.
For short posts, the performance difference between the old and new regex is negligible. As the posts get longer, however, the new method is exponentially faster.
Fixes #35293.