Make WordPress Core

Opened 6 months ago

Last modified 6 months ago

#59691 new defect (bug)

WordPress doesn't sanitize character ʼ (unicode U+02BC) when converting post title to slug

Reported by: ivanzhuck's profile ivanzhuck Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 1.0
Component: Formatting Keywords: has-patch
Focuses: Cc:

Description

WordPress doesn't sanitize character ʼ (unicode U+02BC) when converting post title to slug

How to reproduce:

  1. Create a post with title "Phʼnglui mglwʼnafh Cthulhu Rʼlyeh wgahʼnagl fhtagn"
  2. Publish the post
  3. The post gets slug "phʼnglui-mglwʼnafh-cthulhu-rʼlyeh-wgahʼnagl-fhtagn" instead of "phnglui-mglwnafh-cthulhu-rlyeh-wgahnagl-fhtagn"

Attachments (1)

59691.diff (598 bytes) - added by ivanzhuck 6 months ago.

Download all attachments as: .zip

Change History (4)

#1 @swissspidy
6 months ago

  • Component changed from General to Formatting
  • Version changed from 6.3.2 to 1.0

#2 @ashikur698
6 months ago

Tried to recreate the same issue. The results are attached here
SC 1 - https://prnt.sc/Qc-iAbv4uiOA
SC 2 - https://prnt.sc/I64NUhsf-zO1

In SC 1, I copied the exact title and it seems like I successfully recreated the issue.

However, I tried to use the same character to recreate the issue like in SC 2. But it seems WordPress sanitizes the character just fine.

Maybe there's something going on with that exact text "Phʼnglui mglwʼnafh Cthulhu Rʼlyeh wgahʼnagl fhtagn".

Version 0, edited 6 months ago by ashikur698 (next)

This ticket was mentioned in PR #5541 on WordPress/wordpress-develop by @ivanzhuck.


6 months ago
#3

  • Keywords has-patch added

Added apostrophe character codes to the list of symbols which must be stripped from post URL slug

It replaces characters:

ʼ (U+02BC);
ˮ (U+02EE);
՚ (U+055A);
ߴ (U+07F5);
ߵ (U+07F4);
(U+FF07)

with empty string.

@ivanzhuck
6 months ago

Note: See TracTickets for help on using tickets.