Make WordPress Core

Opened 11 months ago

Last modified 11 months ago

#59691 new defect (bug)

WordPress doesn't sanitize character ʼ (unicode U+02BC) when converting post title to slug

Reported by: ivanzhuck's profile ivanzhuck Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 1.0
Component: Formatting Keywords: has-patch
Focuses: Cc:

Description

WordPress doesn't sanitize character ʼ (unicode U+02BC) when converting post title to slug

How to reproduce:

  1. Create a post with title "Phʼnglui mglwʼnafh Cthulhu Rʼlyeh wgahʼnagl fhtagn"
  2. Publish the post
  3. The post gets slug "phʼnglui-mglwʼnafh-cthulhu-rʼlyeh-wgahʼnagl-fhtagn" instead of "phnglui-mglwnafh-cthulhu-rlyeh-wgahnagl-fhtagn"

Attachments (1)

59691.diff (598 bytes) - added by ivanzhuck 11 months ago.

Download all attachments as: .zip

Change History (4)

#1 @swissspidy
11 months ago

  • Component changed from General to Formatting
  • Version changed from 6.3.2 to 1.0

#2 @ashikur698
11 months ago

Tried to recreate the same issue. The results are attached here
SC 1 - https://prnt.sc/Qc-iAbv4uiOA
SC 2 - https://prnt.sc/I64NUhsf-zO1

In SC 1, I copied the exact title and it seems like I successfully recreated the issue.

However, I tried to use the same character to recreate the issue like in SC 2. But it seems WordPress sanitizes the character just fine.

Maybe there's something going on with that exact text "Phʼnglui mglwʼnafh Cthulhu Rʼlyeh wgahʼnagl fhtagn".

Version 0, edited 11 months ago by ashikur698 (next)

This ticket was mentioned in PR #5541 on WordPress/wordpress-develop by @ivanzhuck.


11 months ago
#3

  • Keywords has-patch added

Added apostrophe character codes to the list of symbols which must be stripped from post URL slug

It replaces characters:

ʼ (U+02BC);
ˮ (U+02EE);
՚ (U+055A);
ߴ (U+07F5);
ߵ (U+07F4);
(U+FF07)

with empty string.

@ivanzhuck
11 months ago

Note: See TracTickets for help on using tickets.