WordPress.org

Make WordPress Core

Opened 8 weeks ago

Last modified 8 weeks ago

#49791 reviewing defect (bug)

sanitize title should filter out the bullet • punctuation mark

Reported by: veromary Owned by: SergeyBiryukov
Milestone: 5.5 Priority: normal
Severity: normal Version: 5.4
Component: Formatting Keywords: has-patch needs-unit-tests
Focuses: Cc:

Description

Some people use bullet points in their titles as decorative punctuation.

"Fancy Title • Amazing"

This bullet point is passed into the page slug which breaks some applications (like facebook links)

fancy-title-•-amazing

which would be better rendered as

fancy-title-amazing

In formatting.php this bullet character should be added to the list of special_chars to filter out.

Attachments (2)

49791.diff (433 bytes) - added by roytanck 8 weeks ago.
Adds the bullet, white bullet and inverse bullet characters to the filter list in sanitize_title_with_dashes().
49791-2.diff (489 bytes) - added by roytanck 8 weeks ago.
Removes all bullet characters that are categorized as punctuation or geometric shapes.

Download all attachments as: .zip

Change History (6)

@roytanck
8 weeks ago

Adds the bullet, white bullet and inverse bullet characters to the filter list in sanitize_title_with_dashes().

#1 @roytanck
8 weeks ago

  • Keywords has-patch added

Hello @veromary. Thank you for submitting this report. I've created a patch that adds the bullet characters to the filter list. I agree that these should not be present in permalinks.

#2 @veromary
8 weeks ago

Thanks @roytanck !
Hope that makes it into the next release.

#3 @SergeyBiryukov
8 weeks ago

  • Keywords needs-unit-tests added
  • Milestone changed from Awaiting Review to 5.5
  • Owner set to SergeyBiryukov
  • Status changed from new to reviewing

@roytanck
8 weeks ago

Removes all bullet characters that are categorized as punctuation or geometric shapes.

#4 @roytanck
8 weeks ago

After looking into Unicode punctuation a bit more I added a second patch that removes four more characters. All seven are bullets categorized as punctuation or geometric shapes.

There are also a number of bullet characters that are mathematical operators/symbols, and a some in other categories. I did not include these, but can add them if we decide so.

In general, there are probably thousands of characters in the Unicode standard that are "not ideal" to include in slugs/URLs. Is there a standard approach to dealing with this in core?

Last edited 8 weeks ago by roytanck (previous) (diff)
Note: See TracTickets for help on using tickets.