Make WordPress Core

Opened 5 years ago

Closed 4 years ago

#49791 closed defect (bug) (fixed)

sanitize title should filter out the bullet • punctuation mark

Reported by: veromary's profile veromary Owned by: sergeybiryukov's profile SergeyBiryukov
Milestone: 5.5 Priority: normal
Severity: normal Version: 5.4
Component: Formatting Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

Some people use bullet points in their titles as decorative punctuation.

"Fancy Title • Amazing"

This bullet point is passed into the page slug which breaks some applications (like facebook links)

fancy-title-•-amazing

which would be better rendered as

fancy-title-amazing

In formatting.php this bullet character should be added to the list of special_chars to filter out.

Attachments (3)

49791.diff (433 bytes) - added by roytanck 5 years ago.
Adds the bullet, white bullet and inverse bullet characters to the filter list in sanitize_title_with_dashes().
49791-2.diff (489 bytes) - added by roytanck 5 years ago.
Removes all bullet characters that are categorized as punctuation or geometric shapes.
49791.2.diff (1.9 KB) - added by deepaklalwani 4 years ago.
Adds unit test cases

Download all attachments as: .zip

Change History (10)

@roytanck
5 years ago

Adds the bullet, white bullet and inverse bullet characters to the filter list in sanitize_title_with_dashes().

#1 @roytanck
5 years ago

  • Keywords has-patch added

Hello @veromary. Thank you for submitting this report. I've created a patch that adds the bullet characters to the filter list. I agree that these should not be present in permalinks.

#2 @veromary
5 years ago

Thanks @roytanck !
Hope that makes it into the next release.

#3 @SergeyBiryukov
5 years ago

  • Keywords needs-unit-tests added
  • Milestone changed from Awaiting Review to 5.5
  • Owner set to SergeyBiryukov
  • Status changed from new to reviewing

@roytanck
5 years ago

Removes all bullet characters that are categorized as punctuation or geometric shapes.

#4 follow-up: @roytanck
5 years ago

After looking into Unicode punctuation a bit more I added a second patch that removes four more characters. All seven are bullets categorized as punctuation or geometric shapes.

There are also a number of bullet characters that are mathematical operators/symbols, and a some in other categories. I did not include these, but can add them if we decide so.

In general, there are probably thousands of characters in the Unicode standard that are "not ideal" to include in slugs/URLs. Is there a standard approach to dealing with this in core?

Last edited 5 years ago by roytanck (previous) (diff)

@deepaklalwani
4 years ago

Adds unit test cases

#5 @deepaklalwani
4 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

#6 in reply to: ↑ 4 @SergeyBiryukov
4 years ago

Replying to roytanck:

In general, there are probably thousands of characters in the Unicode standard that are "not ideal" to include in slugs/URLs. Is there a standard approach to dealing with this in core?

The list in sanitize_title_with_dashes() includes some commonly used characters and is expanded on a case-by-case basis by specific requests. Since the ticket was initially about just one bullet character, let's go with that for now and expand further if there are any more requests.

#7 @SergeyBiryukov
4 years ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

In 48593:

Formatting: Filter out the bullet character in sanitize_title_with_dashes().

Props roytanck, deepaklalwani, veromary.
Fixes #49791.

Note: See TracTickets for help on using tickets.