Opened 5 years ago
Closed 4 years ago
#49791 closed defect (bug) (fixed)
sanitize title should filter out the bullet • punctuation mark
Reported by: | veromary | Owned by: | SergeyBiryukov |
---|---|---|---|
Milestone: | 5.5 | Priority: | normal |
Severity: | normal | Version: | 5.4 |
Component: | Formatting | Keywords: | has-patch has-unit-tests |
Focuses: | Cc: |
Description
Some people use bullet points in their titles as decorative punctuation.
"Fancy Title • Amazing"
This bullet point is passed into the page slug which breaks some applications (like facebook links)
fancy-title-•-amazing
which would be better rendered as
fancy-title-amazing
In formatting.php this bullet character should be added to the list of special_chars to filter out.
Attachments (3)
Change History (10)
#1
@
5 years ago
- Keywords has-patch added
Hello @veromary. Thank you for submitting this report. I've created a patch that adds the bullet characters to the filter list. I agree that these should not be present in permalinks.
#3
@
5 years ago
- Keywords needs-unit-tests added
- Milestone changed from Awaiting Review to 5.5
- Owner set to SergeyBiryukov
- Status changed from new to reviewing
@
5 years ago
Removes all bullet characters that are categorized as punctuation or geometric shapes.
#4
follow-up:
↓ 6
@
5 years ago
After looking into Unicode punctuation a bit more I added a second patch that removes four more characters. All seven are bullets categorized as punctuation or geometric shapes.
There are also a number of bullet characters that are mathematical operators/symbols, and a some in other categories. I did not include these, but can add them if we decide so.
In general, there are probably thousands of characters in the Unicode standard that are "not ideal" to include in slugs/URLs. Is there a standard approach to dealing with this in core?
#6
in reply to:
↑ 4
@
4 years ago
Replying to roytanck:
In general, there are probably thousands of characters in the Unicode standard that are "not ideal" to include in slugs/URLs. Is there a standard approach to dealing with this in core?
The list in sanitize_title_with_dashes()
includes some commonly used characters and is expanded on a case-by-case basis by specific requests. Since the ticket was initially about just one bullet character, let's go with that for now and expand further if there are any more requests.
Adds the bullet, white bullet and inverse bullet characters to the filter list in sanitize_title_with_dashes().