Opened 4 years ago
Closed 20 months ago
#10797 closed defect (bug) (fixed)
curly quotes not stripped out slugs
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Priority: | normal | Milestone: | 3.3 |
| Component: | Permalinks | Version: | 2.9 |
| Severity: | normal | Keywords: | has-patch |
| Cc: | info@…, jeremy@… |
Description
Curly single- and double-quotes are percent-encoded instead of stripped out when creating slugs for post titles, categories, or tags.
To recreate, add a new post with the title: Procol Harum’s “A Whiter Shade of Pale”
expected slug: procol-harums-a-whiter-shade-of-pale
actual slug: procol-harum%e2%80%99s-%e2%80%9ca-whiter-shade-of-pale%e2%80%9d
Attachments (8)
Change History (32)
comment:2
nacin
— 4 years ago
I'm all for the fix, but it should be str_replace instead of preg_replace, no?
comment:3
lloydbudd
— 4 years ago
- Component changed from General to Permalinks
- Keywords developer-feedback submitter-feedback added; has-patch removed
- Owner set to ryan
No version or svn revision info included, please always set and include in description as well for bugs.
http://codex.wordpress.org/Reporting_Bugs#Reporting_a_Bug
This is not a newly introduced issue (regression), updating milestone to 3.0 as too late for existing, lower impact issues.
Please see #9591, which I believe this strongly relates to -- better to work on a more cohesive solution, then one offs.
comment:6
Denis-de-Bernardy
— 3 years ago
- Milestone 3.0 deleted
- Resolution set to duplicate
- Status changed from new to closed
SergeyBiryukov
— 23 months ago
comment:7
SergeyBiryukov
— 23 months ago
- Keywords has-patch added; developer-feedback reporter-feedback removed
- Milestone set to Awaiting Review
- Resolution duplicate deleted
- Status changed from closed to reopened
Closed #16036 as duplicate.
I guess sanitize_title_with_dashes() is a more appropriate function for such replacements than remove_accents(), which handles i18n-related improvements.
So I suggest to reopen that ticket.
Refreshed the patch for 3.3.
comment:8
SergeyBiryukov
— 22 months ago
- Milestone changed from Awaiting Review to 3.3
This falls under "Finally fix the issues relating to special characters in permalinks using an upgrade routine" from 3.3 scope.
Not sure if an upgrade routine is necessary here. Old permalinks (with curly quotes) still work after the patch.
SergeyBiryukov
— 22 months ago
comment:9
SergeyBiryukov
— 22 months ago
Moved sanitization to sanitize_title(), as per today's dev chat.
comment:10
scribu
— 22 months ago
Actually, the best place would be remove_accents().
comment:11
SergeyBiryukov
— 22 months ago
I guess remove_accents() should only handle international characters.
comment:12
follow-up:
↓ 14
scribu
— 22 months ago
Yeah, you're right, but the rest of the special chars are handled elsewhere and hooked into 'sanitize_title'.
I'm not sure why nacin said we shouldn't rely on the filter.
comment:13
scribu
— 22 months ago
Oh yeah. Because it wouldn't have the 'save' context check.
comment:14
in reply to:
↑ 12
;
follow-up:
↓ 15
nacin
— 22 months ago
Replying to scribu:
Yeah, you're right, but the rest of the special chars are handled elsewhere and hooked into 'sanitize_title'.
I'm not sure why nacin said we shouldn't rely on the filter.
I was referring to something else.
This logic probably makes the most sense in sanitize_title_with_dashes. We can alter the add_filter() call to let the context arg be passed in.
SergeyBiryukov
— 22 months ago
comment:15
in reply to:
↑ 14
SergeyBiryukov
— 22 months ago
Replying to nacin:
This logic probably makes the most sense in sanitize_title_with_dashes. We can alter the add_filter() call to let the context arg be passed in.
Done in 10797.3.patch.
comment:16
nacin
— 22 months ago
- Keywords needs-unit-tests added
Looks good. Let's get some unit tests.
SergeyBiryukov
— 22 months ago
SergeyBiryukov
— 22 months ago
comment:17
SergeyBiryukov
— 22 months ago
Based on the tickets linked from #9591, 10797.5.patch fixes:
- #3206: ¡ and ¿ should get stripped
- #8765: Strip ° characters from permalink
- ticket:9591:30: Guillemets
- #12956: © and ™ not stripped from sanitize_title
SergeyBiryukov
— 22 months ago
comment:18
SergeyBiryukov
— 22 months ago
- #15768: Unencoded non-breaking spaces in titles don't get sanitized properly
comment:19
ampt
— 21 months ago
Add unit tests for the latest patch
comment:20
nacin
— 21 months ago
In [18705]:
comment:21
nacin
— 21 months ago
:-) Thanks ampt! Things look great.
Before patch: Tests: 14, Assertions: 15, Failures: 6.
After patch: OK (14 tests, 22 assertions)
comment:22
toscho
— 20 months ago
- Cc info@… added
comment:23
hd-J
— 20 months ago
- Cc jeremy@… added
comment:24
nacin
— 20 months ago
- Keywords needs-unit-tests removed
- Resolution set to fixed
- Status changed from reopened to closed
Think this one is done.
patch to sanitize_title_with_dashes() in formatting.php, replaces curly single- and double- quotes and en- and em-dashes with empty string