WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 2 years ago

#10797 closed defect (bug) (fixed)

curly quotes not stripped out slugs

Reported by: alxndr Owned by: ryan
Milestone: 3.3 Priority: normal
Severity: normal Version: 2.9
Component: Permalinks Keywords: has-patch
Focuses: Cc:

Description

Curly single- and double-quotes are percent-encoded instead of stripped out when creating slugs for post titles, categories, or tags.

To recreate, add a new post with the title: Procol Harum’s “A Whiter Shade of Pale”
expected slug: procol-harums-a-whiter-shade-of-pale
actual slug: procol-harum%e2%80%99s-%e2%80%9ca-whiter-shade-of-pale%e2%80%9d

Attachments (8)

slug-curlyquotes-dashes.patch (600 bytes) - added by alxndr 5 years ago.
patch to sanitize_title_with_dashes() in formatting.php, replaces curly single- and double- quotes and en- and em-dashes with empty string
10797.patch (1.0 KB) - added by SergeyBiryukov 3 years ago.
10797.2.patch (1019 bytes) - added by SergeyBiryukov 3 years ago.
10797.3.patch (2.4 KB) - added by SergeyBiryukov 3 years ago.
10797.4.patch (2.5 KB) - added by SergeyBiryukov 3 years ago.
10797.5.patch (2.8 KB) - added by SergeyBiryukov 3 years ago.
10797.6.patch (2.8 KB) - added by SergeyBiryukov 3 years ago.
10797.tests.patch (2.2 KB) - added by ampt 3 years ago.
unit tests

Download all attachments as: .zip

Change History (32)

alxndr5 years ago

patch to sanitize_title_with_dashes() in formatting.php, replaces curly single- and double- quotes and en- and em-dashes with empty string

comment:1 scribu5 years ago

  • Keywords has-patch added
  • Milestone changed from Unassigned to 2.9

comment:2 nacin4 years ago

I'm all for the fix, but it should be str_replace instead of preg_replace, no?

comment:3 lloydbudd4 years ago

  • Component changed from General to Permalinks
  • Keywords developer-feedback submitter-feedback added; has-patch removed
  • Owner set to ryan

No version or svn revision info included, please always set and include in description as well for bugs.
http://codex.wordpress.org/Reporting_Bugs#Reporting_a_Bug

This is not a newly introduced issue (regression), updating milestone to 3.0 as too late for existing, lower impact issues.

Please see #9591, which I believe this strongly relates to -- better to work on a more cohesive solution, then one offs.

comment:4 lloydbudd4 years ago

  • Milestone changed from 2.9 to 3.0
  • Version set to 2.9

comment:5 lloydbudd4 years ago

  • Keywords reporter-feedback added; submitter-feedback removed

comment:6 Denis-de-Bernardy4 years ago

  • Milestone 3.0 deleted
  • Resolution set to duplicate
  • Status changed from new to closed

SergeyBiryukov3 years ago

comment:7 SergeyBiryukov3 years ago

  • Keywords has-patch added; developer-feedback reporter-feedback removed
  • Milestone set to Awaiting Review
  • Resolution duplicate deleted
  • Status changed from closed to reopened

Closed #16036 as duplicate.

I guess sanitize_title_with_dashes() is a more appropriate function for such replacements than remove_accents(), which handles i18n-related improvements.

So I suggest to reopen this ticket.

Refreshed the patch for 3.3.

Last edited 3 years ago by SergeyBiryukov (previous) (diff)

comment:8 SergeyBiryukov3 years ago

  • Milestone changed from Awaiting Review to 3.3

This falls under "Finally fix the issues relating to special characters in permalinks using an upgrade routine" from 3.3 scope.

Not sure if an upgrade routine is necessary here. Old permalinks (with curly quotes) still work after the patch.

SergeyBiryukov3 years ago

comment:9 SergeyBiryukov3 years ago

Moved sanitization to sanitize_title(), as per today's dev chat.

comment:10 scribu3 years ago

Actually, the best place would be remove_accents().

comment:11 SergeyBiryukov3 years ago

I guess remove_accents() should only handle international characters.

comment:12 follow-up: scribu3 years ago

Yeah, you're right, but the rest of the special chars are handled elsewhere and hooked into 'sanitize_title'.

I'm not sure why nacin said we shouldn't rely on the filter.

comment:13 scribu3 years ago

Oh yeah. Because it wouldn't have the 'save' context check.

comment:14 in reply to: ↑ 12 ; follow-up: nacin3 years ago

Replying to scribu:

Yeah, you're right, but the rest of the special chars are handled elsewhere and hooked into 'sanitize_title'.

I'm not sure why nacin said we shouldn't rely on the filter.

I was referring to something else.

This logic probably makes the most sense in sanitize_title_with_dashes. We can alter the add_filter() call to let the context arg be passed in.

SergeyBiryukov3 years ago

comment:15 in reply to: ↑ 14 SergeyBiryukov3 years ago

Replying to nacin:

This logic probably makes the most sense in sanitize_title_with_dashes. We can alter the add_filter() call to let the context arg be passed in.

Done in 10797.3.patch.

comment:16 nacin3 years ago

  • Keywords needs-unit-tests added

Looks good. Let's get some unit tests.

SergeyBiryukov3 years ago

SergeyBiryukov3 years ago

comment:17 SergeyBiryukov3 years ago

Based on the tickets linked from #9591, 10797.5.patch fixes:

  • #3206: ¡ and ¿ should get stripped
  • #8765: Strip ° characters from permalink
  • ticket:9591:30: Guillemets
  • #12956: © and ™ not stripped from sanitize_title

SergeyBiryukov3 years ago

comment:18 SergeyBiryukov3 years ago

  • #15768: Unencoded non-breaking spaces in titles don't get sanitized properly

ampt3 years ago

unit tests

comment:19 ampt3 years ago

Add unit tests for the latest patch

comment:20 nacin3 years ago

In [18705]:

Strip a number of special characters in sanitize_title_with_dashes on save. Includes quotes (curly, angle), dashes, marks, etc. props SergeyBiryukov. props ampt for the unit tests in [UT438]. see #10797.

comment:21 nacin3 years ago

:-) Thanks ampt! Things look great.

Before patch: Tests: 14, Assertions: 15, Failures: 6.

After patch: OK (14 tests, 22 assertions)

http://unit-tests.trac.wordpress.org/changeset/438

comment:22 toscho2 years ago

  • Cc info@… added

comment:23 hd-J2 years ago

  • Cc jeremy@… added

comment:24 nacin2 years ago

  • Keywords needs-unit-tests removed
  • Resolution set to fixed
  • Status changed from reopened to closed

Think this one is done.

Note: See TracTickets for help on using tickets.