WordPress.org

Make WordPress Core

Opened 2 years ago

Closed 2 years ago

#19820 closed defect (bug) (fixed)

More post slug cleaning

Reported by: honza.skypala Owned by: nacin
Milestone: 3.4 Priority: normal
Severity: normal Version: 3.3.1
Component: Formatting Keywords: has-patch commit
Focuses: Cc:

Description

Hi there,

removing a number of funky characters from post slugs in WP 3.3 is a great improvement, I suggest two more characters:

  • „ (lower curly quotes, used by some european languages) - remove from slug
  • × (multiplay sign) - substitute with letter x

Thx. Take care,

Honza

Attachments (4)

19820.patch (802 bytes) - added by SergeyBiryukov 2 years ago.
19820.2.patch (1.1 KB) - added by kurtpayne 2 years ago.
Fixes a unit test case
19820_unit_test.patch (1.2 KB) - added by kurtpayne 2 years ago.
Unit test for quote entities and multiply sign
19820.3.patch (797 bytes) - added by kurtpayne 2 years ago.
Removed the patch that belongs in 10823

Download all attachments as: .zip

Change History (14)

SergeyBiryukov2 years ago

comment:1 SergeyBiryukov2 years ago

  • Component changed from General to Formatting
  • Keywords has-patch needs-unit-tests added
  • Milestone changed from Awaiting Review to 3.4

19820.patch adds the remaining quotation marks and &times sign to the replacement list.

kurtpayne2 years ago

Fixes a unit test case

kurtpayne2 years ago

Unit test for quote entities and multiply sign

comment:2 follow-up: kurtpayne2 years ago

  • Cc kpayne@… added
  • Keywords needs-unit-tests removed

19820.2.patch fixes this test case:

$this->assertEquals("one-two", sanitize_title_with_dashes("One & Two;", '', 'save'));

Output was:

There was 1 failure:

1) TestSanitizeTitleWithDashes::test_strips_entities
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'one-two'
+'one'

This change is unrelated to SergeyBiryukov's patch, but is related to the request for unit tests.

SergeyBiryukov's patch looks good.

comment:3 in reply to: ↑ 2 SergeyBiryukov2 years ago

Replying to kurtpayne:

19820.2.patch fixes this test case:

$this->assertEquals("one-two", sanitize_title_with_dashes("One & Two;", '', 'save'));

This test case doesn't fail anymore, seems to be fixed elsewhere.

Committed unit tests: [UT688]

comment:4 nacin2 years ago

The test is marked as skipped against #10823.

php wp-test.php -p -f -t TestSanitizeTitleWithDashes
.......F......

Time: 1 second, Memory: 66.75Mb

There was 1 failure:

1) TestSanitizeTitleWithDashes::test_strips_entities
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'one-two'
+'one'

/Users/nacin/Sites/tests/wp-testcase/test_includes_formatting.php:1031
/Users/nacin/Sites/tests/wp-testlib/base.php:604
/Users/nacin/Sites/tests/wp-test.php:209

comment:5 follow-ups: nacin2 years ago

Is the change from . to \S intentional/necessary? We have found that \s (and therefore I imagine \S) incorrectly targets some letters in other languages or elsewhere in UTF8. A good example to test for is the Hebrew letter nun.

comment:6 in reply to: ↑ 5 SergeyBiryukov2 years ago

Replying to nacin:

The test is marked as skipped against #10823.

Thanks, didn't notice that. There are some alternative suggestions for the regex, we should probably leave this change for that ticket.

comment:7 in reply to: ↑ 5 kurtpayne2 years ago

Replying to nacin:

Is the change from . to \S intentional/necessary? We have found that \s (and therefore I imagine \S) incorrectly targets some letters in other languages or elsewhere in UTF8. A good example to test for is the Hebrew letter nun.

It was intentional at the time (I didn't know about #10823). A patch for ampersand handling belongs in #10823, though.

kurtpayne2 years ago

Removed the patch that belongs in 10823

comment:8 nacin2 years ago

  • Keywords commit added

comment:9 kurtpayne2 years ago

For clarity: 19820.patch (SergeyBiryukov) and 19820.3.patch (my latest one, with the patch for 10823 removed) are the same thing. Please disregard core patches from me on this ticket.

comment:10 nacin2 years ago

  • Owner set to nacin
  • Resolution set to fixed
  • Status changed from new to closed

In [20686]:

Add characters to be stripped or replaced in sanitize_title_with_dashes().

  • Replace times (multiplication sign) with x.
  • Strip low quotation marks and other curly quotes we don't already deal with.

props SergeyBiryukov. fixes #19820.

Note: See TracTickets for help on using tickets.