Make WordPress Core

Opened 9 years ago

Closed 8 years ago

#31790 closed defect (bug) (fixed)

sanitize_title_with_dashes() has issues with non breaking space (  and  )

Reported by: michelski's profile michelski Owned by: sergeybiryukov's profile SergeyBiryukov
Milestone: 4.5 Priority: normal
Severity: normal Version: 4.1.1
Component: Formatting Keywords: has-patch has-unit-tests needs-refresh needs-testing
Focuses: Cc:

Description

This is a follow-up to #13623. Maybe related to #31499 and #19292.

If you have a   oder   (non breaking space) in between the words in your title, the slug won't get converted into a dash, instead the words before and after the nbsp are merged.

So, if you have the title

Die Crux mit Blogsystem und Wordpress

or

Die Crux mit Blogsystem und Wordpress

the permalink is created to

die-crux-mit-blogsystem-undwordpress

and not

die-crux-mit-blogsystem-und-wordpress

If you just put a " " in the title, it works fine.

I have struggle with this since I imported some postings from a Textpattern blog (some years ago, thinking of 2009) which had a function named no_widow turned on. This had put the   in the titles.

Attachments (1)

31790.diff (2.5 KB) - added by polevaultweb 8 years ago.

Download all attachments as: .zip

Change History (7)

#1 @swissspidy
8 years ago

  • Component changed from Permalinks to Formatting
  • Keywords needs-patch added
  • Milestone changed from Awaiting Review to Future Release
  • Summary changed from Slug/Permalink has issues with non breaking space (  and  ) to sanitize_title_with_dashes() has issues with non breaking space (  and  )

This bug still exists in trunk.

The problem is this line in sanitize_title_with_dashes():

$title = preg_replace('/&.+?;/', '', $title); // kill entities

A bit further down, we'll see:

// Convert nbsp, ndash and mdash to hyphens
$title = str_replace( array( '%c2%a0', '%e2%80%93', '%e2%80%94' ), '-', $title );

This should be extended to replace the characters mentioned here.

@polevaultweb
8 years ago

#2 @polevaultweb
8 years ago

  • Keywords has-patch needs-testing added; needs-patch removed

#3 follow-up: @polevaultweb
8 years ago

@swissspidy would you mind having a look at the patch when you get a second? Cheers :)

#4 @swissspidy
8 years ago

  • Keywords has-unit-tests added

#5 in reply to: ↑ 3 @swissspidy
8 years ago

  • Keywords needs-refresh added
  • Milestone changed from Future Release to 4.5

Replying to polevaultweb:

@swissspidy would you mind having a look at the patch when you get a second? Cheers :)

The replacement should probably only be done if 'save' === $context, just like further down the function. That's what you're testing, after all. With your current patch you're doing it on display as well.

Regarding the test, I'd personally stick to the "1 assertion per test" rule used in most of the other functions — and use non-German strings for testing :). Example (taken from another test):

<?php
function test_replaces_ndash_entities() {
        $this->assertEquals("do-the-dash", sanitize_title_with_dashes("Do&ndash;the Dash", '', 'save'));
        $this->assertEquals("do-the-dash", sanitize_title_with_dashes("Do the&#8211;Dash", '', 'save'));
}

Oh, and you should add the @ticket 31790 annotation for every new test method you introduce, not only the first one.

#6 @SergeyBiryukov
8 years ago

  • Owner set to SergeyBiryukov
  • Resolution set to fixed
  • Status changed from new to closed

In 36775:

Formatting: In sanitize_title_with_dashes(), convert &nbsp, &ndash, and &mdash HTML entities to hyphens on save.

Props polevaultweb for initial patch.
Fixes #31790.

Note: See TracTickets for help on using tickets.