Opened 15 years ago
Last modified 5 years ago
#10823 new defect (bug)
Bad handling of ampersand in post titles
Reported by: | Commeuneimage | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 2.8 |
Component: | Formatting | Keywords: | has-patch needs-refresh |
Focuses: | Cc: |
Description
Titles with ampersand (&) are not correctly handled:
1/ the_title_attribute() doesn't transform & to & causing XHTML validation errors
2/ titles with & followed by ; generate truncated post name (slug)
e.g. "Test this & believe ; what ?" => "test-this-what"
Attachments (7)
Change History (29)
#1
@
14 years ago
- Component changed from General to Formatting
- Milestone changed from Unassigned to 3.0
#3
@
14 years ago
Could that be changed to :
$title = preg_replace('/&[a-z0-9]+?;/', '', $title); // kill entities
perhaps? That'd still catch all valid entities...
#4
@
14 years ago
- Keywords has-patch needs-testing added; ampersand the_title_attribute htmlspecialchars removed
Tested out dd32's code and it fixes the issue. Like mdawaffe said this could cause other issues so wanted to patch so others could test.
#5
follow-up:
↓ 13
@
14 years ago
Sorry to be pedantic but that patch introduces a slight regression for cases such as × (the permalinks would include "215").
Maybe we should include the hash character in the regex, like so:
$title = preg_replace('/&[#a-z0-9]+?;/', '', $title); // kill entities
After that, I can't think of any other cases.
#6
@
14 years ago
Sorry to be pedantic but that patch introduces a slight regression for cases such as × (the permalinks would include "215").
Very true, That thought came to me after posting, I thought I'd added another reply to that effect..
#8
@
14 years ago
- Keywords needs-patch added; has-patch needs-testing removed
- Milestone changed from Awaiting Triage to Future Release
#9
@
13 years ago
- Keywords has-patch added; needs-patch removed
Patch addresses item (2) - I'm not sure if this would break anything else. In any case this seems quite edge.
#10
@
13 years ago
I imagine we have a regex somewhere that deals with killing entities.
Of note I also see this regex used here:
./wp-includes/formatting.php:758: $username = preg_replace( '/&.+?;/', '', $username ); // Kill entities
#13
in reply to:
↑ 5
@
13 years ago
Replying to pufuwozu:
$title = preg_replace('/&[#a-z0-9]+?;/', '', $title); // kill entities
That looks good except the "?" is not needed. Also we can specify the length of that string:
$title = preg_replace('/&[#a-z0-9]{2,6};/', '', $title);
#14
@
13 years ago
- Keywords 3.4-early needs-unit-tests added
- Milestone changed from 3.3 to Awaiting Review
Too late in the cycle to be tweaking this.
We should also add some unit tests around this change.
#15
@
13 years ago
- Keywords needs-unit-tests removed
dd32 * http://unit-tests.trac.wordpress.org/changeset/489 /wp-testcase/test_includes_formatting.php: Cover more entity striping cases for sanitize_user() and sanitize_title_with_dashes(); See http://core.trac.wordpress.org/ticket/10823
Testing for that was done with/passed with the following regex:
'/&#?\w+;/'
I'm sure the above regex's would probably pass with this still.
#16
@
12 years ago
- Keywords 3.5-early added; 3.4-early removed
- Milestone changed from Awaiting Review to Future Release
#17
@
12 years ago
- Cc mdhansen@… added
Tested this patch, it still seems to work with 3.6-alpha. Maybe we should try to get this in 3.6 early?
I cannot confirm 1. the_title_attribute() does correctly turn ampersands into HTML entities for me.
I can confirm 2.
It comes from the following line in sanitize_title_with_dashes().
Fixing this could generate more bugs since the output is used in permalinks.