Opened 4 years ago
Last modified 4 months ago
#10823 new defect (bug)
Bad handling of ampersand in post titles
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Priority: | normal | Milestone: | Future Release |
| Component: | Formatting | Version: | 2.8.4 |
| Severity: | minor | Keywords: | has-patch 3.5-early |
| Cc: | mdhansen@… |
Description
Titles with ampersand (&) are not correctly handled:
1/ the_title_attribute() doesn't transform & to & causing XHTML validation errors
2/ titles with & followed by ; generate truncated post name (slug)
e.g. "Test this & believe ; what ?" => "test-this-what"
Attachments (4)
Change History (22)
- Component changed from General to Formatting
- Milestone changed from Unassigned to 3.0
Could that be changed to :
$title = preg_replace('/&[a-z0-9]+?;/', '', $title); // kill entities
perhaps? That'd still catch all valid entities...
williamsba1 — 3 years ago
comment:4
williamsba1 — 3 years ago
- Keywords has-patch needs-testing added; ampersand the_title_attribute htmlspecialchars removed
Tested out dd32's code and it fixes the issue. Like mdawaffe said this could cause other issues so wanted to patch so others could test.
Sorry to be pedantic but that patch introduces a slight regression for cases such as × (the permalinks would include "215").
Maybe we should include the hash character in the regex, like so:
$title = preg_replace('/&[#a-z0-9]+?;/', '', $title); // kill entities
After that, I can't think of any other cases.
Sorry to be pedantic but that patch introduces a slight regression for cases such as × (the permalinks would include "215").
Very true, That thought came to me after posting, I thought I'd added another reply to that effect..
- Keywords needs-patch added; has-patch needs-testing removed
- Milestone changed from Awaiting Triage to Future Release
solarissmoke — 2 years ago
comment:9
solarissmoke — 2 years ago
- Keywords has-patch added; needs-patch removed
Patch addresses item (2) - I'm not sure if this would break anything else. In any case this seems quite edge.
comment:10
nacin — 2 years ago
I imagine we have a regex somewhere that deals with killing entities.
Of note I also see this regex used here:
./wp-includes/formatting.php:758: $username = preg_replace( '/&.+?;/', '', $username ); // Kill entities
- Milestone changed from Future Release to 3.3
Related: #10797
Related: #10792
comment:13
in reply to:
↑ 5
azaozz — 18 months ago
Replying to pufuwozu:
$title = preg_replace('/&[#a-z0-9]+?;/', '', $title); // kill entities
That looks good except the "?" is not needed. Also we can specify the length of that string:
$title = preg_replace('/&[#a-z0-9]{2,6};/', '', $title);
comment:14
westi — 18 months ago
- Keywords 3.4-early needs-unit-tests added
- Milestone changed from 3.3 to Awaiting Review
Too late in the cycle to be tweaking this.
We should also add some unit tests around this change.
comment:15
dd32 — 17 months ago
- Keywords needs-unit-tests removed
dd32 * http://unit-tests.trac.wordpress.org/changeset/489 /wp-testcase/test_includes_formatting.php: Cover more entity striping cases for sanitize_user() and sanitize_title_with_dashes(); See http://core.trac.wordpress.org/ticket/10823
Testing for that was done with/passed with the following regex:
'/&#?\w+;/'
I'm sure the above regex's would probably pass with this still.
SergeyBiryukov — 13 months ago
- Keywords 3.5-early added; 3.4-early removed
- Milestone changed from Awaiting Review to Future Release
comment:17
MikeHansenMe — 5 months ago
- Cc mdhansen@… added
Tested this patch, it still seems to work with 3.6-alpha. Maybe we should try to get this in 3.6 early?
comment:18
wonderboymusic — 4 months ago
Related #17648

I cannot confirm 1. the_title_attribute() does correctly turn ampersands into HTML entities for me.
I can confirm 2.
It comes from the following line in sanitize_title_with_dashes().
$title = preg_replace('/&.+?;/', '', $title); // kill entitiesFixing this could generate more bugs since the output is used in permalinks.