#32556 closed defect (bug) (invalid)
Clarify behaviour of esc_attr() with respect to HTML entities
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | |
Component: | Formatting | Keywords: | |
Focuses: | Cc: |
Description
I've just come across this, and would welcome some info on what the *right* thing is to do here. For background, also see #25485.
Currently, I have a string:
Next Events <span>»</span>
I want to place this into the value of an INPUT tag. I was using esc_attr(), e.g.
<input value="<?php echo esc_attr( $string ); ?>">
That results in an INPUT field that displays as:
Next Events <span>»</span>
IE - the » has been converted to » rather than &raquo; which is what is desired, to make the input box display as:
Next Events <span>»</span>
So, I assumed that I should be able to encode the entity myself, and then apply esc_attr() since esc_attr() advertises that it "will never double encode entities" (https://codex.wordpress.org/Function_Reference/esc_attr).
However, beyond "not double-encoding entities", what esc_attr() actually does is normalize any entities - even if they've previously been deliberately encoded. This seems like a bug if not in the function, then in the documentation, but I'm not sure what the *right* thing to do is here?
It's possibly to just use htmlentities(), and not use esc_attr() at all, but that feels like I might be missing out on some additional protection afforded by esc_attr(). Any guidance welcome.
Change History (7)
#1
@
10 years ago
- Milestone Awaiting Review deleted
- Resolution set to invalid
- Status changed from new to closed
- Version 4.2.2 deleted
#2
in reply to:
↑ description
@
10 years ago
Replying to leewillis77:
It's possibly to just use htmlentities(), and not use esc_attr() at all
Just to clarify, it's htmlspecialchars(). If you need help with this, please see the support forums.
#3
follow-up:
↓ 6
@
10 years ago
Hi,
I'm aware it's a fine line between bug and support, but please consider re-opening this - rationale below.
The reason I opened it as a bug, is because esc_attr() is interfering with the string it is passed in undocumented ways. At the very least, that's a documentation bug, ideally it just shouldn't do it - although I appreciate that's probably difficult.
According to the documentation, all three of these should return the same string, however the versions that include esc_attr() return different output to the non-esc_attr() version.
htmlspecialchars('»') // Returns &raquo;
esc_attr(htmlspecialchars('»')); // Returns »
esc_attr('&raquo;'); // Returns »
The & passed into esc_attr is being decoded to & when it should not be. I presume this is part of esc_attr() trying to make sure it's not double-encoding things, but it should not decode if the encoding was part of the source string.
#4
@
10 years ago
So, I assumed that I should be able to encode the entity myself, and then apply esc_attr() since esc_attr() advertises that it "will never double encode entities" (https://codex.wordpress.org/Function_Reference/esc_attr).
The actual inline docs of esc_attr() make no such claim, this is just another case of the codex being wrong.
#5
@
10 years ago
Relevant research may be added to #17780. But keep it focused on the decoding of &
to &
. It was likely done that way for a good reason.
#6
in reply to:
↑ 3
@
10 years ago
Replying to leewillis77:
The & passed into esc_attr is being decoded to & when it should not be.
This appears to be a valid bug. Let me just add that this is very different from the ticket description, and I don't think we will need an extra ticket at this point. The enhancement request at #17780 will fix this bug. I hope that fully addresses your concern.
htmlspecialchars() does what you want and will run faster anyway. There is no bug here.