WordPress.org

Make WordPress Core

Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#25485 closed defect (bug) (invalid)

Why does esc_attr not double encode entities by default?

Reported by: smerriman Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.6.1
Component: Formatting Keywords:
Focuses: Cc:

Description

This function has been around for a long time, so I'm sure there is a good explanation for why this isn't a "bug", but I'm very confused.

The main point of esc_attr is that you can use it inside, say:

input value="<?php echo esc_attr($x);?>"

so that the input field displays exactly the string $x, rather than messing up quotes and other special characters. However, this is not true of HTML entities.

A simple example - in the Tagline field on the Wordpress general settings page, enter:

&copy;

In the database, this is saved exactly as typed.

When the page reloads, you see the copyright symbol rather than what you typed.

If you save the options again without changing anything, the database changes to the actual single copyright character, rather than the HTML.

It seems wrong to me that saving the page without changing anything would result in different results in the database.

This would become extremely problematic if, say, you wanted &nbsp; characters, as after the first save you would no longer be able to distinguish them from normal characters.

I know there is an optional variable you can pass to esc_attr to avoid this, but should not all inputs throughout Wordpress, or indeed the default esc_attr, encode these entities properly? Or is there a good reason for this unusual (to me) behaviour?

(Note that they are properly encoded in esc_textarea).

Change History (9)

#1 @mark-k
7 years ago

Can confirm that this is not only limited to options but also happens with tag names.

Not sure that the proposed solution is the right one, as esc_attr can be used with hard coded copyright symbol as a meta value (or other read only type of values) for which you don't want to escape the html. Maybe there should be a new function that does both escaping 'esc_value_attr' ?

#2 @nacin
7 years ago

We've generally found that the benefits of esc_attr() not double-encoding entities far outweigh any potential side effects.

#3 @nacin
7 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to invalid
  • Status changed from new to closed

#4 follow-up: @bmthemes
7 years ago

Hey Nacin,

I came across the same exact problem.
Could you give an example of a potential side effect?

#5 in reply to: ↑ 4 @bmthemes
7 years ago

  • Resolution invalid deleted
  • Status changed from closed to reopened

Actually it really looks like a bug.
It's a bit hard to understand why an option which has not been changed would change in the database.
Or am I missing something?

#6 follow-up: @SergeyBiryukov
7 years ago

  • Milestone set to Awaiting Review

Currently, it's safe to assume that esc_attr( esc_attr() ) doesn't break anything (see comment:7:ticket:20009 for example), which won't be the case with double-encoding.

The issue with saving &copy; in the Tagline field seems worth investigating. At a glance, however, it doesn't have anything to do with esc_attr(), as it doesn't escape HTML entities.

#7 @nacin
7 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to invalid
  • Status changed from reopened to closed

We're not changing how esc_attr() works. In fact the whole point of it is that it does not re-encode. If you want double-encoding, then you should use htmlspecialchars() directly.

Specific bugs related to, for example, option values belong in separate tickets.

#8 @nacin
7 years ago

Also note that esc_textarea() needs to be double-encoded because of how textareas work. That's by design.

#9 in reply to: ↑ 6 @SergeyBiryukov
7 years ago

Replying to SergeyBiryukov:

The issue with saving &copy; in the Tagline field seems worth investigating.

Related: #27942

Note: See TracTickets for help on using tickets.