Opened 7 years ago
Last modified 6 years ago
#43147 new enhancement
Introduce `esc_html_comment` and translation related functions
Reported by: | jipmoors | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | |
Component: | Formatting | Keywords: | has-patch has-unit-tests 2nd-opinion |
Focuses: | Cc: |
Description
Triggered by the following PR on Yoast SEO #8687 I saw the need for esc_html_comment
functionality in WordPress.
This patch is based on the current API for formatting and escaping functions.
The implementation is based on the specifications found on the HTML specifications page.
I've added unit tests to cover the situations that I could come up with.
Attachments (3)
Change History (12)
#3
@
7 years ago
@jdgrimes thanks for noticing, have created an updated patch with proper implementation.
#4
@
7 years ago
- Keywords 2nd-opinion added
I don't personally think this is needed myself.
The whole premise of the need for it is not being able to trust the data you're outputting, which if it's not hard coded, sounds like it should be being sanitised rather than escaped. WordPress also trusts translations inherently, meaning you should fix a translation that contains rogue HTML rather than simply attempting to escape the bad translations.
#5
follow-up:
↓ 7
@
7 years ago
@dd32 I personally never trust the translations, because they are outside of a developer's control.
When you say "WordPress also trusts translations inherently", do you have something definitive you can point me to? I tried to read up on it, but all I found was (mostly implicit) recommendations to escape anything that will be rendered into HTML.
Also, "fix a translation that contains rogue HTML" is not technically possible in a lot of cases, as translations can just be uploaded to sites as additional (potentially rogue) files.
I personally see the value in this, especially since we decided to use HTML comments as our content data structure.
#6
@
7 years ago
There is one more argument to consider, we can escape all of the ways to communicate with data: json, SQL, HTML, JavaScript. HTML comments are part of this domain, if not escaped properly they will cause side-effects and unwanted behaviour.
#7
in reply to:
↑ 5
@
7 years ago
Replying to schlessera:
@dd32 I personally never trust the translations, because they are outside of a developer's control.
While that's your preference, WordPress itself trusts it's translations from translate.wordpress.org, and suggest that if you're running other translations you need to trust the source of those strings. If you're unable to trust them, you've got potentially larger issues than random extra html tags in said strings, and you should probably review them before shipping them out.
When you say "WordPress also trusts translations inherently", do you have something definitive you can point me to? I tried to read up on it, but all I found was (mostly implicit) recommendations to escape anything that will be rendered into HTML.
https://core.trac.wordpress.org/ticket/30724 is probably the best reference I have right now.
Unfortunately most references, such as the codex, have been updated to suggest escaping everything or running it through kses, even though it's not recommended or suggested as required by core. Unfortunately a certain PHPCS ruleset suggested escaping everything at some point and went against cores implicit "we trust translation strings".
#8
@
7 years ago
@dd32 Thanks for the link to the ticket. That discussion should be transcribed into the handbook somewhere, I think.
I can understand this stance for Core, but I don't think this would apply for plugins or themes. If I'm not mistaken, one plugin can easily install language files that provide translations for other plugins/themes. These are just random additional files that no-one will care about.
The spec says that a comment cannot "contain two consecutive U+002D HYPHEN-MINUS characters (
--
)". However, the patch seems to only disallow--
at the beginning of the comment text, not anywhere within it.