WordPress.org

Make WordPress Core

Opened 12 months ago

Last modified 12 months ago

#44988 new defect (bug)

The sanitize_html_class() is deceptive / "buggy"

Reported by: ChiefAlchemist Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 4.9.6
Component: Formatting Keywords:
Focuses: Cc:

Description

Best I can tell, sanitize_html_class() will return a class name that begins with an invalid character (e.g., an integer). AFAIK a class can't begin with an integer ( as well as a handful of other special / odd characters).

Mind you, I understand sanitation is not validation :) That said, the function description says "If this results in an empty string, then the function will return the alternative value supplied."

If you interpret that as "if your class is invalid..." (as I did), then at the very least some level of validation is implied.

Long to short, if the function returns a useless class, there's (almost) no point in having a special sanitizer. Sure, it might sanitize - I can hear the sec team barking - but the result isn't any more useful than using a "normal" sanitize function.

Attachments (1)

44988-broken.patch (2.2 KB) - added by ayeshrajans 12 months ago.
This breaks tests: https://travis-ci.org/Ayesh/wordpress-develop/builds/434524930

Download all attachments as: .zip

Change History (4)

#1 @mukesh27
12 months ago

  • Component changed from General to Formatting

#2 @ayeshrajans
12 months ago

The W3 spec does say that class names cannot start with a digit, single or double hyphens.
https://www.w3.org/TR/CSS21/syndata.html#characters

We can either remove the invalid characters or replace them. I'm leaning towards replacing them with an underscore because we cal still salvage an invalid class name from it (12345 would become '_2345' instead of '').

$sanitized = preg_replace( array(
                '/^[0-9]/',
                '/^(-[0-9])|^(--)/',
        ), array(
                '_',
                '__',
        ), $sanitized);

Backwards compatibility should not be issue because invalid classes are dropped by browsers anyway: https://jsfiddle.net/4om9bqhp/

I will attach a patch in next reply, but it raises some more issues...

#3 @ayeshrajans
12 months ago

The patch above (which fixes the class name according to W3 spec) breaks the tests including \Tests_Post_GetPostClass::test_with_utf8_category_slugs. This is because \get_post_class function uses this function to specifically get numeric values sanitized although the return values break W3 spec.

We could of course fix the failing tests, but this will clearly break BC. One way to fix this would be to introduce an additional boolean parameter to the function that only enforces stricter validations only when explicitly required so. Ugly solution for a minimal issue.

Note: See TracTickets for help on using tickets.