WordPress.org

Make WordPress Core

Opened 6 years ago

Last modified 6 months ago

#26759 new enhancement

New Generic Sanitize Functions for Core

Reported by: georgestephanis Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.8
Component: Formatting Keywords: dev-feedback needs-patch
Focuses: Cc:
PR Number:

Description

Core currently supplies a number of sanitize functions:

sanitize_email()
sanitize_file_name()
sanitize_html_class()
sanitize_key()
sanitize_meta()
sanitize_mime_type()
sanitize_option()
sanitize_sql_orderby()
sanitize_post_field()
sanitize_text_field()
sanitize_title()
sanitize_title_for_query()
sanitize_title_with_dashes()
sanitize_user()

They all sanitize by usage, not by data type.

As such, I (and I suspect others) wind up using these to escape things they weren't initially meant for -- for the sake of brevity, and it's just quicker and leads to tidier code.

I believe it could result in better and simpler sanitizing if we were to include sanitize-by-format functions in core. For example,

wp_sanitize_numeric( $raw ); // [\d]
wp_sanitize_numeric_float( $raw ); // [\d\.,] allowing both commas and periods as decimal indicator and thousands seperator
wp_sanitize_hex( $raw ); // [\da-f] case-insensitive
wp_sanitize_alphanumeric( $raw ); // [\da-z] case-insensitive
wp_sanitize_letters( $raw ); // [a-z] case-insensitive
wp_sanitize( $raw, $regex ); // uses passed in regex to determine what to strip.

The specific functions to use are up for discussion. I'm just hoping to make it simpler for users to sanitize data by expected type.

As a side note, this will let folks use wp_sanitize_numeric() to sanitize integers larger than PHP_INT_MAX -- which tumblr and twitter IDs often happen to be for imports and feeds and the like (as casting to (int) isn't a good idea).

Change History (11)

#1 @georgestephanis
6 years ago

  • Component changed from General to Validation
  • Keywords dev-feedback added
  • Type changed from defect (bug) to enhancement
  • Version set to trunk

#2 @betzster
6 years ago

  • Cc j@… added

#3 @ethitter
6 years ago

  • Cc erick@… added

#4 @DrewAPicture
6 years ago

  • Cc xoodrew@… added

I think personally I'd lean more toward creating one sanitize function with passed arguments instead of introducing a bunch of new functions.

Of course there's give and take with both approaches, but overall I think having generic sanitization functionality in place would be helpful :)

#5 @jdgrimes
6 years ago

  • Cc jdg@… added

#6 @alex-ye
6 years ago

  • Cc nashwan.doaqan@… added

#7 @goto10
6 years ago

  • Cc dromsey@… added

#8 @nacin
6 years ago

  • Component changed from Validation to Formatting

I'm not sure how useful this is, to be honest. wp_sanitize() would just be a wrapper for preg_replace(). All that would do is obscure what's actually occurring. Rather than hide that this is the functionality being performed, one should just use preg_replace().

Beyond that, I like it when core can provide good utility functions. But I've generally seen this to be a rabbit hole. You add one function like alphanumeric, then someone wants one that only allows lowercase letters. Or you add one function like letters, and someone wants to know why a ligature or diatric is getting stripped. Every field is different. Maybe work on metadata APIs will reveal some new sanitization shorthands, but I don't think we need to be adding the kitchen sink when core doesn't have much of a clear use for them and when it might just be more confusing than just sanitizing things on your own.

#9 @norcross
6 years ago

I agree that it could turn into a rabbit-hole scenario, but some base-level ones that are common usage should be considered. even some basic ones that strip all letters or all numbers would save the oft annoying process of writing regex (which considering there are multiple ways to get the same result, breeds inconsistency)

#10 @SergeyBiryukov
6 years ago

  • Version changed from trunk to 3.8

#11 @chriscct7
4 years ago

  • Keywords needs-patch added
Note: See TracTickets for help on using tickets.