Make WordPress Core

Opened 12 months ago

Last modified 5 weeks ago

#58921 new defect (bug)

wp_kses_allowed_html doesn't allow to add esi:include

Reported by: alekv's profile alekv Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 3.5
Component: Formatting Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

In my plugin I want to implement an [ESI exclusion](https://www.w3.org/TR/esi-lang/) for Litespeed using their filter like this:

<?php
echo apply_filters(
        'litespeed_esi_url',
        'pmw_data_layer',
        'Inject data layer through ESI block');

The filter returns a script that contains an HTML comment and an ESI exclusion tag (which is not a standard HTML tag):

<!-- lscwp Inject data layer through ESI block -->
<esi:include src='/?lsesi=pmw_data_layer&_control=private%2Cno-vary&_hash=6b8400dc86345e005f6cbea3e58da1e2' cache-control='private,no-vary' />
<!-- lscwp Inject data layer through ESI block esi end -->

However, using the unescaped echo is bad practice and is not allowed by WooCommerce (my plugin is published on woocomerce.com, so I can't upload a version that's using echo).

So I tried using wp_kses.

Using the wp_kses_allowed_html filter, one could add custom tags to wp_kses. But it doesn’t work in this particular case because the ESI tag contains a colon esi:include.

I’m stuck here. I can’t change the [ESI specification](https://www.w3.org/TR/esi-lang/) which specifies the tag esi:include.

Litespeed (or any other technology that uses ESI) must consume the esi:include tag to work. That means we need to output the script somehow.

I think there should be a way to add custom tags like esi:include.

Change History (17)

#1 @swissspidy
12 months ago

  • Component changed from Plugins to Formatting
  • Version changed from 6.2.2 to 3.5

Related: #34105

This ticket was mentioned in PR #4923 on WordPress/wordpress-develop by @rajinsharwar.


12 months ago
#2

  • Keywords has-patch added

Adding ESI in the allowed HTML list

Trac ticket: https://core.trac.wordpress.org/ticket/58921

alewolf commented on PR #4923:


12 months ago
#3

This patch doesn't work and won't solve the issue. It doesn't allow the full tag name which is esi:include.

And adding esi:include to that allow list won't work either. It get's ignored because of the colon somehow.

This ticket was mentioned in PR #4926 on WordPress/wordpress-develop by alewolf.


12 months ago
#4

This allows HTML tag names that contain a colon such as <esi:include>

The changed regex now allows zero or one colon within the tag name.

(<esi:include> tags are specified by the ESI Languages Specification: https://www.w3.org/TR/esi-lang/)

Trac ticket: https://core.trac.wordpress.org/ticket/58921

alewolf commented on PR #4926:


12 months ago
#5

Here's the regex on regex101.com if you like to investigate it further: https://regex101.com/r/vI98sd/1

Bear in mind that I had to add one backslash to make it work in regex101. The backslash is not necessary in the PHP code.

Regex101 code: `^<\s*(\/\s*)?([a-zA-Z0-9-]+:?[a-zA-Z0-9-]+)([^>]*)>?`
PHP code:      `^<\s*(/\s*)?([a-zA-Z0-9-]+:?[a-zA-Z0-9-]+)([^>]*)>?`

This ticket was mentioned in Slack in #core by alekv. View the logs.


12 months ago

This ticket was mentioned in PR #4929 on WordPress/wordpress-develop by alewolf.


12 months ago
#7

  • Keywords has-unit-tests added

This allows HTML tag names that contain a colon such as <esi:include>

The changed regex now allows zero or one colon within the tag name.

(<esi:include> tags are specified by the ESI Languages Specification: https://www.w3.org/TR/esi-lang/)

Here's the regex on regex101.com if you like to investigate it further: https://regex101.com/r/qujMNv/1

Bear in mind that I had to add one backslash to make it work in regex101. The backslash is not necessary in the PHP code.

Regex101 code: `^<\s*(\/\s*)?([a-zA-Z0-9-:]+)([^>]*)>?$`
PHP code:      `^<\s*(/\s*)?([a-zA-Z0-9-:]+)([^>]*)>?$`

(This PR replaces the PR https://github.com/WordPress/wordpress-develop/pull/4926)

Trac ticket: https://core.trac.wordpress.org/ticket/58921

alewolf commented on PR #4926:


12 months ago
#8

I created a new PR for this that now includes a unit test: https://github.com/WordPress/wordpress-develop/pull/4929

#9 follow-up: @costdev
12 months ago

The change in #34105 made sense as custom elements can have hyphens in multiple places.

The elements listed in ESI Language Specification 1.0 only seem to indicate a prefix of esi:, so allowing : in general may not be preferable when we could just allow a tag to start with esi:.

This way:

  • We can be specific in adding support for ESI.
  • We won't lock ourselves into a BC promise for some other format that we aren't intending to support at this time.

For example:

^<\s*(/\s*)?((?:esi:)?[a-zA-Z0-9-]+)([^>]*)>?$

#10 in reply to: ↑ 9 @alekv
12 months ago

Replying to costdev:

Yes, that makes sense. I'll change the regex and submit a new commit.

#11 follow-up: @peterwilsoncc
12 months ago

I don't think it's wise to add the tag to the default list of supported elements, as in in PR#4923, the main reason being that it should only be permitted on servers that support the tag.

However, it may be worth supporting the syntax, as in the follow up pull requests, to allow developers to add it to the list of supported tags via a plugin making use of the various filters in KSES.

To that end, the discussion on this ticket can become:

  • continue with no support (close the ticket without a change to the code)
  • support tags containing a colon :
  • support the specific prefix esi:

As the linked w3 document is a note rather than a specification, it would be good to know how widely supported ESI tags are supported?

#12 in reply to: ↑ 11 @alekv
12 months ago

Replying to peterwilsoncc:

To that end, the discussion on this ticket can become:

  • support the specific prefix esi:

The latest PR does exactly that. It adds support for using the esi: prefix specifically.

As the linked w3 document is a note rather than a specification, it would be good to know how widely supported ESI tags are supported?

The LiteSpeed Cache WordPress plugin, with over 4mio active installs, fully supports ESI.

Other popular services that support ESI:

Not every service currently supports or wants to support ESI:

  • Cloudflare says that "old school CDNs" support ESI. And they (Cloudflare) promote their own no-code solution that achieves the same as ESI.
  • KeyCDN points out that it requires too much technical know-how to implement ESI for the end user and is not good for TTFB. So they will only implement ESI if it becomes a W3C standard.

So, pretty large services support ESI. But it doesn't tell us much about usage, and I couldn't find much about that. I only can deduct from what experience I have in relation to the LiteSpeed Cache plugin.

Not all the websites activate ESI in LiteSpeed Cache, and it only makes sense to use it for logged-in users. So this narrows it down to be mostly useful to membership websites and e-commerce websites that offer to create customer accounts. That's certainly not millions, but it could be tens of thousands if not hundreds of thousands of websites that use ESI. And I don't have any numbers of how many of the 50'000 Pixel Manager for WooCommerce users also use LiteSpeed Cache with ESI enabled.

Out of own experience, we had only a few customers (of the Pixel Manager for WooCommerce) to reach out to us regarding ESI support for LiteSpeed Cache. However, those are the more technical users who cared about looking into it more closely and asking. Since the Pixel Manager outputs PII for logged-in users, its output must be excluded if ESI is enabled in LiteSpeed Cache. For now, I implemented a way that disables the entire caching (for logged-in users) if ESI is enabled to ensure that no PII is cached. But we had complaints from our users about that because they'd prefer to use ESI for what it's made for. We must assume that every LiteSpeed Cache user who has enabled ESI wants to use it the way it's made for (and not disable caching for logged-in users), which going back to the above numbers, is probably something around tens of thousands up to hundreds of thousands of users.

Last edited 12 months ago by alekv (previous) (diff)

#13 @alekv
10 months ago

@peterwilsoncc and @costdev

I've provided the information you asked for. Do you need anything else? Or is this ready to be committed to core?

@martin.krcho commented on PR #4926:


5 weeks ago
#14

@alewolf can you please close this PR in favor of https://github.com/WordPress/wordpress-develop/pull/4929?

@martin.krcho commented on PR #4923:


5 weeks ago
#15

@Rajinsharwar have a look at the approach implemented in https://github.com/WordPress/wordpress-develop/pull/4929, please.

Would you also please close this pull request? Thank you

alewolf commented on PR #4926:


5 weeks ago
#16

@martinkrcho How do you mean? It is already closed...

@martin.krcho commented on PR #4926:


5 weeks ago
#17

Scratch that. I didn't realite it's already closed.

Note: See TracTickets for help on using tickets.