WordPress.org

Make WordPress Core

Opened 3 months ago

Last modified 9 days ago

#53450 new enhancement

[WP_Meta_Query] Add faster LIKE based 'STARTSWITH' and 'ENDSWITH' compare modes for value query

Reported by: janthiel Owned by:
Milestone: Awaiting Review Priority: normal
Severity: trivial Version: 5.7.2
Component: Query Keywords: dev-feedback has-patch needs-docs needs-codex has-unit-tests
Focuses: performance Cc:

Description

Currently the "LIKE" compare mode for meta value compares is only usable for contains queries as it always adds wildcards around all queries LIKE '%query%'. This makes one use the more complex REGEXP compare mode for queries which easily could be written with LIKE '%query' or LIKE 'query%'.

As LIKE is faster than REGEXP it is preferable to use LIKE.
See: http://billauer.co.il/blog/2020/12/mysql-index-pattern-matching-performance/

In addition people don't have to use the much more complex regex. Which tends to introduce errors in my experience as most people just copy & paste but do not understand how regex really works (not meant as an offense). So REGEXP should be avoided if possible.

I would suggest naming the new compare types STARTSWITH and ENDSWITH. Also adding their NON ... counter parts to match up the LIKE behaviour.

Maybe also add an alias for LIKE named CONTAINS as the current naming LIKE suggests that you could pass in the wildcards yourself. Which is not the case and thus misleading. But this is just for the sake of the tickets completenes. The pull request only contains code and tests for the new modes.

As an alternative I thought about reworking the current LIKE mode to allow custom wildcard passing. But this will clearly break backward compat and thus I discarded this approach.

Attachments (7)

53450.patch (6.0 KB) - added by janthiel 3 months ago.
53450_v2.diff (5.6 KB) - added by janthiel 3 months ago.
Fixed codestyle and typos
53450_v3.diff (6.8 KB) - added by janthiel 3 months ago.
Added more complex Test cases
53450_v4.diff (16.5 KB) - added by janthiel 8 weeks ago.
Adds the new operators on top of LIKE / NOT LIKE for value and key meta query operators
53450_v5.diff (16.5 KB) - added by janthiel 8 weeks ago.
Minor codestyle fix
53450_v6.diff (16.6 KB) - added by janthiel 8 weeks ago.
Final codestyle fixes
53450_v7.diff (16.6 KB) - added by janthiel 4 days ago.
Implemented Buds feedback on comment order

Download all attachments as: .zip

Change History (24)

This ticket was mentioned in PR #1392 on WordPress/wordpress-develop by JanThiel.


3 months ago

Adds the new LIKE based compare operators STARTSWITH, ENDSWITH, NOT STARTSWITH, NOT ENDSWITH to the meta query value compares.
They allow more performant and easier regular query cases than REGEXP which would be the alternative.

Trac ticket: https://core.trac.wordpress.org/ticket/53450

@janthiel
3 months ago

#2 @prbot
3 months ago

JanThiel commented on PR #1392:

Will take care of the failing test and CS tomorrow :-)

@janthiel
3 months ago

Fixed codestyle and typos

#3 @jorbin
3 months ago

Thanks Jantheil, this feels like a good improvement.

One improvement to the tests I would like to see here is to do some things that should be more explicitly excluded. For example, in test_meta_query_single_query_compare_startswith, having a post that has a meta that contains but does not start with ba but does contain it.

Also, tagging @boonebgorges to see if he has any opinion on this since he did a good amount of work on meta queries.

#4 @janthiel
3 months ago

Good catch @jorbin regarding the tests. I will add those tomorrow.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


3 months ago

This ticket was mentioned in Slack in #core by janthiel. View the logs.


3 months ago

#7 @janthiel
3 months ago

  • Version changed from trunk to 5.7.2

@janthiel
3 months ago

Added more complex Test cases

#8 @janthiel
3 months ago

@jorbin I just added more complex test cases as requested to all new operators. As expected, it all looks still good ;-)

Would any maintainer mind approving the workflows for the open GitHub PR?
https://github.com/WordPress/wordpress-develop/pull/1392

Just to be sure it all works out in the official test pipeline as well.

Thanks!

#9 @boonebgorges
3 months ago

@jorbin Thanks for the ping, and @janthiel thanks for the ticket.

I agree strongly with the motivation for this ticket. REGEXP should be avoided where possible, and it's likely that many of the real-world uses of REGEXP would be covered by the suggested new feature.

My one misgiving is about the syntax. Currently, all values of compare are MySQL keywords. STARTSWITH etc, being passed to compare (especially in all upper-case), give the false appearance of being keywords. I'm concerned that this could cause confusion.

Here's an idea for an alternative syntax. A number of years ago, BuddyPress introduced a similar feature, but added a separate parameter to trigger it: search_wildcard, which defaults to 'both' but accepts 'right' or 'left'. See https://buddypress.trac.wordpress.org/ticket/5769

In the case of meta queries, this might look like this:

[
  'key'                    => 'foo',
  'value'                  => 'bar'
  'compare'                => 'LIKE',
  'like_wildcard_position' => 'left', // defaults to 'both'
]

We might find a better name for the parameter than this, and we might also consider 'start' and 'end' instead of 'left' and 'right'.

In any case, I think that this kind of syntax is better for internal consistency. Does anyone have thoughts about whether this is a better or worse idea?

#10 @janthiel
3 months ago

@boonebgorges I agree that anything based upon the LIKE compare key feels better than introducing new keywords.
Yet I do not believe that most users do really understand or care that the passed in operators for compare are actually SQL keywords. So introducing another additional subkey makes it kind of harder to use. Whereas the STARTSWITH and ENDSWITH keywords are very easy to get and document. As people will see them directly when they read about which compare modes are offered. So I believe offering new, easy to understand keywords as 1st level citizens is more convenient from a users point of view.

Anyway, I also get that from a technical and architectural point of view it is very consequent to only support SQL keywords as 1st level compare keys. If we go with your suggestion - and I can absolutely support that - what about:

[
  'key'                    => 'foo',
  'value'                  => 'bar'
  'compare'                => 'LIKE',
  'like_compare_mode'      => 'startswith', // (startswith | endswith | contains) defaults to 'contains'
]

#11 @boonebgorges
3 months ago

Thanks @janthiel. Your syntax seems fine to me, though it'd be great to get another opinion. Maybe @jorbin or @desrosj (who I see is a watcher) could chime in with thoughts? Once we're happy with the parameter naming, we can work on an updated patch.

@janthiel
8 weeks ago

Adds the new operators on top of LIKE / NOT LIKE for value and key meta query operators

#12 @janthiel
8 weeks ago

@boonebgorges @desrosj I just uploaded the new patch. It now adds two more properties to the Meta Query and removes the non-sql keywords for the compare mode:

	 *         @type string $compare_key_like_mode  Search mode for LIKE compares. Accepts 'startswith ', 'endswith' or
	 *                                              'contains'. Default is 'contains'.
	 *         @type string $compare_like_mode      Search mode for LIKE compares. Accepts 'startswith ', 'endswith' or
	 *                                              'contains'. Default is 'contains'.

Those two will then select an appropriate wildcard "template" for the LIKE / NOT LIKE queries. It works both for value and key. I chose the naming to be very close to the $compare and $compare_key variables to make clear that there is a relation between both.
I also added the capability to use both properties as meta_... props in the WP_Query directly. Shouldn't hurt.

I also added and extended several test cases to cover the new scenarios and combination possibilities.

Let me know what you think.

@janthiel
8 weeks ago

Minor codestyle fix

@janthiel
8 weeks ago

Final codestyle fixes

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


2 weeks ago

#14 follow-up: @manzwebdesigns
2 weeks ago

  • Severity changed from normal to trivial

The only suggestion I would make is to make the following changes, moving 'contains' to the beginning of the list of accepted options:

@type string $compare_key_like_mode  Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.
@type string $compare_like_mode      Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.

#15 in reply to: ↑ 14 ; follow-up: @janthiel
11 days ago

Replying to manzwebdesigns:

The only suggestion I would make is to make the following changes, moving 'contains' to the beginning of the list of accepted options:

@type string $compare_key_like_mode  Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.
@type string $compare_like_mode      Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.

Hi @manzwebdesigns Thank you very much for your input. Could you be so kind and explain to me why you propose this change?
I didn't spend much thinking about the order of the comment till now :-) Yet when considering it, all orders are kind of reasonable. Alphabetical would be as well as a "logical" order where "contains" would be between "startsWith" and "endsWith". My current implementation groups "...with" and closes with "contains". So it is kind of similar to your suggestion. Just reversed.

Just curious.

Thanks, Jan :-)

#16 in reply to: ↑ 15 ; follow-up: @manzwebdesigns
9 days ago

Replying to janthiel:

Replying to manzwebdesigns:

The only suggestion I would make is to make the following changes, moving 'contains' to the beginning of the list of accepted options:

@type string $compare_key_like_mode  Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.
@type string $compare_like_mode      Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.

Hi @manzwebdesigns Thank you very much for your input. Could you be so kind and explain to me why you propose this change?
I didn't spend much thinking about the order of the comment till now :-) Yet when considering it, all orders are kind of reasonable. Alphabetical would be as well as a "logical" order where "contains" would be between "startsWith" and "endsWith". My current implementation groups "...with" and closes with "contains". So it is kind of similar to your suggestion. Just reversed.

Just curious.

Thanks, Jan :-)

Hi @janthiel,

You are very welcome. I just noticed that the other parameters' default options were listed first... it is fine either way, but when I do this, I always put the default option first.

Thanks,
Bud

#17 in reply to: ↑ 16 @janthiel
9 days ago

Replying to manzwebdesigns:

Hi @janthiel,

You are very welcome. I just noticed that the other parameters' default options were listed first... it is fine either way, but when I do this, I always put the default option first.

Thanks,
Bud

Hey Bud,

That sounds reasonable and logical. I will change the patch accordingly :-)

Best regards,
Jan

@janthiel
4 days ago

Implemented Buds feedback on comment order

Note: See TracTickets for help on using tickets.