Make WordPress Core

Opened 3 years ago

Last modified 22 months ago

#53450 new enhancement

[WP_Meta_Query] Add faster LIKE based 'STARTSWITH' and 'ENDSWITH' compare modes for value query

Reported by: janthiel's profile janthiel Owned by:
Milestone: Future Release Priority: normal
Severity: trivial Version:
Component: Query Keywords: dev-feedback has-patch needs-docs needs-codex has-unit-tests early early-like-actually-early needs-testing
Focuses: performance Cc:

Description

Currently the "LIKE" compare mode for meta value compares is only usable for contains queries as it always adds wildcards around all queries LIKE '%query%'. This makes one use the more complex REGEXP compare mode for queries which easily could be written with LIKE '%query' or LIKE 'query%'.

As LIKE is faster than REGEXP it is preferable to use LIKE.
See: http://billauer.co.il/blog/2020/12/mysql-index-pattern-matching-performance/

In addition people don't have to use the much more complex regex. Which tends to introduce errors in my experience as most people just copy & paste but do not understand how regex really works (not meant as an offense). So REGEXP should be avoided if possible.

I would suggest naming the new compare types STARTSWITH and ENDSWITH. Also adding their NON ... counter parts to match up the LIKE behaviour.

Maybe also add an alias for LIKE named CONTAINS as the current naming LIKE suggests that you could pass in the wildcards yourself. Which is not the case and thus misleading. But this is just for the sake of the tickets completenes. The pull request only contains code and tests for the new modes.

As an alternative I thought about reworking the current LIKE mode to allow custom wildcard passing. But this will clearly break backward compat and thus I discarded this approach.

Attachments (7)

53450.patch (6.0 KB) - added by janthiel 3 years ago.
53450_v2.diff (5.6 KB) - added by janthiel 3 years ago.
Fixed codestyle and typos
53450_v3.diff (6.8 KB) - added by janthiel 3 years ago.
Added more complex Test cases
53450_v4.diff (16.5 KB) - added by janthiel 3 years ago.
Adds the new operators on top of LIKE / NOT LIKE for value and key meta query operators
53450_v5.diff (16.5 KB) - added by janthiel 3 years ago.
Minor codestyle fix
53450_v6.diff (16.6 KB) - added by janthiel 3 years ago.
Final codestyle fixes
53450_v7.diff (16.6 KB) - added by janthiel 3 years ago.
Implemented Buds feedback on comment order

Download all attachments as: .zip

Change History (50)

This ticket was mentioned in PR #1392 on WordPress/wordpress-develop by JanThiel.


3 years ago
#1

Adds the new LIKE based compare operators STARTSWITH, ENDSWITH, NOT STARTSWITH, NOT ENDSWITH to the meta query value compares.
They allow more performant and easier regular query cases than REGEXP which would be the alternative.

Trac ticket: https://core.trac.wordpress.org/ticket/53450

@janthiel
3 years ago

JanThiel commented on PR #1392:


3 years ago
#2

Will take care of the failing test and CS tomorrow :-)

@janthiel
3 years ago

Fixed codestyle and typos

#3 @jorbin
3 years ago

Thanks Jantheil, this feels like a good improvement.

One improvement to the tests I would like to see here is to do some things that should be more explicitly excluded. For example, in test_meta_query_single_query_compare_startswith, having a post that has a meta that contains but does not start with ba but does contain it.

Also, tagging @boonebgorges to see if he has any opinion on this since he did a good amount of work on meta queries.

#4 @janthiel
3 years ago

Good catch @jorbin regarding the tests. I will add those tomorrow.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by janthiel. View the logs.


3 years ago

#7 @janthiel
3 years ago

  • Version changed from trunk to 5.7.2

@janthiel
3 years ago

Added more complex Test cases

#8 @janthiel
3 years ago

@jorbin I just added more complex test cases as requested to all new operators. As expected, it all looks still good ;-)

Would any maintainer mind approving the workflows for the open GitHub PR?
https://github.com/WordPress/wordpress-develop/pull/1392

Just to be sure it all works out in the official test pipeline as well.

Thanks!

#9 @boonebgorges
3 years ago

@jorbin Thanks for the ping, and @janthiel thanks for the ticket.

I agree strongly with the motivation for this ticket. REGEXP should be avoided where possible, and it's likely that many of the real-world uses of REGEXP would be covered by the suggested new feature.

My one misgiving is about the syntax. Currently, all values of compare are MySQL keywords. STARTSWITH etc, being passed to compare (especially in all upper-case), give the false appearance of being keywords. I'm concerned that this could cause confusion.

Here's an idea for an alternative syntax. A number of years ago, BuddyPress introduced a similar feature, but added a separate parameter to trigger it: search_wildcard, which defaults to 'both' but accepts 'right' or 'left'. See https://buddypress.trac.wordpress.org/ticket/5769

In the case of meta queries, this might look like this:

[
  'key'                    => 'foo',
  'value'                  => 'bar'
  'compare'                => 'LIKE',
  'like_wildcard_position' => 'left', // defaults to 'both'
]

We might find a better name for the parameter than this, and we might also consider 'start' and 'end' instead of 'left' and 'right'.

In any case, I think that this kind of syntax is better for internal consistency. Does anyone have thoughts about whether this is a better or worse idea?

#10 @janthiel
3 years ago

@boonebgorges I agree that anything based upon the LIKE compare key feels better than introducing new keywords.
Yet I do not believe that most users do really understand or care that the passed in operators for compare are actually SQL keywords. So introducing another additional subkey makes it kind of harder to use. Whereas the STARTSWITH and ENDSWITH keywords are very easy to get and document. As people will see them directly when they read about which compare modes are offered. So I believe offering new, easy to understand keywords as 1st level citizens is more convenient from a users point of view.

Anyway, I also get that from a technical and architectural point of view it is very consequent to only support SQL keywords as 1st level compare keys. If we go with your suggestion - and I can absolutely support that - what about:

[
  'key'                    => 'foo',
  'value'                  => 'bar'
  'compare'                => 'LIKE',
  'like_compare_mode'      => 'startswith', // (startswith | endswith | contains) defaults to 'contains'
]

#11 @boonebgorges
3 years ago

Thanks @janthiel. Your syntax seems fine to me, though it'd be great to get another opinion. Maybe @jorbin or @desrosj (who I see is a watcher) could chime in with thoughts? Once we're happy with the parameter naming, we can work on an updated patch.

@janthiel
3 years ago

Adds the new operators on top of LIKE / NOT LIKE for value and key meta query operators

#12 @janthiel
3 years ago

@boonebgorges @desrosj I just uploaded the new patch. It now adds two more properties to the Meta Query and removes the non-sql keywords for the compare mode:

	 *         @type string $compare_key_like_mode  Search mode for LIKE compares. Accepts 'startswith ', 'endswith' or
	 *                                              'contains'. Default is 'contains'.
	 *         @type string $compare_like_mode      Search mode for LIKE compares. Accepts 'startswith ', 'endswith' or
	 *                                              'contains'. Default is 'contains'.

Those two will then select an appropriate wildcard "template" for the LIKE / NOT LIKE queries. It works both for value and key. I chose the naming to be very close to the $compare and $compare_key variables to make clear that there is a relation between both.
I also added the capability to use both properties as meta_... props in the WP_Query directly. Shouldn't hurt.

I also added and extended several test cases to cover the new scenarios and combination possibilities.

Let me know what you think.

@janthiel
3 years ago

Minor codestyle fix

@janthiel
3 years ago

Final codestyle fixes

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


3 years ago

#14 follow-up: @manzwebdesigns
3 years ago

  • Severity changed from normal to trivial

The only suggestion I would make is to make the following changes, moving 'contains' to the beginning of the list of accepted options:

@type string $compare_key_like_mode  Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.
@type string $compare_like_mode      Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.

#15 in reply to: ↑ 14 ; follow-up: @janthiel
3 years ago

Replying to manzwebdesigns:

The only suggestion I would make is to make the following changes, moving 'contains' to the beginning of the list of accepted options:

@type string $compare_key_like_mode  Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.
@type string $compare_like_mode      Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.

Hi @manzwebdesigns Thank you very much for your input. Could you be so kind and explain to me why you propose this change?
I didn't spend much thinking about the order of the comment till now :-) Yet when considering it, all orders are kind of reasonable. Alphabetical would be as well as a "logical" order where "contains" would be between "startsWith" and "endsWith". My current implementation groups "...with" and closes with "contains". So it is kind of similar to your suggestion. Just reversed.

Just curious.

Thanks, Jan :-)

#16 in reply to: ↑ 15 ; follow-up: @manzwebdesigns
3 years ago

Replying to janthiel:

Replying to manzwebdesigns:

The only suggestion I would make is to make the following changes, moving 'contains' to the beginning of the list of accepted options:

@type string $compare_key_like_mode  Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.
@type string $compare_like_mode      Search mode for LIKE compares. Accepts 'contains', 'startswith ' or
                                     'endswith'. Default is 'contains'.

Hi @manzwebdesigns Thank you very much for your input. Could you be so kind and explain to me why you propose this change?
I didn't spend much thinking about the order of the comment till now :-) Yet when considering it, all orders are kind of reasonable. Alphabetical would be as well as a "logical" order where "contains" would be between "startsWith" and "endsWith". My current implementation groups "...with" and closes with "contains". So it is kind of similar to your suggestion. Just reversed.

Just curious.

Thanks, Jan :-)

Hi @janthiel,

You are very welcome. I just noticed that the other parameters' default options were listed first... it is fine either way, but when I do this, I always put the default option first.

Thanks,
Bud

#17 in reply to: ↑ 16 @janthiel
3 years ago

Replying to manzwebdesigns:

Hi @janthiel,

You are very welcome. I just noticed that the other parameters' default options were listed first... it is fine either way, but when I do this, I always put the default option first.

Thanks,
Bud

Hey Bud,

That sounds reasonable and logical. I will change the patch accordingly :-)

Best regards,
Jan

@janthiel
3 years ago

Implemented Buds feedback on comment order

This ticket was mentioned in Slack in #core by janthiel. View the logs.


3 years ago

#19 @audrasjb
3 years ago

  • Milestone changed from Awaiting Review to 5.9

Moving for 5.9 consideration.

#20 @hellofromTonya
3 years ago

  • Keywords early early-like-actually-early added
  • Milestone changed from 5.9 to Future Release

Query changes by design need careful consideration and plenty of soak time to ensure nothing breaks. With 5.9 feature freeze ~2 days away, it's too late in the release cycle for a change to query. Punting this out of 5.9. As the next milestone is not yet available, moving to Future Release. Once available, feel free to move into the next release.

#21 @janthiel
3 years ago

Good Morning @hellofromTonya,

can you let me know what is missing to this ticket to make it work? I would like to understand what I can do to make it work for at least the next release.

Currently I wouldn't know what else I could do as:

  • The ticket started within the 5.8 release cycle and was completed before planing of 5.9 even started
  • There is positive Core Contributor Feedback (#3, #9)
  • There is positive DEV Feedback (#16)
  • There is extensive test coverage around the WP_Query / WP_Meta_Query with added positive and negative tests to cover the new queries and asure they don't break existing code
  • I pushed this topic several times to the DEV chat agenda

I totally agree that central code has to be tested thoroughly, but that rarely happens only because a ticket is slacking around for longer. So following your concerns would also mean that one cannot trust the automatic tests. As they should deliver exactly the safety you mentioned.

As usual I am thankful for your advice on how to proceed here :-)
Thank you very much and best regards!

#22 @hellofromTonya
3 years ago

Hello @janthiel, Once 5.9 is in Beta (a week from now), let's circle back and figure out what the next steps are for review and testing. 6.0 technically starts in a week.

#23 follow-up: @ironprogrammer
2 years ago

Hi, @janthiel! Thanks for sticking with this enhancement. As you know 5.9 was a little late, but I think this can be refreshed and ready for 6.0.

There have been several DocBlock updates to WP_Meta_Query that presented merge conflicts during recent testing of this PR. Would you be able to refresh off latest trunk? You can also update to @since 6.0.0 🤞🏻

Nice work on the unit tests! I think they could be a little more consistent by using factory()->post->create_many, $query, and $expected across the eight new tests. This would help readability.

Looking forward to helping test and get this committed! 🙌🏻

#24 in reply to: ↑ 23 ; follow-up: @janthiel
2 years ago

Replying to ironprogrammer:

Nice work on the unit tests! I think they could be a little more consistent by using factory()->post->create_many, $query, and $expected across the eight new tests. This would help readability.

Hey @ironprogrammer Thanks for your feedback. Updating the PR for 6.0.0 is not an issue at all.
Yet I do not understand what you mean with $query and $expected. create_many is an obvious change. The other two are not.
Can you please provide me with reference to the changes you propose for $query and $expected?

Thanks :-)

#25 in reply to: ↑ 24 @janthiel
2 years ago

Replying to janthiel:

Replying to ironprogrammer:

Nice work on the unit tests! I think they could be a little more consistent by using factory()->post->create_many, $query, and $expected across the eight new tests. This would help readability.

Hey @ironprogrammer Thanks for your feedback. Updating the PR for 6.0.0 is not an issue at all.
Yet I do not understand what you mean with $query and $expected. create_many is an obvious change. The other two are not.
Can you please provide me with reference to the changes you propose for $query and $expected?

Thanks :-)

Ah ... never mind. Just saw what you mean. As I only copied the tests just the last ones were missing $query and $expected :-)

Fixed that too.

#26 @ironprogrammer
2 years ago

Hi, @janthiel:

Yep, those were what I was referring to. I'm glad you could decipher my terse note! The PR now applies without merge issues. ✅

If you have time, please run through the new inline comments to be sure they conform to the inline doc standards (including punctuation). I only left one example highlighting that in GitHub, as I didn't want to litter the code with trivial comments (I tried not to be persnickety 🤓).

In summary, LGTM! 👍🏻

Test Report

PR 1392

Env

  • WordPress 6.0-alpha-52448-src
  • Safari 15.3
  • macOS 12.2.1 (Monterey)

Steps to Test

  1. Run phpunit --filter Tests_Query_MetaQuery.
  2. [Optional] For deeper coverage, run phpunit --group query,meta.
  3. Observe that all tests pass. ✅

Performance Example

It would be nice to have a real-world test that compares REGEXP/RLIKE to LIKE using the new operator options. Perhaps it could be added to the WP_Query documentation in the sample code section (once this drops, of course). I'll poke around my test data to see if I can put something together, but please share if you've got anything yourself!

#27 @janthiel
2 years ago

Regarding example code:
Let's stick with the Stack Overflow questions regarding "WordPress Meta Query Startswith" / "- Endswith".

Just picked some of them which seem relevant to me:

https://wordpress.stackexchange.com/a/159433
https://wordpress.stackexchange.com/q/178302

The "Email" example is a good one:

Task: Get all Posts / Users where the email address stars- or ends with something particular.

Before 6.0.0:

    'meta_query' => array(
        array(
            'key'       => 'email_address',
            'value'     => '^hello@',
            'compare'   => 'REGEXP',
        )
    )

Or:

    'meta_query' => array(
        array(
            'key'       => 'email_address',
            'value'     => '@example\.com$',
            'compare'   => 'REGEXP',
        )
    )

As of 6.0.0:

    'meta_query' => array(
        array(
            'key'               => 'email_address',
            'value'             => 'hello@',
            'compare'           => 'LIKE',
            'compare_like_mode' => 'startswith'
        )
    )

Or:

    'meta_query' => array(
        array(
            'key'               => 'email_address',
            'value'             => '@example.com',
            'compare'           => 'LIKE',
            'compare_like_mode' => 'endswith'
        )
    )

#28 @ironprogrammer
2 years ago

  • Keywords needs-testing added

Making sure this bubbles up in test circles..

This ticket was mentioned in Slack in #core-test by ironprogrammer. View the logs.


2 years ago

#30 @mukesh27
2 years ago

  • Keywords needs-refresh added
  • Milestone changed from Future Release to 6.1
  • Version 5.7.2 deleted

The patch needs to update against trunk. Moving for 6.1 consideration.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


2 years ago

#32 @SergeyBiryukov
2 years ago

Hi there, thanks for the ticket!

This was discussed in a 6.1-early bug scrub on WordPress Slack with @hellofromTonya, @audrasjb, and @markjaquith. Some of the initial feedback:

  • The ticket needs a strategy to help it move forward including a testing strategy to validate it works and discover potential unknowns.
  • As this is about performance, benchmarking will also be needed.

Replying to janthiel:

As an alternative I thought about reworking the current LIKE mode to allow custom wildcard passing. But this will clearly break backward compat and thus I discarded this approach.

There was some consensus in the chat that this approach should be explored further, allowing people to place the % in the value as they wish. Laravel does it like that:

->where('column', 'like', 'sometext%')

The backward compatibility concerns can be reviewed to see if they can be mitigated, perhaps via a way to mark up values as raw.

This ticket was mentioned in Slack in #core by chaion07. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


2 years ago

#36 @audrasjb
2 years ago

  • Milestone changed from 6.1 to Future Release

Hello there,

This ticket is marked early-like-actually-early. As per today's bugscrub, let's move it to Future Release since that kind of ticket needs to get a strong momentum before commit, with a lot of tests and reviews from experienced contributors.

Hopefully in will be ready to ship for the next release.

mehulkaklotar commented on PR #1392:


22 months ago
#37

@JanThiel Can you update the PR from the latest trunk? I would like to jump on the PR changes for the testing and benchmarking it. Let's try to make it to 6.1

JanThiel commented on PR #1392:


22 months ago
#38

Will try, but no promises. I am on vacation starting Friday. Workload is already quite maxed.

Thanks for hopping on board this PR anyway :-)

mehulkaklotar commented on PR #1392:


22 months ago
#39

@JanThiel can you add me t your repo as collaborator for this branch or push access for this branch. so that I can do changes if needed? If you can do that, would be awesome.

JanThiel commented on PR #1392:


22 months ago
#40

@mehulkaklotar Trunk is merged. Thanks for your support :-)
I wasn't able to find a quick solution to add you just to one branch in my repo.

mehulkaklotar commented on PR #1392:


22 months ago
#41

@JanThiel I think you can add some branch protection rules for the collaborators.

Or I will create a PR against your branch in your repo. That way you will not have to add me to your repo.

Whichever is best for you. Thanks!

JanThiel commented on PR #1392:


22 months ago
#42

@mehulkaklotar Checks are all good now against trunk. If you want to add changes, feel free to fork my branch and open a separate PR against the WordPress repo referencing the trac ticket.
As written before, I will not be near a PC within the next 3 weeks. So I cannot handle any PRs from your end.

This ticket was mentioned in PR #3293 on WordPress/wordpress-develop by mehulkaklotar.


22 months ago
#43

  • Keywords needs-refresh removed

Adds the new LIKE based compare operators STARTSWITH, ENDSWITH, NOT STARTSWITH, NOT ENDSWITH to the meta query value compares.
They allow more performant and easier regular query cases than REGEXP which would be the alternative.

This PR is a fork from existing PR as contributor is temporarily unavailable to work on the changes.

Trac ticket: https://core.trac.wordpress.org/ticket/53450

Note: See TracTickets for help on using tickets.