Make WordPress Core

Opened 6 years ago

Closed 19 months ago

Last modified 19 months ago

#43867 closed feature request (fixed)

Introduce the ability to control which fields are searched in a search query

Reported by: johnbillion's profile johnbillion Owned by: audrasjb's profile audrasjb
Milestone: 6.2 Priority: normal
Severity: normal Version:
Component: Query Keywords: has-patch has-unit-tests commit needs-dev-note add-to-field-guide
Focuses: Cc:

Description

The s argument of the WP_Query::parse_query() method always searches the post_title, post_excerpt, and post_content fields, with no way of controlling this apart from using the posts_search filter and adjusting the SQL manually.

It would be good if it was possible to specify which fields are searched when performing a query, for example via a search_fields argument or similar.

My use case is for a REST API driven autocomplete field which allows a user to search post titles without a large number of less relevant results appearing due to the search spanning post content too.

Attachments (1)

43867.diff (15.2 KB) - added by birgire 6 years ago.

Download all attachments as: .zip

Change History (28)

#1 @birgire
6 years ago

That sounds like a useful feature.

For comparison, the way it's done in WP_User_Query is with the search_columns query argument:

https://core.trac.wordpress.org/browser/tags/4.9.5/src/wp-includes/class-wp-user-query.php#L163

and also the user_search_columns filter:

https://core.trac.wordpress.org/browser/tags/4.9.5/src/wp-includes/class-wp-user-query.php#L532

where the default search fields are (marked with x):

x ID
x user_login
  user_pass
x user_nicename
x user_email
x user_url
  user_registered
  user_activation_key
  user_status
x display_name

Similar the WP_Site_Query uses the search_columns query argument:

https://core.trac.wordpress.org/browser/tags/4.9.5/src/wp-includes/class-wp-site-query.php#L137

and the site_search_columns filter, with the default search columns:

x domain
x path 
  ... 

https://core.trac.wordpress.org/browser/tags/4.9.5/src/wp-includes/class-wp-site-query.php#L502

So taking a similar approach as above, we could support:

  • the search_columns query argument
  • and the post_search_columns or search_columns filter.

with the default columns as:

x post_title
x post_excerpt
x post_content
  ...
Last edited 6 years ago by birgire (previous) (diff)

#2 @birgire
6 years ago

The parse_search() seems like a good place for
the filter:

$search_columns = apply_filters( 'post_search_columns', $search_columns, $q['s'], $this );

where we can loop over the columns with e.g.:

$searches = array(); 
foreach( (array) $search_columns as $search_column ) {
	$searches[] = $wpdb->prepare( "({$wpdb->posts}.{$search_column} $like_op %s)", $like ); 
}

if( $searches ) {
	$search .= "{$searchand}(" . join( " $andor_op", $searches ) . ')';
}

within the search terms loop.

I will look into this further with unit tests.

@birgire
6 years ago

#3 @birgire
6 years ago

43867.diff is a first iteration with unit tests in tests/query/searchColumns.php.

Some notes on this first iteration:

  • It uses the default search columns: post_title, post_excerpt and post_content.
  • It uses the supported search columns: post_title, post_excerpt and post_content. We could consider expanding this further.
  • It uses the default search columns as a fallback for empty input: 'search_columns' => array().
  • It uses the default search columns as a fallback for an existing but not supported search field, e.g. 'search_columns' => array( 'post_name' ).
  • It ignores non-existing or non supported search column(s) together with supported search column(s), e.g. 'search_columns' => array( 'post_title', 'post_name' ) is the same as 'search_columns' => array( 'post_title' ).
  • Introduces the post_search_columns filter. We restrict it to the supported search columns, to avoid any kind of column names in the SQL query.

#4 @birgire
6 years ago

  • Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed

#5 @petitphp
22 months ago

#56880 was marked as a duplicate.

#7 @petitphp
22 months ago

@johnbillion Duh, I don't know how I missed this ticket.

I refreshed birgire's patch against trunk. I've also updated the REST Post endpoint to add the ability to select the columns to use when searching.

This ticket was mentioned in Slack in #core by petitphp. View the logs.


21 months ago

#9 @SergeyBiryukov
21 months ago

  • Milestone changed from Awaiting Review to 6.2

@petitphp commented on PR #3585:


21 months ago
#10

@costdev @audrasjb Thanks for the review, I've gone through all the comments and fixed them.

@petitphp commented on PR #3585:


21 months ago
#11

@costdev Thanks, I've committed the suggested changes.

@petitphp commented on PR #3585:


20 months ago
#12

@costdev Looking at the test failures, I noticed it always happen on the same three tests :

  • Tests_Query_SearchColumns::test_s_should_support_post_excerpt_search_column,
  • Tests_Query_SearchColumns::test_s_should_support_post_content_search_column,
  • Tests_Query_SearchColumns::test_s_should_support_post_excerpt_and_post_content_search_columns

These tests have one similarity, they don't use the post_title column in the search query.

Summary :

Failures were due to a change made in this PR which result in the orderby clause to only contain a statement for the post_date column.

For now, I've reverted this change (see 4e55c57).

Detail report :

Previously when doing a basic search query we would generate an orderby clause looking like this ORDER BY wptests_posts.post_title LIKE '%foo%' DESC, wptests_posts.post_date DESC, containing a post_title and a post_date statement.

In this PR, I've initially changed the condition which add the post_title statement in the orderby clause to only add it when the post_title column was used in the query. This result in an orderby clause containing only the post_date part (ORDER BY wptests_posts.post_date DESC).

My thinking for this change was that it wouldn't be relevant to order results by post_title for a search query that didn't include this column in it WHERE condition.

When running the tests, all the posts are generated rapidly and end up having the same post_date, down to the second. The queries would sometime return results in an unexpected order for some of the tests.

Upon additional testing, I ran into an issue when using the search_column parameter and the 'orderby' => 'relevance' parameter, which generated an invalid orderby clause ORDER BY DESC, wptests_posts.post_date DESC.

I think it's safer to revert that change and go back to the original behavior for now.

#13 @costdev
20 months ago

PR 3585 is looking great to me. I've just left a small docs suggestion, but other than that I think this one is ready for final review.

@johnbillion commented on PR #3585:


20 months ago
#14

The issue with the seemingly random ordering of results with the same post date is one we've seen before. Some related tickets off the top of my head:

This ticket was mentioned in Slack in #core by mukeshpanchal27. View the logs.


19 months ago

#16 @costdev
19 months ago

  • Keywords commit added

This ticket was discussed during the bug scrub. The feedback on the PR appears to have been addressed and this one looks ready for commit consideration.

Additional props: @mukesh27

#17 @audrasjb
19 months ago

Self assigning for commit.

#18 @audrasjb
19 months ago

  • Owner set to audrasjb
  • Status changed from new to accepted

#19 @audrasjb
19 months ago

  • Keywords needs-dev-note add-to-field-guide added

#20 @audrasjb
19 months ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 55248:

Query: Add a search_columns argument to control which fields are searched in a search query.

Previously, the s argument of the WP_Query::parse_query() method searched the post_title, post_excerpt, and post_content fields, with no way of controlling this apart from using the posts_search filter and adjusting the SQL manually. This changeset adds the ability to specify which fields are searched when performing a query, using the search_columns argument.

Props johnbillion, birgire, petitphp, audrasjb, costdev, mukesh27.
Fixes #43867.

#22 @petitphp
19 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopen to update tests and fix intermittent failures in CI.

This ticket was mentioned in PR #4023 on WordPress/wordpress-develop by @petitphp.


19 months ago
#23

Switch to assertSameSets

Trac ticket: https://core.trac.wordpress.org/ticket/43867

#24 @SergeyBiryukov
19 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 55264:

Tests: Use assertSameSets() in WP_Query tests for search_columns argument.

This aims to resolve intermittent test failures due to indeterminate sort order.

Follow-up to [55248].

Props petitphp, costdev, audrasjb, johnbillion.
Fixes #43867.

@SergeyBiryukov commented on PR #4023:


19 months ago
#25

Thanks for the PR! Merged in r55264.

#26 @milana_cap
19 months ago

@johnbillion @audrasjb do we have a volunteer for writing the Dev note?

#27 @audrasjb
19 months ago

I think I can add it to my plate but anyone is welcome to help :)

Note: See TracTickets for help on using tickets.