Make WordPress Core

Opened 6 months ago

Closed 3 weeks ago

Last modified 3 weeks ago

#58116 closed defect (bug) (fixed)

rest_{$this->taxonomy}_query Cache Problem

Reported by: tnolte's profile tnolte Owned by: spacedmonkey's profile spacedmonkey
Milestone: 6.4 Priority: normal
Severity: normal Version: 6.0
Component: Taxonomy Keywords: has-patch has-unit-tests
Focuses: rest-api, performance Cc:

Description

This is a follow-up to #55837.

Every since the terms caching changes in WP 6.0+ there is an issue with caching when using the rest_{$this->taxonomy}_query filter. This filter is apparently bypassed when the cache results are stored and thus only the first REST API response has the correct data, following cache responses have the wrong data. This is specifically seen when adjusting the query to limit the results/returned counts to a specific post-type.

<?php
function my_rest_some_taxonomy_query( $params ) {
        $params['post_type']   = 'some_custom_post_type';
        $params['post_status'] = 'publish';

        return $params;
}
add_filter( 'rest_some_taxonomy_query', 'my_rest_some_taxonomy_query' );

Currently this is blocking us from upgrading past WordPress 5.9 as we haven't found how to correct this caching change. I may have to explore writing a custom REST API endpoint though I'm not confident that will ultimately fix the issue.

Change History (30)

#1 @tnolte
6 months ago

A follow-up note is that this sort of filtering is being done because our specific use case of the custom taxonomy results contains counts for 2 post types there the same taxonomy is being used.

#2 follow-up: @spacedmonkey
6 months ago

  • Severity changed from major to normal

@tnolte neither post_type or post_status are valid parameters to be passed to get_terms. These values would be removed and ignored. So I don’t see why these would be an issue.

Can you explain what you are trying to do with this filter. What is the expected result here?

Also, have you tested again 6.1 and 6.2. There were fixes in both releases.

Downgraded this from major, as this ticket is just a bug.

#3 in reply to: ↑ 2 @tnolte
6 months ago

  • Severity changed from normal to major

@spacedmonkey so to be very clear this works with WordPress older than 6.x, and doesn't work for all versions of 6.x, yes I have tested all of the versions. This filter is for the taxonomy REST endpoint this isn't a simple get_terms() call. The REST API calls for taxonomies return a count for how many posts are in each term, and when a taxonomy is shared between multiple post types you get a total of all posts across the associated post types. You can see the endpoint that I'm having problems with right here:

https://vilcek.org/wp-json/wp/v2/country_of_birth/?hide_empty=true

That example filter I provide works exactly as expected both when the Object Cache is empty and when it is filled prior to WordPress 6.x. After a WordPress 6.x upgrade, across all versions of 6, the only time that filter triggers is when the Object Cache is empty, afterwards all requests appear to completely bypass the filter and return the total counts once again. You can verify this currently at that endpoint I provided when looking at the Bulgaria results and you'll see that the count being returned is 2 however if you check both the 2 post type feeds you'll see that one of them contains 2 and the other contains 3. If I upgrade the site to WP 6.x then the only time I will get a count of 2 is when the object cache is empty, after the fact I will only ever get the total count across both post types.

#4 @spacedmonkey
6 months ago

  • Severity changed from major to normal

#5 @spacedmonkey
6 months ago

From what I can see your filter, passes two invalid params to the get_terms call and that changed the cache before and now it does not.

As you were passing invalid params here, fixing this issue, maybe in fact be a breaking change, as ignoring invalid params is now an expected behaviour, for 3 releases now.

The best work around is for to change your code to pass a value to the cache_group parameter. This is allow you to affect how data is cached.

Also note, this is not a major issue, as to see it, you have to use a filter. Major issues are issues affects all users, where data is being lost and a major performance issue.

#6 @tnolte
6 months ago

So, I took a look at the documentation for this and while it doesn't list the post_type as a supported method argument I went looking to see if we had additional code leveraging hooks in get_terms. What I did find is that we have do in fact leverage the terms_clauses hook in correlation to this. It is in that hook where the post_type parameter is being checked for and we are apparently modifying the query to do this filtering.

So the issue still remains though that there is still an inconsistency with the handling of the code. When object caching is involved in WordPress >=5.9.x those 2 hooks are being accounted for and returned as expected pre & post object caching. However, starting in 6.0+ that is no longer the case. The changes starting in WordPress 6.0 are actually the breaking change incorrectly handling those hooks.

#7 @peterwilsoncc
6 months ago

@tnolte Are you able to provide your term_clauses filter?

The cache key used for the queries does include a hash of the SQL query following term_clauses being applied so I can't see how it would have an effect but would like to double check.

#8 @tnolte
6 months ago

@peterwilsoncc this is what we have set:

<?php
                // Set to query specific posts types if set.
                if ( ! empty( $post_types ) ) {
                        $clauses['fields'] = 'DISTINCT ' . str_replace( 'tt.*', 'tt.term_taxonomy_id, tt.taxonomy, tt.description, tt.parent', $clauses['fields'] ) . ', COUNT(p.post_type) AS count';
                        $clauses['join'] .= ' LEFT JOIN ' . $wpdb->term_relationships . ' AS r ON r.term_taxonomy_id = tt.term_taxonomy_id LEFT JOIN ' . $wpdb->posts . ' AS p ON p.ID = r.object_id';
                        $clauses['where'] .= ' AND (p.post_type IN (' . implode( ',', $post_types ) . ') OR p.post_type IS NULL)';
                        $clauses['orderby'] = 'GROUP BY t.term_id ' . $clauses['orderby'];
                }

                // Set to query posts with specific status.
                if ( ! empty( $post_status ) ) {
                        $clauses['where'] .= ' AND (p.post_status IN (' . implode( ',', $post_status ) . '))';
                }

#9 @tnolte
6 months ago

@peterwilsoncc what I may try is using the "supported" object_ids parameter and specify all of the IDs of that Custom Post Type that is being used, though it's not readily clear to me if that is going to actually suffice based on the developer documentation.

#10 @peterwilsoncc
6 months ago

I created a mini-plugin for testing the ticket and was able to reproduce:

  1. Activate WP with caching dropin
  2. Create a post using the term "Two Post Types"
  3. Create a '58116 Types' CPT also with the term "Two Post Types"
  4. Ensure cache is clean via wp cli: wp eval "wp_cache_set_terms_last_changed();"
  5. Visit the rest api endpoint for the taxonomy: http://wp-dev-cached.local/?rest_route=/wp/v2/58116_taxo
  6. Observe count: 1
  7. Refresh the rest api endpoint
  8. Observe count: 2

On an instance without a persistent cache, the count will remain 1, as expected.

I agree this isn't a major issue as the argument isn't supported but it's certainly something to should look in to. Which is not to say it's something that ought to be accounted for.

As yet, I can't figure out exactly what is happening. I expect it's related to using wp_term_taxonomy->count (the terms count in the database) in the cached data but returning the number of objects returned in the initial request. Possibly around the code here.

#11 follow-up: @spacedmonkey
6 months ago

I think the issue came in here [52836].

These filters change the requested fields, as it returns a props that could the the WP_Term object, the count field returned by the db query, maps to count in WP_Term. Now, we only get term_ids and get term object from object cache, it breaks that.

I am not sure, if we should fix this issue. Using filters in this way, it is pretty edge case and is somewhat brittle. I am not sure we can put a fix in place that would fix this forever and feels like we are more likely to break things, as this as been this way for 3 versions now.

#12 in reply to: ↑ 11 @peterwilsoncc
6 months ago

Replying to spacedmonkey:

I am not sure, if we should fix this issue. Using filters in this way, it is pretty edge case and is somewhat brittle. I am not sure we can put a fix in place that would fix this forever and feels like we are more likely to break things, as this as been this way for 3 versions now.

I think it needs to be fixed so the returned data is consistent. There are a few options:

  • similar to WP_Query, consider the result uncachable if the SELECT has been modified
  • cache count as returned from the DB query, if set, otherwise use the term's count column
  • always return the term's count column, whether cached or uncached

I lean towards either the first or last option.

#13 @tnolte
6 months ago

@spacedmonkey

I am not sure, if we should fix this issue. Using filters in this way, it is pretty edge case and is somewhat brittle. I am not sure we can put a fix in place that would fix this forever and feels like we are more likely to break things, as this as been this way for 3 versions now.

The fact that it has been this way for 3 versions is also why our site in question has been locked into WP 5.9.x without the ability to upgrade to any 6.x version without breaking how the functionality used to work. Whether intentional or not the changes were a breaking change.

This ticket was mentioned in PR #4362 on WordPress/wordpress-develop by @spacedmonkey.


5 months ago
#14

  • Keywords has-patch added

#17 @spacedmonkey
3 months ago

  • Milestone changed from Awaiting Review to 6.4
  • Owner set to spacedmonkey
  • Status changed from new to assigned

#18 @spacedmonkey
2 months ago

  • Keywords needs-refresh needs-unit-tests added

#19 @spacedmonkey
2 months ago

@tnolte As this is custom code, could you try the following.

  • Use terms_pre_query to return a custom results.
  • Make 'term-queries' cache group none persistent, meaning caching would be disabled.
  • Use the get_term filter and change the properties of the WP_Term object.

The change is WP 6.0 was a valid one. It is bad to store the whole object in cache. It waste lots of space in cache. I am not personally sure this is a use case this use case as there are number ways to work around this, see above.

#20 @spacedmonkey
2 months ago

  • Keywords has-unit-tests added; needs-refresh needs-unit-tests removed

I have a PR with unit tests. The fix is pretty simple. Pinging @peterwilsoncc to take a look.

#21 @tnolte
2 months ago

@spacedmonkey Sorry for the delay. Our internal ticket related to this issue was put on hold internally. I'm going to attempt to apply your PR as a patch to WP on the site with our issue and test this out. Thanks!

@tnolte commented on PR #4363:


2 months ago
#22

@spacedmonkey I applied this patch to our WordPress site and things are working as they should. I'm going ahead with deploying are site for QA with this patch applied to WP Core.

#23 @tnolte
2 months ago

I have confirmed that https://github.com/WordPress/wordpress-develop/pull/4363 fixes this issue. I'm rolling out WP 6.2.2 to our QA team with this patch applied.

@spacedmonkey commented on PR #4363:


5 weeks ago
#24

Requesting review from @peterwilsoncc

@spacedmonkey commented on PR #4363:


5 weeks ago
#25

Requesting review from @peterwilsoncc

#26 @spacedmonkey
4 weeks ago

Pr is ready for review after actioned feedback.

This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.


4 weeks ago

#28 @spacedmonkey
3 weeks ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 56555:

Taxonomy: Cache term objects in WP_Term_Query if query is filtered.

When utilizing the terms_clauses or get_terms_fields filters within WP_Term_Query and the selected fields are modified, the entire term object is now cached. This adjustment is necessary because filters can broaden the selected fields beyond just the term ID. Fields linked to the term object, such as the count or parent, may undergo modifications when queried. Caching the complete object ensures the accurate storage of these modified fields within the cache.

Props spacedmonkey, tnolte, peterwilsoncc.
Fixes #58116.

Note: See TracTickets for help on using tickets.