Make WordPress Core

Opened 3 months ago

Closed 5 weeks ago

Last modified 5 weeks ago

#57966 closed enhancement (fixed)

High number of call to sanitize_term in wp_queue_posts_for_term_meta_lazyload

Reported by: spacedmonkey's profile spacedmonkey Owned by: spacedmonkey's profile spacedmonkey
Milestone: 6.3 Priority: normal
Severity: normal Version: 6.2
Component: Taxonomy Keywords: has-patch
Focuses: performance Cc:


In [55252] was refactored wp_queue_posts_for_term_meta_lazyload to use wp_cache_get_multiple. This refactor includes a call to get_term. However calling get_term calls sanitize_term, which sanitizes term all fields in a term. This can mean, if there are say 40 terms called to this function, it can mean that sanitize_term and then sanitize_term_field is called. As there 9 fields to sanitize, this can mount up quickly.

This is no need to use get_term here, a object with the shape of a term will work as well and will not run through all these sanitize_term_field calls.

Attachments (1)

Screenshot 2023-03-21 at 18.03.24.png (75.7 KB) - added by spacedmonkey 3 months ago.

Download all attachments as: .zip

Change History (13)

This ticket was mentioned in PR #4246 on WordPress/wordpress-develop by @spacedmonkey.

3 months ago

  • Keywords has-patch added

#2 @spacedmonkey
3 months ago

  • Owner set to spacedmonkey
  • Status changed from new to assigned

#4 @spacedmonkey
3 months ago

There are now two PRs.

  1. #4246 Remove the use of get_term and replace with check the raw object that comes from object cache. This keeps the check for the taxonomy.
  1. #4259 Removes all checks for if the term exists or taxonomy. This may have more edge case issues, but it much much faster.

I await feedback on this.

#5 @joemcgill
2 months ago

At minimum, it looks like the first approach makes sense to me. We're only calling get_term to make sure an error isn't thrown before adding the term ID to the queue. One interesting behavior of get_term is that if it is passed an object instead of an ID, it will apply raw filtering, which may already be more performant in scenarios like this than passing an ID which requires a new WP_Term instance to be generated and sanitized.

Since we're not making use of any of those fields, but instead are only worried about adding the IDs to the queue, I think ignoring sanitation here is totally fine, and is in line with how this was functioning prior to the changes made in [55252].

I'm not sure I fully grasp why the additional taxonomy checks were added in the prior commit. Could you provide more context, and why you think it would be fine to remove them in the second approach you're suggesting?

Last edited 2 months ago by joemcgill (previous) (diff)

#6 @spacedmonkey
2 months ago

@joemcgill For context. The old version of this function used get_object_term_cache. In that function, it does this. To ensure that backwards compatibility is maintained, I copied this logic over, but looking at the code again, I am not sure it is needed.

Calling get_term here does two things.

  1. Check the taxonomy exists.
  2. Check the term exists.

Why are either of these checks needed? The code in the function, gets taxonomies assigned to the current posts. But if had a taxonomy from say a plugin and then deactivated that plugin, then it would not be registered and not be looked up, so the taxonomy look up is unrequired IMO. As for the ID, I don't see why the term would come back as part of the relationship to the post, if the term was deleted, but maybe some weird caching issue. What is the problem is asking for a term meta for a term that doesn't exist? Either will never be rendered or more than likely nothing will return. As the lookup for term meta is down in one database query, it is just one id in a list of others. As a key lookup, it will quickly return nothing.

Unless I am missing something, these checks seem pointless, as we already have the id we need,just pass it to the term meta lazy load queue and forget about it. It also worth noting that term meta is not heavily used, that the likelihood of this ever being a problem seems low.

#7 @joemcgill
2 months ago

Thank you for the context, I think I understand better what is happening here. get_object_term_cache() calls get_term() because it expects that only IDs are being returned by wp_cache_get( $id, "{$taxonomy}_relationships" ). Since that function wants to return a list of WP_Term objects, the error check is there to make sure no WP_Error objects are mixed in.

The individual calls to get_term() also explains why the function calls _prime_term_caches prior to looping through the IDs.

Since one of the base assumptions of wp_queue_posts_for_term_meta_lazyload is that it will only work if the term cache has already been primed, and because we are only trying to get a list of term IDs to queue (not full objects), I think the second approach should be fine. The lazy loader should be responsible for making sure any term IDs it is loading are valid term objects regardless (not sure if it is actually implemented that way), but I agree that we shouldn't need to add an additional check here.

#8 @spacedmonkey
7 weeks ago

I provided some profiling data for my PR. It is a net positive. It worth noting, for pages with multiple WP_Query, there would have been way more of a performance improvement.

Classic theme
Trunk PRDifference
Response Time 81.62 81.09 0.53
wp-before-template 27.84 27.59 0.25
wp-template 47.17 46.96 0.21
wp-total 75.16 74.84 0.32

Block theme
Trunk PR Difference
Response Time 221.25 217.15 4.1
wp-before-template 109.07 105.92 3.15
wp-template 103.35 101.85 1.5
wp-total 213.39 208.73 4.66

I think this PR should go ahead and be committed at part of 6.3.

@spacedmonkey commented on PR #4259:

6 weeks ago

Thanks @spacedmonkey, Overall look good to me.

  • Use unique term ids as there may be duplicate term ids
  • It's good to add @since annotation with message that "As performance improvement remove get_term call.

I am not sure about adding that as a comment in code. That is what commit messages are for.

#10 @joemcgill
6 weeks ago

I think the approach in PR 4259 makes sense. I agree with @spacedmonkey that since there is no public API change here, there is no need to add a @since comment in the docblock.

#11 @spacedmonkey
5 weeks ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 55701:

Taxonomy: Remove redundant call to get_term in wp_queue_posts_for_term_meta_lazyload.

In [55252] the function wp_queue_posts_for_term_meta_lazyload was refactored to use wp_cache_get_multiple. This refactor included a call to get_term. However, calling get_term calls sanitize_term, which sanitizes all fields in a term. The full term object is not needed in this context as term meta only needs to the term id, which is already in the function. Saving calls to sanitize_term will improve performance of this function.

Props spacedmonkey, joemcgill, mukesh27.
Fixes #57966.

Note: See TracTickets for help on using tickets.