Opened 9 years ago
Closed 8 years ago
#37198 closed enhancement (fixed)
Use `WP_Term_Query` for `wp_get_object_terms()`
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 4.7 | Priority: | normal |
Severity: | normal | Version: | 4.6 |
Component: | Taxonomy | Keywords: | has-patch |
Focuses: | Cc: |
Description
As discussed in #35381, WP_Term_Query
should be used for wp_get_object_terms()
as well. It works almost similar, except for that it requires to pass object IDs and make an additional SQL JOIN with the term_relationships
table. In addition to a better structured code, this would also provide significant cache benefits.
Attachments (3)
Change History (18)
This ticket was mentioned in Slack in #core by flixos90. View the logs.
8 years ago
#5
follow-up:
↓ 6
@
8 years ago
- Keywords 2nd-opinion added; early removed
- Milestone changed from Future Release to 4.7
@flixos90 Thank you for the hard work you've done on the first pass, and sorry for the time it took to get around to reviewing it. It's a hard one :)
Your patch is extremely conservative, in the sense that (a) it guarantees 100% backward compatibility, and (b) it attempts not to touch any existing code beyond what must be touched to refactor wp_get_object_terms()
. I like this as a general approach :) But in this case, I think that we can simplify our strategy by looking a bit deeper at the back compat issues.
37198.2.diff is a somewhat simplified version of your patch. (It's also sloppy - I left out some documentation and default params. This will be cleaned up.) Here are the important differences:
// Verification of term hierarchy must be skipped for wp_get_object_terms().
If I'm reading it correctly, this block in 37198.diff was meant to addressTest_Term_WpGetObjectTerms::test_parent()
. These tests fail without the block. The reason is that the tests were written incorrectly: they define hierarchical terms, in a taxonomy that is non-hierarchical. By changing the tests, we can avoid the additional logic. This is technically a compatibility break. Currently, if you manually create hierarchical terms in a non-hierarchical taxonomy, and you callwp_get_object_terms()
with 'parent', only the object-terms with the correct 'parent' will be returned. With my patch, doing this same thing will result in an empty array being returned. IMO, passing 'parent' when the taxonomy is non-hierarchical is not meaningful, and it's OK to break it - but I would be interested to hear what you think.
- When building the
SELECT
clause, I default to selectingt.*, tt.*
whenever possible. The performance differences are negligible, and keeping a smaller number of variations in the SQL query means better cache coverage. Note that this part of the patch will need refreshing when the mainSELECT
clause inWP_Term_Query
grabs only theterm_id
(orterm_taxonomy_id
) column. See #37189, #34239.
- I removed the
post_process_terms()
logic. The sanitization you mention was introduced in [26010] to ensure that integer values are, in fact, integers. But we now haveWP_Term
for this purpose. So, instead, my patch sends results throughget_term()
whenfields=all_with_object_id
, and also adds some integer-casting whenfields=ids
and some other numeric values. I think this logic can probably be simplified even further with #34239 etc, but for now I think it's clearer to ensure thatWP_Term
objects are returned.
- You had a special clause for 'orderby' in
wp_get_object_terms()
to 'id'. I think this is related to the fact that some of our unit tests passedorderby=t.term_id
towp_get_object_terms()
. (You also have some logic inparse_orderby()
for this.) But I'm fairly sure that these tests are incorrect:t.term_id
has never been officially supported as a value for 'orderby'. It worked in some cases because, since it didn't match a whitelisted 'orderby' value, it fell through to the default - which happened to beterm_id
. However, I don't think this is something that we must continue to support. I could be convinced otherwise if we found a bunch of uses oforderby=t.term_id
in the wild, but even then I'd strongly consider the break. My patch fixes the tests, instead.
So, my approach is a bit more cavalier, but results in a considerably smaller patch. What do you think, @flixos90 ?
#6
in reply to:
↑ 5
@
8 years ago
Replying to boonebgorges:
Your patch is extremely conservative, in the sense that (a) it guarantees 100% backward compatibility, and (b) it attempts not to touch any existing code beyond what must be touched to refactor
wp_get_object_terms()
. I like this as a general approach :) But in this case, I think that we can simplify our strategy by looking a bit deeper at the back compat issues.
I'm glad you see it that way. :) I wasn't sure about how backward-compatible it should be, but indeed, especially the post_process_terms()
logic in my patch to take care of previous handling in wp_get_object_terms()
was kind of painful, so I like handling it in a unified way, plus it's smaller and better readable like you said.
IMO, passing 'parent' when the taxonomy is non-hierarchical is not meaningful, and it's OK to break it - but I would be interested to hear what you think.
I agree, returning an empty result when passing a parent
to a non-hierarchical taxonomy is actually what I would expect it to return.
Your other changes make sense to me as well. I'm not exactly sure about whether getting rid of the extra sanitization I did could introduce BC issues, but you probably have a better overview of that. I think we should mention this in a dev-note, and maybe get this in some time soon to see if anything breaks.
One thing I'd like to add back in is only allowing "object_id-related" query vars if object_ids
is passed and not empty. On the one hand, providing fields => all_with_object_id
doesn't make sense without passing object_ids
anyway, but on the other hand we will be more foolproof by preventing it entirely. The same applies to sorting by term order (this can break the SQL query atm), although I don't like how I did it in the first patch (the method is encapsulated otherwise, but for this particular reason accesses a class property - passing a bool like $has_object_ids
to the method is structurally better I think).
Are you planning to add documentation and default params back in? You could probably merge a lot of my first patch for that. Looking forward to see a comprehensive patch. :)
#7
follow-up:
↓ 8
@
8 years ago
Are you planning to add documentation and default params back in? You could probably merge a lot of my first patch for that. Looking forward to see a comprehensive patch. :)
Sorry about this, I got lazy :)
37198.3.diff is a more complete patch. It adds back in the documentation etc. Also:
- I'm enforcing the 'term_count' thing by changing the value of 'order' to something legal before passing it to
parse_orderby()
- I moved the deduplication logic so that it can be done in a single block (instead of conditionally using
array_unique()
) - Added some tests
So, fun fact, this change adds caching to all wp_get_object_terms()
calls. Yet we still have a separate {$taxonomy}_relationships
cache. We can't easily tear the latter bit out, because it's scattered through the codebase. Plus, it's a more durable cache than the WP_Term_Query
cache, because WP_Term_Query
is based on last_changed
, which will be bumped pretty frequently. Still, having two layers of cache makes me uneasy. I wrote a test that demonstrates that things are working OK. Do you have thoughts about it, @flixos90 ?
#8
in reply to:
↑ 7
@
8 years ago
Replying to boonebgorges:
- I'm enforcing the 'term_count' thing by changing the value of 'order' to something legal before passing it to
parse_orderby()
I think it's term_order
, other than that it appears to be good solution. :)
So, fun fact, this change adds caching to all
wp_get_object_terms()
calls. Yet we still have a separate{$taxonomy}_relationships
cache. We can't easily tear the latter bit out, because it's scattered through the codebase. Plus, it's a more durable cache than theWP_Term_Query
cache, becauseWP_Term_Query
is based onlast_changed
, which will be bumped pretty frequently. Still, having two layers of cache makes me uneasy. I wrote a test that demonstrates that things are working OK. Do you have thoughts about it, @flixos90 ?
That's interesting. I don't think it will be a problem since the caches shouldn't interfere with each other. Also, the {$taxonomy}_relationships
cache has a more specific use-case (terms of _one_ taxonomy for _one_ object) and it is actually built using wp_get_object_terms()
, so I think we can ignore it here. They both have their use-case.
What we could theoretically do is actually make use of that cache in wp_get_object_terms()
:
- check if _one_ object ID and _one_ taxonomy is provided and if
$orderby
doesn't includeterm_order
- if so, check the
{$taxonomy}_relationships
cache for that object ID usingget_object_term_cache()
- if the cache brings a successful response, we adjust the query arguments:
- remove the
object_ids
argument - add the term IDs to a
include
argument
- remove the
- send the (possibly modified) arguments to a
WP_Term_Query
If any of the above checks fails, we proceed as usual instead. I'm not sure if that actually makes sense performance-wise, but just wanted to mention it as a possibility. When having the relationships cached, this would allow for much simpler queries (and I assume there _are_ many cases where terms for one object ID and one taxonomy are requested.
#9
@
8 years ago
- Keywords 2nd-opinion removed
- Owner set to boonebgorges
- Status changed from new to assigned
I think it's term_order, other than that it appears to be good solution. :)
Whoops :)
That's interesting. I don't think it will be a problem since the caches shouldn't interfere with each other.
Right. For posterity's sake, here's how it'll work in practice. Functions like get_the_terms()
, which call get_object_term_cache()
, do something like the following:
1. $terms = get_object_term_cache( $post_id, $taxonomy ); 2. if ( false === $terms ) { 3. $terms = wp_get_object_terms( $post_id, $taxonomy ); 4. // then set the {$taxonomy}_relationships cache 5. }
Under normal circumstances, one of the following conditions will hold:
- The relationship cache is full, and so is hit on line 1.
- The relationship cache is empty, but the cache related to the
wp_get_object_terms()
call is full, so that cached value is returned and used to populate the{$taxonomy}_relationships
cache. - The relationship cache is empty, as is the
wp_get_object_terms()
cache. The database is hit and the term query cache is populated. This is then used to populate the{$taxonomy}_relationships
cache.
What we could theoretically do is actually make use of that cache in wp_get_object_terms()
That's an interesting thought. Essentially, it'd save us a JOIN against wp_term_relationships
. Let's explore the possibility in a separate ticket, with some benchmarks. (It'll take a bit of confusing code to make it work right, so it's not worth doing if it doesn't result in visible benefits.)
I played around with this to create a first draft for a patch in 37198.diff.
In addition to conditionally adding the necessary JOIN statement (only when object IDs are passed), there were some additional tweaks necessary as several things that used to happen in
get_terms()
(and nowWP_Term_Query
) were not applicable / compatible with howwp_get_object_terms()
work. Therefore I include checks for whetherobject_ids
is empty or not in several locations to be backwards-compatible.Also, new values for the
fields
argument are supported as needed forwp_get_object_terms()
(all_with_object_id
,tt_ids
,slugs
).There is a new private method
post_process_terms
which is used to ensure that the terms are sanitized if object IDs are passed (wp_get_object_terms()
expects sanitized data). This method is not only run after doing the SQL query, but also after retrieving cached results.