WordPress.org

Make WordPress Core

Opened 5 months ago

Last modified 5 months ago

#47351 new defect (bug)

term_exists() needs to apply_filters('pre_term_name') to match wp_insert/update_term()

Reported by: piklistuser23 Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 5.2.1
Component: Taxonomy Keywords:
Focuses: Cc:
PR Number:

Description

term_exists() first searches via slug, then failing that searches by name.

When wp_insert_term()/wp_update_term() format their data they invoke the pre_term_name filter which in turn eventually calls esc_html(). This causes, eg, any '&' to become '&' in the term name saved in the db. (which is horrible practice, but that's beside the point)

In a hierarchical taxonomy, if there are multiple terms with the same name under different parents, the insert/update functions will enforce unique slugs. Thus the second and subsequent terms added with the same name will no longer be found in term_exists() via the default slug comparison.

The second test term_exists() does is by comparing the name. However, currently term_exists() does not invoke the pre_term_name filter before submitting the name for comparison.

The result is that in the case where multiple terms exist with the same name, which name includes contents altered by the pre_term_name filter hook, term_exists() will fail to find them even though they do exist.

Subsequent attempts in code to wp_insert_term() the alleged missing term (a common pattern) will of course fail and report a duplicate collision.

How to fix:
in wp-includes/taxonomy.php, in the function term_exists(), near the beginning,

replacing:

$term = trim( wp_unslash( $term ) );
$slug = sanitize_title( $term );

with:

$term = trim( wp_unslash( $term ) );
$slug = sanitize_title( $term );
$term = apply_filters( "pre_term_name", $term, $taxonomy );

solves this.

Example:

taxonomy 'tax':
id parent term
1 0 Foo
2 1 a & b -> slug generated: a-b
3 0 Bar
4 3 a & b -> slug generated: a-b-bar

the db will have name = 'a & b'
term_exists('a & b', 'tax', 1):
will test via slug 'a-b' and succeed and return the result

term_exists('a & b', 'tax', 3):
will test via slug 'a-b' and fail to match 'a-b-bar'
will then test name 'a & b' and fail to match 'a & b'
will return no result!

With the fix,
term_exists('a & b', 'tax', 3):
will still fail the slug comparison
will then test name using the correct value 'a & b' and find the match as it should.

Another potential issue I haven't tested yet is with the slug naming. Given the above example, what happens if you added a top-level term id 5, name 'a & b'? Since id 2 has used the 'a-b' slug, how would it make the new slug unique--it has no parent slug to append. Perhaps making the slug always append the parent slug is something else that may need to be considered--if that were the behavior already I would never have noticed this bug since my case would not have failed the slug comparison to expose the issue with the name comparison.

In addition, I don't know why it's testing slug first anyways or at all even, since the function signature as documented is accepting 'name' it should start and end there, especially since users can edit slugs arbitrarily. If I swapped the slugs between terms 2 and 4 in the above example, it would be a perfectly valid way to use the system, yet term_exists() would now fail its first slug comparison query every time and have to do the name query anyways--an extra query every time rather than just testing name first and only.

Another possible improvement, since the slug test is probably relied upon in existing code and can't be removed now, would be to look at the $term value and see if it looks like a slug format. If so, do the slug query first, if not, do the name query first. A simple strpos($term, '-') test and/or $term == $slug test could potentially save a lot of extra queries.

The better/correct solution to the above would be to actually remove the excessive default pre_term_name filters in wp_insert|update_term() and only filter the content to minimal SQL grammar safety--leaving the user content opaque and unmangled in the database. But, this would require updating the db contents when deploying the change, and could adversely affect existing client code that deals with this very issue.

Change History (1)

#1 @piklistuser23
5 months ago

Actually, if it always appended the parent slug for child terms this issue would have been discovered much sooner because any nested terms would always fail the slug query test and many more would have tested the name comparison query code path.

Note: See TracTickets for help on using tickets.