Opened 9 years ago
Last modified 3 years ago
#37009 assigned defect (bug)
When two different tags generate the same slug, the second tag is rejected
Reported by: | michael.costanza | Owned by: | boonebgorges |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 4.5.2 |
Component: | Taxonomy | Keywords: | bulk-reopened |
Focuses: | Cc: |
Description
When two different tags result in the same slug, instead of the slugs being made unique the second tag is discarded and treated as if it were the first. Here's an example:
1) Add the tag "$4 gas" to a post, via the post edit screen.
2) Update the post.
3) Verify that the term "$4 gas" is added to the wp_terms table with "4-gas" as its slug and that the tag is displayed with the post on the edit screen.
4) Add the tag "#4 gas" (or "#4-gas or similar) to the post, via the post edit screen.
5) Update the post.
6) Note that the term "#4 gas" has not been added to the wp_terms table and the tag is not displayed with the post.
Also, if you try to add "#4 gas" to any post, you'll get "$4 gas" instead.
(I realize this example sounds silly, but with the many tags based on words in different languages or brands, unique tags can result in the same slug.)
It seems that what should happen in this case is that if the original tags are actually unique, the second one should get a unique slug (with something like '-2' appended).
I'm using WordPress 4.5.2 with all plug-ins deactivated.
Attachments (1)
Change History (12)
#1
@
9 years ago
- Milestone changed from Awaiting Review to 4.6
- Owner set to boonebgorges
- Status changed from new to assigned
#6
@
9 years ago
37009.diff starts to move to the more precise get_term_by()
, but it's still broken.
When you query for slug
- in term_exists()
, get_term_by()
, or get_terms()
- the value is passed through sanitize_title()
. This sanitization is done on the way into the database to ensure that the slug is URL-safe. It's done while querying mostly to avoid SQL injection issues. This mixing of concerns makes it impossible to do a strict query for a slug match: get_term_by( 'slug', '$foo' )
will return a term with the slug foo
. This pretty clearly seems wrong.
Ideally, we would stop doing so much heavy sanitization on values passed to get_term_by()
and get_terms()
, so that we would stop getting false-positive matches for things like $foo
/foo
. Simply removing the sanitize_title()
sanitization would probably solve the immediate problem, but has compatibility ramifications for plugins that are expecting the current fuzzy matching. My inclination is to stop supporting fuzzy matches in the case of get_term_by()
, because the nature of the function is such that (IMO) you should expect exact matches only. But a second opinion would be helpful here.
#8
@
9 years ago
- Keywords 4.7-early added
- Milestone changed from 4.6 to Future Release
Let's look at this again for 4.7.
This ticket was mentioned in Slack in #core by helen. View the logs.
8 years ago
#12
@
6 years ago
- Keywords bulk-reopened added
We are experiencing a problem with this that's straightforward to reproduce --
We have unique terms such as 1:23 and 12:3; if added from a post page, both get sanitised in term_exists() to same 123 and only one term is saved.
[Note: if we add the terms from the standalone terms page (rather than on the post page), the term isn't passed through term_exists() and we can create both without problem]
@michael.costanza Thanks for the detailed report, and welcome to WordPress Trac!
The low-level taxonomy API (
wp_insert_term()
) supports the creation of terms like$foo
and#foo
in just the way you've suggested - by mapping them to slugsfoo
andfoo-2
. I'll add a test that demonstrates this. However, this breaks when callingwp_set_object_terms()
, because of the use of the impreciseterm_exists()
.term_exists()
runs the passed value throughsanitize_title()
(which reduces$foo
tofoo
), so it thinks that the post already has the term. We should switch to using more precise checks.