WordPress.org

Make WordPress Core

Opened 3 years ago

Last modified 3 years ago

#31665 new defect (bug)

Duplicate slugs in DB, created for hierarchical terms with long, non-latin names, when the default slug already exists

Reported by: nevma Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 4.1.1
Component: Taxonomy Keywords: needs-patch needs-unit-tests
Focuses: Cc:

Description

When WP automatically generates a slug for a child term, and if the produced slug (which is normally generated just from the term's name) already exists in that taxonomy, it forms a slug by concatenating all the parent terms' slugs hierarchically

For example, in a term structure like: Parent -Child

If a term with the name "Grandchild" is to be inserted under "Child", normally it would get the slug "grandchild". However if that slug already exists in the taxonomy, WP generates the slug "parent-child-grandchild".

When the slugs have long, non-latin names, they are stored urlencoded in wp_terms, and the stored string's length can easily overflow the field's size ( varchar(200) ). Any terms created under that condition end up having the same slug stored in the DB (the produced urlencoded one, truncated to 200 chars).

To reproduce the issue:

Create a term (e.g. category) with this name (without the quotes): "Ένα δύο τρία τέσσερα πέντε" Create another term with the same name, defining the first term as its parent. Create a third term with the same name, defining the second term as its parent.

The second and third terms end up having duplicate slugs in the DB, a situation which is normally an error.

This issue is also not detected if the same procedure is repeated using wp_insert_term(). Normally an attempt to insert a duplicate slug to the same taxonomy would raise a "duplicate_term_slug" WP_Error, which is not the case.

Change History (4)

#1 @boonebgorges
3 years ago

  • Keywords needs-patch needs-unit-tests added
  • Milestone changed from Awaiting Review to Future Release

Thank you for the report. This is going to be a problem for any slugs longer than 200, but obviously it's a lot more severe in the case of non-latin characters.

My initial thought is that we should be doing a length check in wp_unique_term_slug() (which is used by both wp_insert_term() and wp_update_term()). If we find that the generated slug is longer than our schema can handle, truncate (in a multibyte-friendly way) to, say, 190 chars, and then do the -2, -3 etc suffix.

#2 follow-up: @SergeyBiryukov
3 years ago

wp_unique_post_slug() already handles too long slugs in a multibyte-friendly way, see [23420]. We could probably mirror some of that code in wp_unique_term_slug().

#3 in reply to: ↑ 2 ; follow-up: @boonebgorges
3 years ago

Replying to SergeyBiryukov:

wp_unique_post_slug() already handles too long slugs in a multibyte-friendly way, see [23420]. We could probably mirror some of that code in wp_unique_term_slug().

Great minds think alike :) It looks to me like _truncate_post_slug() is not really post-specific at all. Any objection if I move the logic to a general mb-safe slug truncation function, and convert _truncate_post_slug() to a wrapper?

#4 in reply to: ↑ 3 @SergeyBiryukov
3 years ago

Replying to boonebgorges:

Any objection if I move the logic to a general mb-safe slug truncation function, and convert _truncate_post_slug() to a wrapper?

None :)

Note: See TracTickets for help on using tickets.