Make WordPress Core

Opened 5 months ago

Last modified 5 months ago

#63683 new enhancement

Improve wp_insert_term() duplicate query

Reported by: xipasduarte's profile xipasduarte Owned by:
Milestone: Awaiting Review Priority: normal
Severity: minor Version:
Component: Taxonomy Keywords: has-patch 2nd-opinion
Focuses: performance Cc:

Description

The query in wp_insert_term() to get the $duplicate_term is fetching all matching terms from the database and extracting the first afterwords.

If there is an issue and terms are created with duplicate slugs in large amounts, the query becomes slow, not just due to the amount of comparisons, but also the data being returned. Both aspects could be improved with a LIMIT 1 appended to the query.

Change History (4)

This ticket was mentioned in PR #9237 on WordPress/wordpress-develop by @xipasduarte.


5 months ago
#1

  • Keywords has-patch added

Add LIMIT 1 to $duplicate_term query to improve performance returning earlier and with less data from the database.

Trac ticket: https://core.trac.wordpress.org/ticket/63683

This ticket was mentioned in Slack in #core-test by oglekler. View the logs.


5 months ago

#3 @oglekler
5 months ago

  • Keywords 2nd-opinion added

Hi @xipasduarte,
Thank you for the ticket!

$wpdb->get_row() returns only one row, so there is no need to add LIMIT 1. It should be no long request. We need reproduction report with actual numbers and DB queries that were made. And if $wpdb->get_row() actually makes a query in such a way that request to DB is getting more than 1 row and slows down due to amount of matching rows, then we should raise a question about fixing the get_row() method instead of adding LIMIT 1 here and there.

So, I am adding 2nd-opinion that indicated that we need some evaluation from a developer.

#4 @xipasduarte
5 months ago

Hello @oglekler. Thanks for the feedback. I thought about raising an issue for the get_row() method, but there is good reason for it to be like this. You are right that it only returns a single result, but it does not limit what is requested from the database, as that is the responsibility of the query itself.

My thoughts were that queries can vary a lot and, in some cases, it might be very bad to mess around with what is passed as it could lead to breaking the query itself. The method get_row() is used as part of WordPress' API and we would need to be very careful when "messing" with it.

That being said, there is an issue for potential improvement as a hole, by looking into get_row(). But due to the time it would take, the current change would be of benefit right away and would not lead to any duplication. If we consider that the case where queries containing LIMIT need to be address in rewriting get_row() this would fall under that.

I'll gladly open an issue in regards to get_row() once I've got the time to review the best approach and offer some insights.

Note: See TracTickets for help on using tickets.