Make WordPress Core

Opened 17 years ago

Closed 15 years ago

Last modified 13 years ago

#4365 closed enhancement (wontfix)

Deleting a category scales poorly

Reported by: markjaquith's profile markjaquith Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Performance Keywords: needs-patch
Focuses: Cc:

Description

Deleting a category scales very poorly if that category has many posts (especially if it has many posts whose only category is the deleted one).

The operation can take several minutes with only a few thousand posts.

Change History (18)

#1 @DD32
16 years ago

Does this still apply to the latest trunk, Given that this was reported before the taxonomy changes were made?

#2 @DD32
16 years ago

  • Component changed from Administration to Taxonomy
  • Keywords needs-patch added
  • Owner changed from anonymous to ryan

Just tested, Took about.. 60 seconds to delete a category with >1000 posts on my personal desktop, should probably be a much faster method to recategorize posts..

For reference, Some code to create a thousand posts in a category, Change the cat ID:

$cat_ID = 186;
for($i=0; $i<=1000; $i++) {
$a = array('post_status' => 'publish', 'post_category' => array($cat_ID), 'post_title' => "Test Post #$i", 'post_content' => $i );
var_dump( wp_insert_post( $a ) );
}

#3 @Denis-de-Bernardy
16 years ago

wontfix imo. the slow area is this, from within a foreach loop:

wp_set_object_terms($object, $terms, $taxonomy);

the *proper* workaround would be to use an sql trigger, but that is not an option until mysql4.x is a thing of the past.

the other potential workaround would be some kind of wp_refresh_object_terms($object_ids) function, which updates the post counts accordingly.

#4 @Denis-de-Bernardy
15 years ago

See #6035, which contains some of the needed code

#5 @vladimir_kolesnikov
15 years ago

  • Cc vladimir@… added

I have a question, sorry if it is silly - I am not familiar with taxonomy internals: why not use a query like this:

DELETE 
FROM wp_term_taxonomy AS tt LEFT JOIN wp_term_relationships AS tr USING(term_taxonomy_id)
WHERE tt.term_id = TERM_ID AND tt.term_texonomy = 'TAXONOMY'

instead of that foreach loop in wp_delete_term()?

#6 @Denis-de-Bernardy
15 years ago

It's too late for 2.8. WP 2.9 will bump MySQL requirements to 4.1, and thus introduce subqueries. We'll be looking into quite a few optimizations along the same lines, and better ones.

#7 @Denis-de-Bernardy
15 years ago

  • Component changed from Taxonomy to Performance

#8 @Denis-de-Bernardy
15 years ago

  • Owner changed from ryan to Denis-de-Bernardy
  • Status changed from new to accepted

#9 follow-up: @vladimir_kolesnikov
15 years ago

Subqueries? Why? In many cases JOINs are faster (MySQL optimizer does not handle subqueries well until 5.x).

If I understand the idea of wp_delete_term() correcly, its foreach loop can be replaced with 5 queries (even 1 if $args is absent).

The first query gets the IDs of all objects to be affected by term removal:

SELECT DISTINCT tr.object_id
FROM wp_term_taxonomy AS tt INNER JOIN wp_term_relationships AS tr USING(term_taxonomy_id)
WHERE tr.term_id = 'TERM ID HERE' AND tr.taxonomy = 'TAXONOMY NAME HERE'

The second query removes the term:

DELETE
FROM wp_term_taxonomy AS tt LEFT JOIN wp_term_relationships AS tr USING(term_taxonomy_id)
WHERE tr.term_id = 'TERM ID HERE' AND tr.taxonomy = 'TAXONOMY NAME HERE'

The third one retrieves the objects that were affeceted by DELETE and do not belong to the removed term's taxonomy anymore:

SELECT object_id FROM wp_term_relationship WHERE object_id IN ("IDs from the 1st query")

Then we do array_diff($result_1st_query, $result_3rd_query) and generate an extended INSERT statement (4th query):

INSERT INTO wp_term_relationships VALUES
('OBJECT_ID_1', 'DEFAULT_TERM_TAXONOMY_ID', 0), /* ... */ ('OBJECT_ID_N', 'DEFAULT_TERM_TAXONOMY_ID', 0)

The last query updates term count:

UPDATE wp_term_taxonomy 
SET `count` = `count` + 'number of rows affected by INSERT'
WHERE term_taxonomy_id = 'DEFAULT_TERM_TAXONOMY_ID'

This is just a proof of concept, I haven't tested it much yet but it seems working to me.

#10 in reply to: ↑ 9 @Denis-de-Bernardy
15 years ago

Replying to vladimir_kolesnikov:

Subqueries? Why? In many cases JOINs are faster (MySQL optimizer does not handle subqueries well until 5.x).

Agreed. On occasion, they're a little easier to read, though.

select *
from a
left join b
on b.id = a.id
and condition
where a.id is null

select *
from a
where a.id not in ( select b.id where condition )

they're also useful to extract constants:

select *
from a
where a.field = ( select max( field ) from b )

I'm sure we'll find them good use in 2.9

#11 @Denis-de-Bernardy
15 years ago

that was, where b.id is null, in the first example. but I'm sure you got the point.

#12 @Denis-de-Bernardy
15 years ago

  • Owner Denis-de-Bernardy deleted
  • Status changed from accepted to assigned

#13 @hakre
15 years ago

Was able to test now. Did not take minutes over here but some time. The impact is further lighter if you delete by ajax request which is now the default if javascript is enabled.

Bulk deletion does take it's time anyway.

wp_delete_category($cat_ID)
wp_delete_term($cat_ID, 'category', array('default' => $default))

This is all because of the model. The last routine iterates over every object that is bound to a term instead of just removing the taxonomy link to the object.

So this is documented so far with the according lines:

  • markjaquith 2007.10.13 05:51:11 prepare(), insert(), update() for wp-includes/ taxonomy.php, rss.php, registration.php
  • ryan 2007.06.19 02:33:44 Add some taxonomy validation. Rearrange funcs.
  • westi 2009.03.18 22:06:49 Allow wp_delete_term users to force all objects to have a new term applied. Allow for category merging use-case. Fixes #9355.

So this looks like this is by intention since #9355 and therefore I assume this wontfix.

#14 @hakre
15 years ago

Changesets are: [5726], [6241], [10813] and [11457].

#15 @janeforshort
15 years ago

  • Type changed from defect (bug) to enhancement

#16 @markjaquith
15 years ago

  • Milestone changed from 2.9 to Unassigned
  • Resolution set to wontfix
  • Status changed from assigned to closed

A lot has changed, and this doesn't seem to be the problem it once was.

#17 @Denis-de-Bernardy
15 years ago

  • Milestone Unassigned deleted

#18 @jeremyclarke
13 years ago

  • Cc jer@… added

Related #18631 "wp_set_object_terms() should only wp_update_term_count() for affected terms"

Note: See TracTickets for help on using tickets.