WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 5 years ago

#9661 closed defect (bug) (fixed)

cat_row doesn't not list all categories

Reported by: hailin Owned by:
Milestone: 2.8 Priority: normal
Severity: normal Version:
Component: General Keywords: has-patch dev-feedback
Focuses: Cc:

Description

wp-admin/includes/template.php

cat_rows and _cat_rows are buggy - it can not list all legitimate categories. The key reason is that within loop

foreach ( $categories as $category )

...
unset($categories[$i]) Prune the working set

We can not unset an element while the foreach loop is traversing the array, because it causes the internal PHP cursor to jump ahead, leaving out valid elements. It's ok to unset an element as long as there is no outer loop performing on the array.

Also, we should not try to get categories and children inside _cat_rows, instead, it's better to pass them in once and only once.

If the page starts in a subtree, the original author forgot to assign $my_parent = get_category( $p ); so the subtree handling was not working at all.

Overall, before and after my fix, the time complexity is O(N2), so it's not efficient when # of categories reach 1000. However, I think reasonable range is at most a few hundreds. If speed becomes an issue, I suggest we use an O(N) algorithm, similar to that used in page_rows.

Attachments (1)

9661_category.diff (2.9 KB) - added by hailin 5 years ago.
revised patch again

Download all attachments as: .zip

Change History (29)

comment:1 hailin5 years ago

tested and verified

comment:2 in reply to: ↑ description hakre5 years ago

  • Resolution set to invalid
  • Status changed from new to closed

Replying to hailin:

[...] The key reason is that within loop

foreach ( $categories as $category )

...
unset($categories[$i]) Prune the working set

We can not unset an element while the foreach loop is traversing the array, because it causes the internal PHP cursor to jump ahead, leaving out valid elements.

That is not true. foreach() works on a copy of the array so you do not manipulate the array pointer while removing one element from the (source) array while iterating over a copy of the (source) array. compare to http://de.php.net/manual/en/control-structures.foreach.php .

since you name that beign the "key reason" I assume that your report is invalid. I therefore close it as invalid. Feel free to provide more info and reopen it.

Please name the wordpress version you provided the patch for.

comment:3 hakre5 years ago

  • Keywords reporter-feedback added

comment:4 follow-up: hailin5 years ago

From that PHP page:

Note: Unless the array is referenced, foreach operates on a copy of the specified array and not the array itself. foreach has some side effects on the array pointer. Don't rely on the array pointer during or after the foreach without resetting it.

I guess "some side effects" is the key issue.

I've tracked down this issue by meticulously tracking what went wrong on a WordPress.com blog categories. The category has 29 entries, yet the wp-admin/categories.php only lists 23 on two pages combined. Everything is perfect in DB. Once we unset the element, and after a few recursions, the original pointer behaves weirdly, and missed one subtree.

If you want to verify it, try construct a category stree with at least two levels with at least 25 elements (or force per_page=5, and use less elements). It is easy to see that _cat_rows and cat_row will miss elements.

comment:5 hailin5 years ago

For those who have access, here is the blog where the original problem happened (fixed now):
http://peterthomas.wordpress.com/wp-admin/categories.php

comment:6 hailin5 years ago

the patch is for the WP trunk

comment:8 follow-up: hailin5 years ago

that is probably because (pass by reference + unset during looping ) = problem.

Even (pass by value + unset during looping) causes problem.
In my patch, there is no unset, so it is good to pass by reference.

comment:9 in reply to: ↑ 4 hakre5 years ago

Replying to hailin:

From that PHP page:

Note: Unless the array is referenced, foreach operates on a copy of the specified array and not the array itself. foreach has some side effects on the array pointer. Don't rely on the array pointer during or after the foreach without resetting it.

I guess "some side effects" is the key issue.

the refernces side effects are related to the array pointer only. the code is not using any array pointer. additionally the foreach loop iterates over a copy which means this is a new array, not $categories any longer (you just quoted that as well).

Please stop guessing, you are mixing things. Using $array[$index] is not using the arrays pointer. Learn more about array pointers in php here: http://php.net/manual/en/function.current.php I hope that helps you to better understand what an internal array pointer is.

So please make a proof.

comment:10 in reply to: ↑ 8 hakre5 years ago

Replying to hailin:

that is probably because (pass by reference + unset during looping ) = problem.

Even (pass by value + unset during looping) causes problem.
In my patch, there is no unset, so it is good to pass by reference.

ryan already pointed to that: there is no more pass by reference.

comment:11 follow-up: hailin5 years ago

Okay, he root cause is that unset($categories[$i])
mistakenly unsets wrong element because it assumes
the index always begins with 0, and auto-increments. This is not the case
with Array index - it can be anything, and even when it's numeric, it may not begin
with 0 and auto-increments. I remembered running into a similar bug in walker class
before.

So the quick fix is to remember the $key, and unset( $categories[ $key ] )

The second issue the new patch fixes is when the page starts in a subtree case.

The third minor issue the new patch fixes is to pass children by ref, since
we are not operating on it; and align the argument order to be consistent with that of the original.

comment:12 hailin5 years ago

Either of my two proposed patches fixes the issues.

The revised one retains unset, so it's faster.

comment:13 in reply to: ↑ 11 hakre5 years ago

  • Keywords developer-feedback added; reporter-feedback removed

Replying to hailin:

Okay, he root cause is that unset($categories[$i])
mistakenly unsets wrong element because it assumes
the index always begins with 0, and auto-increments.

Now that sounds like a Bug! Thanks for pin-pointing it down.

So the quick fix is to remember the $key, and unset( $categories[ $key ] )

array_values() might come in handy as well.

The second issue the new patch fixes is when the page starts in a subtree case.

could you anaylze why this is running into problems?

comment:14 hakre5 years ago

  • Milestone changed from Unassigned to 2.8
  • Resolution invalid deleted
  • Status changed from closed to reopened

hailin: can you tell against which wordpress version you are patching? you see it in the admin, down-right corner.

comment:15 hakre5 years ago

  • Keywords reporter-feedback added

comment:16 Denis-de-Bernardy5 years ago

see: #9089 (duplicate of this, with other details)

comment:18 hailin5 years ago

patch is for WP trunk revision 11073 (those functions were there since June 2008).

The subtree handling had an issue because $my_parent in the following block was not initialized, thus while statement won't execute at all. In the patch, I also used $p to refer to the parent term_id, to distinguish it from category object itself so that it's a little bit more readable.

$my_parents = array();
while ( $my_parent) {

$my_parent = get_category($my_parent);
$my_parents[] = $my_parent;
if ( !$my_parent->parent )

break;

$my_parent = $my_parent->parent;

}

hakre: thanks for the array foreach notes, I learned something too.

comment:19 janeforshort5 years ago

  • Keywords has-patch added

comment:21 hailin5 years ago

yeah, that also fixes the second issue on subtree.
Meanwhile, I think it's better to distinguish between term_id and category object (don't use $my_parent for both).

comment:22 Denis-de-Bernardy5 years ago

@hailin: please update the patch accordingly (and put a notice that you did in #8632), as I'm not entirely certain of what you're meaning.

comment:23 hailin5 years ago

The current patch is up to date.

What I was saying is that in my patch, I used an additional var
$p = $category->parent;
instead of using $my_parent throughout that block, for better clarity.

hailin5 years ago

revised patch again

comment:24 hailin5 years ago

This above revised patch is taken against latest WP trunk rev 11132.

comment:25 Denis-de-Bernardy5 years ago

  • Keywords dev-feedback added; developer-feedback reporter-feedback removed

comment:26 Denis-de-Bernardy5 years ago

patch looks neat. one suggestion though: don't change the order of parameters in existing functions, else it might get rejected for backward compat reasons.

comment:27 hailin5 years ago

I thought about that, and made sure it didn't have backward compat issue since _cat_row is meant for a private function to be called only by cat_row.

The orignal para orders of cat_row and _cat_row are so different that it was an annoyance. I believe little things - choosing right var name, para order - contribute to good code.

comment:28 ryan5 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [11139]) cat_row fixes. Props hailin. fixes #9661 see #8632

Note: See TracTickets for help on using tickets.