Opened 16 years ago
Closed 16 years ago
#9661 closed defect (bug) (fixed)
cat_row doesn't not list all categories
Reported by: | hailin | Owned by: | |
---|---|---|---|
Milestone: | 2.8 | Priority: | normal |
Severity: | normal | Version: | |
Component: | General | Keywords: | has-patch dev-feedback |
Focuses: | Cc: |
Description
wp-admin/includes/template.php
cat_rows and _cat_rows are buggy - it can not list all legitimate categories. The key reason is that within loop
foreach ( $categories as $category )
...
unset($categories[$i]) Prune the working set
We can not unset an element while the foreach loop is traversing the array, because it causes the internal PHP cursor to jump ahead, leaving out valid elements. It's ok to unset an element as long as there is no outer loop performing on the array.
Also, we should not try to get categories and children inside _cat_rows, instead, it's better to pass them in once and only once.
If the page starts in a subtree, the original author forgot to assign $my_parent = get_category( $p ); so the subtree handling was not working at all.
Overall, before and after my fix, the time complexity is O(N2), so it's not efficient when # of categories reach 1000. However, I think reasonable range is at most a few hundreds. If speed becomes an issue, I suggest we use an O(N) algorithm, similar to that used in page_rows.
Attachments (1)
Change History (29)
#2
in reply to:
↑ description
@
16 years ago
- Resolution set to invalid
- Status changed from new to closed
Replying to hailin:
[...] The key reason is that within loop
foreach ( $categories as $category )
...
unset($categories[$i]) Prune the working set
We can not unset an element while the foreach loop is traversing the array, because it causes the internal PHP cursor to jump ahead, leaving out valid elements.
That is not true. foreach() works on a copy of the array so you do not manipulate the array pointer while removing one element from the (source) array while iterating over a copy of the (source) array. compare to http://de.php.net/manual/en/control-structures.foreach.php .
since you name that beign the "key reason" I assume that your report is invalid. I therefore close it as invalid. Feel free to provide more info and reopen it.
Please name the wordpress version you provided the patch for.
#4
follow-up:
↓ 9
@
16 years ago
From that PHP page:
Note: Unless the array is referenced, foreach operates on a copy of the specified array and not the array itself. foreach has some side effects on the array pointer. Don't rely on the array pointer during or after the foreach without resetting it.
I guess "some side effects" is the key issue.
I've tracked down this issue by meticulously tracking what went wrong on a WordPress.com blog categories. The category has 29 entries, yet the wp-admin/categories.php only lists 23 on two pages combined. Everything is perfect in DB. Once we unset the element, and after a few recursions, the original pointer behaves weirdly, and missed one subtree.
If you want to verify it, try construct a category stree with at least two levels with at least 25 elements (or force per_page=5, and use less elements). It is easy to see that _cat_rows and cat_row will miss elements.
#5
@
16 years ago
For those who have access, here is the blog where the original problem happened (fixed now):
http://peterthomas.wordpress.com/wp-admin/categories.php
#8
follow-up:
↓ 10
@
16 years ago
that is probably because (pass by reference + unset during looping ) = problem.
Even (pass by value + unset during looping) causes problem.
In my patch, there is no unset, so it is good to pass by reference.
#9
in reply to:
↑ 4
@
16 years ago
Replying to hailin:
From that PHP page:
Note: Unless the array is referenced, foreach operates on a copy of the specified array and not the array itself. foreach has some side effects on the array pointer. Don't rely on the array pointer during or after the foreach without resetting it.
I guess "some side effects" is the key issue.
the refernces side effects are related to the array pointer only. the code is not using any array pointer. additionally the foreach loop iterates over a copy which means this is a new array, not $categories any longer (you just quoted that as well).
Please stop guessing, you are mixing things. Using $array[$index] is not using the arrays pointer. Learn more about array pointers in php here: http://php.net/manual/en/function.current.php I hope that helps you to better understand what an internal array pointer is.
So please make a proof.
#10
in reply to:
↑ 8
@
16 years ago
Replying to hailin:
that is probably because (pass by reference + unset during looping ) = problem.
Even (pass by value + unset during looping) causes problem.
In my patch, there is no unset, so it is good to pass by reference.
ryan already pointed to that: there is no more pass by reference.
#11
follow-up:
↓ 13
@
16 years ago
Okay, he root cause is that unset($categories[$i])
mistakenly unsets wrong element because it assumes
the index always begins with 0, and auto-increments. This is not the case
with Array index - it can be anything, and even when it's numeric, it may not begin
with 0 and auto-increments. I remembered running into a similar bug in walker class
before.
So the quick fix is to remember the $key, and unset( $categories[ $key ] )
The second issue the new patch fixes is when the page starts in a subtree case.
The third minor issue the new patch fixes is to pass children by ref, since
we are not operating on it; and align the argument order to be consistent with that of the original.
#12
@
16 years ago
Either of my two proposed patches fixes the issues.
The revised one retains unset, so it's faster.
#13
in reply to:
↑ 11
@
16 years ago
- Keywords developer-feedback added; reporter-feedback removed
Replying to hailin:
Okay, he root cause is that unset($categories[$i])
mistakenly unsets wrong element because it assumes
the index always begins with 0, and auto-increments.
Now that sounds like a Bug! Thanks for pin-pointing it down.
So the quick fix is to remember the $key, and unset( $categories[ $key ] )
array_values() might come in handy as well.
The second issue the new patch fixes is when the page starts in a subtree case.
could you anaylze why this is running into problems?
#14
@
16 years ago
- Milestone changed from Unassigned to 2.8
- Resolution invalid deleted
- Status changed from closed to reopened
hailin: can you tell against which wordpress version you are patching? you see it in the admin, down-right corner.
#18
@
16 years ago
patch is for WP trunk revision 11073 (those functions were there since June 2008).
The subtree handling had an issue because $my_parent in the following block was not initialized, thus while statement won't execute at all. In the patch, I also used $p to refer to the parent term_id, to distinguish it from category object itself so that it's a little bit more readable.
$my_parents = array();
while ( $my_parent) {
$my_parent = get_category($my_parent);
$my_parents[] = $my_parent;
if ( !$my_parent->parent )
break;
$my_parent = $my_parent->parent;
}
hakre: thanks for the array foreach notes, I learned something too.
#21
@
16 years ago
yeah, that also fixes the second issue on subtree.
Meanwhile, I think it's better to distinguish between term_id and category object (don't use $my_parent for both).
#22
@
16 years ago
@hailin: please update the patch accordingly (and put a notice that you did in #8632), as I'm not entirely certain of what you're meaning.
#23
@
16 years ago
The current patch is up to date.
What I was saying is that in my patch, I used an additional var
$p = $category->parent;
instead of using $my_parent throughout that block, for better clarity.
#26
@
16 years ago
patch looks neat. one suggestion though: don't change the order of parameters in existing functions, else it might get rejected for backward compat reasons.
#27
@
16 years ago
I thought about that, and made sure it didn't have backward compat issue since _cat_row is meant for a private function to be called only by cat_row.
The orignal para orders of cat_row and _cat_row are so different that it was an annoyance. I believe little things - choosing right var name, para order - contribute to good code.
tested and verified