id,summary,reporter,owner,description,type,status,priority,milestone,component,version,severity,resolution,keywords,cc,focuses
8384,Improving WP hierarchical data structure / Use SQL trees,hailin,hailin,"Currently WordPress uses simple Adjacency List Model to manage hierarchical data, including pages, categories, terms.
As the size of data sets grow larger and larger, we run into inherent limitations imposed by our current data structure.
For example:
When there are 5500 pages, in order to list a particular subset of pages,
we have to retrieve all of them from the db (step 1), and enumerate through each one to construct a sub-tree (step 2).
We did some algorithmic improvement before to improves part (#2) by reducing the complexity from O(n^2) to O(n) in
function page_rows($pages, $pagenum = 1, $per_page = 20)
it reduced the processing time in step 2 from 20 sec to 0.3 sec in a case when there are ~2000 pages.
However, step 1 is still a major hurdle, because limitations of our current pages data structure.
In the case of one blog with 5500+ pages, it takes 33 seconds to display wp-admin/edit-pages.php. In particular, 17 seconds are spent in update_post_caches(), and 11 seconds are spent in apply_filters('the_posts', $this->posts), simply because we are operating over all 5500+ pages! And because the we only store rudimentary parent-child information in the table, we had to query the whole 5500 pages.
If we improve the data structure and store the pages in efficient hierarchical order in db, then for every operation, we could retrieve only the ones we want, eg, 30 pages at a time. This can bring down the page load time from 33 seconds to sub-second, substantially improving user experience.
The above is just one example, the same can be applied to any other cases involving
hierarchical data sets.
The potential change will have a lot of ripple effects, as it may affect a lot of other functions or even maybe themes, which depend on the existing behavior. So we should proceed cautiously and pay great attention to backward compatibility, etc.
We could consider:
Modified Preorder Tree Traversal Algorithm
http://www.sitepoint.com/article/hierarchical-data-database/
Or the ones recommended in the classical book:
http://www.amazon.com/Hierarchies-Smarties-Kaufmann-Management-Systems/dp/1558609202/ref=pd_bxgy_b_img_b
Besides, WordPress is evolving into a CMS, and that mandates us having a better foundation on which to manipulate various data formats. Such a solid, efficient, elegant, robust data structure will serve as a strong foundation for us to evolve well into the future.
",feature request,closed,low,,Optimization,,minor,maybelater,,,