Make WordPress Core

Opened 7 years ago

Last modified 17 months ago

#41714 new defect (bug)

wp_list_pages() - horrible performance due to eventual "SELECT *"

Reported by: davidanderson's profile DavidAnderson Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 4.9
Component: Posts, Post Types Keywords: 2nd-opinion
Focuses: Cc:

Description

I'm investigating awful performance (MySQL slow queries logged constantly) a site which uses a plugin to display a page list.

The plugin calls wp_list_pages(). wp_list_pages() in turn calls get_posts(). And get_posts() ends up making the slow query:

SELECT * FROM wp_posts WHERE (post_type = ‘page’ AND post_status = ‘publish’) ORDER BY wp_posts.post_date ASC;

So, all the post_content fields (along with everything else) are being requested, just for the purposes of constructing a page list. (The site has ~ 1200 pages - and the above call returns 34MB from the MySQL server).

It looks like either get_posts() needs some more flexibility so that it has an option to return only specified fields.

Change History (9)

#1 @subrataemfluence
7 years ago

If I understand correctly you want more control over core get_posts() method so that you can build a query with specific "WHERE" filters in order to fetch the type of records you want, not just "ALL" posts.

To achieve this you have the option to build an array and pass it through $args parameter get_posts() (function get_posts( $args = null ) { ... }) method, which by default is set to null. If you do not pass any argument, all posts would be retrieved while if you have a properly built argument, get_posts() will use this to fetch matching records according to criteria you set in the query.

https://developer.wordpress.org/reference/functions/get_posts/

I think the following article might also be helpful:
https://www.sitepoint.com/exploring-the-wordpress-get_posts-function/

Last edited 7 years ago by subrataemfluence (previous) (diff)

#2 @DavidAnderson
7 years ago

@subrataemfluence Thank you, but no. The ticket is not about calling get_posts() directly, or about controlling which posts are selected.

#3 @arena
6 years ago

Have you tried just adding an index on post table

CREATE INDEX `wp_list_pages`  
ON `**db**`.`**prefix**_posts` 
(post_type, 
post_status, 
post_date ASC)

#4 @garrett-eclipse
6 years ago

  • Keywords 2nd-opinion added

Thanks @DavidAnderson for the report.

Looking at the chain of function calls just wanted to clarify that wp_list_pages() calls get_pages() and not get_posts() as seen here;
https://github.com/WordPress/WordPress/blob/df438d90a0d8affaaded8177ad96986cacf83994/wp-includes/post-template.php#L1311

If it was calling get_posts() then it could simply use the 'fields' argument in the $args array, but that's not the case.

Looking at core probably the best approach here would be updating get_posts() to support a fields argument similar to what's used in get_posts() as get_posts simply uses WP_Query. If get_pages was updated to also use WP_Query then it would by default support the 'fields' parameter as well which is documented here;
https://codex.wordpress.org/Class_Reference/WP_Query#Return_Fields_Parameter

The get_pages function is drastically different than get_posts and has caching and a direct SQL Select call made. It would be nice to make them consistent but would be a large effort to ensure back-compat.

"Note that, although similar to get_posts(), several of the parameter names and values differ. (It is implemented quite differently, see get_posts().)"
Reference - https://codex.wordpress.org/Function_Reference/get_pages

If it's determined that making them consistent and using WP_Query is an overwhelming effort than potentially introducing 'fields' to the get_pages arguments may be an easier approach as it could then be used directly in the SQL statement.

#5 @DavidAnderson
6 years ago

@garrett-eclipse Thank you.

@arena Adding indexes to the table is not going to make any difference to SELECT * making MySQL send over all that's in post_content (34MB in this case, on a not-large site) when all that was ultimately wanted through the initial API call made was a list of post titles and URLs.

Last edited 6 years ago by DavidAnderson (previous) (diff)

#6 @birgire
6 years ago

Related #8874

#7 @arena
6 years ago

@DavidAnderson

worth a try !

as long as your site has an important number of pages

#8 @spacedmonkey
17 months ago

I believe this issue was fixed as part of #12821.

Can any of the original comments on this ticket validate this is still an issue in WordPress 6.3 beta?

#9 @DavidAnderson
17 months ago

@spacedmonkey I see no improvement.

I ran a script with this content:

`
define('WP_DEBUG', true);
require 'wp-load.php';
define('SAVEQUERIES', true);
echo $wp_version;
wp_list_pages();
var_dump($wpdb->queries);
`

Here's the output on WP 6.2.2:

6.2.2
array(1) {
  [0] =>
  array(5) {
    [0] =>
    string(126) "SELECT * FROM wp_posts  WHERE (post_type = 'page' AND post_status = 'publish')     ORDER BY menu_order,wp_posts.post_title ASC"
    [1] =>
    double(0.00075888633728027)
    [2] =>
    string(24) "wp_list_pages, get_pages"
    [3] =>
    double(1688653138.0517)
    [4] =>
    array(0) {
    }
  }
}

Upgraded to the latest WP 6.3 nightly. The output is then:

6.3-beta3-56143
array(1) {
  [0] =>
  array(5) {
    [0] =>
    string(200) "
                        SELECT   wp_posts.*
                        FROM wp_posts 
                        WHERE 1=1  AND wp_posts.post_type = 'page' AND ((wp_posts.post_status = 'publish'))

                        ORDER BY wp_posts.menu_order ASC, wp_posts.post_title ASC

                "
    [1] =>
    double(0.0011658668518066)
    [2] =>
    string(85) "wp_list_pages, get_pages, WP_Query->__construct, WP_Query->query, WP_Query->get_posts"
    [3] =>
    double(1688653365.4942)
    [4] =>
    array(0) {
    }
  }
}

So, the only thing that's changed is the trace through to the SQL query, and inconsequential details of the query. The query itself is still SELECT * FROM on the posts table, and the full load of post_content for every page on the site is being fetched just to list the post titles and links.

Note: See TracTickets for help on using tickets.