Make WordPress Core

Opened 2 years ago

Last modified 23 months ago

#55206 assigned defect (bug)

wp core api memory leaks

Reported by: sllimrovert's profile sllimrovert Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Database Keywords: 2nd-opinion
Focuses: performance Cc:

Description

I've experienced the following two memory leaks in WP core. One involves $wpdb when SAVEQUERIES is defined truthy, and the other involves $wp_object_cache growing as a consequence of calling core api functions that themselves save to the object cache. Both have happened for me in cases where I'm doing large batch processing involving thousands or tens of thousands of posts. I've had memory usage exceed 512MB and cause crashes.

I'm including unit tests here showing each memory leak and also the fix that I've used to prevent the memory leak and keep my batch jobs running.

<?php

/**
 * Class WP_Memory_Leak_Tests
 *
 * This class tests two cases that cause memory leaks in WordPress that could
 * lead to crashes, particularly in CLI jobs that work on larger batches.  For
 * each of the cases ( one for the wpdb class and one for the global
 * $wp_object_cache ), we perform some seemingly innocuous task many times -
 * enough times to require that PHP allocate more memory because of a specific
 * action.
 *
 * Neither of the tests here show a particularly large memory increase, but I've
 * personally had both occur for me on large jobs hitting WP API functions.  The
 * one with $wpdb->queries particularly has a tendency to blow up.
 */
class WP_Memory_Leak_Tests extends WP_UnitTestCase {

        /**
         * This tests a condition which exposes a memory leak in the WPDB class.
         * If 'SAVEQUERIES' is defined as truthy, then the $wpdb->queries property
         * can grow indefinitely.
         */
        public function test_WPDB_Memory_Leak() {

                // Once a constant is defined, it can't be undefined, it's often defined in dev or staging environments.
                define( 'SAVEQUERIES', true );

                // I'll just start my cron job to read the import file I've got.  It's
                // got a decent number of records.
                $number_of_records = 1000;

                global $wpdb;
                $memory = memory_get_usage( true );
                $peak   = memory_get_peak_usage( true );
                foreach ( [ 'first', 'second' ] as $pass ) {
                        // first pass through, we'll apply a fix for this memory leak.
                        // second pass through, we'll bypass the fix and the tests will fail.
                        for ( $i = 1; $i <= $number_of_records; $i ++ ) {
                                if ( 'first' === $pass ) {
                                        $wpdb->queries = [];
                                }

                                // for this test, we'll do direct calls to $wpdb
                                $wpdb->query( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d", $i ) );
                        }
                        $this->assertEquals( $memory, memory_get_usage( true ), "$pass pass" );
                        $this->assertEquals( $peak, memory_get_peak_usage( true ), "$pass pass" );
                }

        }

        /**
         * This tests a condition which exposes a memory leak in wp cache API.  If
         * a large batch job attempts to do a lot of something that ends up caching
         * things ( like, for example, get_post or wp_insert_post ), then unless
         * the cache is flushed regularly, the memory usage grows indefinitely.
         */
        public function test_WP_Cache_Memory_Leak() {

                // I'll just start my cron job to read the import file I've got.  It's
                // got a decent number of records.
                $number_of_records = 1000;

                global $wpdb;
                $memory = memory_get_usage( true );
                $peak   = memory_get_peak_usage( true );
                foreach ( [ 'first', 'second' ] as $pass ) {
                        // first pass through, we'll apply a fix for this memory leak.
                        // second pass through, we'll bypass the fix and the tests will fail.
                        for ( $i = 1; $i <= $number_of_records; $i ++ ) {
                                if ( 'first' === $pass ) {
                                        wp_cache_flush();
                                }

                                // Because our last test defined 'SAVEQUERIES', we need to
                                // always apply this fix, otherwise that memory leak manifests.
                                // With us doing a core API function `wp_insert_post`, the number
                                // of queries is quite large and memory __really__ grows.
                                $wpdb->queries = [];

                                // let's say we're inserting posts, maybe from an excel file.
                                // this caches some things, so $wp_object_cache grows.
                                wp_insert_post([
                                        'post_type' => 'post',
                                        'post_title' => "post $i",
                                        'post_content' => "pass $pass"
                                ]);
                        }
                        $this->assertEquals( $memory, memory_get_usage( true ), "$pass pass" );
                        $this->assertEquals( $peak, memory_get_peak_usage( true ), "$pass pass" );
                }

        }

}

Attachments (1)

test-wp-memory-leaks.php (3.4 KB) - added by sllimrovert 2 years ago.
failing unit tests

Download all attachments as: .zip

Change History (4)

@sllimrovert
2 years ago

failing unit tests

#1 @SergeyBiryukov
2 years ago

  • Focuses performance added
  • Keywords 2nd-opinion added

Hi there, welcome back to WordPress Trac! Thanks for the ticket.

At a glance, the SAVEQUERIES constant works as intended here, it is only meant for debugging and not meant to be enabled during large batch processing jobs.

As for the object cache growing, you may get better results using the wp_suspend_cache_addition() function introduced in [18681] / #5389 exactly for cases like this. There is also wp_suspend_cache_invalidation() introduced in [9106] that might be helpful.

So I would say that both $wpdb and the object cache work as expected in these circumstances. To impove perfomance of large batch processing jobs, I would suggest looking into asynchronous solutions like WP Background Processing. That said, I'm curious to know if you or anyone else have any ideas on how WordPress core could be improved here.

Last edited 2 years ago by SergeyBiryukov (previous) (diff)

#2 @sllimrovert
2 years ago

Hi Sergey,

Thank you for the response.

I agree SAVEQUERIES is working as intended, but it's not uncommon for it to be on by default in a development environment. It could potentially lead to confusing time spent by inexperienced devs who've coded themselves into a memory limit corner. For this one, I'd solve it using the constant to initialize a $save_queries property of $wpdb, same as how $show_errors is. It would make it useful for devs to be able to grab just the queries from their set of calls. Even with that in place, I'd probably still look to cap the number of queries saved at some reasonably large number like 10,000 just to prevent the issue altogether.

For the cache one, two days ago I learned of a function called wp_suspend_cache_addition. Thank you for that. Totally solves the scenario. Still, it's conceivable someone doing generally best practices using the common api could find themselves in OOM territory. For this one I'd solve it with garbage collection. If a group in the cache exceeds some memory threshold, it gets pruned.

I just looked at WP Background Processing, another one new to me. That's a beautifully written piece of code by the folks at Delicious Brains, as it usually is. And in fact, didn't they even come across memory issues as evidenced by - https://github.com/deliciousbrains/wp-background-processing/master/wp-background-process.php#L343.

Me, I have a code snippet that I use when I need it, which is always when processing thousands of unique posts.

<?php
/**
 * Caching is normally good.  However, when running a big batch process, it often isn't.  We might be
 * caching things for the whole request that we really only need for the current record.  That's a
 * memory leak.  Additionally, if SAVEQUERIES is defined as true, then $wpdb saves all queries throughout
 * the request.  That's another memory leak.  We prevent these here with the next two lines.
 *
 * @return void
 */

function wp_prevent_memory_leak() {
        wp_cache_flush();
        global $wpdb;
        $wpdb->queries = array();
}
Last edited 2 years ago by sllimrovert (previous) (diff)

#3 @costdev
23 months ago

  • Version trunk deleted

Removing trunk as this was not introduced during the 6.0 cycle.

Note: See TracTickets for help on using tickets.