WordPress.org

Make WordPress Core

Opened 3 years ago

Closed 3 years ago

#36711 closed enhancement (fixed)

Add caching to get_page_by_path

Reported by: spacedmonkey Owned by: boonebgorges
Milestone: 4.6 Priority: normal
Severity: normal Version: 2.1
Component: Posts, Post Types Keywords:
Focuses: performance Cc:
PR Number:

Description

The function is get_page_by_path is called lot in WP_Query. Currently is doesn't have any caching on it. This is performance issue if you are using persistent cache.

Attachments (5)

36711.patch (1.0 KB) - added by spacedmonkey 3 years ago.
36711.2.patch (965 bytes) - added by spacedmonkey 3 years ago.
36711.3.patch (2.4 KB) - added by spacedmonkey 3 years ago.
New patch with unit tests
36711.4.patch (2.9 KB) - added by spacedmonkey 3 years ago.
36711.5.patch (3.7 KB) - added by boonebgorges 3 years ago.

Download all attachments as: .zip

Change History (21)

@spacedmonkey
3 years ago

#1 @spacedmonkey
3 years ago

  • Keywords has-patch added

First pass at this patch. Pretty simple patch to get caching. The caching method is simple to the one in the get_pages function. This cache key use last_changed, a value set in the clean_post_cache function.

#2 @ocean90
3 years ago

  • Keywords needs-unit-tests added

#3 follow-up: @spacedmonkey
3 years ago

@ocean90 What unit tests should we be writing for this patch? I run the current tests and they are passing.

#4 in reply to: ↑ 3 @boonebgorges
3 years ago

Replying to spacedmonkey:

@ocean90 What unit tests should we be writing for this patch? I run the current tests and they are passing.

At a minimum, I'd write one test that demonstrates that two subsequent, identical queries result in only one database query; and one test that demonstrates that running a query, updating the page, and then running the same query results in two database queries (and that the results of the second query match the post-update page).

@spacedmonkey
3 years ago

New patch with unit tests

#5 @spacedmonkey
3 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

#6 @spacedmonkey
3 years ago

  • Keywords dev-feedback added

36711.4 patch, is another way of doing caching for this function. It is better as it isn't invalidated every time the post cache is cleaned. The code isn't as clean through.

#7 @boonebgorges
3 years ago

In 37478:

Tests: Improve tests for get_page_by_path().

  • Move existing test into its own file.
  • Add tests that cover most pieces of functionality.

See #36711.

#8 @boonebgorges
3 years ago

  • Milestone changed from Awaiting Review to 4.6

@spacedmonkey Thanks for your work here.

The approach in 36711.4.patch will not work, because a page's path depends on pages other than itself. For example, a page /foo/bar with a parent of foo will get a new patch if foo changes to baz. It would probably be possible to build a more targeted invalidation routine. It'd mean that when a page's path is updated, the path cache of each of its descendants must also be updated. This is a lot of code and a lot of overhead for a small improvement, so I'd suggest we skip it for now and go with the more general last_changed technique.

36711.5.patch cleans up the improvements in 36711.3.patch and clarifies the tests. I think it's a pretty safe change, but I'd appreciate another set of eyes.

#9 @spacedmonkey
3 years ago

I have been running a patch similar patch on production ( millions of requests a day ) and it has massively cut down the number of queries to the posts table. Very happy to see this in core :D

Couple of notes on 36711.5.patch

The follow lines are problem for me.

    if ( false !== $cached ) { 
           return get_post( $cached ); 
    } 

I believe they should look something like this.

    if ( false !== $cached ) { 
           if ( $cached ) {
                return get_post( $cached, $output ); 
           }
           return;
    } 

Fixes 2 issues.

  1. The format of output is now respected.
  2. If $cached is 0 ( nothing found ), it will not give that to get_post. If you give get_post an empty value (null/0/false) it will just return the current post, which is incorrect. If the cached value is 0, the function should just return.

One thing I have never understand about this function, why doesn't it return anything if nothing is found? It could at least return null or false.

Once we have worked through all the issues with this patch, similar caching can be applied to the get_page_by_title function as well :D

#10 @boonebgorges
3 years ago

I believe they should look something like this.

Ah yes, thanks. I've made the change and added a test that describes it.

One thing I have never understand about this function, why doesn't it return anything if nothing is found? It could at least return null or false.

The answer, if there is one, lies in the log. Go to the log, my son. Go to the log. (And open a new ticket if you want to change it, and have supporting info from the log.)

Let's put this in and see what breaks.

#11 @boonebgorges
3 years ago

  • Owner set to boonebgorges
  • Resolution set to fixed
  • Status changed from new to closed

In 37479:

Cache queries in get_page_by_path().

Props spacedmonkey.
Fixes #36711.

#12 @spacedmonkey
3 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

@boonebgorges in 37479 the cache return value doesn't take the output param.

#13 @boonebgorges
3 years ago

@spacedmonkey I'm afraid I don't understand. Can you please explain in more detail, ideally with a test that demonstrates the problem?

#14 @spacedmonkey
3 years ago

If doesn't require a test. get_page_by_path takes a $output param. The original function passes $output to get_post like so.

if ( $foundid ) {
    return get_post( $foundid, $output );
}

However currently, if the value returned from cache it doesn't pass the output param.

} else {
   return get_post( $cached );
}

This line should look like this

} else {
   return get_post( $cached, $output );
}

Currently if you run the function like this.

$value = get_page_by_path( 'foo', ARRAY_A );

It would return an object.

#15 @boonebgorges
3 years ago

  • Keywords has-patch has-unit-tests dev-feedback removed
  • Status changed from reopened to assigned

If doesn't require a test.

If it's a bug, it does.

Thanks for the explanation. I will write the tests.

#16 @boonebgorges
3 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 37481:

In get_page_by_path(), values fetched from cache should obey $output param.

Introduced in [37479].

Props spacedmonkey.
Fixes #36711.

Note: See TracTickets for help on using tickets.