Make WordPress Core

Opened 11 years ago

Closed 10 years ago

#28425 closed enhancement (fixed)

get_permalink passes only post ID to called functions instead of post object

Reported by: arnee's profile arnee Owned by: johnbillion's profile johnbillion
Milestone: 4.0 Priority: normal
Severity: normal Version: 4.0
Component: Posts, Post Types Keywords: has-patch dev-feedback commit
Focuses: performance Cc:

Description

The function get_permalink passes only the ID of the post to the called functions get_page_link, get_attachment_link and get_post_permalink.

This will result in an additional post-resolving with eventually a database call in some situations.

Since get_permalink already resolved the post, it should only pass the object.

Attachments (2)

28425-1.diff (776 bytes) - added by arnee 11 years ago.
28425.test.php (4.0 KB) - added by SergeyBiryukov 10 years ago.

Download all attachments as: .zip

Change History (24)

@arnee
11 years ago

#1 @arnee
11 years ago

  • Keywords has-patch added

#2 @johnbillion
11 years ago

  • Milestone changed from Awaiting Review to 4.0
  • Type changed from defect (bug) to enhancement

Makes sense.

#3 @MikeHansenMe
11 years ago

At first I thought there would need to be some additional check to see if it was an object but the functions above all call get_post which does a check to see if it is an object instead of an id. Looks good to me.

#4 @arnee
11 years ago

Yes, I've checked that too. All called functions ensure the post object again.

#5 @kovshenin
11 years ago

This will result in an additional post-resolving with eventually a database call in some situations.

Can you provide some examples of such situations? I can be wrong, but afaik if the post has been fetched before, get_post() will simply get the post from cache via WP_Post::get_instance().

#6 follow-ups: @arnee
11 years ago

A case like this would be if you have a post object, which is not in the post cache yet. There might be several reasons for this, one is if you used a custom SELECT SQL to fetch more than one post (like my sitemap plugin does).

On the get_permalink call, the first get_post call will transform the given object into a WP_Post object. This is good. Unfortunately if the post is a page, get_page_link will be called with only the ID. get_page_link therefore calls get_post (with only the ID) which leads to a SQL query.

Anyway there is no reason the object should not be passed, since passing the ID creates definitely some unnecessary overhead in any case.

#7 in reply to: ↑ 6 @kovshenin
11 years ago

Replying to arnee: can you provide a code sample, perhaps from your sitemap plugin, where a call to get_permalink() results in an database hit vs. no hit if you pass the whole object to get_page_link()?

#8 follow-up: @arnee
10 years ago

@kovshenin, as I wrote before any time you query a page directly with wpdb and try to get the permalink then.

$posts = $wpdb->get_results("SELECT * FROM wp_posts WHERE ID=21278");
$p = $posts[0];
$permalink = get_permalink($p);
var_dump($wpdb->queries);

wpdb->queries:

  [7]=>
  array(3) {
    [0]=>
    string(37) "SELECT * FROM wp_posts WHERE ID=21278"
    [1]=>
    float(0.00099992752075195)
    [2]=>
    string(0) ""
  }
  [8]=>
  array(3) {
    [0]=>
    string(47) "SELECT * FROM wp_posts WHERE ID = 21278 LIMIT 1"
    [1]=>
    float(0)
    [2]=>
    string(61) "get_permalink, get_page_link, get_post, WP_Post::get_instance"
  }

#9 in reply to: ↑ 8 @kovshenin
10 years ago

Replying to arnee: I'm not sure that's a valid use case.

If you need the post with the ID 21278, you're better off with get_post( 21278 ). If you're looking for pages by path, title, etc, you should be using get_page_by_* or WP_Query. All of these leverage object caching, which prevents further querying of the same posts regardless of the context (the same post can appear in a widget for example).

Furthermore, when using these in an environment with persistent object caching, there's a high chance you don't need to hit the database even once, especially if you're simply querying by ID.

Also, passing an object instead of the ID doesn't really solve much. The page_link and attachment_link filters pass the post ID, so there's a chance anything attached to those filters will run get_post with an ID anyway.

#10 follow-up: @arnee
10 years ago

kovshenin, this example was very very simple just to demonstrate the behavior. In the actual case I would of course not query a single post, but probably 100, 200 or more.
Regarding your suggestions:

  • WP_Query is not a solution, since other plugins will break the query with their filters. WP_Query with suppress_filters is not a solution, since I would need to use my own filters to get the result I want. I've tried it before and it was a huge unreliable mess.
  • As long as a persitent cache is not part of WP Core, the probability that the posts are in the cache is zero. It has to work for all WP sites not for the 1% which run a persistent object cache.

The additional db query can avoided by just using the available objects instead of the ID. I think the change is pretty simple and makes sense, it actually saves some CPU. There is no reason for throwing away the object and passing the ID just to get the object from the cache again.

Actually, the get_instance method of WP_Post (which is called by get_post if you only use the ID) will even create a new WP_Post object every time it is read from the cache, since the cache contains the raw data and not the WP_Post object.

Last edited 10 years ago by arnee (previous) (diff)

#11 @kovshenin
10 years ago

  • Keywords dev-feedback added

I'll let somebody else weigh in. I'm not against the proposed change, I just think it's useless unless you're trying to hack your way around WP_Query and friends.

#12 @arnee
10 years ago

I agree that WP_Query should be used, there are just cases where it is not possible or using it results in hacking and workarounds.

#13 in reply to: ↑ 6 @rmccue
10 years ago

  • Keywords close added

Replying to arnee:

A case like this would be if you have a post object, which is not in the post cache yet. There might be several reasons for this, one is if you used a custom SELECT SQL to fetch more than one post (like my sitemap plugin does).

If you're doing a custom SELECT, then you should be calling update_post_cache() (or update_post_caches()) on the posts to ensure they're added to the cache. You should be doing this regardless of how WP behaves internally. (Even if you don't have an external object cache, this will still cache them in memory for the current request.)

Apart from that, I'd still recommend using WP_Query if at all possible. You note "WP_Query is not a solution, since other plugins will break the query with their filters", but I'm not sure exactly what would be actually conflicting here.

I'm not sure that changing get_permalink() behaviour really gains us anything here. Seems like unnecessary code churn to me. Recommend close.

#14 in reply to: ↑ 10 ; follow-up: @SergeyBiryukov
10 years ago

  • Keywords commit added; close removed

Replying to arnee:

Actually, the get_instance method of WP_Post (which is called by get_post if you only use the ID) will even create a new WP_Post object every time it is read from the cache, since the cache contains the raw data and not the WP_Post object.

While the use case might not exactly be valid (as mentioned above), this sounds like a valid observation. Reusing the object we already have would save us some overhead.

A quick test with 100 000 iterations shows a 15% performance increase (21,281 seconds with 28425-1.diff vs. 25,135 seconds unpatched).

Last edited 10 years ago by SergeyBiryukov (previous) (diff)

#15 in reply to: ↑ 14 @rmccue
10 years ago

Replying to SergeyBiryukov:

A quick test with 100 000 iterations shows a 15% performance increase (21,281 seconds with 28425-1.diff vs. 25,135 seconds unpatched).

I am +1 in that case. :)

#16 @kovshenin
10 years ago

I don't see a 15% increase with or without the patch. I ran a few benchmarks with 1000 calls to get_permalink() which passes an object vs id to get_page_link(), I don't see a significant increase or decrease in performance.

I did run the benchmarks separately too and there's ~ 8,000 extra function calls when working with the id, most notably wp_cache_get(), is_object(), and WP_Post::get_instance(). None of them seems to make a significant difference in execution time.

#17 follow-up: @arnee
10 years ago

If you are used to work with objects, this mix of using sometimes the ID and sometimes passing the object is highly confusing. Another example:

$page = get_post(21278);
echo get_permalink($page); 
//Will echo http://localhost/wp_svn/adipiscing-at-aliquam/

$page->post_name="someotherslug";
echo get_permalink($page); 
//Will echo http://localhost/wp_svn/adipiscing-at-aliquam/
//Should echo http://localhost/wp_svn/someotherslug/

#18 in reply to: ↑ 17 @kovshenin
10 years ago

Replying to arnee: As I said, I'm not against this change, just not for the reasons you mentioned above (tricking get_permalink() into returning something that does not exist, working directly with the database and by-passing object caching, resulting in extra database queries). I'd accept the consistency/cleaner code reason, maybe even the performance reason although there seems to be very little gain if any.

#19 follow-up: @SergeyBiryukov
10 years ago

Replying to kovshenin:

I don't see a 15% increase with or without the patch. I ran a few benchmarks with 1000 calls to get_permalink() which passes an object vs id to get_page_link(), I don't see a significant increase or decrease in performance.

I'd suggest trying 100 000 calls for more consistent results.

Here's my test file: 28425.test.php. I do see a consistent difference on both PHP 5.2.17 and 5.4.29:

// PHP 5.2.17, 1000 calls
get_permalink(): 0,225 seconds
get_permalink_with_obj(): 0,177 seconds

// PHP 5.2.17, 100 000 calls
get_permalink(): 22,158 seconds
get_permalink_with_obj(): 17,931 seconds

// PHP 5.4.29, 1000 calls
get_permalink(): 0,125 seconds
get_permalink_with_obj(): 0,094 seconds

// PHP 5.4.29, 100 000 calls
get_permalink(): 12,422 seconds
get_permalink_with_obj(): 9,406 seconds

#20 in reply to: ↑ 19 @kovshenin
10 years ago

Replying to SergeyBiryukov: You're right, 100k yielded more consistent results and there indeed is a ~ 5-15% performance increase. Here's the last run: https://cloudup.com/ifbVpepbkSP on PHP 5.6.

Should we look for other places that could benefit from a similar change? Perhaps worth a new ticket? A quick search revealed:

  • clean_post_cache() in wp_publish_post
  • get_the_guid() in wp_get_attachment_url() and _transition_post_status()
  • wp_get_post_revisions() in wp_list_post_revisions()
  • get_attachment_link() in wp_get_attachment_link()

P.S. You, sir, must have a very fast computer to have those executed in 12 seconds ;)

This ticket was mentioned in IRC in #wordpress-dev by helen. View the logs.


10 years ago

#22 @johnbillion
10 years ago

  • Owner set to johnbillion
  • Resolution set to fixed
  • Status changed from new to closed

In 29024:

Pass the post object rather than the post ID between the post permalink functions. Fixes #28425. Props arnee

Note: See TracTickets for help on using tickets.