WordPress.org

Make WordPress Core

Opened 3 years ago

Closed 3 years ago

#32077 closed enhancement (duplicate)

get_pages re-factor to reduce line-count, DRY up code & give more specific functions for some of it's main processes

Reported by: LewisCowles Owned by:
Milestone: Priority: normal
Severity: normal Version: 1.5
Component: Posts, Post Types Keywords: needs-patch needs-unit-tests
Focuses: docs, performance Cc:

Description

While investigating another bug report, I noticed get_pages was more than a little unweildy & required significant screen scrolling, and had everything thrown into the function.

Really the whole thing needs a rewrite or to be chopped up into smaller functions for readibility and sandboxing some effects of code, which also increases adaptability.

I also noticed the caching seemed to be adding overhead where it was unnecesarry, and checks on caching seemed to be in the wrong order.

Attachments (1)

post.php.patch (18.7 KB) - added by LewisCowles 3 years ago.
Initial re-factor

Download all attachments as: .zip

Change History (11)

#1 follow-up: @helen
3 years ago

Related / duplicate-in-spirit: #12821

#2 @johnbillion
3 years ago

  • Focuses docs removed
  • Keywords needs-patch needs-unit-tests added
  • Version changed from 4.1.2 to 1.5

@LewisCowles
3 years ago

Initial re-factor

#3 in reply to: ↑ 1 @LewisCowles
3 years ago

  • Focuses docs added

Replying to helen:

Related / duplicate-in-spirit: #12821

No, lol. I see what you were thinking, maybe that is an eventual goal, but I think it's definitely not the spirit of this.

This is to increase readability, make documentation in the code easier to follow, and making it work correctly, as efficiently as possible. This has not added any functionality and most code is simply copied & pasted.

As-is in trunk there are a number of oddities with the code

  • It's a massive & hard to read function with many responsibilities (in the patch these are sent to other helper functions)
  • One filter is applied 3 times, instead of every time the function exits
  • Some of the checks seemed a little flimsy
  • There is no separation of concerns, or ability to graph, or iterate the function (which this is not specifically about, but for larger systems, this could be a nice ability to have)

Things I have not done are

  • add some filters for the SQL / WP_SQL, (I think deserves it's own issue, would help advanced plugin builders to use core code augmenting it with their own functionality, keeps people away from wpdb global or even rewriting queries / bits of queries on their own!)
  • To separate the data from the code (IMHO nice to have, maybe another issue, something I have raised before)
  • Maybe have an object instead of global access (again separate issue)

None of the above are about using posts ;)

#4 follow-up: @helen
3 years ago

Pages are a post type. If one is wandering into refactoring this function, perhaps it is prudent to consider that the general movement of its existence should be to utilizing our own query API.

#5 follow-up: @wonderboymusic
3 years ago

+1 what @helen said - would be interested to see if this could be broken out into a generic class

#6 in reply to: ↑ 4 @LewisCowles
3 years ago

Replying to helen:

Pages are a post type. If one is wandering into refactoring this function, perhaps it is prudent to consider that the general movement of its existence should be to utilizing our own query API.

I am not disputing what Helen is saying as part of a separate issue. This has nothing to do with her proposal to ensure pages follow the post's, and in-fact should make that easier by compartmentalizing the functionality of a huge dinosaur function.

#7 in reply to: ↑ 5 @LewisCowles
3 years ago

Replying to wonderboymusic:

+1 what @helen said - would be interested to see if this could be broken out into a generic class

Do you mean querying post_types as a generic class? If not, then again it is a separate issue to pages and probably one step ahead of this issue in terms of where things are destined to go, but it's a leap... Either way definitely separate! I Think that despite pages being post types, as they are currently handled separately, what is there, needs a bit of a clean before popping onto anything else.

There are other bits I would love to put into this, but as per the SVG functionality I tried to have pushed into the core #31973 to stop WP needing inefficient plugins to supplement it's limited file-type support, where I was told not to add upload types, and gallery fixes for displaying the added upload type in one issue. I have tried to not do this, and not include the Cutty Sark and her crew in patches; Now I feel like you are asking me to do this for this patch which is inconsistent.

#8 @boonebgorges
3 years ago

Do you mean querying post_types as a generic class?

I assume @helen meant: get_pages() should use WP_Query.

We generally avoid refactoring for refactoring's sake. Making get_pages() more readable, more DRY, etc are good things, but if these are the only gains, then it may not be worth the cost of potential regressions. See https://make.wordpress.org/core/2011/03/23/code-refactoring/ for more about this philosophy.

That said, converting get_pages() to use WP_Query - see https://core.trac.wordpress.org/ticket/12821#comment:47 - is probably worthwhile. The more we can centralize the generation of query SQL, post caching, and other stuff like that, the better off we'll be in the long run. If you'd like to work on this project, I'd recommend closing this ticket in favor of #12821 and using https://core.trac.wordpress.org/attachment/ticket/12821/12821-wp-query.diff as a starting point - copious unit tests for the existing get_pages() would be the natural starting point.

#9 @LewisCowles
3 years ago

We generally avoid refactoring for refactoring's sake.

limited data from functions

Old function 0.044784545898438MB RAM used 0ms running time 0.011528015136719MB RAM used 0ms running time

New function 0.0069198608398438MB RAM used 1ms running time 1/1000th of a second approx 0.005706787109375MB RAM used 0ms running time

So it uses less RAM, and represents maybe 1ms time per unique-call...

If anyone else wants to formally benchmark

function start

$rustart = getrusage(); $start_memory = memory_get_usage();

all function exits (replaced return with below + trailing return)

$ruend = getrusage(); echo "\r\n" . ( ( memory_get_usage() - $start_memory) / 1024 / 1024 ) . "MB RAM used\r\n"; echo "\r\n" . ( ($ruendru_utime.tv_sec?*1000 + intval($ruendru_utime.tv_usec?/1000)) - ($rustartru_utime.tv_sec?*1000 + intval($rustartru_utime.tv_usec?/1000)) ) . "ms running time\r\n";

#10 @helen
3 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to duplicate
  • Status changed from new to closed

I don't think get_pages() is worth futzing with on its own until we've moved it to actually use our own query API. At that point, I imagine there would be significantly less code to construct SQL and do caching, etc., so realistically, this is a duplicate of #12821 as far as what the problem is (tickets are ideally about problems, not solutions).

Note: See TracTickets for help on using tickets.