Make WordPress Core

Opened 8 months ago

Closed 2 months ago

Last modified 2 months ago

#59900 closed task (blessed) (fixed)

Measure performance with a persistent object cache in performance tests

Reported by: flixos90's profile flixos90 Owned by: swissspidy's profile swissspidy
Milestone: 6.6 Priority: normal
Severity: normal Version:
Component: Build/Test Tools Keywords: has-patch has-unit-tests
Focuses: performance Cc:

Description

Similar to #59815: Whether or not a persistent object cache is used makes a major difference on server-side performance of a WordPress site. While there are several factors impacting server-side performance and we cannot benchmark for every single one of them, object caching is a quite significant one and therefore should probably be covered, similar to internationalization.

It is common for certain performance optimizations to only apply for sites with a persistent object cache, so we should be able to monitor the impact of such changes over time.

Change History (23)

#1 @flixos90
8 months ago

cc @swissspidy @joemcgill

#2 @flixos90
8 months ago

  • Keywords 2nd-opinion added

So far, the different performance benchmark scenarios all simply depend on different configuration of the WordPress site. However, this one is slightly different as it ties into the server stack.

I'm thinking of two alternative options we have here:

  • Either we set up separate environments, one with object caching and one without. This could for example be done by restructuring the performance.yml workflow to use a matrix, where e.g. every single scenario is its own entry and could be run in its own environment. That may be beneficial for better test separation and maintenance anyway, but it could also make the process slower or more costly overall.
  • Or we set up the single environment we already have with object caching enabled, but by default ensure the object-cache.php drop-in isn't placed in wp-content. That way, the object cache wouldn't be used so it shouldn't make a difference for all existing scenarios. We could then implement a way to place the file from within PHP (e.g. a REST endpoint just for this workflow), accompanied with a Playwright set of functions like enableObjectCache() and disableObjectCache().

I think the first one could be a bit cleaner, but it's also clearly a lot more work. If it's possible to place the drop-in via PHP in the GitHub workflow, I think that would be a much more straightforward solution not involving any refactoring.

Thoughts?

#3 @swissspidy
8 months ago

Related ticket & discussion: #58000

The exact setup really depends on what we want to test / what we hope to gain from these tests.

My quick 2 cents:

As for implementation details, the former shouldn't be a lot of work. A matrix is probably the easiest way to do it. You can then still mark some tests to not run or only run with the object cache is enabled.
The second approach would lead to more code repetition and worse readability, if we do it for every single test.

That said, a combination of both would provide most flexibility: use a matrix _and_ a way to enable/disable the object cache during the test.

#4 @joemcgill
8 months ago

I agree that we should use a matrix for this, which could also provide the opportunity to add other configurations, like PHP version, etc., in the future. Some of our other workflows in Core separate the task of running the workflow from the matrix that defines the inputs for the workflow, like .github/workflows/phpunit-tests.yml and .github/workflows/phpunit-tests-run.yml, where the former defines the matrixes and uses the latter to actually execute the tests. Something like this might be worth looking into as we broaden the complexity of the performance tests.

This ticket was mentioned in PR #5679 on WordPress/wordpress-develop by @swissspidy.


8 months ago
#5

  • Keywords has-patch has-unit-tests added; needs-patch removed

Trac ticket:

#6 @swissspidy
8 months ago

I discussed this a bit with @flixos90 and tried to further wrap my head around this.

Introducing a matrix will definitely unlock additional improvements to the test suite, though there are some things to consider. Some loose thoughts:

  • This is a great opportunity to remove a lot of duplicate code like e.g. *-l10n tests
  • Even the classic & block theme tests could be deduplicated, theoretically allowing us to even test all the default themes in the future
  • Theoretically even the admin & home tests could be deduplicated, as they are very similar.
  • Theoretically allows us to test more hardware configurations in the future (i.e. different PHP & MySQL versions) if needed
  • An alternative is to do this matrix in JS instead of the GHA (example).
    • Con: can't really change server configuration (e.g. PHP version).
    • Pro all tests happen within the same GHA runner
    • Pro: you can easily run the whole matrix locally.
  • We should use this opportunity to remove the hardcoded test suite names in the CLI scripts
  • Each matrix job should still do its own comparison with target commit and baseline version. This is to ensure the runs happen within the same environment.
  • The exact setup depends on whether we want/need to print and log results after the whole matrix has run, or if that can be done within a single matrix job.
    • For the former we need to upload an artifact in each job and then in another job merge them together. Not really difficult, but if we don't _have_ to do it, it will make things less complex.
    • Need to check what the current dashboard requires. For back compat we need to ensure we keep sending the same keys there that it expects.

Here's a simple POC I whipped up to see what's possible: https://github.com/WordPress/wordpress-develop/pull/5679
Workflow summary: https://github.com/WordPress/wordpress-develop/actions/runs/6907241045?pr=5679

As you can see, printing the results in each job leads to a lot of noise due to separated summaries, so printing only 1 summary after the matrix concluded is probably favorable.

This ticket was mentioned in Slack in #core-performance by swissspidy. View the logs.


8 months ago

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


6 months ago

#9 @swissspidy
6 months ago

  • Owner changed from flixos90 to swissspidy

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


6 months ago

#11 @swissspidy
5 months ago

  • Type changed from enhancement to task (blessed)

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


5 months ago

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


5 months ago

#14 @swissspidy
5 months ago

  • Milestone changed from 6.5 to 6.6

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


3 months ago

#16 @swissspidy
3 months ago

  • Keywords 2nd-opinion removed

@swissspidy commented on PR #5679:


3 months ago
#17

Do we need to make any updates to the codevitals.run/project/wordpress dashboard to be able to collect some of these new metrics (e.g., DB calls, additional themes, etc.) over time or adjust anything else about how these are now being collected?

I updated the log script so that the existing metrics should be collected as usual. Now that we finally collect memory usage and server-timing in the admin, those fields, which were previously empty, can finally be displayed properly too.

For tracking things like db queries, changes to the dashboard are indeed necessary, but I usually handle this on Slack directly with Riad :)

That said, the dashboard really doesn't scale for that many metrics. So if we want to track all the default themes etc. then we need to improve/switch the dashboard first.

@youknowriad commented on PR #5679:


3 months ago
#18

I agree that the UI of the dashboard is not meant to handle a huge list of metrics. I don't mind much If we think there are alternative dashboards that we can switch to without too much hassle.

If not, we can also implement this issue to have like "main metrics" and "all metrics" pages per project or something.

@desrosj commented on PR #5679:


3 months ago
#19

I also think we may want to consider breaking this workflow into two separate workflows:

@joemcgill I'm working on something like this in #6232 for all workflows because there are huge benefits for maintainability. Let's overhaul the workflow first, and then work on splitting it into a callable workflow there.

#20 @swissspidy
2 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 58076:

Build/Test Tools: Overhaul performance tests to improve stability and cover more scenarios.

Simplifies the tests setup by leveraging a test matrix, improving maintenance and making it much easier to test more scenarios. With this change, tests are now also run with an external object cache (Memcached). Additional information such as memory usage and the number of database queries is now collected as well.

Improves test setup and cleanup by disabling external HTTP requests and cron for the tests, as well as deleting expired transients and flushing the cache in-between. This should aid the test stability.

When testing the previous commit / target branch, this now leverages the already built artifact from the build process workflow. Raw test results are now also uploaded as artifacts to aid debugging.

Props swissspidy, adamsilverstein, joemcgill, mukesh27, desrosj, youknowriad, flixos90.
Fixes #59900

#22 @swissspidy
2 months ago

In 58077:

Build/Test Tools: Fix performance tests logging script after [58076].

Removes some unintended debug cruft, whoops!

See #59900.

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


2 months ago

Note: See TracTickets for help on using tickets.