#62148 closed task (blessed) (fixed)
Add Twenty Twenty-Five to the Performance Tests
Reported by: | joemcgill | Owned by: | pbearne |
---|---|---|---|
Milestone: | 6.8 | Priority: | normal |
Severity: | normal | Version: | |
Component: | Build/Test Tools | Keywords: | early has-patch |
Focuses: | performance | Cc: |
Description
During the release party for WP 6.7-beta1, it was discovered that adding Twenty Twenty-Five to the Performance Tests caused an error when tested in the baseline version of WordPress, which that theme does not currently support. To address those failures, the theme was removed from the performance tests in [59151].
We should start collecting performance data for this new default theme, so we either need to update the baseline used conditionally based on the theme we're testing, or consider updating the baseline for all performance tests.
Change History (17)
This ticket was mentioned in Slack in #core by sergey. View the logs.
4 months ago
This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.
4 months ago
#4
@
4 months ago
- Keywords early added
- Milestone changed from 6.7 to 6.8
After discussion in the #core-performance bug scrub earlier today, we determined to push this to 6.8-early, since making any changes to the workflow that will impact the baselines would be risky until after the release. Even then, it will be important to keep in mind that all release branches use a reusable workflow from trunk
, so changes after branching could still affect the 6.7 release.
This ticket was mentioned in PR #7688 on WordPress/wordpress-develop by @pbearne.
3 months ago
#5
- Keywords has-patch added; needs-patch removed
Changed the default value of the baseline version from 6.1.1 to 6.6.2 in the reusable-performance.yml workflow file. This ensures the workflow uses the latest baseline version for performance measurements.
This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.
2 months ago
This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.
7 weeks ago
@joemcgill commented on PR #7688:
6 weeks ago
#9
I wonder whether we should bump this to 6.7 instead, now that it's out?
I think we should give this a try. We've not updated the baseline version since the performance tests were first introduced, so now would be a good time to try this prior to any code syncs being merged into the core repo for this release.
@flixos90 commented on PR #7688:
6 weeks ago
#10
@joemcgill
I wonder whether we should bump this to 6.7 instead, now that it's out?
I think we should give this a try. We've not updated the baseline version since the performance tests were first introduced, so now would be a good time to try this prior to any code syncs being merged into the core repo for this release.
We're still early enough in the 6.8 cycle IMO so that I think it would be reasonable to commit this now. I think it would be great to do so this week.
@swissspidy commented on PR #7688:
6 weeks ago
#12
@swissspidy commented on PR #7688:
6 weeks ago
#15
Are there any changes required upstream where we submit the reports?
If we wanna send the data there and display it there, actually, yes. We need to update tests/performance/log-results.js
and also Riad will need to add it to the dashboard. But I think we should only do that when we add some sort of filter there...
#16
@
6 weeks ago
Just wanted to follow up to provide more context about the purpose of these baseline measurements to clear up some potential confusion that I noticed in the PR discussion about whether these should be bumped to 6.7.1 instead—specifically:
I think it should be fine to use 6.7 and not 6.7.1. After all, we're comparing performance between major releases, not minor ones. If we start with 6.7.1, it's possible (though unlikely) a performance positive change in 6.7.1 would not be included in the next report.
In this workflow, the whole purpose of including a baseline test is to normalize the data when comparing these metrics for multiple commits over time, as we do when when we log and display this data in the WordPress Code Vitals dashboard. The theory is that by collecting performance metrics for the same commit over time, we can control for the performance impact introduced by non-code factors, like the difference in performance of the GitHub workers, and get a better indication that a performance change was introduced by the specific code changes of a commit.
The way this works in practice is that when we record the performance data to the dashboard, the metrics for the latest commit get normalized based on the difference recorded between the baseline metrics from that workflow run with the previously recorded baseline (reference).
We've maintained the same baseline since this was introduced so that our normalization stayed consistent, so after this change it is expected that the trends on the dashboard will take time to even out (as you can already see when visiting the dashboard after this commit). Since we really only use this method as a tool to spot potential commits that introduced a performance cahnge, I think starting fresh each release is likely fine, as it only affects how we're observing this data, not actually changing the way this is measured, which can be confirmed by looking at the summary data displayed on the workflows before and after this change.
Looking into the performance tests failure for TT5:
It appears that the baseline version for performance tests is WP 6.1.1, whereas
register_block_bindings_source()
was added in [57375] / #60282 for WP 6.5.The 6.1.1 baseline version was added in [55459] / #57687, back when trunk was 6.2. Should it be updated to 6.6.2 now?