Make WordPress Core

Opened 5 weeks ago

Closed 4 weeks ago

Last modified 4 weeks ago

#62153 closed defect (bug) (fixed)

Fix recording Performance Test results

Reported by: joemcgill's profile joemcgill Owned by: flixos90's profile flixos90
Milestone: 6.7 Priority: high
Severity: normal Version: 6.6
Component: Build/Test Tools Keywords: has-patch has-unit-tests fixed-major
Focuses: performance Cc:

Description

The automated Performance Tests that run on each individual commit send the recorded metrics to the WordPress Code Vitals dashboard so we can track the performance change over time. However, these metrics stopped being recorded since [58165].

Ideally, we could manually recover the delta from when these metrics stopped being shipped to the dashboard. If not, we should at minimum fix the connection so these results can begin being recorded again.

Change History (37)

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


5 weeks ago

#2 @swissspidy
5 weeks ago

Wow, I guess we didn't look at that dashboard in quite some time.

There's a AggregateError [ECONNREFUSED] error in the log-results.js script for some reason, see https://github.com/WordPress/wordpress-develop/actions/runs/11131505615/job/30933229744#step:41:23. It didn't cause the step to fail, hence we didn't notice. Such an error should really have an exit code > 0.

#3 @adamsilverstein
5 weeks ago

Wow, I guess we didn't look at that dashboard in quite some time.

Even looking, it is a bit hard to tell looking at the dashboard that it isn't getting updated. It would be nice to see dates for the commits or the last update date on the chart page itself.

#4 @flixos90
5 weeks ago

@desrosj Looking at [58165], it looks like the change was made in an effort to rely on reusable workflows, which makes sense.

That said, given that the breakage of the dashboard is a high priority issue, I wonder whether we could as a first attempt revert the performance.yml changes from [58165] and commit that if it solves the problem.

We could then open a follow up ticket to reinstate the approach using reusable workflows in a way that works, which will give us more time to actually find the underlying problem and address it, e.g. during the 6.8 cycle.

This ticket was mentioned in PR #7490 on WordPress/wordpress-develop by @flixos90.


5 weeks ago
#5

  • Keywords has-patch added

For reference, see https://docs.github.com/en/actions/sharing-automations/reusing-workflows#using-inputs-and-secrets-in-a-reusable-workflow

Trac ticket: https://core.trac.wordpress.org/ticket/62153

#6 @flixos90
5 weeks ago

  • Owner set to flixos90
  • Status changed from new to assigned

Please ignore my previous comment. I took a closer look and I believe I found the problem: Secrets aren't automatically inherited in reusable workflows, but we're not passing the crucial CODEVITALS_PROJECT_TOKEN to the reusable performance workflow.

I believe that's the issue, but let's validate. I opened https://github.com/WordPress/wordpress-develop/pull/7490 with a fix.

#7 @joemcgill
5 weeks ago

Thanks for investigating, @flixos90! I suspected that the CODEVITALS_PROJECT_TOKEN was likely the issue but hadn't been able to verify.

@flixos90 commented on PR #7490:


5 weeks ago
#8

@desrosj @joemcgill @mukeshpanchal27 With my initial fix, the performance workflow now fails (see https://github.com/WordPress/wordpress-develop/actions/runs/11167077994), however this is not really related and it only happens because performance.yml always uses reusable-performance.yml from the trunk branch - which here does not work since I'm also making changes to reusable-performance.yml.

So in https://github.com/WordPress/wordpress-develop/pull/7490/commits/2ad3672c176f4b4de126493b880c2acb09a361d6 I temporarily change that so that both workflows come from this PR branch. The relevant workflow run https://github.com/WordPress/wordpress-develop/actions/runs/11167231240/job/31042991530?pr=7490 passes, however since data for pull requests is never sent to the dashboard, it doesn't allow validating that the problem is fixed.

But we can use the workflow logs to find that out. Take a look at any recent Core commit run (e.g. https://github.com/WordPress/wordpress-develop/actions/runs/11153662921/job/31001600641#step:41:20) and you'll see that in those the env.CODEVITALS_PROJECT_TOKEN logging shows an empty value / nothing, but it should be shown *** instead (used to keep the value secret).

So in https://github.com/WordPress/wordpress-develop/pull/7490/commits/7cc4f3a6f8c609c5d91cbe967081cc47896cd5e1 I temporarily added the secret as env in another step that always runs, just so we can inspect that it is correctly available via the logs. The relevant workflow run however shows that it's still empty (see https://github.com/WordPress/wordpress-develop/actions/runs/11167384998/job/31043488494?pr=7490#step:36:22) so that didn't seem to work.

That's my progress update so far, I'm continuing to experiment...

@flixos90 commented on PR #7490:


5 weeks ago
#9

Turns out I didn't think about that this PR is running in my fork. 🙃 So the only reason there was no CODEVITALS_PROJECT_TOKEN available was that my fork didn't have it configured (see e.g. https://github.com/WordPress/wordpress-develop/actions/runs/11167940794/job/31045233520?pr=7490#step:36:22).

I now configured a test secret in my fork and pushed another commit. Here's the workflow run for that, using the secrets: inherit approach, but there's still no secret: https://github.com/WordPress/wordpress-develop/actions/runs/11168272117/job/31046297639?pr=7490#step:36:22 🤔

So I went back and in https://github.com/WordPress/wordpress-develop/pull/7490/commits/aaf748761cd091f1a82e1aa11aeec52b01ae0c55 tried to use the more explicit approach of requiring specific secrets, which is probably a safer choice anyway. It also makes the reusable workflows easier to understand, by having the list of relevant secrets on top of the workflow file. Here's the relevant workflow run, but it still doesn't show: https://github.com/WordPress/wordpress-develop/actions/runs/11168437399/job/31046822425?pr=7490#step:36:22

At this point, I'm not sure how to best debug this. Maybe it worked all along in my commits and my approach of testing it is just wrong. I tried both of the documented approaches from https://docs.github.com/en/actions/sharing-automations/reusing-workflows#passing-inputs-and-secrets-to-a-reusable-workflow and https://docs.github.com/en/actions/sharing-automations/reusing-workflows#passing-secrets-to-nested-workflows, so I would expect one of them to work. Not sure what's wrong here... 🤔

@desrosj @joemcgill @mukeshpanchal27 Any additional ideas and testing would be much appreciated.

This ticket was mentioned in PR #7492 on WordPress/wordpress-develop by @flixos90.


5 weeks ago
#10

  • Keywords has-unit-tests added

This is the same as #7490, but opened temporarily within the same repository, for debugging, as there are some limitations on forks in regards to GH Actions.

I'm aware that this PR and branch will be auto-deleted, but it may help even just temporarily.

Trac ticket: https://core.trac.wordpress.org/ticket/62153

@flixos90 commented on PR #7492:


5 weeks ago
#12

Closing this as the branch was already auto-deleted, will open another temp PR to validate this once more.

This ticket was mentioned in PR #7493 on WordPress/wordpress-develop by @flixos90.


5 weeks ago
#13

This is the same as #7492, a temporary PR so that we don't have to use a fork.

See #7490 for the source of truth PR.

This PR has one additional commit on top of #7492, that temporarily reverts the fix, so that we can validate that the fix actually fixed the problem. :)

Trac ticket: https://core.trac.wordpress.org/ticket/62153

@flixos90 commented on PR #7493:


5 weeks ago
#15

Yep!

@flixos90 commented on PR #7490:


5 weeks ago
#16

@desrosj @joemcgill @mukeshpanchal27 I have temporarily opened #7492, to check whether the fact that it's not working was only due to this being a fork. Looks like that's it!

Please see the workflow run of the same code in #7492: https://github.com/WordPress/wordpress-develop/actions/runs/11169200148/job/31049367116?pr=7492#step:36:22

It confirms that the secret is now recognized in the reusable workflow as expected.

I opened yet another temporary follow-up #7493 (simply because by the time I wanted to test this the branch from #7492 had already been auto-deleted): In this PR I temporarily removed the fix, so that we can check that the workflow run for that code again no longer has access to the secret. And the workflow run _does_ confirm it, it indeed no longer has access to the secret: https://github.com/WordPress/wordpress-develop/actions/runs/11169595371/job/31050658243?pr=7493#step:36:22

As such, we now validated that the fix is working. I will remove the temporary workarounds from this PR so that it can be properly reviewed.

@joemcgill commented on PR #7490:


5 weeks ago
#17

Looks like running this from a branch of the WordPress org works: https://github.com/WordPress/wordpress-develop/actions/runs/11169200148/job/31049368386?pr=7492#step:36:22

And removing the fix shows that the env variable is not set: https://github.com/WordPress/wordpress-develop/actions/runs/11169595371/job/31050658243?pr=7493#step:36:22

Since we only care about this on actual commits (not PRs) this should do the job. I think it's worth updating the workflow to point to trunk again and testing a real commit.

@flixos90 commented on PR #7490:


5 weeks ago
#18

@joemcgill

Since we only care about this on actual commits (not PRs) this should do the job. I think it's worth updating the workflow to point to trunk again and testing a real commit.

That's not possible simply because this is a PR. If I change the reusable workflow reference back to @trunk, this PR's performance workflow will fail because the trunk version of reusable-performance.yml is missing the relevant changes.

We need to rely on the validation done before, and we will only have 100% validation once this is actually committed to Core.

@joemcgill commented on PR #7490:


5 weeks ago
#19

When the commit is made, then the workflow in trunk will be updated. If it's still failing, then we can either revert that commit or do a follow-up, but if it passes (which seems likely given the testing already done) then no follow-up will be necessary.

@flixos90 commented on PR #7490:


5 weeks ago
#20

Marking as approved.

@felixarntz[[Image(chrome-extension://hgomfjikakokcbkjlfgodhklifiplmpg/images/wp-logo.png)]] I assume you'll prep the commit and can handle whether to revert the changes to jobs.performance.uses in .github/workflows/performance.yml or to do so in a follow-up.

Yes, I'll make sure to take care of this. Will wait for another approval before I commit.

#21 @flixos90
5 weeks ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 59170:

Build/Test Tools: Resolve access failure to continue sending commit performance data to Code Vitals dashboard.

This originally broke in [58165] and unfortunately went unnoticed for a while because the failing request to send the data did not cause the GitHub workflows to fail. This changeset resolves the underlying access problem, which was happening because reusable GitHub workflows do not automatically receive secrets from the calling workflow. More concretely, the relevant CODEVITALS_PROJECT_TOKEN was not being explicitly passed to the reusable workflow.

The changeset also includes a change so that in the future a failing request would cause the workflow to fail, which ensures a similar problem further down the road wouldn't go unnoticed.

Props joemcgill, flixos90, swissspidy, mukesh27, sergeybiryukov
Fixes #62153.
See #61213.

@flixos90 commented on PR #7490:


5 weeks ago
#22

Committed in https://core.trac.wordpress.org/changeset/59170. I'll monitor the workflow.

#23 @flixos90
5 weeks ago

I can confirm the commit resolved the problem.

The workflow sent the commit data successfully: https://github.com/WordPress/wordpress-develop/actions/runs/11183467868/job/31092115383#step:41:24

That commit also shows up in https://www.codevitals.run/project/wordpress as expected.

Going forward, the workflow should fail, should this request ever fail again in the future, so that we'll find out about it right away.

#24 @desrosj
4 weeks ago

  • Keywords fixed-major added
  • Resolution fixed deleted
  • Status changed from closed to reopened

It looks like we'll need to backport the related changes to older branches still running the performance tests. Here are a few failing runs:

Does this make sense? Or should we stop running this workflow for these branches?

#26 @joemcgill
4 weeks ago

Good point @desrosj, We really only need the new env variable when sending the data to the dashboard for commits to trunk, so I wonder if we could update the reusable test so that the CODEVITALS_PROJECT_TOKEN input is not required. The only step that uses that value already has a conditional to only run on trunk so this should be fine.

I've opened a new PR that makes this change.

#27 @joemcgill
4 weeks ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 59214:

Build/Test Tools: Fix Performance GH workflows on release branches.

This fixes an issue after [59170] that was causing the Performance release workflows to fail on older branches since the CODEVITALS_PROJECT_TOKEN input value was marked as required but was not being passed.

Fixes #62153.
Props desrosj, joemcgill, flixos90.

#28 @joemcgill
4 weeks ago

In 59215:

Build/Test Tools: Revert [59214].

The commit caused the Performance Tests workflow to fail due to an invalid workflow file.

See #62153.
Unprops @joemcgill.

#29 @joemcgill
4 weeks ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening, since [59214] caused an invalid workflow error in the reusable workflow which needed to be reverted so the Performance Test workflow would continue to run on trunk.

@joemcgill commented on PR #7550:


4 weeks ago
#30

@felixarntz and @desrosj. I _think_ that 04396b9 should make this work, but am curious if there's a better way to test and validate this before making another commit?

@flixos90 commented on PR #7550:


4 weeks ago
#31

@joemcgill This is probably okay as is? I'd say, just like for the other fix, monitor after the commit that the workflows complete successfully and that the data is sent. If not, we can follow up.

@swissspidy commented on PR #7550:


4 weeks ago
#32

You can test this against the trunk branch of your fork, no need to commit to find out.

@flixos90 commented on PR #7550:


4 weeks ago
#33

You can test this against the trunk branch of your fork, no need to commit to find out.

There are still limitations with that since forks don't have access to the project secrets. And even adding that secret to your own fork won't solve that, per what I tried in #7490. At the end of the day, it looks like this kind of stuff is unfortunately not fully testable until you commit it to trunk.

@joemcgill commented on PR #7550:


4 weeks ago
#34

You can test this against the trunk branch of your fork, no need to commit to find out.

Ran a quick test on my fork, and while it fails for unrelated reasons, it confirms that the workflow is valid and will run.

#35 @joemcgill
4 weeks ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 59217:

Build/Test Tools: Fix Performance Test workflows on release branches.

This fixes an issue after [59170] that was causing the Performance Test workflows to fail on older branches since the CODEVITALS_PROJECT_TOKEN input value was marked as required but was not being passed.

This is a follow-up to [59214] and [59215].

Fixes #62153.
Props desrosj, joemcgill, flixos90, swissspidy.

@joemcgill commented on PR #7550:


4 weeks ago
#36

Committed in https://core.trac.wordpress.org/changeset/59217 and confirmed that the commit showed up in the Code Vitals dashboard. @desrosj can you keep an eye on this next time there is a need to backport a commit to a release branch?

@desrosj commented on PR #7550:


4 weeks ago
#37

Because the reusable part of the workflow is run from trunk and there's no changes required to the calling workflow in the branches that failed, rerunning the workflow will use the updated version committed in Core-50217 as long as you rerun the entire workflow (rerunning only failed jobs will always use the same version of the file used in the other jobs within the workflow).

I've gone and rerun the three failures from the other day and they are no longer experiencing the problem. 🎉 You'll notice one is still failing, but it's due to an unrelated problem (the built artifact from the previous run of the workflow expired is no longer available).

Note: See TracTickets for help on using tickets.