#56407 closed enhancement (fixed)
Rerun GitHub Action workflows on the first failure
Reported by: | desrosj | Owned by: | desrosj |
---|---|---|---|
Milestone: | 6.1 | Priority: | normal |
Severity: | normal | Version: | |
Component: | Build/Test Tools | Keywords: | has-patch commit |
Focuses: | Cc: |
Description
It's fairly common for GitHub Action workflows to fail because of several different types of errors. Some examples include:
- Chromium timeout errors
- WordPress.org hiccups when making requests within PHP unit tests
- NPM install errors
- Docker container registry timout and connection issues.
In almost all cases (except when there are service level outages), rerunning the workflow will result in a successful outcome. Because this is currently manual, it's sometimes forgotten.
Our Actions workflows should attempt to fix these runs by automatically rerunning failed jobs.
Attachments (1)
Change History (32)
This ticket was mentioned in PR #3111 on WordPress/wordpress-develop by desrosj.
2 years ago
#1
2 years ago
#2
There is some inconsistency, where github.rest.actions.createWorkflowDispatch is sometimes called with
const result = await
and sometimes just called on it's own. I think one should be picked and used throughout
Ah, good spot. This was a relic from when I was debugging why the workflow_dispatch
call was not working in my fork.
One thing that is not clear to me reading through the action's documentation is whether github-script
will cause a step or job to fail if a non 2xx response code is returned by an API call. If not, using await
and checking the response for a problem manually would be preferred. But for now I have removed the one instance of that in the coding standards workflow.
#5
@
2 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
The workflow itself works when manually called and providing a cancelled or failed workflow run ID (see this test), but for some reason the step is being skipped in other workflows. It worked in the test repository I made on GitHub, so not sure what is missing.
Will dig in.
#6
@
2 years ago
- Keywords commit added
Turns out to be an incorrect variable being used. github.run_number
(the unique number for a run represents for a workflow over time) is not the same as github.run_attempt
(the run attempt number for a given workflow run). The attempt is the correct one here.
#8
@
2 years ago
Hi @desrosj, thanks for your work on this! Cool idea!
To have it documented: Why is the GHA_WORKFLOW_DISPATCH
secret needed over e.g. the default GITHUB_TOKEN
, and what permission scope does it need? Thanks!
#9
@
2 years ago
Great question, @TobiasBg!
I definitely wanted to just use GITHUB_TOKEN
instead. But in my testing, GITHUB_TOKEN
I found that does not have the required permissions to modify workflow runs through the REST API.
The documentation for creating a workflow dispatch event mentions actions:write
is required, but that specifically mentions GitHub Apps.
GitHub Actions does support specifying custom permissions in a workflow through permissions
at the top or job level, but in my testing, even specifying permissions: write-all
had no effect.
It can also be a bit hard to get to the bottom of because calling github.rest.actions.createWorkflowDispatch()
with a token lacking the required permissions still returns a 204
status (the one documented as expected). But eventually, I found the right documentation on this here.
When you use the repository's
GITHUB_TOKEN
to perform tasks, events triggered by theGITHUB_TOKEN
will not create a new workflow run. This prevents you from accidentally creating recursive workflow runs... If you do want to trigger a workflow from within a workflow run, you can use a personal access token instead ofGITHUB_TOKEN
to trigger events that require a token.
So seems that it's an intentional design decision to prevent user error.
#10
@
2 years ago
This came through today: https://github.blog/changelog/2022-09-08-github-actions-use-github_token-with-workflow_dispatch-and-repository_dispatch/. Looks like using the supplied token is now allowed! I’ll update the workflows through #55652.
#11
@
2 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
Reopening to add the enable_debug_logging
parameter when restarting a workflow. See https://docs.github.com/en/rest/actions/workflow-runs#re-run-failed-jobs-from-a-workflow-run.
This workflow will restart a failed or cancelled workflow when it has only been run once.
This is dependent on actions/github-script#283, which will add the needed
.rest.actions.reRunWorkflowFailedJobs()
function added in octokit/plugin-rest-endpoint-methods.js@v5.14.0.Trac ticket: https://core.trac.wordpress.org/ticket/56407