Make WordPress Core

Opened 4 years ago

Closed 2 years ago

Last modified 2 years ago

#49783 closed enhancement (fixed)

PHPCS config: Persistent caching during CI

Reported by: johnbillion's profile johnbillion Owned by: desrosj's profile desrosj
Milestone: 5.9 Priority: normal
Severity: normal Version:
Component: Build/Test Tools Keywords: has-patch
Focuses: coding-standards Cc:

Description

Caching between linting runs with PHPCS was introduced in [45665]. We can go a step further and use a specific cache location and then instruct Travis CI to cache this between builds. This effectively reduces the linting time to a few seconds once the cache is warm.

Previously: #40539

Attachments (2)

49783.diff (1.2 KB) - added by johnbillion 4 years ago.
49783.2.diff (1.7 KB) - added by desrosj 4 years ago.

Download all attachments as: .zip

Change History (27)

@johnbillion
4 years ago

#1 @johnbillion
4 years ago

  • Keywords has-patch added; needs-patch removed

#2 @jrf
4 years ago

Just checking: if I remember correctly, the caching feature checks if files have been changed since the last run and only runs phpcs on those changed files.

What I'm not so sure about is, whether the cache gets invalidated when PHPCS and/or external standards used with it, has been updated (i.e. composer update) and whether the cache gets invalidated if any of the used rulesets (project + underlying) change.

This should really need be verified in detail as if that is not the case, only changed files would get the benefit of updated rulesets/sniffs/fixes etc and the cache would quickly contain a lot of outdated information.

#3 @johnbillion
4 years ago

@jrf Excellent point and I'm going to check that before making the change. I would expect the cache interface to consider that, but definitely worth verifying.

#4 @johnbillion
4 years ago

cc @jdgrimes

#5 @johnbillion
4 years ago

According to https://github.com/squizlabs/PHP_CodeSniffer/issues/530 it appears that --cache and --cache=<file> behave differently. The former means the cache will not be invalidated if the standard changes, but the latter means it will.

#6 @johnbillion
4 years ago

  • Keywords commit added

Okay I've verified locally that changing the standard does invalidate the cache. I tested:

  • Manually altering the rules in phpcs.xml.dist
  • Altering (specifically, downgrading) the wp-coding-standards/wpcs version
  • Applying a different standard via the --standard CLI flag

Anything else that might need testing?

#7 @jrf
4 years ago

@johnbillion I've ran some tests myself as well & had a look at the caching code. I concur with your conclusions.

The only thing I can think of which still needs addressing in the PHPCS caching code would be to take enabled PHP extensions into account, but that's not something we should be concerned about for this ticket.

I do wonder a little about the tests/cache/phpcs.json file name as it would suggest this has something to do with the (unit) tests, which it doesn't.

What about just letting it cache in the root directory where the ruleset lives as well to a file called .phpcs.cache ?

@desrosj
4 years ago

#8 @desrosj
4 years ago

  • Keywords commit removed

49783.2.diff changes the cache file name to .phpcs-cache.json and moves it to the root directory as suggested by @jrf. I experimented with using a cache directory in root, but it seems that PHPCS wants the directory to exist.

49783.2.diff also adds a .phpcompat-cache.json file to cache PHP compatibility scans.

In my testing, though, I could only get the compatibility cache file to be created and work correctly.

#9 @jrf
4 years ago

@desrosj Been working on some related things. I have a feeling that @johnbillion used a directory for the cache file to allow Travis to cache the file as it seems that there is no option to cache individual files, just directories.

With that in mind, using a /[.]cache/ directory off the project root may not be a bad idea.

This directory would need a .gitkeep file to ensure it exists and can be committed (as per John's earlier patch).
The cache files then need to be added to both .gitignore as well as .svnignore to make sure that the cache files themselves do not get committed.

Also: why is /phpcompat.xml added to .gitignore ? This is a user-specific override file, not a default file name, so should probably be in the user .git/info/exclude/ rather than the project .gitignore.
Or is there documentation/a make manual page which refers to using this as an overload file ?

#10 @desrosj
4 years ago

@jrf I added that because /phpcs.xml was present in the .gitignore already and assumed this was desired and missed when I introduced the new phpcompat.xml.dist file. If this is an incorrect assumption, I can remove that from the diff.

Also, I misread the documentation for the Travis config file, so that will need to be fixed. What if the cache folder is added to the root and the tests/cache folder is removed in favor of that one? It would consolidate all cache files in one place and not cause confusion.

#11 @jrf
4 years ago

I added that because /phpcs.xml was present in the .gitignore already and assumed this was desired and missed when I introduced the new phpcompat.xml.dist file. If this is an incorrect assumption, I can remove that from the diff.

@desrosj Yes, that assumption is incorrect.

To explain it: PHPCS will automatically pick up on a configuration file if it's called [.]phpcs.xml[.dist]. This means that if one of those file names is used, the user does not have to pass the --standard=... argument on the command-line.

Between the different file names, there is precedence, with .phpcs.xml having the highest and phpcs.xml.dist the lowest.
This allows for a project to have a phpcs.xml.dist file and an individual dev working on the project to have a local phpcs.xml file which will overrule the project file (but may include it) and will automatically be loaded as long as the --standard=... CLI argument is not used.
Using this "trick", devs can, for instance, have stricter settings for themselves, test new rules before adding them or ignore a particular issue (like Windows line endings) etc.

With that in mind, adding phpcs.xml and .phpcs.xml to the .gitignore file is best practice as that allows for devs using this automatic overload feature and prevents those local files from getting committed.

The phpcompat.xml.dist file is not a standard file name PHPCS will pick up on, so it will always have to be passed as an argument on the command-line to be used.
If a dev wants to locally overrule that file, they can name the local file whatever they want (safe for the above mentioned file names) as it won't get picked up automatically anyway and they will need to pass the filename of their local overload config file on the command-line via --standard=....

So, as there is no standard naming scheme for overloading the phpcompat.xml.dist file, adding an arbitrary name to the .gitignore, is, well, arbitrary, and this is probably better suited for the dev's own .git/info/exclude file.

What if the cache folder is added to the root and the tests/cache folder is removed in favor of that one? It would consolidate all cache files in one place and not cause confusion.

All in favour of that. To make it obvious its a CI/dev directory and does not contain project files, you may want to consider naming the directory /.cache/ as I suggested above.

#12 @jrf
4 years ago

The only thing I can think of which still needs addressing in the PHPCS caching code would be to take enabled PHP extensions into account, but that's not something we should be concerned about for this ticket.

FYI: I've opened a PR to address this: https://github.com/squizlabs/PHP_CodeSniffer/pull/2982

#13 @jrf
4 years ago

According to https://github.com/squizlabs/PHP_CodeSniffer/issues/530 it appears that --cache and --cache=<file> behave differently. The former means the cache will not be invalidated if the standard changes, but the latter means it will.

I've run some additional tests with this. Just adding the results here to document it.

When using --cache the cache does not get invalidated if the standard changes, instead a separate cache file for the files under the other standard is created.

So, in effect, the options actually behave the same when using them.

The difference is that when using a named cache file, you only have one cache file and it gets invalidated and rebuild completely when something has changed (either in the ruleset, the sniffs or PHPCS itself), while when using the --cache option, you have many different cache files depending on the options used in a run. When the same options are used however, an existing cache will be used.

#14 @jrf
4 years ago

FYI - I discovered another bug in the PHPCS caching. Only relevant for those people who use the --sniffs or --exclude CLI arguments.

Bug reported in PHPCS: https://github.com/squizlabs/PHP_CodeSniffer/issues/2992 and fix pulled: https://github.com/squizlabs/PHP_CodeSniffer/pull/2996

#15 @johnbillion
4 years ago

  • Milestone changed from 5.5 to Future Release

#16 @jrf
4 years ago

FYI: PR https://github.com/squizlabs/PHP_CodeSniffer/pull/2982 has been merged in the mean time and will be included in the PHPCS 3.5.6 release.

Issue https://github.com/squizlabs/PHP_CodeSniffer/issues/2992 has been confirmed, but needs a different fix than the one I initially pulled.

#17 @desrosj
3 years ago

  • Keywords needs-refresh added

The wheels are in motion to move from Travis to GitHub Actions (see #50401). We should turn our attention to how GitHub Actions can cache this between workflow runs. I'll add a to-do to look into this.

#18 @johnbillion
3 years ago

FYI I haven't found a great way to run this on GitHub Actions yet because GHA doesn't support always pulling in a cache file and simultaneously always updating it if there are changes, unlike Travis, because it uses the same cache key for fetching and saving the cache during a run. Unless this has changed recently, this means you need to pull in a somewhat stale cache.

Example: https://github.com/johnbillion/query-monitor/blob/fbe8124794d8b53edb3455c87ac39fc596b14fd4/.github/workflows/test.yml#L35-L44

#19 @johnbillion
3 years ago

  • Summary changed from PHPCS config: Persistent caching on Travis to PHPCS config: Persistent caching during CI

This ticket was mentioned in PR #1877 on WordPress/wordpress-develop by desrosj.


2 years ago
#20

  • Keywords needs-refresh removed

This adds the needed steps to store the PHPCS caches for coding standards and PHP compatibility scanning across workflow runs.

It uses the date of the previous Monday for the cache key to ensure that the cache is regularly refreshed. As the week progresses, some files will need to be manually scanned for each run as they are updated, but this should will not result in big difference, unless many PHP files are updated over the course of a week.

Trac ticket: https://core.trac.wordpress.org/ticket/49783

desrosj commented on PR #1877:


2 years ago
#21

Before adding caching in this PR, the PHPCS scan took just over a minute. After, it has been reduced by over 95% to less than 3 seconds.

The caching the PHP compatibility scanning resulted in similar reduction. It's not currently being shown in action runs for this PR because I believe there is a bad cache value. That cache will be flushed at some point today when the action runner's date changes into Monday and the key resets. I've tested and verified the improvement locally.

#22 @desrosj
2 years ago

  • Milestone changed from Future Release to 5.9

I've set up the PR above to implement caching in GitHub Actions across workflow runs. I think it's ready to go. A second set of eyes and final agreement on using a .cache directory in the project root would be appreciated.

Moving this to the 5.9 milestone so it can be wrapped up.

#23 @desrosj
2 years ago

  • Owner set to desrosj
  • Resolution set to fixed
  • Status changed from new to closed

In 52179:

Build/Test Tools: Cache the results of PHP_CodeSniffer across workflow runs.

When the PHP_CodeSniffer runs, it produces a cache file. When a cache file is present, only changed files are rescanned, making subsequent scans significantly faster.

This adds the needed steps to the corresponding GitHub Actions workflows to cache these files across runs. The cache keys include the date of the previous Monday to ensure that the cache is flushed at least weekly.

Since GitHub Action caches cannot be updated once created, the scans will take slightly longer as the week progresses and more PHP files are updated. The date within the cache key can be updated to purge twice weekly if the scan time starts to approach the current scan times.

This change also introduces a .cache directory for all caching files related to build/test tools.

Props johnbillion, jrf.
Fixes #49783.

This ticket was mentioned in Slack in #core by sergey. View the logs.


2 years ago

Note: See TracTickets for help on using tickets.