Make WordPress Core

Opened 5 months ago

Last modified 4 months ago

#63901 new enhancement

Add `AGENTS.md` for the project

Reported by: flixos90's profile flixos90 Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Build/Test Tools Keywords:
Focuses: Cc:

Description

This ticket proposes the inclusion of an AGENTS.md file for WordPress Core, to centrally provide the context for AI assisted coding tools and agentic solutions.

AGENTS.md is an emerging standard for a central LLM assistance file supported by many tools. It addresses the problem of having to favor specific tools, and the problem of having to include many different files with similar contents just to satisfy different popular tools.

AGENTS.md is already widely supported, as seen on the linked website. For tools without out-of-the-box support, it should be possible to configure the file manually as an additional context file, or to symlink it under another file name mandated by the respective tool.

As for the contents of the file, I think we should approach this pragmatically. We will not be able to (nor should we) holistically cover everything, and we will need to come up with something that works well enough to start with, rather than being perfect. Only actual usage with the different AI coding assistants that individual contributors prefer will identify room for improvements, whether missing crucial context or existing content that confuses LLMs.

Change History (12)

#1 @flixos90
5 months ago

Here's a proposal for a high-level outline we could start with:

  1. Project brief: Describes what the project / repository is and does in 1-2 short paragraphs.
  2. High-level architecture: Outlines a few key concepts, design patterns, and philosophies for the project's architecture, potentially including a few sub sections. Could also cover aspects like directory structure.
  3. Development tooling and commands: Lists the most commonly useful development tooling and relevant commands, such as composer lint or npm run test:php.
  4. Coding standards and best practices: Clarifies coding standards and related best practices to follow, e.g. naming conventions, documentation guidelines etc.

#2 @johnbillion
5 months ago

I've had good luck pointing agents to where they can find information such as how to run tests, rather than including it in the helper file. "See CONTRIBUTING.md for information on how to run tests", etc.

#3 @SirLouen
5 months ago

I think that this reopens the debate: Is wordpress-develop meant to be a fully-featured development environment? Or it was only meant for CI building purposes?

I have the same doubt when planning to add new features, like the need of testing anything for the Mail component, without having a tool by default.

Personally, I'm in favor of adding dev tools of any type, maybe including this (this is technically not a tool, but it serves as it was).

Although it might generate a lot of redundancy with other files as @johnbillion suggest, like contributing.md. Obviously keeping a file like this short and clear for an LLM will save some context credits

PS: Btw, are we eager to promote the use of LLMs for Core?

Last edited 5 months ago by SirLouen (previous) (diff)

#4 @gziolo
5 months ago

Here's a proposal for a high-level outline we could start with:

That's a great overview of what could be included. At the same time, I share the general sentiment that, at this point, all these coding tools should be mature enough and to be able to understand the usual project's structure and process README.md, CONTRIBUTING.md, composer.jon, package.json, and synthesize that into its own entry in the context.

I'm not against including AGENTS.md if that improves the workflows. It's more of a remark that CONTRIBUTING.md should work for both usual contributors and agents. They could standardize sections in that file if there is a need for some agent-oriented information.

#5 @matt
5 months ago

"Are we eager to promote the use of LLMs for Core?"

Absolutely. We aim to make WordPress itself, as well as its plugins, themes, and extended ecosystem, more legible and easy to use with AI tools. This will enable us to harness the passion, talent, and creativity of WordPress contributors to explore and experiment with these tools, ultimately becoming more efficient in achieving our mission of democratizing publishing, making the web more open source, and enhancing the stability, performance, and security of all WordPress users.

Our founding ethos was fueled by web standards, interoperability, and hackability. This is today's version of that. There are tools available for free or pennies that give capabilities beyond what we could have imagined even five years ago, let's support that and see what happens. Let a thousand flowers bloom.

#6 @bordoni
5 months ago

  1. High-level architecture: Outlines a few key concepts, design patterns, and philosophies for the project's architecture, potentially including a few sub sections. Could also cover aspects like directory structure.

@flixos90 I've found really good success for bigger code bases when trying to convey bigger concepts to mention specific folders where the "Agent" can find .md files with more explanations and even code examples, following the Context7 pattern (https://context7.com/wordpress/gutenberg?topic=slotfill).

#7 @justlevine
5 months ago

My anecdotal experience aligns with @johnbillion 's and @gziolo 's comments:

whatever.md AGENTS.md is great when a project lacks documentation or tooling, but can't compete with those "sources of truth" and can in many cases cause LLM output to degrade, for example:

  • Across models, e.g. using absolutist language ("always","never") is strongly recommended in GPT3.5/Claude Sonnet 3.7, but a footgun in more "sycophantic" models like GPT4o.
  • When the .md conflicts with the sources of truth, e.g. when told to "follow WordPress Coding Standards" but the agent keeps discovering noncompliant code (legacy in core, modern if we're talking other WordPress/* projects) or the lints keep failing.

Also want to remind folks how amorphous evaluating the efficacy of these early-stage experiments. Taking a cue from Matt's q&a (albeit in a different context), I think it's crucial to first lay out a plan to test/measure/iterate instead of just theory-crafting with our (albeit collectively experienced) gut. For example, we should be able to answer:

  • Is this (or any) AGENTS.md better or worse than no file at all
  • It this (or any) AGENTS.md better than a Directory Tree with some context comments and a link to existing documentation. (Or just the CONTRIBUTING.md if it's already been optimized for both humans and agents).
  • Is X version of the .md better or worse than whatever first version we decide to commit?

Otherwise we're just throwing seeds out of the car window shouting "bloom" in hopes something will catch hold and germinate, there's faster and more effective/impactful ways to start a garden. (I'm assuming the metaphor was intended literally and not as employed by Mau)

This ticket was mentioned in Slack in #core-committers by westonruter. View the logs.


5 months ago

#9 follow-up: @jeremyfelt
4 months ago

I think adding AGENTS.md is a great idea, that we should start simple, and that while tools should generally be good at analyzing and picking up existing structure, it can be helpful to point them in the right initial direction.

FWIW, I've found success by maintaining context in a separate directory (like /docs) and then explicitly loading it through the main agent file (previously, CLAUDE.md).

# Agent context for the WordPress project

This is the root level context file for the open source project, WordPress.

## Project overview

Always load @CONTRIBUTING.md, @README.md, and @docs/architecture.md .... when starting a new session.

This can then be tested with a prompt like:

> What context have you loaded already? Please provide filenames.

I've loaded the following context files:

- /{HOME}/wordpress-develop/AGENTS.md
- /{HOME}/wordpress-develop/CONTRIBUTING.md
- /{HOME}/wordpress-develop/README.md
- /{HOME}/wordpress-develop/docs/architecture.md

IMO, this helps keep AGENTS.md clean and can allow for additional context to be designed more for people and agents.

This is also a good opportunity to revisit our existing documentation and improve it for the current state of the project. (e.g. code maintained in other repos, explanation of src/ and build/ directories, etc...)

#10 in reply to: ↑ 9 ; follow-up: @justlevine
4 months ago

Using this solely for illustrative purposes (I understand the specific wording isn't the focus 🙇):

FWIW, I've found success by maintaining context in a separate directory (like /docs) and then explicitly loading it through the main agent file (previously, CLAUDE.md).

# Agent context for the WordPress project

This is the root level context file for the open source project, WordPress.

## Project overview

Always load @CONTRIBUTING.md, @README.md, and @docs/architecture.md .... when starting a new session.

I want to repeat that GitHub Copilot explicitly recommends not to use absolute language like "always".

From https://docs.github.com/en/copilot/how-tos/configure-custom-instructions/add-repository-instructions

You should also consider the size and complexity of your repository. The following types of instructions may work for a small repository with only a few contributors, but for a large and diverse repository, these may cause problems:

  • Requests to refer to external resources when formulating a response
  • Instructions to answer in a particular style
  • Requests to always respond with a certain level of detail

For example, the following instructions may not have the intended results:

Always conform to the coding styles defined in styleguide.md in repo my-org/my-repo when generating code.

Use @terminal when answering questions about Git.

Answer all questions in the style of a friendly colleague, using informal language.

Answer all questions in less than 1000 characters, and words of no more than 12 characters.

Does Claude Code or whatever still need absolute language to prevent it from ignoring our AGENTS.md and falling back to the built-in instruction set when the context window gets too large? Is GitHub's recommendation just as true for when using GPT5 or only the more sycophantic 4x models that are used by default?

I don't know. But I do feel that in any most other context the bulk of core committers and leadership (yup acutely aware of all my heros I'm core-splaining to right now 😅) would strongly oppose to adding such an opaque footgun to core. I mean we won't even phpcbf legacy code because it might cause some diff headaches on old PRs, but we're cool with something that can actively degrade the contributor experience - while costing them money on wasted tokens! - with no explicit indicator or hint that it's a bug with the instructions and not e.g. Anthropic secretly rate limiting and using a worse model?


More direct feedback

Replying to jeremyfelt:

This can then be tested with a prompt like:

What context have you loaded already? Please provide filenames.

I think we need test the _results_, i.e. the effect on the ability to generate compliant code or accurately answer questions about/navigate the codebase.

  1. A positive answer here doesn't prove those files are in context. It doesn't even prove that AGENTS.md is in the context (or still unsupported by the IDE), just that when asked the question the LLM was able to discover that and parrot back what what's written there.
  2. Just because something is "in context" doesn't mean it's having a positive effect out the LLM output. A big part of the shift to subagents rn is that ability to only have the relevant info for the task.

#11 in reply to: ↑ 10 @jeremyfelt
4 months ago

Replying to justlevine:

I don't know. But I do feel that in any most other context the bulk of core committers and leadership [...] would strongly oppose to adding such an opaque footgun to core.

I think that's why starting simple is important. Focus on keeping documentation readable and useful to humans. See what happens when you tell the model to start with that. As time goes on, add documentation for tools that enhance the experience.

My previous example could be just the one line, and without "Always", but I don't think it hurts for AGENTS.md to also be targeted to humans.

  1. A positive answer here doesn't prove those files are in context.

Don't trust, verify. :) So far, in my test case of me, additional prompts react as expected, even with weird chains of context files I've setup.

  1. Just because something is "in context" doesn't mean it's having a positive effect out the LLM output.

This is much harder to measure, of course, and I'm not sure there's an automated way to test it.

Issue-specific user prompting matters more, but it at least feels helpful for there to be an entry point that provides a readable, structured overview so that the agent doesn't start attempting to parse a bunch of unrelated files into context immediately.

#12 @justlevine
4 months ago

Felt it important to come back and highlight that yet again a tool that was ostensibly supposed to improve results when working with an LLM is now being said to | produce worse results vs traditional best practices that we already enforce. That doesn't mean that LLMs.txt, Structured Markup, MCP Agents.md is complete hype, it just a reminder we should to test the results of adopting this tool at least the same amount we would for any other.

As to why not go with our gut and iterate by trial and error, I'll remind everyone about July's METR report that showed that developers | consistent felt AI assisted coding tools sped them up, when it was actually making them take ~19% longer. There is a measurable dopamine influence involved in how we "experience" AI productivity gains, so it's pretty important we rely on something just a bit more concrete than vibes.

This is much harder to measure

We can use the preexisting | test report flow where we come up with a handful of prompts for a few different types of tasks/scenarios, and then people share their results (branch diff if it was a task or answer if it was a question, chat history, model, contents of their AGENTS.md file etc etc). We make a template so it's mindless to report, and we compare it to a baseline of no AGENTS.md and an AGENTS.md that is just a table-of-contents to existing documentation files.

It's lower barrier-to-entry than contributing normal test reports because the tester doesn't even need specific WP or AI knowledge to prompt the LLM, nor to even evaluate the results, just report them. Plus it balances the presumed model skew based on who's starred this thread (which is probably the closest metaphor to traditional "environments" we have right now). We don't need a specific threshold on how many to collect just a bare minimum of due diligence and a tangible feedback loop.

Note: See TracTickets for help on using tickets.