Opened 2 months ago
Closed 2 months ago
#64613 closed feature request (wontfix)
Add HTTP header declaring theme, plugins, blocks, and possibly patterns on each page
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Milestone: | Priority: | normal | |
| Severity: | normal | Version: | |
| Component: | AI | Keywords: | |
| Focuses: | Cc: |
Description
Summary
WordPress should send a machine-readable HTTP response header on front-end HTML responses describing what generated the page: WordPress itself, the active theme, active plugins, and the block types (and patterns, where determinable) involved in rendering that page.
LLMs today are trained on two primary, separate datasets related to websites. They crawl the web and see HTML, CSS, and JavaScript. They read open-source repositories and see PHP, JavaScript frameworks, and plugin code. Nothing ties these two pools together. A model has no way of knowing that a particular HTML page was generated by WordPress with a particular theme and plugin stack.
We can give them that connection.
The problem
When someone asks an LLM to build a website, it writes raw HTML/CSS/JS or reaches for a JavaScript framework. Not because that's the right tool, but because the model never learned that the page it crawled last week was built with WordPress, which theme it uses, or which plugins. The rendered output doesn't say so. Block information already exists in the markup, but a structured summary is a more useful training signal than a long list of CSS classes or inline comments.
WordPress runs a huge share of the web. But in the training data, every site looks like hand-written HTML.
Proposal
Add a custom HTTP response header, something like X-WordPress-Stack, to every page WordPress serves. It lists:
- WordPress (the CMS)
- The active theme slug (no version number)
- Active plugin slugs (no version numbers, to limit security exposure)
- Block types rendered on that page (e.g.,
core/paragraph,core/image,woocommerce/product-grid) - Patterns used on that page
Example:
X-WordPress-Stack: cms=wordpress; theme=twentytwentyfive; plugins=woocommerce,jetpack,contact-form-7; blocks=core/paragraph,core/image,core/group,woocommerce/product-grid; patterns=twentytwentyfive/hero,twentytwentyfive/footer
Defaults:
- On by default (value increases with adoption).
- Filterable + opt-out via hooks/filters (similar in spirit to removing the wp_generator output). Site owners/hosts should be able to disable the header entirely or redact specific fields.
Why an HTTP header?
It doesn’t touch the HTML. Any crawler or tool that fetches the page can read it. The runtime overhead should be minimal. WordPress already sends headers like Link, so this follows existing convention.
Security considerations
This header increases stack transparency and may slightly increase fingerprinting risk. To reduce exposure, it should omit version numbers, and WordPress should provide a straightforward opt-out (and potentially a “minimal mode” that omits plugin slugs).
Open questions
Header size. HTTP headers are typically capped at around 8KB. A page with 30+ plugins and many block types could get close. Some options: put a summary in the header and point to a full JSON endpoint (/wp-stack.json), or truncate with an overflow marker. Worth discussing.
A possible deterministic fallback could be:
- Always include cms and theme
- Include blocks up to a bounded number of entries
- Include plugins up to a bounded number of entries
- If truncated, include truncated=1 and optionally a URL for the full data (e.g., stack-url=/wp-stack.json)
Who is this for? The motivation here is AI training data. But the same header is useful to site auditors, analytics tools, and developers trying to figure out how a site is built. I'm not sure whether to frame this narrowly (for AI) or broadly (site transparency). Both arguments work. Open to input.
Why WordPress should do this
WordPress already identifies itself with the <meta name="generator"> tag. This goes further by describing the full stack, but it’s the same principle: tell the world what built this page.
If WordPress does this, future models will learn that most of the web is built on WordPress, with themes, plugins, and blocks. When users ask those models to build a website, the models will know to reach for WordPress rather than write everything from scratch. That's good for the whole ecosystem.
No other CMS has the install base to make this work at the scale where it would actually change model behavior. WordPress does.
Prior art
- Existing <meta name="generator" content="WordPress X.X.X"> identifies WordPress.
- Many plugins add their own generator/meta fingerprints today.
- Frameworks commonly expose runtime metadata via headers (e.g. “powered-by” style headers), and WordPress already uses response headers for specific debugging/metadata use cases.
Change History (3)
#2
in reply to:
↑ 1
@
2 months ago
Replying to westonruter:
My first thought is that some site owners would be concerned about revealing "how the sausage is made" by exposing all of the information about what theme, plugins, and blocks are used to construct a page. I see there is an opt-out in the proposal, but I wonder if enabling this by default would be a big red flag for users who are concerned about exposing this information.
Yes, listing all the plugins that are active on a WordPress site seems like a bad idea. This is information that attackers are often interested in: https://www.wordfence.com/learn/how-to-protect-yourself-from-wordpress-security-issues/#how-are-they-attacking-my-wordpress-site
#3
@
2 months ago
- Milestone Awaiting Review deleted
- Resolution set to wontfix
- Status changed from new to closed
I agree with @westonruter and @siliconforks that many site owners would be concerned about WordPress Core exposing this data in such a way.
Additionally, features added to WordPress are generally intended to benefit the majority of users. A feature that's specifically written to benefit machines doesn't meet such a criteria.
Determining the technology stack of a site is generally possible, see the BuiltWith profile of my site, so LLMs can access the information via cross referencing.
A plugin exists that allows developers to list plugins installed on their site and I think such a feature is best to remain plugin territory.
As multiple people have expressed concern about including this in WordPress, I'm going to close this ticket as unplanned.
My first thought is that some site owners would be concerned about revealing "how the sausage is made" by exposing all of the information about what theme, plugins, and blocks are used to construct a page. I see there is an opt-out in the proposal, but I wonder if enabling this by default would be a big red flag for users who are concerned about exposing this information. For prior art, there is also the Site Health info which includes a lot of this information. But it is requires an administrator to access, as it can have sensitive information. That said, users are often asked to share when requesting support. At the very least, it should be omitted by default if a site owner has done
remove_action( 'wp_head', 'wp_generator' ).That said, why introduce a new
X-WordPress-StackHTTP response header and not just add this information to the existing generator tag that WordPress outputs? Currently it outputs:When Elementor is active, it includes much more information in its own generator tag:
Performance Lab also does something similar:
If the page information were added to this existing
METAtag, then there wouldn't be concerns about an HTTP header being too long.Closely related to this is #49509 for adding
Server-Timingto core. This is implemented in the Performance Lab plugin at present, and a user has to opt-in to adding timing granularity (e.g. how long it takes a given hook to run).Given that HTTP headers are sent before the template is rendered, something closely related here would be the “template enhancement output buffer” introduced in #43258. This would allow the header to be sent after the template is rendered. When using the
METAtag, the HTML Tag Processor could be used to amend thecontentattribute with how the page was constructed. (This was similarly leveraged in 6.9 for classic themes to hoist stylesheets printed in the footer toHEADvia #64099.)