Make WordPress Core

Opened 3 weeks ago

Last modified 10 days ago

#65232 assigned enhancement

Abilities API: Add a fast site health summary to `core/get-environment-info`

Reported by: gziolo's profile gziolo Owned by: gziolo's profile gziolo
Milestone: 7.1 Priority: normal
Severity: normal Version: 6.9
Component: Abilities API Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

Extend core/get-environment-info with two additive changes:

  1. An optional fields input parameter that filters the response (mirroring core/get-site-info).
  2. A new site_health output field returning a high-level, LLM-friendly Site Health overview — "is everything working, what to improve, what is broken" — populated entirely from cached data.

Background

core/get-environment-info shipped in 6.9 as part of the orientation layer for AI agents. Adding a cached Site Health summary lets agents answer "is anything on fire?" in the same call they already make for runtime context, without paying for a synchronous test run.

Related discussion: WordPress/ai#40.

Proposed shape

{
  "environment": "production",
  "php_version": "8.2.10",
  "db_server_info": "mysql 8.0.35",
  "wp_version": "7.1.0",
  "site_health": {
    "status": "recommended",
    "counts": { "good": 12, "recommended": 3, "critical": 0 },
    "issues": [
      {
        "label": "Background updates are not working as expected.",
        "severity": "recommended",
        "recommendation": "Contact your hosting provider to ensure WP-Cron can run."
      }
    ],
    "truncated": false
  }
}

Behavioral rules

  • Read cache only. Populate site_health from WordPress's cached Site Health results — the same cache the admin dashboard widget reads. Never trigger a synchronous test run from this ability.
  • Unknown when uncached. If no cached results exist, return status: "unknown" with empty counts and issues, so the agent can distinguish "no data" from "clean bill of health".
  • Only actionable issues listed. counts covers all severities; issues includes only recommended and critical entries.
  • Bounded payload. Cap the issues list at 10 entries with truncated: true when exceeded.
  • Included by default. Omitting fields returns all properties including site_health, matching core/get-site-info. Passing fields: ["php_version"] filters down to just that property.

Capability check unchanged: the existing manage_options gate on core/get-environment-info covers Site Health access.

Open question

core/get-environment-info declares idempotent: true. The site_health field can vary across calls as the cache refreshes. The annotation remains accurate in the "no state change" sense but the response is not byte-stable. Worth documenting in the field description so adapters can communicate the nuance.

Acceptance criteria

  • core/get-environment-info accepts an optional fields input parameter that filters the response.
  • When fields is omitted, the response includes all properties — the existing 6.9 fields plus site_health.
  • site_health reads cached Site Health results only; no synchronous test execution under any input.
  • JSON Schema documents the new field with title and description on each sub-property so adapters can surface them to agents.
  • Unit tests cover: fields filtering, site_health with cached results, site_health with no cache, bounded issues list with truncated: true, and confirmation that no synchronous evaluation is triggered.
  • Documentation updated on the Abilities API reference pages (separate task).

Change History (11)

#1 @gziolo
3 weeks ago

  • Owner set to gziolo
  • Status changed from new to assigned

#2 follow-up: @karunyachavan84
3 weeks ago

While reviewing this, I noticed that the existing health-check-site-status-result transient currently only stores aggregate counts (good, recommended, critical) and does not persist the underlying issue details needed for the proposed site_health.issues response (label, severity, recommendation).
Since the proposal also explicitly requires that site_health be populated entirely from cached data and never trigger synchronous Site Health tests, it seems like we would first need to expand the Site Health caching mechanism itself (likely during the scheduled health check cron run and related persistence flows) to store normalized actionable issue summaries alongside the counts.
Just wanted to confirm whether that scope expansion is expected here, or if there’s another existing cached source I may be missing.

This ticket was mentioned in Slack in #core by gziolo. View the logs.


3 weeks ago

This ticket was mentioned in PR #11833 on WordPress/wordpress-develop by @khokansardar.


3 weeks ago
#4

  • Keywords has-patch has-unit-tests added

Add optional fields input to mirror core/get-site-info. Expose site_health from the health-check-site-status-result transient only (no synchronous tests), including counts, actionable issues (recommended/critical), and truncation when more than ten issues are cached.

Persist issue summaries in the Site Health transient from the weekly cron and from the Site Health screen AJAX handler, with merge behavior when the client omits the issues payload.

#5 in reply to: ↑ 2 @westonruter
3 weeks ago

Replying to karunyachavan84:

While reviewing this, I noticed that the existing health-check-site-status-result transient currently only stores aggregate counts (good, recommended, critical) and does not persist the underlying issue details needed for the proposed site_health.issues response (label, severity, recommendation).
Since the proposal also explicitly requires that site_health be populated entirely from cached data and never trigger synchronous Site Health tests, it seems like we would first need to expand the Site Health caching mechanism itself (likely during the scheduled health check cron run and related persistence flows) to store normalized actionable issue summaries alongside the counts.
Just wanted to confirm whether that scope expansion is expected here, or if there’s another existing cached source I may be missing.

Yes, I think we should be caching the entire Site Health results in a transient, not just the aggregate counts. This would be very helpful to implement #64066. You can see from the PR for that ticket, it had to resort to adding a new transient specifically for health_check_page_cache_detail since the results as a whole aren't cached. If we had caching, then this wouldn't be needed.

The transient should also store a timestamp for when the results were obtained, so the Ability can indicate how stale the results are. For sites that have broken WP-Cron and no traffic, it could be that the results become quite stale.

#6 @gziolo
3 weeks ago

The transient should also store a timestamp for when the results were obtained, so the Ability can indicate how stale the results are. For sites that have broken WP-Cron and no traffic, it could be that the results become quite stale.

If that was in place, we would include that as part of the site health object. On its own that could be also another site health metric.

This ticket was mentioned in PR #11834 on WordPress/wordpress-develop by @karunyachavan84.


3 weeks ago
#7

Trac Ticket: https://core.trac.wordpress.org/ticket/65232

---
This PR extends core/get-environment-info with two additive changes:

  • Adds an optional fields input parameter, matching the response filtering behavior already used by core/get-site-info.
  • Adds a cached-only site_health output field with a high-level Site Health summary for agents and other consumers of the Abilities API.

The Site Health summary is read from the existing health-check-site-status-result transient. The ability does not run Site Health tests, call Site Health test discovery, or trigger synchronous evaluation. If cached data is missing or malformed, the ability returns status: "unknown" with empty counts and issues.

flowchart LR
    A[Site Health checks run from UI or scheduled check] --> B[Normalize counts and actionable issues]
    B --> C[Store health-check-site-status-result transient]
    D[core/get-environment-info] --> E[Read transient only]
    E --> F[Return site_health summary]

The site_health response includes:

  • status: unknown, good, recommended, or critical.
  • counts: cached Site Health result counts for good, recommended, and critical.
  • issues: up to 10 actionable recommended or critical issue summaries.
  • truncated: whether more than 10 actionable issues were available.
  • timestamp: Unix timestamp for when the cached Site Health data was collected, or 0 when no cached data exists.

Example:

{
  "site_health": {
    "status": "recommended",
    "counts": {
      "good": 12,
      "recommended": 3,
      "critical": 0
    },
    "issues": [
      {
        "test": "background_updates",
        "label": "Background updates are not working as expected.",
        "severity": "recommended",
        "recommendation": "Contact your hosting provider to ensure WP-Cron can run."
      }
    ],
    "truncated": false,
    "timestamp": 1715714399
  }
}

## Cache Behavior

The Site Health transient is expanded to include normalized actionable issue summaries and a timestamp while preserving the existing aggregate counts used by the dashboard and admin menu.

flowchart TD
    A[Site Health result] --> B{Status}
    B -->|good| C[Increment good count]
    B -->|recommended| D[Increment recommended count]
    B -->|critical| E[Increment critical count]
    D --> F[Store normalized issue summary]
    E --> F
    C --> G[Do not include in issues array]
    F --> H[Save transient with timestamp]
    G --> H

Only actionable recommended and critical entries are stored in the issues list. Passing good checks remain represented in counts.good.

The Site Health screen AJAX persistence path also sanitizes incoming issue summaries and preserves previous issue details only when the new counts still indicate actionable issues. This avoids carrying stale issue details after the site becomes clean.

## Schema Changes

The output schema now documents the site_health object and all nested properties with titles and descriptions. It also constrains known values with enums and disallows unexpected nested properties where appropriate.

This makes the response clearer for adapters by documenting:

  • The difference between unknown and good.
  • That issue severity is limited to recommended or critical.
  • That timestamp communicates the freshness of cached Site Health data.

## Use of AI Tools

AI assistance: Yes

Tool(s): Codex

Model(s): GPT-5

Used for: refactoring and test coverage suggestions.

---

@khokansardar commented on PR #11833:


3 weeks ago
#8

@apermo Thanks for reviewing, I've addressed the following fixes.

#9 @soyebsalar01
3 weeks ago

Document cache staleness directly in the JSON Schema description, not just the ticket.
Confirm counts always shows full totals even when issues is capped at 10.
Mark recommendation as nullable — not all Site Health tests have a remediation step.
Add status: "unknown" as an explicit unit test case.

#10 @gziolo
10 days ago

Thanks both @karunyachavan84 and @khokansardar for the proposed PRs here. We now have two open implementations that are very similar in scope, with the main differences being around the final shape and semantics of the cached Site Health summary. In particular, PR#11834 adds a timestamp/freshness signal and is a bit more defensive about not preserving stale issue details when the latest counts no longer indicate actionable issues, while PR#11833 keeps the issue payload closer to the raw cached Site Health data shape.

I think we should narrow this down before iterating too much further. One possible next step would be to split the work into two smaller pieces: first extend the Site Health caching mechanism to persist the counts plus actionable issue summaries, since that part looks largely aligned across both PRs. Next, follow up with the Abilities API changes in a separate PR, using the discussion here and the current implementations to decide the final response shape. That might keep the feedback loop more focused and make it easier to review the cache behavior separately from the public ability contract.

#11 @gziolo
10 days ago

I opened #65355 and opened a PR that covers the changes that introduce fields in the input schema to isolate those changes from the core work on site health improvements.

Note: See TracTickets for help on using tickets.