Make WordPress Core

Opened 6 weeks ago

Last modified 6 weeks ago

#64579 new defect (bug)

Build: env:install fails intermittently with database connection refused

Reported by: kraftbj's profile kraftbj Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Build/Test Tools Keywords: has-patch
Focuses: Cc:

Description

The env:install script (tools/local-env/scripts/install.js) can fail intermittently with "Database connection error (2002) Connection refused" when the database container reports healthy but isn't yet accepting connections.

Steps to reproduce:

  1. Run npm run env:start && npm run env:install
  2. Occasionally fails with connection refused error

Root cause:

The first wp_cli() call runs immediately without waiting for the database:

// Line 13 - runs immediately with no wait
wp_cli( config create --dbname=wordpress_develop --dbuser=root --dbpass=password --dbhost=mysql --force );

The Docker healthcheck uses mysqladmin ping / mariadb-admin ping, which can pass before the database is ready to accept authenticated connections from other containers.

This causes intermittent CI failures across random PHP/MariaDB combinations (different jobs fail each run).

Related:

Change History (1)

This ticket was mentioned in PR #10834 on WordPress/wordpress-develop by @kraftbj.


6 weeks ago
#1

  • Keywords has-patch added

This PR adds retry logic with exponential backoff to the wp config create command in tools/local-env/scripts/install.js to handle intermittent database connection failures during environment setup.

## The Problem

The env:install script fails intermittently with "Database connection error (2002) Connection refused" because:

  1. Docker's healthcheck (mysqladmin ping) passes when the database process is running
  2. But the database may not yet be accepting authenticated connections from other containers
  3. The wp config create command runs immediately without any wait

This causes random CI job failures across different PHP/MariaDB combinations.

## Approach Chosen: Application-Level Retry

This PR adds a wp_cli_with_retry() function that retries the config create command with exponential backoff (1s, 2s, 4s, 8s, 16s).

Pros:

  • Handles the actual failure case directly
  • No unnecessary delay when database is ready quickly
  • Self-documenting: the retry logic shows this is a known timing issue
  • Works regardless of how Docker reports container health
  • No changes needed to docker-compose.yml

Cons:

  • Adds ~25 lines of code to install.js
  • Maximum 31 seconds of retries before failing (may not be enough in extreme cases)

## Alternative Considered: Improved Docker Healthcheck

Change the healthcheck in docker-compose.yml from ping-based to connection-based:

healthcheck:
  test: ['CMD', 'mysql', '-u', 'root', '-ppassword', '-e', 'SELECT 1']

Pros:

  • Fixes the issue at the source (container health definition)
  • No changes to application code

Cons:

  • Healthcheck runs repeatedly (every 5s) even after startup, adding overhead
  • Different syntax needed for MySQL vs MariaDB vs different MariaDB versions
  • Exposes password in healthcheck command
  • Doesn't help if there are other transient connection issues
Note: See TracTickets for help on using tickets.