WordPress.org

Make WordPress Core

Opened 4 months ago

Last modified 4 months ago

#50789 new defect (bug)

Improve WPDB logic around information_schema

Reported by: andy Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 5.5
Component: Database Keywords:
Focuses: multisite Cc:

Description

WooCommerce uses a query on information_schema.table_constraints to determine whether it needs to add a foreign key constraint to one of its tables.

SELECT
  COUNT(*) AS fk_count
FROM
  information_schema.TABLE_CONSTRAINTS
WHERE
  CONSTRAINT_SCHEMA = ''
  AND CONSTRAINT_NAME = 'fk_wp_12345_wc_download_log_permission_id'
  AND CONSTRAINT_TYPE = 'FOREIGN KEY'
  AND TABLE_NAME = 'wp_12345_wc_download_log'

$wpdb->get_table_from_query returns information_schema which is not a table, it is a database containing the table TABLE_CONSTRAINTS. So it seems there is room to improve this method to extract the table name in cases where it is preceded by a database name. However, this is not the goal of this ticket. In the specific case of information_schema tables, we're more interested in the table referenced in the WHERE clause.

My use case involves a wpdb drop-in, WordPress.com's hyperdb, which was the original source of this method when it was added in [30345]. We use hyperdb to map queries to database servers using table names as map keys. Given a table like wp_12345_posts we connect to the right database.

The database information_schema exists in all database servers. When WooCommerce queries the table table_constraints it's looking for information about the table wp_12345_wc_download_log. To route the query to the appropriate database server, we are interested in this table name.

I would propose adding this before the first preg_match in get_table_from_query:

		// SELECT FROM information_schema.* WHERE TABLE_NAME = 'wp_12345_foo'
		if ( preg_match('/^\s*'
				. 'SELECT.*?\s+FROM\s+`?information_schema`?\.'
				. '.*\s+TABLE_NAME\s*=\s*["\']([\w-]+)["\']/is', $q, $maybe) )
			return $maybe[1];

This returns wp_12345_wc_download_log which allows us to route the query to the correct database server.

I am able to patch WordPress.com's drop-in to check this pattern before calling the parent method in core's wpdb. So there is no need to rush on our behalf.

Does anyone know of a use case that relies on the existing implementation?

It might be argued that a caller of this function would expect the return to be TABLE_CONSTRAINTS in this case. However, I was unable to find any tickets requesting a fix for the current behavior. I believe the best fix would be to return the table name from the WHERE clause.

Attachments (1)

50789.diff (615 bytes) - added by johnjamesjacoby 4 months ago.
First pass diff at targeting information_schema.* table, TABLE_NAME column queries

Download all attachments as: .zip

Change History (4)

#1 follow-up: @johnjamesjacoby
4 months ago

Interesting. (Also, hey @andy!)

It's not common, but querying the information_schema database is a valid use of WPDB::query(). I expect for get_table_from_query() to return the first table being queried, not the TABLE_NAME - it seems too clever.

The get_table_from_query() method documentation says:

	/**
	 * Finds the first table name referenced in a query.
	 *

...emphasis first. 😕


I probably wouldn't query information_schema directly in this situation for this reason. WooCommerce should be storing a version as an option for each of its custom tables, and the values of those options should dictate what SQL is generated. (This is also the approach I've taken in the BerlinDB project.)

If table versions are unknown, I would reverse engineer them by calling:

	$prefix = $wpdb->get_blog_prefix();
	$table  = "{$prefix}wc_download_log";
	$query  = "SHOW CREATE TABLE {$table}";
	$create = $wpdb->query( $query );

...and comparing those results to known database table schema changes.


I do think if this idea made its way into get_table_from_query() that probably not very many people would notice. But I still don't think it's a good idea. That said, I'm attaching a patch imminently so that others can test it out and draw their own conclusions. ❤️

@johnjamesjacoby
4 months ago

First pass diff at targeting information_schema.* table, TABLE_NAME column queries

#2 @johnjamesjacoby
4 months ago

Related ticket on testing foreign keys: #44351

#3 in reply to: ↑ 1 @andy
4 months ago

Replying to johnjamesjacoby:

Interesting. (Also, hey @andy!)

Hey, @johnjamesjacoby! :D

It's not common, but querying the information_schema database is a valid use of WPDB::query(). I expect for get_table_from_query() to return the first table being queried, not the TABLE_NAME - it seems too clever.

The original and primary use for get_table_from_query in hyperdb was to allow hyperdb to route the query to the right database server. Years later, @pento borrowed this code for core in [30345] for a different purpose: to enable get_table_charset() to write a SHOW COLUMNS statement ultimately supporting strip_invalid_text_from_query(). This does not fit with the proposal that I made, i.e. to extract from the TABLE_NAME value in the WHERE clause.

In agreement with your assessment, I now think that hyperdb should use a separate or hybrid solution, taking back responsibility for extracting the name that it needs for query routing.

Still, there is a valid issue in the original description: the bug where get_table_from_query returns the database name information_schema instead of the table name TABLE_CONSTRAINTS. This should be fixed. In the process, it can be considered whether to return the extracted database name somewhere (e.g. information_schema).

I probably wouldn't query information_schema directly in this situation for this reason. WooCommerce should be storing a version as an option for each of its custom tables, and the values of those options should dictate what SQL is generated. (This is also the approach I've taken in the BerlinDB project.)

We agree that information_schema queries should be avoided. I can't speak for the WooCommerce logic of using both a stored version number and a schema check. But WooCommerce got a pull request to replace the information_schema query with a SHOW CREATE TABLE.

Last edited 4 months ago by andy (previous) (diff)
Note: See TracTickets for help on using tickets.