Make WordPress Core

Opened 13 years ago

Closed 6 months ago

Last modified 6 months ago

#18210 closed defect (bug) (worksforme)

Update_post_meta is case insensitive on meta_key, but get_post_meta is NOT

Reported by: anmari's profile anmari Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.2
Component: Database Keywords: dev-feedback needs-patch needs-unit-tests
Focuses: performance Cc:

Description (last modified by sabernhardt)

In WordPress 3.3-aortic-dissection and 3.2.1

get_post_meta is case sensitive on the meta-key

BUT

update_post_meta is NOT case sensitive

Thus If there is a pre-existing meta record with a key in say UPPERCASE, then one can issue an update for a lowercase key that one can not then fetch as it does not exist - only the uppercase key exists.

Example Code

	$meta = get_post_meta ($post->ID, '_allday');
	if ($meta ) { echo '<br />got lower: '; var_dump($meta);
	}
	$meta = get_post_meta ($post->ID, '_ALLDAY');
	if ($meta ) { echo '<br />got upper: '; var_dump($meta);
	}
	update_post_meta (21, '_allday','Tried to update lowercase');
	$meta = get_post_meta ($post->ID, '_allday');
	if ($meta ) { echo '<br />got lower: '; var_dump($meta);
	}
	else { echo '<br />Tried to get lower but no go';
	}
	$meta = get_post_meta ($post->ID, '_ALLDAY');
	if ($meta ) { echo '<br />Still have upper: '; var_dump($meta);
	}


Output of above:

got upper: array(1) { [0]=> string(14) "tis the upper." }
Tried to get lower but no go
Still have upper: array(1) { [0]=> string(25) "Tried to update lowercase" }

Attachments (1)

18210.diff (1.2 KB) - added by kdoole 9 years ago.
Added tests for the cases @anmari describes

Download all attachments as: .zip

Change History (15)

#1 @anmari
13 years ago

Further..

delete_post_meta also appears to be case_insensitive.

IE: one can delete issue

delete_post_meta(21, '_ALLDAY');

and any lowercase entries will be deleted too.

#2 @anmari
13 years ago

get_posts also NOT case sensitive:

	$args = array(
	'posts_per_page' => '-1',
	'post_type'=>'any',
	'meta_key'=> $key  // update to your key
);
	$allpostwith = get_posts( $args );

will return a MIX of lower and uppercase.

Is it caused by a mysql 'feature' perhaps? Similar behaviour in phpmyadmin? by why get_post_meta works ie is case sensitive ?

#3 @xknown
13 years ago

Yes, it is due to the collation used by default in WP (utf8_general_ci), which is case insensitive. There are more cases than case insensivity, for example the following query returns true:

select ('_allday' COLLATE utf8_general_ci) = ('_âlldây        ' COLLATE utf8_general_ci)

The reason why get_post_meta does not work for you is because the meta data are first cached in PHP using an array with key/value pairs and as you know, the string comparison in PHP is case sensitive.

Version 0, edited 13 years ago by xknown (next)

#4 @anmari
13 years ago

I just very quickly learnt a whole heap more about collation and the sorting and matching issues especially for foreign languages.

I gather the _ci (case insenstive) collations are faster than the _cs (case sensitive) and different collations define different orders of search results. Eg: _bin the binary ones do not sort the capital 'A' with the lower 'a'.

It seems weird to give up accuracy for speed, and it seems that in mysql one can assign a collation (if available) at query time. Is there a case that functions that work together should do the 'same' thing. Eg: maybe use a binary collation where 'matching' keys is important?

Are there some 'best practises' around appropriate way to handle this that could be communicated. So far I've added a couple of edits to the get_post_meta and update_post_meta codex to highlight the risks in mixing case, or changing case of one's meta key.

For others who may read, these are useful
http://dev.mysql.com/doc/refman/5.0/en/charset-collations.html
http://hakre.wordpress.com/2010/12/26/wordpress-database-charset-and-collation-configuration/

#5 @brokentone
12 years ago

  • Cc kenton.jacobsen@… added

#6 @nacin
12 years ago

  • Component changed from General to Database

#7 @chriscct7
9 years ago

  • Focuses performance added
  • Keywords dev-feedback needs-patch added
  • Version changed from 3.2.1 to 3.2

#8 @netweb
9 years ago

  • Keywords needs-unit-tests added

#9 @kdoole
9 years ago

I don't know of / couldn't find best practices around this.

The collation can be changed in the query statement, so one option would be adding COLLATE utf8_general_cs when adding / updating / deleting meta. Would it be worthwhile to benchmark the performance difference between ci and cs?

Another option would be allowing users to set this themselves with a parameter when using the various get/update/insert/delete post meta fns. Seems like overkill though. Is it at all useful to let people make meta fields case sensitive?

Perhaps it's simply a matter of updating get_post_meta to be case INsensitive (and noting in the codex that post meta keys are case insensitive) so that everything works consistently.

@kdoole
9 years ago

Added tests for the cases @anmari describes

#10 @webaware
7 years ago

I ran into a related problem with user meta. A client used a CSV importer and accidentally capitalised the first letter of a meta key. Because case insensitivity in MySQL, but case sensitivity in the WP user meta PHP code, the billing last name would never load "for some users". Fixed with this query, and a flush of the object cache:

update wp_usermeta
set meta_key = 'billing_last_name'
where meta_key collate utf8mb4_bin = 'Billing_last_name'

FWIW, I ran the query below on one of my testbed sites with over 200 plugins (multisite, lots of compatibility testing) and didn't find any examples of plugins having user meta keys with capital letters.

select distinct meta_key
from wp_usermeta
where meta_key collate utf8mb4_bin regexp '[A-Z]'

Post meta is a different story, however; meta keys are sometimes directly used on the front end, and thus may legitimately contain capital letters.

#11 follow-up: @jrchamp
7 years ago

Rather than specifying the collation, you'll have better luck using the BINARY keyword. Example:

SELECT meta_key, meta_value
FROM wp_usermeta
WHERE meta_key = BINARY 'Billing_last_name'

Note that it's also important from a performance perspective to place the BINARY keyword on the value side of the equals or regexp (otherwise, you may not be able to benefit from the existing index).

Last edited 7 years ago by jrchamp (previous) (diff)

#12 in reply to: ↑ 11 @webaware
7 years ago

Replying to jrchamp:

Rather than specifying the collation, you'll have better luck using the BINARY keyword.
...
Note that it's also important from a performance perspective to place the BINARY keyword on the value side of the equals or regexp (otherwise, you may not be able to benefit from the existing index).

Thanks for the tips, will try to remember for next time I bump into this one :)

#13 @pbearne
6 months ago

  • Milestone Awaiting Review deleted
  • Resolution set to invalid
  • Status changed from new to closed

I have run the test provided and this seems to be fixed
So closing ticket as invalid

#14 @sabernhardt
6 months ago

  • Description modified (diff)
  • Resolution changed from invalid to worksforme
Note: See TracTickets for help on using tickets.