Make WordPress Core

Opened 12 years ago

Closed 9 years ago

#21267 closed defect (bug) (fixed)

Kill the serialization of $wp_filter in get_terms()

Reported by: nacin's profile nacin Owned by: boonebgorges's profile boonebgorges
Milestone: 4.4 Priority: normal
Severity: normal Version:
Component: Cache API Keywords: has-patch
Focuses: Cc:

Description (last modified by scribu)

We use this as part of a persistent cache key:

serialize($GLOBALS['wp_filter']['list_terms_exclusions'])

See [8225].

This is not good for two reasons. First, well, it's a not fun to reach into our internal API like that. But worse, it's broken whenever an object method is used, because spl_object_hash() will be unique to that object, thus rendering the cache invalidated on the next pageload.

As an aside, we should probably have _wp_filter_build_unique_id() create a delimiter when dealing with a static class method — $function[0].$function[1] can conflict with a legitimate function name.

I'm not sure how this should be fixed.

Attachments (3)

filter-callback-keys.diff (5.2 KB) - added by wonderboymusic 12 years ago.
21267.diff (5.1 KB) - added by wonderboymusic 11 years ago.
21267.2.diff (7.4 KB) - added by wonderboymusic 11 years ago.

Download all attachments as: .zip

Change History (22)

#1 @scribu
12 years ago

  • Description modified (diff)
  • Keywords needs-patch added

That looks incredibly hacky. Besides, closures are not serializable (you get a fatal error).

#2 @scribu
12 years ago

  • Description modified (diff)

#3 @wonderboymusic
12 years ago

Here's how you can serialize a Closure instance, and by "serialize" I really mean "create a unique identifier":

ob_start();
echo new ReflectionFunction( function ( $var1, $var2 ) { 
	return $var1 + $var2;
} );
$source = ob_get_clean();
echo md5( $source );

// makes md5 of this:
Closure [ <user> function {closure} ] {
  @@ /Users/scott/Sites/wordpress-core/test.php 64 - 66

  - Parameters [2] {
    Parameter #0 [ <required> $var1 ]
    Parameter #1 [ <required> $var2 ]
  }
}

This works because ReflectionFunction implements the toString magic method and is better than spl_object_hash which will refresh the hash on every request.

http://www.php.net/manual/en/reflectionfunction.tostring.php

#4 @wonderboymusic
12 years ago

Since PHP 5.2 (the required version for WP is 5.2.4), spl_object_hash exists, which means the else statement and function_exists() check are completely useless in _wp_filter_build_unique_id()

Also, since all of that code is ditched and strings are returned for functions, there is nothing unique about the ID created, it just returns a string representation of a_function, MyClass::method, or whatever represents a hash for a closure

I am working up a patch, and I also found a bug in add_filter() as pertains to call_user_func( array( 'MyClass::static_method' ) ) - if $arr[1] is empty, a Warning is thrown and the callback is invalid

Version 0, edited 12 years ago by wonderboymusic (next)

#5 @wonderboymusic
12 years ago

FYI, here's a good way to debug this in functions.php or whatever:

function a_function( $filter ) { error_log( 'called' ); return $filter; }
class MyClass1 { function filter( $filter ) { error_log( 'called' ); return $filter; } }
class MyClass2 { static function filter( $filter ) { error_log( 'called' ); return $filter; } }

$my_class_1a = new MyClass1;
$my_class_1b = new MyClass1;

add_filter( 'list_terms_exclusions', 'a_function' );
add_filter( 'list_terms_exclusions', array( $my_class_1a, 'filter' ) );
add_filter( 'list_terms_exclusions', array( $my_class_1b, 'filter' ) );
add_filter( 'list_terms_exclusions', array( 'MyClass2', 'filter' ) );
add_filter( 'list_terms_exclusions', 'MyClass2::filter' );
add_filter( 'list_terms_exclusions', function ( $filter ) { error_log( 'called' ); return $filter; } );

print_r( $GLOBALS['wp_filter']['list_terms_exclusions'] );

echo wp_filter_callbacks_hash( 'list_terms_exclusions' );

get_terms( 'link_category' );
exit();
Last edited 12 years ago by wonderboymusic (previous) (diff)

#6 @wonderboymusic
12 years ago

  • Keywords has-patch added; needs-patch removed

Attached the patch:

Removed the unreachable code
Created a function to hash a closure: _wp_build_unique_closure_id( $function )
Created a function to hash a list of callbacks by key: wp_filter_callbacks_hash( $tag )

_wp_filter_build_unique_id() now generally returns the string name of a function or class ('MyClass::method') which is what this function was doing anyways - it returns an md5 string representing a closure where applicable. For class instances, it will prepend spl_object_hash( $function[0] )~ which will be removed when the wp_filter_callbacks_hash is generated - allows persistence between requests

get_terms() now internally calls:

$filter_key = wp_filter_callbacks_hash( 'list_terms_exclusions' );

It previously called:

$filter_key = ( has_filter('list_terms_exclusions') ) ? serialize($GLOBALS['wp_filter']['list_terms_exclusions']) : '';
Last edited 12 years ago by wonderboymusic (previous) (diff)

#7 @wonderboymusic
12 years ago

Updated the patch and edited my comments where appropriate

#8 @nacin
12 years ago

Clever.

Since this is the only time we've had to hash the filters being used, I wonder if this is a bit overkill.

#9 @nacin
12 years ago

  • Milestone changed from 3.5 to Future Release

#10 @wonderboymusic
12 years ago

@nacin - do we still care about this, or should we ditch the filter and throw a doing_it_wrong or something? I mean, this could be our chance to have an md5'd output buffer of ReflectionFunction::__toString in Core, which everyone's been asking for

#11 @scribu
12 years ago

this could be our chance to have an md5'd output buffer of ReflectionFunction::__toString in Core, which everyone's been asking for

Everybode? Really? :P

or should we ditch the filter and throw a doing_it_wrong or something?

Instead of basing the cache key on the list of callbacks attached to that filter, why not just use the term IDs currently returned by the filter?

#12 @scribu
12 years ago

Or, ugh, the SQL string returned by the filter rather.

#13 @wonderboymusic
11 years ago

  • Keywords needs-unit-tests needs-testing added
  • Milestone changed from Future Release to 3.7

Refreshed against trunk. The bug is still legitimate - patch allows us to update the object hashing strategy

#14 @wonderboymusic
11 years ago

  • Keywords needs-unit-tests needs-testing removed
  • Milestone changed from 3.7 to 3.8

Too big for right now

#15 @wonderboymusic
11 years ago

  • Keywords 3.9-early added
  • Milestone changed from 3.8 to Future Release

#16 @wonderboymusic
10 years ago

#30960 was marked as a duplicate.

#17 @boonebgorges
9 years ago

  • Keywords 3.9-early removed
  • Milestone changed from Future Release to 4.4

Instead of basing the cache key on the list of callbacks attached to that filter, why not just use the term IDs currently returned by the filter?

Yes. Imagine you have the following:

add_filter( 'list_terms_exclusions', function( $e ) {
    if ( is_page() ) {
        return one thing
    } else {
        return something else
    } 
} );

Cache is primed while looking at a page. It's now broken for all other uses.

There are a couple places in this function where we're allowing filters to intervene: get_terms_fields, get_terms_orderby, and the Mother Filter, terms_clauses. Currently, we ignore these when building our cache key, but they're subject to the exact same bug.

Using a SQL string to build a cache key is not the best thing in the world. For one thing, it's ugly. For another, it means running through all the parameter processing, even if we're going to end up hitting the cache. However, I think it's the only thing we can do in this case. (The additional processing should introduce minimal overhead - it shouldn't require any database queries, by the looks of things.) In this way, we sidestep the issue of serializing closures, and other kewl things discussed in this ticket.

We still need to include a hash of the function params, because some of them don't reflect in the SQL query (like child_of and fields, both of which are processed on the fetched results).

See 21267.3.diff.

#18 @wonderboymusic
9 years ago

  • Owner set to boonebgorges
  • Status changed from new to assigned

#19 @boonebgorges
9 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 35120:

Use a more reliable method for generating get_terms() cache key.

Previously, the cache key included a serialization of list_terms_exclusions
callbacks, to ensure that the cache was differentiated properly for different
uses of the list_terms_exclusions filter. This strategy was flawed in a
couple of ways: serialization doesn't work equally well for all callable types;
the serialization required reaching into the $wp_filter global; serializing
the callback itself didn't properly account for the possibility that the
callback might return different values in different contexts; the cache key
didn't account for other filters that similarly affect the cached values, such
as terms_clauses.

We skirt all these issues by concatenating the cache key using the SQL query
string, which will reflect all filters applied earlier in get_terms().

Props boonebgorges, wonderboymusic.
Fixes #21267.

Note: See TracTickets for help on using tickets.