Make WordPress Core

Opened 4 years ago

Last modified 4 years ago

#50223 new enhancement

Performance improvement: Avoid using array_unique() where possible

Reported by: aristath's profile aristath Owned by:
Milestone: Awaiting Review Priority: lowest
Severity: normal Version:
Component: General Keywords: reporter-feedback
Focuses: performance Cc:

Description

We could improve performance by avoiding an expensive call to array_unique( $array ); where possible.
A more performant option is array_flip( array_flip( $array ) );

How to test:

<?php
$array_size = 100000;
$runs       = 10000;
$test       = [];

// Generate array.
for ( $run = 0; $run < $array_size; $run++ ) {
        $test[] = rand( 0, 100 );
}

// Test array_unique().
$time = microtime( true );
for( $run = 0; $run < $runs; $run++ ) {
        $out = array_unique( $test );
}
$time = microtime( true ) - $time;
echo 'Array Unique: ' . $time;

// Test array_flip( array_flip() ).
$time = microtime( true );
for ( $run = 0; $run < $runs; $run++ ) {
        $out = array_flip( array_flip( $test ) );
}
$time = microtime( true ) - $time;
echo 'Flip Flip: ' . $time;

In PHP 7.3 this prints the following:

Array Unique: 31.995715856552
Flip Flip: 9.5911109447479 

In previous PHP versions the difference is a lot more dramatic (in PHP 5.6 array_flip( array_flip() ) is up to 1000 times faster).

Of course this can't be done in arrays where keys are arrays or objects, but in most cases I've seen we use array_unique with strings so we should see a performance gain.

This may look like a micro-optimization, and it is. The goal is reducing carbon emissions, and if we manage to save even .1ms from each page-load, globally it adds up to a few tons of CO2.

Change History (5)

#1 @johnbillion
4 years ago

  • Keywords reporter-feedback added
  • Priority changed from normal to lowest

Thanks for the report @aristath. Do you have concrete examples where these changes can be made in WordPress core to save processing time?

#2 follow-up: @Cybr
4 years ago

I believe the most prominent ones are at:

.\wp-admin\load-scripts.php
.\wp-admin\load-styles.php
.\wp-includes\çlass-wp-query.php

Via Xdebug, you can find how many times this function is called during a typical request so that you can estimate the relative performance impact based on the results from the OP.

However, I don't think we should sacrifice readability for this. A flip-flip doesn't convey that the variable is transformed to yield unique keys; I, therefore, believe these rudimentary low-level issues should be forwarded to the PHP Group.

Last edited 4 years ago by Cybr (previous) (diff)

#3 @aristath
4 years ago

Adding an inline comment like this above any line where we change to flip-flip should resolve the readability issue...

// Use array_flip( array_flip() ) instead of array_unique() for improved performance.

#4 @joyously
4 years ago

I'm not sure it's relevant to WP's data, but the result of flip-flip is not always the same as array_unique.
I see two ways that they can differ.
1) array_unique preserves the keys, retaining the first one for a duplicate value, whereas array_flip uses the last one of the keys of the duplicate values.
2) array_unique will compare as string by default (although there is a flag). array_flip() does not retain the data type of values, and they must be either integer or string (no boolean or array or object, although you can encode them first).

And it appears that version 7.2 changed the internals, so that array_unique could be giving slightly different keys as it used to. From https://www.php.net/manual/en/function.array-unique :

If sort_flags is SORT_STRING, formerly array has been copied and non-unique elements have been removed (without packing the array afterwards), but now a new array is built by adding the unique elements. This can result in different numeric indexes.

Perhaps a test can be done on the different sort_flags, to see if SORT_REGULAR is faster than the default SORT_STRING.

#5 in reply to: ↑ 2 @SergeyBiryukov
4 years ago

I'm all for performance improvements, but if replacing array_unique() causes a noticeable difference somewhere in core, I'd rather see if the underlying logic could be improved.

Unless using array_unique() is disallowed in WPCS (which is probably not a good idea), new instances will inevitably get added over time. Replacing some instances randomly would only cause inconsistencies and confusion, even with inline comments.

Replying to Cybr:

However, I don't think we should sacrifice readability for this. A flip-flip doesn't convey that the variable is transformed to yield unique keys; I, therefore, believe these rudimentary low-level issues should be forwarded to the PHP Group.

This seems like a good idea, that way plugins, themes, and other plaforms would also benefit from any performance impovements to array_unique().

Last edited 4 years ago by SergeyBiryukov (previous) (diff)
Note: See TracTickets for help on using tickets.