Opened 3 years ago
Closed 4 weeks ago
#59239 closed defect (bug) (fixed)
wp_generate_uuid4 collisions
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 7.0 | Priority: | normal |
| Severity: | normal | Version: | |
| Component: | General | Keywords: | has-patch needs-testing commit |
| Focuses: | Cc: |
Description
It seems like wp_generate_uuid4() is prone to creating UUID collisions, since it internally uses mt_rand, which uses a 32-bit seed. When this seed repeats, it will generate the same UUID twice. Maybe it should be updated to use wp_rand instead which uses random_int so it's backed by a real CSPRNG?
Change History (19)
This ticket was mentioned in PR #10694 on WordPress/wordpress-develop by @johnbillion.
3 months ago
#2
- Keywords has-patch added
Trac ticket: #59239
#4
follow-up:
↓ 9
@
3 months ago
One concern I have is that with this change in place, wp_generate_uuid4() now relies on a pluggable function. This means it would break any code that calls wp_generate_uuid4() before the plugins_loaded hook. What's the likelihood of this? Low, but perhaps a plugin somewhere is generating a UUID for each request for logging purposes.
#5
@
3 months ago
I created this ticket too because I observed some collisions when generating UUIDs with wp_generate_uuid4(). According to PHP documentation, after 80,000 iterations the chance of a collision is 50%. I tested this myself by generating UUIDs using this function in a loop and after few minutes I managed to generate a collision around 80,000 iterations.
It seems like wp_rand() is a wrapper around random_int() and it only uses a fallback if the random_int() throws an exception (no random source available) or the function is not available. As it was added in PHP 7.0 that's probably not relevant in current WordPress versions as they don't support PHP 5 any longer.
I think it could be possible to just use random_int() directly if wp_rand() is not available. If that's too much duplicate code it might be possible to drop wp_rand() completely and just use random_int() so the function works without the dependency on pluggable functions being loaded.
This ticket was mentioned in Slack in #core by juanmaguitar. View the logs.
5 weeks ago
#7
@
5 weeks ago
- Keywords needs-testing added
From today's bug scrub
There's a PR #10694 opened that has been approved. I'll add the needs-testing keyword
#8
@
4 weeks ago
Tested PR #10694 on PHP 8.5.1 (NTS, Windows x64) against trunk at r62029.
Environment
- PHP: 8.5.1 (cli) NTS Visual C++ 2022 x64
- WordPress: 7.0-beta5 (trunk r62029)
- OS: Windows 11
Reproduction
Demonstrated the collision problem by generating UUIDs with mt_rand() using repeated seeds. Since mt_rand() uses a 32-bit seed, identical seeds produce identical UUID sequences.
Before patch (mt_rand):
mt_srand(12345); $uuid1 = wp_generate_uuid4(); // 51e22de5-fd1d-4881-8da4-ada90ffe517e mt_srand(12345); $uuid2 = wp_generate_uuid4(); // 51e22de5-fd1d-4881-8da4-ada90ffe517e // $uuid1 === $uuid2 → collision
Generated 131,072 UUIDs with 65,536 sequential seeds across 2 rounds. Result: 65,536 unique UUIDs, 65,536 collisions (every UUID from round 2 collided with round 1).
After patch (wp_rand / random_int):
Generated 100,000 UUIDs using random_int() (which wp_rand() delegates to). Result: 100,000 unique UUIDs, 0 collisions.
UUID4 format validation
Verified 1,000 UUIDs generated with the patched function:
- All 1,000 match the UUID4 regex
^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$ - Version nibble (position 13) is always
4 - Variant nibble (position 17) is always
8,9,a, orb
Bitwise operations | 0x4000 and | 0x8000 are correctly preserved after switching from mt_rand() to wp_rand().
Results
| Scenario | Before Patch (mt_rand) | After Patch (wp_rand) |
|---|---|---|
| Same seed produces same UUID | Yes (collision) | N/A (CSPRNG, no seed control) |
| 131K UUIDs with repeated seeds | 65,536 collisions | 0 collisions in 100K |
| UUID4 format valid | Yes | Yes |
| Version/variant bits correct | Yes | Yes |
Patch is minimal and correct. No regressions. +1 for commit.
#9
in reply to:
↑ 4
;
follow-up:
↓ 13
@
4 weeks ago
Replying to johnbillion:
One concern I have is that with this change in place,
wp_generate_uuid4()now relies on a pluggable function. This means it would break any code that callswp_generate_uuid4()before theplugins_loadedhook. What's the likelihood of this? Low, but perhaps a plugin somewhere is generating a UUID for each request for logging purposes.
This could be protected against with:
<?php $randomizer = function_exists( 'wp_rand' ) ? 'wp_rand' : 'mt_rand'; //... $randomizer( 0, 0xffff );
I agree it's a low possibility, it can only really affect mu-plugins, plugins using the plugin_loaded (singular) hook and plugins running bootstrapping code upon file inclusion rather than a hook.
This ticket was mentioned in Slack in #core by audrasjb. View the logs.
4 weeks ago
#11
@
4 weeks ago
- Owner set to jorbin
- Status changed from new to assigned
Reassigning to @jorbin as per today's bug scrub.
#12
@
4 weeks ago
- Keywords commit added
Updated based on @peterwilsoncc's suggestion. Planning to commit soon.
#13
in reply to:
↑ 9
@
4 weeks ago
Isn't mt_rand the problem here, though? Wouldn't it be better to use random_int instead?
#14
follow-up:
↓ 15
@
4 weeks ago
@siliconforks mt_rand is a fallback in the scenario where someone is using wp_generate_uuid4 extremely early. If random_int would to be used, the logic in wp_rand would need to be duplicated including, a fallback for when random_int cannot find an appropriate source of randomness.
#15
in reply to:
↑ 14
@
4 weeks ago
Replying to jorbin:
@siliconforks
mt_randis a fallback in the scenario where someone is usingwp_generate_uuid4extremely early. Ifrandom_intwould to be used, the logic in wp_rand would need to be duplicated including, a fallback for whenrandom_intcannot find an appropriate source of randomness.
Well, I think the safest thing to do in that case would be to just throw the exception rather than keep handing out bogus UUIDs. But if you really want to guarantee that wp_generate_uuid4() never throws an exception and always returns something, you could simply
- Use
wp_rand()if that function exists - If
wp_rand()is not available, try callingrandom_int() - If
random_int()throws an exception, catch it and fall back tomt_rand()
#16
follow-up:
↓ 17
@
4 weeks ago
@siliconforks I don't think it's worth the additional logic.
The risk of collision with mt_rand() is present but very low. The updated function will only ever fire if a third party developers calls it somewhere between the mu-plugins includes and before the plugins_loaded hook. Which itself is rare.
Running the following using the existing (collision prone) code about tenish didn't result in any collisions.
<?php function gguid_collision_test() { $gguids = array(); $count = 0; $tries = 1000000; for ( $i = 0; $i < $tries; $i++ ) { $gguid = wp_generate_uuid4(); if ( isset( $gguids[ $gguid ] ) ) { echo "Collision detected after $count tries.\n"; return; } $gguids[ $gguid ] = true; $count++; } echo "No collisions detected after $count tries.\n"; } echo '<pre>'; gguid_collision_test(); exit;
As this isn't using mt_rand for cryptographic purposes, I think it's fine to use the KISS approach for a low^2 probability.
#17
in reply to:
↑ 16
@
4 weeks ago
Replying to peterwilsoncc:
Running the following using the existing (collision prone) code about tenish didn't result in any collisions.
I don't think you would ever get a collision that way, because the Mersenne Twister has a very long period. The problem is that the seed is only 32 bits.
As this isn't using
mt_randfor cryptographic purposes,
Well, it's a library function, so you don't really know what callers are using it for...
I think it's fine to use the KISS approach for a
low^2probability.
I think KISS is fine too - I would just call random_int() and then if it throws an exception, let the caller decide what to do with it.
We've noted this while investigating a support case - specific server/PHP config was (suspectedly) causing UUID to be duplicated at least once between a few random requests. And highlighting another report from @thanasisxan here https://developer.wordpress.org/reference/functions/wp_generate_uuid4/