WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#36356 closed defect (bug) (fixed)

Function signature of wp_parse_url() does not match parse_url()

Reported by: johnbillion Owned by: peterwilsoncc
Milestone: 4.7 Priority: normal
Severity: normal Version: 4.4
Component: HTTP API Keywords: has-patch has-unit-tests
Focuses: Cc:
PR Number:

Description (last modified by johnbillion)

The function description for wp_parse_url() states that the function is:

A wrapper for PHP's parse_url() function that handles edgecases in < PHP 5.4.7

PHP's parse_url() function accepts an optional second parameter for specifying which URL part should be returned. This parameter is not supported in wp_parse_url(), therefore either wp_parse_url() should be modified so this parameter is accepted, or its documentation should be corrected.

The former is preferable, the latter is easier.

Attachments (10)

36356-wp_parse_url.patch (6.4 KB) - added by jrf 4 years ago.
Add $component parameter - includes unit tests
36356-wp_parse_url-revisited.patch (8.3 KB) - added by jrf 3 years ago.
Complete patch including all previous changes and passing for PHP 5.2
36356-tests.diff (869 bytes) - added by peterwilsoncc 3 years ago.
36356-wp_parse_url-additional-edge-cases.patch (11.7 KB) - added by jrf 3 years ago.
Fixes the additional test cases and some more.
36356-unit-tests-for-new-functions.patch (2.3 KB) - added by jrf 3 years ago.
Unit tests for the new functions introduced
36356-Fix-potential-undefined-notice.patch (2.7 KB) - added by jrf 3 years ago.
Fix for = $url[0]. Includes additional unit tests.
36356.diff (13.5 KB) - added by peterwilsoncc 3 years ago.
36356.2.diff (11.5 KB) - added by peterwilsoncc 3 years ago.
36356.3.diff (11.5 KB) - added by peterwilsoncc 3 years ago.
36356.3.2.diff (10.7 KB) - added by peterwilsoncc 3 years ago.

Download all attachments as: .zip

Change History (49)

#1 @johnbillion
4 years ago

  • Description modified (diff)

@jrf
4 years ago

Add $component parameter - includes unit tests

#2 @jrf
4 years ago

  • Keywords has-patch has-unit-tests added; needs-patch removed

#3 @jrf
4 years ago

The patch also includes a minor performance improvement to only execute the work-arounds for PHP < 5.4.7 compatibility when the code is actually run on a lower PHP version.

#5 @johnbillion
3 years ago

  • Keywords needs-testing added; 2nd-opinion removed
  • Milestone changed from Awaiting Review to 4.7
  • Owner set to johnbillion
  • Status changed from new to reviewing

#6 @johnbillion
3 years ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

In 38449:

HTTP API: Add a $component parameter to wp_parse_url() to give it parity with PHP's parse_url() function.

Fixes #36356
Props jrf

#7 @johnbillion
3 years ago

  • Focuses docs removed
  • Keywords needs-patch added; has-patch removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

This fatals on < 5.4.7 due to line 663 trying to unset array elements of $parts when $parts is a string.

Ref: https://travis-ci.org/aaronjorbin/develop.wordpress/jobs/156277415

#8 @johnbillion
3 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 38450:

HTTP API: Prevent a fatal error on PHP < 5.4.7 due to changes introduced in [38449].

Fixes #36356

#9 @johnbillion
3 years ago

  • Keywords needs-unit-tests added; needs-patch has-unit-tests needs-testing removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

#10 @johnbillion
3 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 38452:

HTTP API: The tests for wp_parse_url() can't be strict on type because this causes the tests to fail on PHP 5.2 which, bizarrely, returns the results of parse_url() (when called with a $component parameter) in a different order to later PHP versions.

Fixes #36356

#11 @johnbillion
3 years ago

  • Keywords needs-patch added
  • Resolution fixed deleted
  • Status changed from closed to reopened

Okay we have an actual failure now, on PHP 5.2.

test_wp_parse_url_with_component with data set #17 ('/path/http://example.net/', 1, NULL)
Failed asserting that 'placeholder' matches expected null.
tests/phpunit/tests/http/http.php:125

#12 @johnbillion
3 years ago

In 38453:

HTTP API: Separate the test for wp_parse_url() with -1 as its component into a separate test, so the remaining tests can use strict type checking. This helps avoid gotches with the potentially empty values (ie. null) that we're testing for.

See #36356

#13 @johnbillion
3 years ago

In 38456:

HTTP API: Revert changes to wp_parse_url() while PHP 5.2 errors are investigated.

See #36356

#14 @jorbin
3 years ago

Anyone that has a 5.2 install handy and can help test, please do.

#15 @SergeyBiryukov
3 years ago

  • Owner changed from johnbillion to SergeyBiryukov
  • Status changed from reopened to reviewing

@jrf
3 years ago

Complete patch including all previous changes and passing for PHP 5.2

#16 @jrf
3 years ago

  • Keywords has-patch has-unit-tests added; needs-unit-tests needs-patch removed

Ok, I've finally had some time to take another look at this, including the build failures.

The patch I've just uploaded includes:

  • The original patch
  • The additional changes made by @johnbillion
  • A fix for the PHP 5.2 failure - or rather a code logic fix which will prevent the PHP 5.2 failure.

The code logic change basically looks at what $component parameter was passed and for relative URLs bows out early if the $component is either scheme or host - which won't be available for relative URLs anyway.
It also only removes the placeholders when the $component is the default -1 as that's the only time the $parts array will actually be an array.

Minor additional change: changed the protected static $full_test_url in the test file to a class constant.

The change has been tested on PHP 5.2 - 7.1 and the unit tests pass.

I've also had travis run the unit tests (just for the http group which is affected by this change) and the build passes there as well: https://travis-ci.org/jrfnl/develop.wordpress/builds/164150049

#17 @johnbillion
3 years ago

  • Keywords commit added

LGTM

#18 @johnbillion
3 years ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

In 38694:

HTTP API: Add a $component parameter to wp_parse_url() to give it parity with PHP's parse_url() function.

Fixes #36356
Props jrf

#19 @peterwilsoncc
3 years ago

  • Keywords needs-unit-tests needs-patch added; has-patch has-unit-tests commit removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening, PHP7 doesn't cope with protocol relative URLs in some circumstances.

This will need a dedicated test case in Tests_HTTP_HTTP::parse_url_testcases as it's only coincidence that caught it.

$ php -v
PHP 7.0.9-1+deb.sury.org~trusty+1 (cli) ( NTS )

$ wp eval 'var_dump( wp_parse_url( "//fonts.googleapis.com/css?family=Open+Sans:400&subset=latin" ) );'
array(2) {
  ["path"]=>
  string(26) "//fonts.googleapis.com/css"
  ["query"]=>
  string(33) "family=Open+Sans:400&subset=latin"
}

$ wp eval 'var_dump( wp_parse_url( "//fonts.googleapis.com/css/family/Open/Sans/400/subset/latin" ) );'
array(2) {
  ["host"]=>
  string(20) "fonts.googleapis.com"
  ["path"]=>
  string(38) "/css/family/Open/Sans/400/subset/latin"
}

#20 @peterwilsoncc
3 years ago

36356-tests.diff includes two forms of Google font urls that are particularly good at causing bugs in various versions of PHP.

ref: https://3v4l.org/IpTl1

@jrf
3 years ago

Fixes the additional test cases and some more.

@jrf
3 years ago

Unit tests for the new functions introduced

#21 @jrf
3 years ago

  • Keywords has-patch has-unit-tests added; needs-unit-tests needs-patch removed

Sometimes PHP just drives me up the wall....

@peterwilsoncc Thanks for the additional test cases. I've fixed the handling of these.

I've also come up with two more test cases which previously weren't handled at all:

http://www.test.com/path1/path2/&q=a
http://www.test.com/path1/path2/file.php&q=a

PHP would return an empty query for these and have the &q=a part within the path.

Both @peterwilsoncc's as well my own additional test cases are now handled correctly cross-version PHP 5.2 - 7.1 + HHVM by the additional patch which I just uploaded.

The patch includes two new internal functions to prevent code duplication.
The patch includes updated documentation to reflect the changes.

The second patch I've uploaded contains perfunctory unit tests for the new functions I introduced.

A travis build which covers the groups which contain the http tests as well as the template tests which contained the Google URLs (now also added to the http tests) passes on the combination of the two new patches (fix + additional tests): https://travis-ci.org/jrfnl/develop.wordpress/builds/164235958

#22 @johnbillion
3 years ago

The colon in those Google Fonts URLs is actually an illegal character; the colon should be encoded. However, I'm not about to pick that fight.

#23 @jrf
3 years ago

@johnbillion You are completely correct and waiting for that to be solved is another matter. In the mean time, it will work.

One thing I'm now wondering though: should we maybe account for the fact that it is an illegal character and encode it within this function so the output will not contain the illegal character ?

#24 @johnbillion
3 years ago

I think that's beyond the scope of this particular ticket. Maybe it could happen in esc_url() though.

#25 @jrf
3 years ago

FYI: in the mean time I've also run a complete travis build on the master + patch (not just for the affected groups) and that passes as well: https://travis-ci.org/jrfnl/develop.wordpress

#26 @swissspidy
3 years ago

In the meantime we could also resolve this in wp_dependencies_unique_hosts() or in the relevant test so the build is not failing anymore

#27 @jrf
3 years ago

@swissspidy The 36356-wp_parse_url-additional-edge-cases.patch I send in two days ago will fix the build - see the travis build report: https://travis-ci.org/jrfnl/develop.wordpress/builds/164317694 - this is a complete run on master with both patches from two days ago applied.
The additional 36356-unit-tests-for-new-functions.patch contains unit tests for the two new functions the edge cases patch introduces.

#28 @peterwilsoncc
3 years ago

@jrf The === $url[0] in that patch will throw a notice if an empty string is passed to the function, are you able to add a couple of invalid use cases while refining the tests? I think this has been in WP since forever and I didn't think to test it over the weekend.

From rudimentary testing, placeholder schemes and hosts seem the most reliable across versions of PHP. Can they be used and unset generally? Version sniffing is proving unreliable.

#29 @jrf
3 years ago

@peterwilsoncc

The === $url[0] in that patch will throw a notice if an empty string is passed to the function, are you able to add a couple of invalid use cases while refining the tests? I think this has been in WP since forever and I didn't think to test it over the weekend.

While I agree that's an issue, it's not an issue with the patch as it's been - like you say - in WP since forever.

I detect a distinct case of scope creep here. This ticket is about adding the $component parameter (without breaking anything else), nothing more.
Variable & type checking has never been WP's strong suite, this function is no exception and the adding of the $component parameter has already made improvements in that respect (maybe even more than the ticket warrants).

I will add the empty string check in as it's an easy fix, but for anything else, I'd suggest opening a separate ticket.

From rudimentary testing, placeholder schemes and hosts seem the most reliable across versions of PHP. Can they be used and unset generally? Version sniffing is proving unreliable.

Version sniffing was already removed in my edge-cases patch from three days ago, so I don't understand what you're talking about.
Similarly, placeholder schemes and hosts are already being used and unset when appropriately, so what is your point here ?

Last edited 3 years ago by jrf (previous) (diff)

@jrf
3 years ago

Fix for = $url[0]. Includes additional unit tests.

#30 @jrf
3 years ago

Additional patch to fix the potential undefined notice uploaded.

A travis unit test run on the current master + last three patches (which have not been committed to core yet) for just the relevant groups has been run and passes: https://travis-ci.org/jrfnl/develop.wordpress/builds/164806773

A complete travis unit test run on the current master + last three patches is currently running and I expect it to pass as well: https://travis-ci.org/jrfnl/develop.wordpress/builds/164806903

#31 @johnbillion
3 years ago

  • Owner changed from SergeyBiryukov to peterwilsoncc
  • Status changed from reopened to assigned

@peterwilsoncc Can I assign you to take a look at this today please and expand on your feedback?

@peterwilsoncc
3 years ago

#32 @peterwilsoncc
3 years ago

@jrf Looks like I was looking at the wrong three day old version file re: version compare.

Combined version of your patches in 36356.diff.

Looking over it this-eventing.

#33 @peterwilsoncc
3 years ago

@johnbillion 36356.2.diff incorporates my second comment above about simplfying matters by using placeholder schemes and hosts as a matter of course.

@jrf I've looked over the spec and can't find anything about query parameters beginning with &, are you able to send a pointer.

Somewhat complicated by the fact another part of the build is failing, I get consistent results at https://travis-ci.org/peterwilsoncc/WordPress/builds/164890491

#34 @jrf
3 years ago

@peterwilsoncc

I've looked over the spec and can't find anything about query parameters beginning with &, are you able to send a pointer.

This is similar to other invalidly formatted urls being passed to this function and being corrected by it. Invalid formats are not in the specs.

As you removed the fix now, you might as well remove the tests for it as they do not have any added value unless the fix for it is in (and the unfixed results just don't make sense).

I get consistent results

The HHVM fix has been removed in your patch - will that still be caught now ? HHVM will sometimes take the :### in the query string of the google font urls and use it as the port. See my previous fix for that.

Looked over the rest of the code.

I generally like the simplification. Wished I dared to do that. I considered it, but my experience with contributing to WP core has been that any time I change things drastically, the change gets rejected out of hand, so glad you have the whuffies to do it.

Only other point - I would suggest wrapping the unset foreach at the end of the function within a if ( false !== $parts ) {}

#35 @peterwilsoncc
3 years ago

36356.3.2.diff for history while I wait for the full test suite to run.

@jrf Thanks for looking over that, I've added the check before unsetting.

HHVM isn't failing on the Google URLs so it looks like the placeholders are covering that.

I'll leave the ampersand as is, php, chrome network tools and firefox inspector all show it as part of the path. So it's at least consistent. I've removed these tests.

#36 @peterwilsoncc
3 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 38726:

HTTP API: Simplify wp_parse_url() to ensure consistent results.

[38694] revealed some URL formats were been parsed incorrectly, including those used by Google Fonts. This change simplifies the function to use placeholder values which cause PHP's parsing to behave consistently.

Props jrf, peterwilsoncc.
Fixes #36356.

#37 follow-up: @jrf
3 years ago

@peterwilsoncc

I've added the check before unsetting.

Except you are returning $parts early in that case which will give issues when a $component has been requested (in that case, they would expect null, not false as a return).

Either return using _get_component_from_parsed_url_array( $parts, $component ); or wrap the foreach() in the false check like I suggested.

Last edited 3 years ago by jrf (previous) (diff)

#38 in reply to: ↑ 37 @peterwilsoncc
3 years ago

Replying to jrf:

@peterwilsoncc
Except you are returning $parts early in that case which will give issues when a $component has been requested (in that case, they would expect null, not false as a return).

The doco is ambiguous in the circumstance, normally I'd clarify but wanted to get the build passing

I'm happy to look over it, to clarify you thoughts:

  • with component: component if exists
  • with component: null on parse error
  • no component: false on parse error
  • no component: array of set values

is this correct?

P

#39 @jrf
3 years ago

@peterwilsoncc

The doco is ambiguous in the circumstance, normally I'd clarify but wanted to get the build passing

Very true. I've just run some tests and to my surprise on malformed urls + a component parameter parse_url() returns false, so the current behaviour of wp_parse_url() is in line with that.

All the same, it makes for unnecessarily extra complicated variable checking of the return value of wp_parse_url() in the case of a $component parameter being passed.

// Current check would have to be:
$component = wp_parse_url( $url, $component_constant );
if ( isset($component) && false !== $component ) {
	// continue processing
}

// What I suggested would simplify this to:
$component = wp_parse_url( $url, $component_constant );
if ( isset($component) ) {
	// continue processing
}

So to clarify - this was what I suggested:

  • with component: component if exists
  • with component: null on parse error and if the component does not exist
  • no component: false on parse error
  • no component: array of set values
Note: See TracTickets for help on using tickets.