#36356 closed defect (bug) (fixed)
Function signature of wp_parse_url() does not match parse_url()
Reported by: | johnbillion | Owned by: | peterwilsoncc |
---|---|---|---|
Milestone: | 4.7 | Priority: | normal |
Severity: | normal | Version: | 4.4 |
Component: | HTTP API | Keywords: | has-patch has-unit-tests |
Focuses: | Cc: |
Description (last modified by )
The function description for wp_parse_url()
states that the function is:
A wrapper for PHP's parse_url() function that handles edgecases in < PHP 5.4.7
PHP's parse_url()
function accepts an optional second parameter for specifying which URL part should be returned. This parameter is not supported in wp_parse_url()
, therefore either wp_parse_url()
should be modified so this parameter is accepted, or its documentation should be corrected.
The former is preferable, the latter is easier.
Attachments (10)
Change History (49)
#3
@
8 years ago
The patch also includes a minor performance improvement to only execute the work-arounds for PHP < 5.4.7 compatibility when the code is actually run on a lower PHP version.
#5
@
8 years ago
- Keywords needs-testing added; 2nd-opinion removed
- Milestone changed from Awaiting Review to 4.7
- Owner set to johnbillion
- Status changed from new to reviewing
#7
@
8 years ago
- Focuses docs removed
- Keywords needs-patch added; has-patch removed
- Resolution fixed deleted
- Status changed from closed to reopened
This fatals on < 5.4.7 due to line 663 trying to unset array elements of $parts
when $parts
is a string.
Ref: https://travis-ci.org/aaronjorbin/develop.wordpress/jobs/156277415
#9
@
8 years ago
- Keywords needs-unit-tests added; needs-patch has-unit-tests needs-testing removed
- Resolution fixed deleted
- Status changed from closed to reopened
#11
@
8 years ago
- Keywords needs-patch added
- Resolution fixed deleted
- Status changed from closed to reopened
Okay we have an actual failure now, on PHP 5.2.
test_wp_parse_url_with_component with data set #17 ('/path/http://example.net/', 1, NULL) Failed asserting that 'placeholder' matches expected null. tests/phpunit/tests/http/http.php:125
#15
@
8 years ago
- Owner changed from johnbillion to SergeyBiryukov
- Status changed from reopened to reviewing
#16
@
8 years ago
- Keywords has-patch has-unit-tests added; needs-unit-tests needs-patch removed
Ok, I've finally had some time to take another look at this, including the build failures.
The patch I've just uploaded includes:
- The original patch
- The additional changes made by @johnbillion
- A fix for the PHP 5.2 failure - or rather a code logic fix which will prevent the PHP 5.2 failure.
The code logic change basically looks at what $component
parameter was passed and for relative URLs bows out early if the $component
is either scheme or host - which won't be available for relative URLs anyway.
It also only removes the placeholders when the $component
is the default -1
as that's the only time the $parts
array will actually be an array.
Minor additional change: changed the protected static $full_test_url
in the test file to a class constant.
The change has been tested on PHP 5.2 - 7.1 and the unit tests pass.
I've also had travis run the unit tests (just for the http
group which is affected by this change) and the build passes there as well: https://travis-ci.org/jrfnl/develop.wordpress/builds/164150049
#19
@
8 years ago
- Keywords needs-unit-tests needs-patch added; has-patch has-unit-tests commit removed
- Resolution fixed deleted
- Status changed from closed to reopened
Reopening, PHP7 doesn't cope with protocol relative URLs in some circumstances.
This will need a dedicated test case in Tests_HTTP_HTTP::parse_url_testcases
as it's only coincidence that caught it.
$ php -v PHP 7.0.9-1+deb.sury.org~trusty+1 (cli) ( NTS ) $ wp eval 'var_dump( wp_parse_url( "//fonts.googleapis.com/css?family=Open+Sans:400&subset=latin" ) );' array(2) { ["path"]=> string(26) "//fonts.googleapis.com/css" ["query"]=> string(33) "family=Open+Sans:400&subset=latin" } $ wp eval 'var_dump( wp_parse_url( "//fonts.googleapis.com/css/family/Open/Sans/400/subset/latin" ) );' array(2) { ["host"]=> string(20) "fonts.googleapis.com" ["path"]=> string(38) "/css/family/Open/Sans/400/subset/latin" }
#20
@
8 years ago
36356-tests.diff includes two forms of Google font urls that are particularly good at causing bugs in various versions of PHP.
#21
@
8 years ago
- Keywords has-patch has-unit-tests added; needs-unit-tests needs-patch removed
Sometimes PHP just drives me up the wall....
@peterwilsoncc Thanks for the additional test cases. I've fixed the handling of these.
I've also come up with two more test cases which previously weren't handled at all:
http://www.test.com/path1/path2/&q=a http://www.test.com/path1/path2/file.php&q=a
PHP would return an empty query
for these and have the &q=a
part within the path
.
Both @peterwilsoncc's as well my own additional test cases are now handled correctly cross-version PHP 5.2 - 7.1 + HHVM by the additional patch which I just uploaded.
The patch includes two new internal functions to prevent code duplication.
The patch includes updated documentation to reflect the changes.
The second patch I've uploaded contains perfunctory unit tests for the new functions I introduced.
A travis build which covers the groups which contain the http
tests as well as the template
tests which contained the Google URLs (now also added to the http
tests) passes on the combination of the two new patches (fix + additional tests): https://travis-ci.org/jrfnl/develop.wordpress/builds/164235958
#22
@
8 years ago
The colon in those Google Fonts URLs is actually an illegal character; the colon should be encoded. However, I'm not about to pick that fight.
#23
@
8 years ago
@johnbillion You are completely correct and waiting for that to be solved is another matter. In the mean time, it will work.
One thing I'm now wondering though: should we maybe account for the fact that it is an illegal character and encode it within this function so the output will not contain the illegal character ?
#24
@
8 years ago
I think that's beyond the scope of this particular ticket. Maybe it could happen in esc_url()
though.
#25
@
8 years ago
FYI: in the mean time I've also run a complete travis build on the master + patch (not just for the affected groups) and that passes as well: https://travis-ci.org/jrfnl/develop.wordpress
#26
@
8 years ago
In the meantime we could also resolve this in wp_dependencies_unique_hosts()
or in the relevant test so the build is not failing anymore
#27
@
8 years ago
@swissspidy The 36356-wp_parse_url-additional-edge-cases.patch
I send in two days ago will fix the build - see the travis build report: https://travis-ci.org/jrfnl/develop.wordpress/builds/164317694 - this is a complete run on master with both patches from two days ago applied.
The additional 36356-unit-tests-for-new-functions.patch
contains unit tests for the two new functions the edge cases patch introduces.
#28
@
8 years ago
@jrf The === $url[0]
in that patch will throw a notice if an empty string is passed to the function, are you able to add a couple of invalid use cases while refining the tests? I think this has been in WP since forever and I didn't think to test it over the weekend.
From rudimentary testing, placeholder schemes and hosts seem the most reliable across versions of PHP. Can they be used and unset generally? Version sniffing is proving unreliable.
#29
@
8 years ago
@peterwilsoncc
The
=== $url[0]
in that patch will throw a notice if an empty string is passed to the function, are you able to add a couple of invalid use cases while refining the tests? I think this has been in WP since forever and I didn't think to test it over the weekend.
While I agree that's an issue, it's not an issue with the patch as it's been - like you say - in WP since forever.
I detect a distinct case of scope creep here. This ticket is about adding the $component
parameter (without breaking anything else), nothing more.
Variable & type checking has never been WP's strong suite, this function is no exception and the adding of the $component
parameter has already made improvements in that respect (maybe even more than the ticket warrants).
I will add the empty string check in as it's an easy fix, but for anything else, I'd suggest opening a separate ticket.
From rudimentary testing, placeholder schemes and hosts seem the most reliable across versions of PHP. Can they be used and unset generally? Version sniffing is proving unreliable.
Version sniffing was already removed in my edge-cases patch from three days ago, so I don't understand what you're talking about.
Similarly, placeholder schemes and hosts are already being used and unset when appropriately, so what is your point here ?
#30
@
8 years ago
Additional patch to fix the potential undefined notice uploaded.
A travis unit test run on the current master + last three patches (which have not been committed to core yet) for just the relevant groups has been run and passes: https://travis-ci.org/jrfnl/develop.wordpress/builds/164806773
A complete travis unit test run on the current master + last three patches is currently running and I expect it to pass as well: https://travis-ci.org/jrfnl/develop.wordpress/builds/164806903
#31
@
8 years ago
- Owner changed from SergeyBiryukov to peterwilsoncc
- Status changed from reopened to assigned
@peterwilsoncc Can I assign you to take a look at this today please and expand on your feedback?
#32
@
8 years ago
@jrf Looks like I was looking at the wrong three day old version file re: version compare.
Combined version of your patches in 36356.diff.
Looking over it this-eventing.
#33
@
8 years ago
@johnbillion 36356.2.diff incorporates my second comment above about simplfying matters by using placeholder schemes and hosts as a matter of course.
@jrf I've looked over the spec and can't find anything about query parameters beginning with &
, are you able to send a pointer.
Somewhat complicated by the fact another part of the build is failing, I get consistent results at https://travis-ci.org/peterwilsoncc/WordPress/builds/164890491
#34
@
8 years ago
@peterwilsoncc
I've looked over the spec and can't find anything about query parameters beginning with &, are you able to send a pointer.
This is similar to other invalidly formatted urls being passed to this function and being corrected by it. Invalid formats are not in the specs.
As you removed the fix now, you might as well remove the tests for it as they do not have any added value unless the fix for it is in (and the unfixed results just don't make sense).
I get consistent results
The HHVM fix has been removed in your patch - will that still be caught now ? HHVM will sometimes take the :###
in the query string of the google font urls and use it as the port. See my previous fix for that.
Looked over the rest of the code.
I generally like the simplification. Wished I dared to do that. I considered it, but my experience with contributing to WP core has been that any time I change things drastically, the change gets rejected out of hand, so glad you have the whuffies to do it.
Only other point - I would suggest wrapping the unset foreach
at the end of the function within a if ( false !== $parts ) {}
#35
@
8 years ago
36356.3.2.diff for history while I wait for the full test suite to run.
@jrf Thanks for looking over that, I've added the check before unsetting.
HHVM isn't failing on the Google URLs so it looks like the placeholders are covering that.
I'll leave the ampersand as is, php, chrome network tools and firefox inspector all show it as part of the path. So it's at least consistent. I've removed these tests.
#37
follow-up:
↓ 38
@
8 years ago
I've added the check before unsetting.
Except you are returning $parts
early in that case which will give issues when a $component
has been requested (in that case, they would expect null, not false as a return).
Either return using _get_component_from_parsed_url_array( $parts, $component );
or wrap the foreach()
in the false
check like I suggested.
#38
in reply to:
↑ 37
@
8 years ago
Replying to jrf:
@peterwilsoncc
Except you are returning$parts
early in that case which will give issues when a$component
has been requested (in that case, they would expect null, not false as a return).
The doco is ambiguous in the circumstance, normally I'd clarify but wanted to get the build passing
I'm happy to look over it, to clarify you thoughts:
- with component: component if exists
- with component: null on parse error
- no component: false on parse error
- no component: array of set values
is this correct?
P
#39
@
8 years ago
@peterwilsoncc
The doco is ambiguous in the circumstance, normally I'd clarify but wanted to get the build passing
Very true. I've just run some tests and to my surprise on malformed urls + a component parameter parse_url()
returns false
, so the current behaviour of wp_parse_url()
is in line with that.
All the same, it makes for unnecessarily extra complicated variable checking of the return value of wp_parse_url()
in the case of a $component
parameter being passed.
// Current check would have to be: $component = wp_parse_url( $url, $component_constant ); if ( isset($component) && false !== $component ) { // continue processing } // What I suggested would simplify this to: $component = wp_parse_url( $url, $component_constant ); if ( isset($component) ) { // continue processing }
So to clarify - this was what I suggested:
- with component: component if exists
- with component: null on parse error and if the component does not exist
- no component: false on parse error
- no component: array of set values
Add $component parameter - includes unit tests