Opened 6 years ago
Last modified 2 days ago
#49385 new defect (bug)
wp_remote_get() cannot retrieve webcal URIs
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Milestone: | Awaiting Review | Priority: | normal |
| Severity: | normal | Version: | |
| Component: | HTTP API | Keywords: | needs-unit-tests has-patch |
| Focuses: | Cc: |
Description
In #31666, webcal was added to the list of allowed protocols. Unfortunately this does not bubble up into the HTTP API for remote requests, and wp_remote_get() on a webcal:// URI will fail with:
object(WP_Error)[532]
public 'errors' =>
array (size=1)
'http_request_failed' =>
array (size=1)
0 => string 'A valid URL was not provided.' (length=29)
public 'error_data' =>
array (size=0)
empty
Here is my proof-of-concept to show off the failure:
add_action( 'plugins_loaded', function() {
// Public iCloud calendar I created
$uri = 'webcal://p41-caldav.icloud.com/published/2/AAAAAAAAAAAAAAAAAAAAAF-eqSypTVlehAPwNTiPeHHBkTEvCi1qK6G4LDcU1Fr6AKLM-yaJrbRrhSSGMrjSbAxJZJ6TibzOCKLh0xBSpKI';
// Regular remote get call
$get = wp_remote_get( $uri );
// Dump results
var_dump( $get ); die;
} );
Change History (5)
#1
@
6 years ago
#2
@
6 years ago
For context & clarity, webcal schemes being supported inside of the_content is in no way directly connected to the HTTP API itself.
All that ticket 31666 exhibits is a willingness to explicitly support them in the WordPress project. I hope that will help justify further code changes to accommodate developers who want to use the recommended APIs to interact with remote webcal:// URIs.
#3
@
6 years ago
If WordPress wanted to connect the HTTP API to its allowed protocols, the code would look something like:
$scheme = parse_url( $url, PHP_URL_SCHEME );
$allowed_protocols = wp_allowed_protocols();
if ( empty( $url ) || ! in_array( $scheme, $allowed_protocols, true ) ) {
If it simply wanted to maintain essentially any scheme, it would look something like:
$scheme = parse_url( $url, PHP_URL_SCHEME );
if ( empty( $url ) || empty( $scheme ) ) {
PHP.net says:
This function is intended specifically for the purpose of parsing URLs and not URIs. However, to comply with PHP's backwards compatibility requirements it makes an exception for the file:// scheme where triple slashes (file:///...) are allowed. For any other scheme this is invalid.
So... whether WordPress considers webcal:// a URL or a URI scheme may also be up for discussion.
I believe for the purposes laid out here it is a URI scheme that is intended to be allowed, making my first code change recommendation the most accurate one I can imagine at this time.
I don't have a core development checkout on this computer right now to make the patches myself, but I'll try to remember to circle back here once I do.
#4
@
6 years ago
I just read the second section of the PHP docs for WP_Http::request() which says:
* Send an HTTP request to a URI. * * Please note: The only URI that are supported in the HTTP Transport implementation * are the HTTP and HTTPS protocols.
So, even though it works for URIs, it only works for HTTP and HTTPS, which really stinks.
Perhaps there is an opportunity to introduce a filter here, allowing plugins to use this API with their own unsupported protocols and at their own risk.
This ticket was mentioned in PR #12078 on WordPress/wordpress-develop by @yashyadav247.
2 days ago
#5
- Keywords has-patch added; needs-patch removed
webcal:// and webcals:// URLs are allowed in HTML via wp_allowed_protocols() (#31666), but the HTTP API still rejects them. wp_remote_get() and wp_safe_remote_get() fail with WP_Error: A valid URL was not provided. because wp_kses_bad_protocol() only permits http/https (and ssl in WP_Http::request()), and HTTP transports only support http/https.
This change:
wp_http_normalize_url() — Rewrites webcal:// and webcals:// to https:// (case-insensitive) before validation and transport. Exposes the http_normalize_url filter so sites that serve calendars over plain HTTP can map to http:// if needed.
wp_http_validate_url() — Validates the normalized URL (SSRF checks run against the real https:// target) but returns the original URL when safe, so callers can keep storing/displaying webcal:// links.
http_allowed_protocols filter — Lets plugins extend allowed schemes in validation/wp_kses_bad_protocol() at their own risk. Documented that transports still only execute http/https; webcal is intentionally not added to the default list because normalization handles it without accepting raw webcal:// in the transport path.
WP_Http::request() — Normalizes the URL immediately after pre_http_request, before kses and wp_http_validate_url().
Trac ticket: https://core.trac.wordpress.org/ticket/49385
Use of AI Tools
AI assistance: Yes
Tool(s): Cursor (Auto)
Used for: Implementation plan, code changes, unit tests, and this PR description; changes were reviewed against the Trac ticket and existing HTTP API patterns.
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.
Test plan
vendor/bin/phpunit tests/phpunit/tests/http/http.php --filter 49385 (or npm run test:php -- --filter 49385 tests/phpunit/tests/http/http.php)
Confirm wp_http_normalize_url( 'webcal://example.com/feed.ics' ) returns https://example.com/feed.ics
Confirm wp_http_validate_url( 'webcal://example.com/caniload.php' ) returns the original webcal:// URL
Confirm wp_http_validate_url( 'ftp://example.com/caniload.php' ) still returns false
Confirm wp_remote_get( 'webcal://example.com/feed.ics' ) no longer returns A valid URL was not provided. (outbound request uses https://)
Confirm wp_safe_remote_get( 'webcal://example.com/feed.ics' ) passes validation and fetches over HTTPS
Run broader HTTP API tests: phpunit --group http
This bug exists because PHPs
parse_url()function apparently does not considerwebcala valid scheme, even though WordPress does.Inside
WP_Http::request(),parse_url()is used on thewebcal://url, which does not return a scheme.$arrURL['scheme']ends up being empty, and aWP_Error()is returned.Oddly, if you attempt to use
parse_url()with thePHP_URL_SCHEMEflag, it will correctly identify the webcal scheme:add_action( 'plugins_loaded', function() { // Public iCloud calendar I created $uri = 'webcal://p41-caldav.icloud.com/published/2/AAAAAAAAAAAAAAAAAAAAAF-eqSypTVlehAPwNTiPeHHBkTEvCi1qK6G4LDcU1Fr6AKLM-yaJrbRrhSSGMrjSbAxJZJ6TibzOCKLh0xBSpKI'; // Regular remote get call $scheme = parse_url( $uri, PHP_URL_SCHEME ); // Dump results var_dump( $scheme ); die; } );No doubt this is an error/oddity in PHP's implementation of
parse_url(), but because WordPress made the decision to explicitly support it, I believe there is an obligation to follow through with that where PHP itself may be failing it.