WordPress.org

Make WordPress Core

Opened 6 months ago

Last modified 5 months ago

#25738 new defect (bug)

WP_HTTP uses transports that incorrectly claim to support a request

Reported by: dd32 Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version:
Component: HTTP API Keywords:
Focuses: Cc:

Description (last modified by dd32)

Following on from #25716

We now have 2 HTTP transports, and 2 types of requests that we need to support:

HTTPHTTPS
cURL
PHP Streams

If we break it down further based on known compatibility issues, we can end up with this:

HTTP GETHTTPS GETHTTP POSTHTTPS POST
cURL no SSL
cURL w/ OpenSSL
cURL w/ GnuTLS~~
cURL 7.31.0 w/ OpenSSL
cURL 7.10.6 w/ any SSL~~
cURL + DNS timeout
PHP Streams no SSL
PHP Streams w/ OpenSSL

(✓ = should work, ✘ = doesn't work, ~ = known issues)

For background of cURL 7.31.0 and GnuTLS see #25716
For background of cURL 7.10.6 see #25751 - SSL subjectAltNames is not supported by 7.10.6 or earlier.

WP_HTTP currently

  • Tries to use cURL first, Streams second
  • Only uses cURL for HTTPS if it claims to support HTTPS, we have no check in place to verify it can make a SSL connection
  • Doesn't fall over to a 2nd transport in event of error
  • Doesn't disable a transport in event of continued troubles

I think there are a few things we can possibly do here:

  1. Disable a transport after multiple failed requests (See #16870)
  2. Add a parameter to try the other transport if available (and if that works, lock it to using that transport for the rest of that page load so we have less timeouts)

#1 would be a win for most users who have issues with cURL and DNS lookup failures, and any sites which have plugins that make a bunch of external HTTP calls.
#1 can also have false positives, for example, oEmbed calls to sites that are down could disable a sites ability to make requests, so we might want to limit it to *.wordpress.org requests as a way of saying "This site SHOULD be working"
#1 and #2 might not play nicely though if #2 was implemented, we might not have enough data to reliably trigger the blocking.

#2 would mean that we could use that for WordPress.org API requests, which would cause any HTTPS connections that fail via cURL succeeding through Streams (if supported).
Since it's possible for a site to have cURL with broken SSL, and no OpenSSL installed, we would still need extra code to handle a HTTPS -> HTTP fallback in the update API's though (perhaps this would be good as a wrapper "connect over SSL if possible, else, use HTTP)

We also probably don't want to re-try every HTTP request, which makes adding a parameter for it probably a better idea, the other option would be to simply detect certain cURL/streams error codes and re-try those.

potentially we wouldn't want it to re-try zip package downloads too

Attachments (2)

25738.diff (1.1 KB) - added by dd32 5 months ago.
25738.2.diff (1.2 KB) - added by dd32 5 months ago.

Download all attachments as: .zip

Change History (16)

comment:1 dd326 months ago

Evidently the cURL + DNS timeout case is still an issue, although it's not clear if streams would work in these cases, or if it was a transient issue:
http://wordpress.org/support/topic/rss-feed-loss-and-unable-to-search-plugins?replies=1
http://wordpress.org/support/topic/as-of-3-weeks-ago-emails-take-8-20-hours-to-arrive?replies=2

Last edited 6 months ago by dd32 (previous) (diff)

comment:2 follow-up: dd326 months ago

Looking at the chart above with the currently-known bugs, one might question why cURL is the first transport we try, it's three fold:

  • cURL was supposably going to have less bugs than our own PHP implementation. Over the years of WP_HTTP's lifespan we've replaced many functionalities of cURL with PHP code (redirects, some error handling, stream decompression/decoding) and more recently we've replaced the two PHP-implementations (fsockopen and fopen) with a new PHP Streams class (based on stream_context_create() ) in #25007
  • cURL is more memory efficient as most string operations happen in C, which supposably also makes cURL requests faster than Streams. This doesn't seem to hold true, as most requests complete in about the same time (+/- a %age point)
  • cURL had better SSL verification than Streams, which is no longer true as of #25007

so when it comes down to it, it's mostly a legacy ordering, but there exists code out there which expects HTTP requests to be made via cURL and doesn't take into account its possible for it to use something else.

comment:3 in reply to: ↑ 2 rmccue6 months ago

Replying to dd32:

  • cURL is more memory efficient as most string operations happen in C, which supposably also makes cURL requests faster than Streams. This doesn't seem to hold true, as most requests complete in about the same time (+/- a %age point)

cURL still is faster by my testing, but the disparity is no longer what it used to be with regards to speed and memory. I'd still prefer to use cURL personally, since they do have more experience with handling this scenario than we are.

comment:4 dd326 months ago

I'd still prefer to use cURL personally, since they do have more experience with handling this scenario than we are.

They are, but we're also at the mercy of hosts running severely outdated versions of cURL which miss the vital fixes that we can apply to our Streams/Socket transport immediately.

comment:5 dd325 months ago

  • Description modified (diff)

See Also: #26010

comment:6 dd325 months ago

  • Description modified (diff)

dd325 months ago

comment:7 soulseekah5 months ago

  • Cc gennady@… added

Only two false positives for some servers have been caught in 7.31.0 so far:

  • #25716 (lingering_close)
  • #26010 (HTTP version mismatch, possibly related to how forced 1.0 is interpreted by OpenSSL)

More false positives probably here: http://curl.haxx.se/bug/view.cgi?id=1249 (still down).

Blacklisting this single version seems fine, but only if there are other options. If streams are not available, 7.31.0 (with SSL) will still probably work, which is better than failing altogether perhaps. This might be too complicated to check for in reality.

But, yes, overall I agree that 7.31.0 (w/ OpenSSL) should make the blacklist due to the false positives it errors out with (https://github.com/bagder/curl/commit/8a7a277c086199b37c07a8e01165168037866f3e).

Last edited 5 months ago by soulseekah (previous) (diff)

dd325 months ago

comment:8 dd325 months ago

Blacklisting this single version seems fine, but only if there are other options. If streams are not available, 7.31.0 (with SSL) will still probably work, which is better than failing altogether perhaps. This might be too complicated to check for in reality.

It'll succeed against most Apache servers, and a few others, but fail against a bunch of others.

Most Plugins need to be written with the fact that not all servers can perform SSL connections reliably anyway, so blocking HTTPS connections via cURL on a host with a known bad version and letting it fallback to Streams+OpenSSL is a better option.
If OpenSSL isn't available for Streams on that server, they're out of luck and the plugin will have to resort to HTTP.

The tough part about blacklisting cURL 7.31.0 and cURL+GnuTLS (which both have the same issue) is that both of these can now communicate with api.wordpress.org securely, but if we block these, and the host doesn't have OpenSSL installed then they'll be left out in the cold with manual updates (Background Updates require HTTPS).
So it's a balance between core functionality and more reliable requests for plugins.

One alternative is that we only blacklist those 2 conditions when we can otherwise probably successfully make a connection with Streams: 25738.2.diff,

comment:9 rmccue5 months ago

What if we do something a little funky and change the order? We have a (probably) working implementation in Streams, and a half-working implementation in cURL, so it makes sense to try the probably working version first.

As to whether we can do stuff like this in the code: maybe. IIRC the order is hardcoded, but I'm sure we can work some black magic.

(As part of this, we could introduce a priority method to the transports to replace/coexist with the test method. By default, cURL = 10, fsockopen = 20, but if cURL looks like it could be broken, cURL = 30, e.g.)

Last edited 5 months ago by rmccue (previous) (diff)

comment:10 follow-up: dd325 months ago

I have been considering for a while of changing the order to Streams first for all connections.
Streams also has the benefit that it's supposably better at non-blocking connections (well, it doesn't block the connection for long..).
It may also mean that we start getting more bug reports for it, and we realise it's not as reliable as we previously believed (although I find this unlikely).

Changing the order has implications elsewhere though, as a result, I wouldn't want to change the order unless we also added a parameter to try both transports before failing (ie. for Update-related stuff), which I think would be good regardless.

We have a hard-coded order at present, in the past it was a hard-coded order based on blocking/non-blocking, we can easily add ordering based on a priority test or some such.

comment:11 in reply to: ↑ 10 rmccue5 months ago

Replying to dd32:

I have been considering for a while of changing the order to Streams first for all connections.

See also comment:2 ;)

Streams also has the benefit that it's supposably better at non-blocking connections (well, it doesn't block the connection for long..).
It may also mean that we start getting more bug reports for it, and we realise it's not as reliable as we previously believed (although I find this unlikely).

cURL has a fairly large install base, and the advantage of handling more in C gives us better performance. Adding parallel request support would help with this, but I don't think anyone has proposed that yet.

Changing the order has implications elsewhere though, as a result, I wouldn't want to change the order unless we also added a parameter to try both transports before failing (ie. for Update-related stuff), which I think would be good regardless.

I'm not sure special-casing it is a good idea; more options = worse.

comment:12 dd325 months ago

#26010 was marked as a duplicate.

comment:13 dd325 months ago

cURL has a fairly large install base, and the advantage of handling more in C gives us better performance. Adding parallel request support would help with this, but I don't think anyone has proposed that yet.

I'm yet to see any significant advantage in performance using cURL, doing unscientific testing (and excluding any times which were crazy-far from the rest) showed curl and streams generally within 10ms of each other (on a 400ms request) on small files, and within 200ms on downloading WordPress (~5s). When I tried on a slower connection (ironically, my VPS) cURL was averaging 33s, while Streams averaged 29s.

What I'm saying.. is any significant performance gains between the two are almost irrelevant for WordPress, if we were making thousands of requests per hour then squeezing 10ms out here and there makes a huge difference, but in this case it's more likely remote severs are the performance bottleneck.

I'm not sure special-casing it is a good idea; more options = worse.

In general I agree, however, this is a scenario where we need the most compatibility possible, if HTTPS via cURL fails, we need to try HTTPS via Streams before falling back to HTTP.
We also don't want to prevent a site from getting update notifications, which is why we have to have a HTTP fallback.

comment:14 dd325 months ago

  • Milestone changed from 3.8 to Future Release
Note: See TracTickets for help on using tickets.