Make WordPress Core

Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#32112 closed defect (bug) (fixed)

wp_get_attachment_url returns https when it should not

Reported by: zabatonni's profile zabatonni Owned by: boonebgorges's profile boonebgorges
Milestone: 4.2.2 Priority: normal
Severity: normal Version: 4.2
Component: Media Keywords: fixed-major
Focuses: Cc:

Description

when I have set my
WP_CONTENT_URL constant as http://site.tld/wp-content
WP_HOME as http://site.tld
WP_SITEURL as https://site.tld

I expect to have only http urls, scripts and images on frontend but after updating WP I'm getting https image urls instead. I'm using https only in admin for better protection.

Attachments (3)

32112.diff (4.9 KB) - added by boonebgorges 9 years ago.
32112.2.diff (5.4 KB) - added by boonebgorges 9 years ago.
32112.3.diff (729 bytes) - added by boonebgorges 9 years ago.

Download all attachments as: .zip

Change History (35)

#1 @SergeyBiryukov
9 years ago

  • Component changed from General to Media

Related: #15928

#2 follow-up: @boonebgorges
9 years ago

See #15928 [31614]. This is, at least in part, by design.

Can you give more details about the context in which wp_get_attachment_url() is returning HTTPS links? Are these attachment links that are coded into your theme? Or were the URLs generated when writing a post (through Add Media)?

#3 in reply to: ↑ 2 ; follow-up: @zabatonni
9 years ago

Replying to boonebgorges:

See #15928 [31614]. This is, at least in part, by design.

Can you give more details about the context in which wp_get_attachment_url() is returning HTTPS links? Are these attachment links that are coded into your theme? Or were the URLs generated when writing a post (through Add Media)?

Yes, its in Add media so it influences all wysiwyg and meta boxes. Before it worked depending that WP_CONTENT_URL constant.

#4 in reply to: ↑ 3 ; follow-ups: @boonebgorges
9 years ago

Replying to zabatonni:

Yes, its in Add media so it influences all wysiwyg and meta boxes. Before it worked depending that WP_CONTENT_URL constant.

Gotcha. The previous behavior was that you'd be at https://site.tld/wp-admin, and your images would be inserted with http://site.tld/ src attributes. This was a bug, because it meant that the security of your https request was compromised by the presence of non-https resources.

The question, then, is front-end behavior. The decision in #15928 was that it's better to fix the bug just described and deliver https assets on the front end. Is this fix causing side effects for you? In other words, what is the harm in delivering images over HTTPS when it's possible to do so? You can, of course, change this behavior with a rewrite rule, or with a filter on 'the_content', for the purposes of your own site, but I'm wondering whether there's some general reason why HTTPS assets would be harmful in these cases.

#5 in reply to: ↑ 4 @zabatonni
9 years ago

Replying to boonebgorges:

Replying to zabatonni:

Yes, its in Add media so it influences all wysiwyg and meta boxes. Before it worked depending that WP_CONTENT_URL constant.

Gotcha. The previous behavior was that you'd be at https://site.tld/wp-admin, and your images would be inserted with http://site.tld/ src attributes. This was a bug, because it meant that the security of your https request was compromised by the presence of non-https resources.

The question, then, is front-end behavior. The decision in #15928 was that it's better to fix the bug just described and deliver https assets on the front end. Is this fix causing side effects for you? In other words, what is the harm in delivering images over HTTPS when it's possible to do so? You can, of course, change this behavior with a rewrite rule, or with a filter on 'the_content', for the purposes of your own site, but I'm wondering whether there's some general reason why HTTPS assets would be harmful in these cases.

Yes, side effect is that it won't load at all due to ssl cert.
Wouldn't it be better to use all attachments schemeless? like:

//domain.tld/wp-content/uploads..

domain.tld/wp-content/uploads/... then it would load correctly on both sides. I would apply same for permalinks and other stuff generating urls.

Last edited 9 years ago by zabatonni (previous) (diff)

#6 follow-up: @boonebgorges
9 years ago

Yes, side effect is that it won't load at all due to ssl cert.

Can you explain this in more detail? If https://site.tld/wp-admin/ is available, then <img src="https://site.tld/uploads/2015/whatever.jpg" /> should work.

Wouldn't it be better to use all attachments schemeless?

Yeah, we considered this, but it's hard to do it at the level of wp_get_attachment_src(). That function may be used to generate URLs used in emails, RSS feeds, and other places where clients may not understand schemeless URLs.

#7 @Ipstenu
9 years ago

I've run into a couple places where caching isn't acting as expected because it's an https link (Pagespeed and Jetpack's Photon module).

#8 in reply to: ↑ 6 @zabatonni
9 years ago

  • Keywords close added

Replying to boonebgorges:

Yes, side effect is that it won't load at all due to ssl cert.

Can you explain this in more detail? If https://site.tld/wp-admin/ is available, then <img src="https://site.tld/uploads/2015/whatever.jpg" /> should work.

Wouldn't it be better to use all attachments schemeless?

Yeah, we considered this, but it's hard to do it at the level of wp_get_attachment_src(). That function may be used to generate URLs used in emails, RSS feeds, and other places where clients may not understand schemeless URLs.

Ok, I get it. Lets close this. For now I'll be using filter to have those links back to http.

#9 follow-up: @boonebgorges
9 years ago

Thanks, zabatonni.

ipstenu - Can you provide more details about the caching issue?

#10 in reply to: ↑ 9 @Ipstenu
9 years ago

Replying to boonebgorges:

ipstenu - Can you provide more details about the caching issue?

For Photon: https://github.com/Automattic/jetpack/issues/2015 - I can logically see why they DON'T catch https there, and it's much the same reason as pagespeed. Caching and HTTPS are techy at best :)

Pagespeed (by default) doesn't process https unless you set it up to do so. Since many sites (mine included) don't use https on the front end for the site, it sees https://path.to/image.jpg and doesn't render my 'commands' to it (which are to make the image root-relative and to do some extra compression). If I change the image back to http, it works fine.

I'm going to be filtering the content on sites that don't use https on the front end (one site in the network is full https which is my own drama, if annoying).

#11 in reply to: ↑ 4 ; follow-up: @Serindu
9 years ago

Replying to boonebgorges:

Replying to zabatonni:

Yes, its in Add media so it influences all wysiwyg and meta boxes. Before it worked depending that WP_CONTENT_URL constant.

Gotcha. The previous behavior was that you'd be at https://site.tld/wp-admin, and your images would be inserted with http://site.tld/ src attributes. This was a bug, because it meant that the security of your https request was compromised by the presence of non-https resources.

The question, then, is front-end behavior. The decision in #15928 was that it's better to fix the bug just described and deliver https assets on the front end. Is this fix causing side effects for you? In other words, what is the harm in delivering images over HTTPS when it's possible to do so? You can, of course, change this behavior with a rewrite rule, or with a filter on 'the_content', for the purposes of your own site, but I'm wondering whether there's some general reason why HTTPS assets would be harmful in these cases.

This change completely breaks media for sites using self-signed certificates.

Since my users don't need to log in or otherwise provide sensitive information I use a self-signed cert to provide encryption when I log in. This change breaks media content for my users unless they manually accept my certificate or I manually go and fix generated links.

Can you be more explicit about how I can workaround this problem?

#12 in reply to: ↑ 11 @zabatonni
9 years ago

Replying to Serindu:

Replying to boonebgorges:

Replying to zabatonni:

Yes, its in Add media so it influences all wysiwyg and meta boxes. Before it worked depending that WP_CONTENT_URL constant.

Gotcha. The previous behavior was that you'd be at https://site.tld/wp-admin, and your images would be inserted with http://site.tld/ src attributes. This was a bug, because it meant that the security of your https request was compromised by the presence of non-https resources.

The question, then, is front-end behavior. The decision in #15928 was that it's better to fix the bug just described and deliver https assets on the front end. Is this fix causing side effects for you? In other words, what is the harm in delivering images over HTTPS when it's possible to do so? You can, of course, change this behavior with a rewrite rule, or with a filter on 'the_content', for the purposes of your own site, but I'm wondering whether there's some general reason why HTTPS assets would be harmful in these cases.

This change completely breaks media for sites using self-signed certificates.

Since my users don't need to log in or otherwise provide sensitive information I use a self-signed cert to provide encryption when I log in. This change breaks media content for my users unless they manually accept my certificate or I manually go and fix generated links.

Can you be more explicit about how I can workaround this problem?

add_filter('wp_get_attachment_src',YOURFN',10,1);

and the in your function do something like

func($url) {
return str_replace('https','http',$url);
}

or use wp functions to change scheme.

#13 @zabatonni
9 years ago

i just wonder which hook is the best to call this add_filter

#14 @boonebgorges
9 years ago

  • Keywords close removed
  • Milestone changed from Awaiting Review to 4.2.1
  • Owner set to boonebgorges
  • Status changed from new to assigned

You can restore wp_get_attachment_url() to its pre-4.2 behavior like this:

function wp32112_force_attachment_urls_to_non_https( $url ) {
    return set_url_scheme( $url, 'http' );
}
add_filter( 'wp_get_attachment_url', 'wp32112_force_attachment_urls_to_non_https' );

A more targeted fix is to replace src URLs in post content only (which is the place where the https is problematic):

function wp32112_force_attachment_urls_to_non_https_in_post_content( $content ) {
    if ( ! is_ssl() ) {
	$uploads = wp_upload_dir();
	$base = $uploads['basedir'];
	$base_ssl = set_url_scheme( $base, 'https' );

	$content = str_replace( 'src="' . $base_ssl, 'src="' . $base, $content );
    }
    return $content;
}
add_filter( 'the_content', 'wp32112_force_attachment_urls_to_non_https_in_post_content' );

The fixes in wp_get_attachment_url() are appropriate for the vast majority of cases, and should stay. But the issues of HTTP-only caching and self-signed certificates are important enough that we should do something about it. Since the problem is with post content that is generated in HTTPS wp-admin but then displayed on a front end that is non-HTTPS, I think a proper fix is to modify the Add Media process so that the 'src' attribute of <img> tags inserted into the editor always respects the scheme of wp_upload_dir(). The downside of this is that it'll result in mixed-content warnings on SSL-optional front-ends, but this was also the case before 4.2, and can be fixed with a filter (roughly the opposite of the 'the_content' filter I posted above). I'll work on a patch.

#15 @DrewAPicture
9 years ago

  • Keywords needs-patch added

@boonebgorges: What's the status on a patch here?

#16 @boonebgorges
9 years ago

Working on it. It's Sunday :-D

@boonebgorges
9 years ago

#17 @boonebgorges
9 years ago

  • Keywords has-patch needs-testing added; needs-patch removed

32112.diff modifies the send-attachment-to-editor AJAX callback so that image sources will always be http if the site's canonical URL is http, even when the current request is over SSL. This is how attachment inserting worked prior to 4.2. This change does not address any content that has been created since 4.2 - only new content. I would appreciate testing from anyone who is experiencing the problem (if testing in production, you only need the small patch for ajax-actions.php, not the /tests/ changes).

The change suggested in 32112.diff means that we're rolling back a couple of improvements that had been introduced in 4.2. (a) A site with non-SSL homeurl which is administered over SSL will see mixed-content warnings when using the Visual editor. (b) Front-end visitors to an SSL-optional site (non-SSL homeurl but accessible over HTTPS) will see mixed-content warnings when viewing posts with attachments. I've been working on more focused patches to address these issues within the context of the concerns raised in this ticket, but since they're not 4.2 regressions, I'll open separate tickets for them, toward a future milestone.

#18 @boonebgorges
9 years ago

#32186 was marked as a duplicate.

#19 @khlo
9 years ago

Hey @boonebgorges

Nice patch - thanks! I'm testing this in production right now.

  • Images insert as expected with a http-scheme URL.
  • On the 'Insert Media' screen:
    • If the Link to: Media file option is selected, it still links to https version. I'm not so bothered about this myself as I don't normally use this option. However, it's a little bit confusing as the hyperlink is intended for users on the front-end (hence, it should probably respect the http schema which has been specified for the front-end).
    • Same goes for the URL listed on the Insert Media screen.

Aside from that, it's generally working well and images are no longer causing a problem.

@boonebgorges
9 years ago

#20 @boonebgorges
9 years ago

Thanks for testing, khlo. You're right that those links should also go to the canonical, non-SSL URLs. See 32112.2.diff.

This issue is showing pretty clearly how it's a mess that we use the same function as a template/theme function - where it should be context-sensitive - and as a business function for getting canonical URLs - where it should not be context-sensitive.

#21 @khlo
9 years ago

Thanks @boonebgorges - will also test the updated version of the patch later in production.

Agreed about the comments using the same function for context-sensitive and non-context-sensitive stuff. The only thing that concerns me a bit about this pull request is it almost feels like the wrong place to fix it.

Firstly, because we add context-sensitivity in wp_get_attachment_url() and then we get rid of it again in wp_ajax_query_attachments().

Secondly, it seems a little difficult to maintain. Do we have any other non-contextual business functions that also depend on wp_get_attachment_url()? There's also something like the Admin Media Library. Presumably this should also show canonical URLs for the sake of consistency (but it probably has little impact there).

Finally, I wonder if there are any third-party plugins that depend on wp_get_attachment_url() to return a canonical URL (not sure if they'll be using wp_ajax_query_attachments() to insert the image).

As a total newbie (so please feel free to shoot me down!), it almost seems to make more sense to have wp_get_attachment_url() always return the canonical URL. We should then have an output filter which changes the scheme in the output as necessary (it seems to be a more logical place to put context-sensitive code). Finally, it might be possible to implement protocol-relative URLs as a filter on the content whilst leaving the canonical URLs in the original, unfiltered content (e.g. for e-mail subscription plugins, RSS output, direct database queries, etc).

#22 @boonebgorges
9 years ago

@khlo Thanks for the thoughts. See #15928 for a bunch of painful back story.

Briefly, I agree with you in theory that wp_get_attachment_url() "feels" like it ought to produce a canonical, context-insensitive URL, based on homeurl. The problem is that the function is widely used in themes to generate links and <img> tags. In situations where it's possible to view the site over SSL even though homeurl is non-SSL - as when SSL is optional, or when SSL is enforced by server redirects - the use of wp_get_attachment_src() can cause browser mixed-content warnings, as well as links that unknowingly lead users out of the HTTPS context. This was the original concern that led to #15928.

It's worth noting that other URL functions in WP - like get_permalink() - are context-specific while ! is_admin(), because they use get_home_url() to generate the URL base. Maybe the correct course of action here is for us to do the same thing in wp_get_attachment_url(). See https://core.trac.wordpress.org/ticket/15928#comment:81 and related discussion.

I think we can go with something modest for 4.2.x, as I've suggested in 2.diff.

#23 @joemcgill
9 years ago

Hey Boone,

Just had a chance to look over this issue and tend to agree that it makes sense to respect the scheme of from wp_upload_dir() when writing content to the database, as you've suggested in 32112.2.diff. Although, the two concerns you brought up in your previous comment makes me wonder if we might end up needing to add a content filter either way.

(a) A site with non-SSL homeurl which is administered over SSL will see mixed-content warnings when using the Visual editor. (b) Front-end visitors to an SSL-optional site (non-SSL homeurl but accessible over HTTPS) will see mixed-content warnings when viewing posts with attachments.

One thing we chose not to do while working on #15928, which could help this issue and the types of caching issues @Ipstenu describes, is to force a scheme switch to non-SSL urls when someone visits a non SSL-optional front end, which could apply to any case where wp_get_attachement_url() is being called in a template on the front end.

So in short, I'm suggesting:

  1. Respect the scheme of wp_upload_dir() when we know we're writing post content to the database, which your current patch handles.
  2. Conditionally add a very narrow filter to the_content that basically uses the same logic we used in #15928 to scheme switch only when a request is coming over HTTPS and the host matches the host of the uploads directory (i.e., when we know SSL is available).
  3. Optionally, we can use similar logic to scheme switch HTTPS urls back to HTTP in SSL-optional contexts.

What would be great, if possible, would be to make TinyMCE scheme switch to HTTPS when loading the image views in the WYSIWYG editor, without changing the actual content that gets saved, but that's @azaozz territory. I have no idea if we can even do such a thing.

@boonebgorges
9 years ago

#24 @boonebgorges
9 years ago

Thanks for the thoughts, joemcgill.

Another possible strategy is 32112.3.diff. It copies the scheme logic from get_home_url(): match the scheme of the current request only when ! is_admin() (and when we're not on the login page, though I'm not sure that's necessary here). Since posts are always created from the admin, this would take care of the problem of post content in the database.

Your suggestion (2) and your TinyMCE suggestion are the "more focused patches" that I mention here https://core.trac.wordpress.org/ticket/32112#comment:17. Swapping out the src attribute in Visual mode is not too hard - I have a patch mostly done. Let's not worry about this on the current ticket though, as it's not a regression.

#25 @AlphaBootis
9 years ago

As someone who uses SSL only for the back-end I definately feel this update should be fixed. I use a self signed cerificate for the back end, so random https calls on random browsers will be halted on the count of the certificate beeing untrusted therefore breaking content.
Solving this on my end would involve either investing in an actual trusted certificate (and associated additional server load from https requests) or implementing some sort of server-side workaround which seems wrong to me.

#26 @boonebgorges
9 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 32342:

Attachment URLs should only be forced to SSL on the front end.

Detecting SSL status on the Dashboard introduces problems when writing content
that is saved to the database and then displayed on the front end, where SSL
may be optional (or impossible, due to self-signed certificates). The new
approach parallels the logic in get_home_url() for forcing HTTPS.

See [31614] #15928 for background.

Fixes #32112 for trunk.

#27 @boonebgorges
9 years ago

  • Keywords fixed-major added; has-patch needs-testing removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

Let's go with the is_admin() solution for now, as it matches get_home_url(). I'll open additional tickets for the editor and the_content improvements in the upcoming days, and link to them from here.

Reopening for 4.2.2.

#28 @dfisek
9 years ago

The post 4.2 behavior makes the use of FORCE_SSL_ADMIN meaningless (and IMHO should be fixed). This feature is used when the admin interface is served via https but the site itself is served via http.

Many popular browsers' (like Firefox, Chrome) security settings don't like mixed http/https content in a single page. In that case, they don't display the content that is served by a different protocol other than the requested one by default (if http://site is requested, additional content requested from https://site is not shown to the user).

This default behavior forces us to serve a page's all content via http or https. That's why the post-4.2 behavior is disruptive.

IMHO, the ultimate solution would be fixing the media upload function so that it doesn't insert full static url's into site content, but rather use a code to dynamically construct the site url according to site settings and requested protocol (like Drupal).

But currently we do have to fix this bug urgently, since it's actively causing mixed content trouble.

#29 @boonebgorges
9 years ago

@dfisek Thanks for the comment. Please read this ticket and #15928 thoroughly. Your concerns have been discussed in both places at length. The issue is already addressed in trunk and the fix should be part of the next maintenance release.

#30 @boonebgorges
9 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 32384:

Attachment URLs should only be forced to SSL on the front end.

Detecting SSL status on the Dashboard introduces problems when writing content
that is saved to the database and then displayed on the front end, where SSL
may be optional (or impossible, due to self-signed certificates). The new
approach parallels the logic in get_home_url() for forcing HTTPS.

See [31614] #15928 for background.

Fixes #32112 for 4.2 branch.

#31 @boonebgorges
9 years ago

#32268 was marked as a duplicate.

#32 @johnbillion
9 years ago

#32479 was marked as a duplicate.

Note: See TracTickets for help on using tickets.