Make WordPress Core

Opened 14 months ago

Closed 9 months ago

Last modified 9 months ago

#57627 closed defect (bug) (fixed)

The Cache-Control header for logged-in pages should include `private`

Reported by: markdoliner's profile markdoliner Owned by: johnbillion's profile johnbillion
Milestone: 6.3 Priority: normal
Severity: normal Version:
Component: Administration Keywords: has-patch has-unit-tests
Focuses: privacy Cc:

Description

I believe WordPress returns the following Cache-Control header for pages that are rendered for logged-in users:

Cache-Control: no-cache, must-revalidate, max-age=0

I think the relevant code is here and here.

For pages for logged-in users I believe this header should be modified to include the private directive to indicate that the response should not be cached by intermediary shared cache servers.

The change should not be made everywhere nocache_headers() is used--only for responses that vary based on the logged-in user. And maybe also for users who have recently left a comment (#16612 is related), though it seems like this is hard for the server to know reliably. You could key off the presence of one of the comment_author_* cookies but those aren't always set.

The Meanings of no-cache and private

You might think that no-cache would be sufficient to accomplish this, but it's not. It's a bit confusing but no-cache means "this response may be stored in a cache but it must be revalidated before it is used." And so I believe that shared cache servers are allowed to cache pages rendered for logged-in users.

I've found MDN's caching guide to be helpful while trying to understand the meaning of the various directives. The Private Caches section says, "If a response contains personalized content and you want to store the response only in the private cache, you must specify a private directive." It's reiterated in the "Do Not Share With Others" section under "Don't Cache." And MDN's Cache-Control header reference contains a similar statement.

What's the Harm?

And of course the risk isn't just that the page is cached on a shared server, but that it's served to a user other than the logged-in user. Thankfully I think the risk is minimal for a few reasons:

  1. no-cache means the cache will attempt to revalidate the page before using it. I believe revalidation is not possible by default because WordPress does not set the ETag or Last-Modified header for these responses. Though this isn't a guarantee: Someone could configure their web server or a caching reverse proxy server to set the headers and return HTTP 304 if appropriate. Or a plugin could do these things. The WP Super Cache plugin even has options for "304 Browser caching" and "Enable caching for all visitors" (even logged-in visitors), though I couldn't get it to serve a logged-in page to a non-logged-in user so it looks like it's clever enough to use different cached data based on the user's cookie (I see that Cookie is added to the Vary header), so that's great.
  1. When used as caching reverse proxies Nginx and Varnish appear to not cache responses if the Cache-Control header includes no-cache, so they won't cache pages for logged-in users. For Nginx I think it's this logic. For Varnish I think it's this logic. I think they're allowed to cache these responses and it seems possible that they will in the future, but they don't currently. And as a counter example I believe Squid is willing to cache these responses (this FAQ is related but not super clear).
  1. I suspect shared cache servers are uncommon (thought I've made no attempt to find data about it).
  1. The number of https sites has increased greatly over time and shared cache servers can't cache objects served over https (unless they decrypt and reencrypt the data, which is mostly only possible in company-managed computers where the company is able to add their own signing certificate to the browser trust store).

So Why Should We Change It?

While I think it's rare that the lack of private will cause harm, WordPress is widely used and there are many ways to configure cache-related headers. I'd guess there is a non-zero chance that this problem has surfaced at some point in time and so I feel that it's worth changing. The risk from adding the header feels low to me.

I'll caveat this ticket by saying that I'm not intimately familiar with caching behavior. I've just been looking at it a lot over the last few days. It's entirely possible that I'm wrong about all of this.

Related Tickets

  • #16612 proposes using nocache headers() for requests with comment cookies. That seems appropriate to me, and also using private.
  • #21938 proposes adding no-store to the nocache headers() list. This is a separate consideration from the issue I'm raising above. I don't know whether it's a good proposal. There's a lot to think about there.
  • #22258, #23021, and #40444 dealt with removing Last-Modified from nocache_headers().

Change History (8)

#1 @ayeshrajans
14 months ago

Hi @markdoliner, welcome to WordPress Trac.

Thank you for opening this ticket. This indeed looks like something we have to improve.

After reading #21938 (to add no-store) and yours, I also think choosing private over no-store makes more sense, because going back the browser history is an absolute valid use-case that we don't have to disallow. I also understand that the browsers probably stores the authenticated pages in cache if the user logs out, but this can also be solved with a Clear-Site-Data. I maintain a plugin (https://wordpress.org/plugins/clear-logout/) that does just that.

You are also right that the change should ideally be in wp_get_nocache_headers function. Patching that would be trivial with a single-line diff, but perhaps this is something we can add a headless browser test as well?

#2 @johnbillion
14 months ago

  • Component changed from General to Administration
  • Focuses privacy added
  • Milestone changed from Awaiting Review to 6.3
  • Owner set to johnbillion
  • Status changed from new to accepted

I'd definitely like to see this tightened up. Moving to 6.3 along with #21938.

This ticket was mentioned in PR #4284 on WordPress/wordpress-develop by rutviksavsani.


12 months ago
#3

  • Keywords has-patch has-unit-tests added

This PR adds the private directive for cache control when the user is logged in.

Trac ticket: https://core.trac.wordpress.org/ticket/57627

#4 @johnbillion
10 months ago

Thank you for the PR @rutviksavsani! I've opened a new PR at https://github.com/WordPress/wordpress-develop/pull/4570 which addresses this ticket and #21938 at the same time, and I've included an updated version of the e2e test from your PR. Would you like to do some testing?

#5 @johnbillion
10 months ago

  • Keywords needs-testing added

#6 @Dharm1025
10 months ago

  • Keywords needs-testing removed

Test Report

This report validates that the indicated patch addresses the issue.

Patch tested: https://github.com/WordPress/wordpress-develop/pull/4570

Environment

  • OS: macOS Ventura 13.0
  • Web Server: nginx/1.25.0
  • PHP: 7.4.33
  • WordPress: 6.3-alpha-55505-src
  • Browser: Chrome Version 113.0.5672.126 (Official Build) (arm64)
  • Theme: Twenty Twenty-Three
  • Active Plugins: -

Test Results

✅ Works as expected with a patch.

I have tested the patch as per testing instructions and it works as expected.

Before Patch:

Cache-Control Header:

  1. Front-end (logged in): Cache-Control: no-cache, must-revalidate, max-age=0
  2. Front-end (not logged in): No Cache-Control present
  3. Back-end (logged in): Cache-Control: no-cache, must-revalidate, max-age=0
  4. wp-login.php page: Cache-Control: no-cache, must-revalidate, max-age=0

Login to wp-admin, then logout and press the back button in the browser shows the previous wp-admin page.

After Patch:

Cache-Control Header:

  1. Front-end (logged in): Cache-Control: no-cache, must-revalidate, max-age=0, no-store, private
  2. Front-end (not logged in): No Cache-Control present
  3. Back-end (logged in): Cache-Control: no-cache, must-revalidate, max-age=0, no-store, private
  4. wp-login.php Page: Cache-Control: no-cache, must-revalidate, max-age=0

Login to wp-admin, then logout, press the back button in the browser shows the wp-login page.

Thanks

#7 @johnbillion
9 months ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 55968:

Administration: Add the no-store and private directives to the Cache-Control header when preventing caching for logged in users.

The intention behind this change is to prevent sensitive data in responses for logged in users being cached and available to others, for example via the browser history after the user logs out.

The no-store directive instructs caches in the browser or within proxies not to store the response in the cache. This is subtly different from the no-cache directive which means the response can be cached but must be revalidated before re-use. WordPress does not use ETag headers by default therefore this does not achieve the same result.

The private directive complements the no-store directive by specifying that the response contains private information that should not be stored in a public cache. Som
e proxy caches may ignore the no-store directive but respect the private directive, thus it is included.

The existing Cache-Control header for users who are not logged in remains unchanged, and the existing cache prevention directives remain in place for backwards compatib
ility.

Props soulseekah, luehrsen, Dharm1025, markdoliner, rutviksavsani, ayeshrajans, paulkevan, clorith, andy786, johnbillion

Fixes #21938, Fixes #57627

Note: See TracTickets for help on using tickets.