Make WordPress Core

Opened 15 years ago

Last modified 3 years ago

#13066 accepted defect (bug)

Last-Modified headers for individual comment feeds are incorrect

Reported by: solarissmoke's profile solarissmoke Owned by: jgci's profile jgci
Milestone: Future Release Priority: low
Severity: normal Version: 3.0
Component: Feeds Keywords: dev-feedback needs-refresh
Focuses: performance Cc:

Description

The WP::send_headers function currently uses get_lastcommentmodified() to set the Last-Modified header for all comment feeds. This is a problem when used for individual post comment feeds. The function gets the last modified comment across all blog posts. That means that every time a comment is posted anywhere, the Last-Modified header for ALL comment feeds is refreshed. Issues:

  1. This is technically incorrect, since only the global comment feed and one specific post's comment feed have changed with the last comment (not all possible comment feeds); and
  1. It means that If-Modified-Since requests for other post comment feeds will not receive a 304 response when they should do (since their content hasn't changed). On blogs with many posts and many comment feeds, this will have a large impact on bandwidth because lots of requests will receive 200 responses where 304's would have done, just because a comment was posted on some other post.

If I've understood the flow correctly, $wp_query hasn't been fully set up at the time this function is called, so changing this behaviour would require some change in the flow of things (e.g., the handling of last modified headers for feeds moves into the do_feed() function). But doing so would mean that Last-Modified headers are correct/meaningful and that many more 304 responses can be served.

Any thoughts?

Attachments (2)

get_lastcommentmodified.diff (3.5 KB) - added by jgci 14 years ago.
query.diff (734 bytes) - added by jgci 14 years ago.

Download all attachments as: .zip

Change History (15)

#1 @solarissmoke
15 years ago

Incidentally, the <lastBuildDate> tag in the individual comment feeds is also wrong because of the same use of get_lastcommentmodified().

#2 follow-up: @dd32
15 years ago

  • Milestone changed from 3.0 to Future Release

This code is run early in the execution of WordPress, At the point in time its run, All WordPress knows is, that its a query for a Comments feed.

Its a trade off between bandwidth and server load, continuing to process the request will require loading the rest of WordPress & parsing the request & database fetches. all of which, would be pointless if no comments have been posted.

At the same time, it wastes bandwidth in preference to wasting CPU cycles as feeds are re-served often when they're not needed to.

Theres a few options here:

  • Leave as-is
  • Make the code here run AFTER posts are queried, which would result in higher CPU load, yet more accurate result
  • Make the comment check specify the post for the requests using a manual SQL
    • for p, page_id, and attachment_id requests, this wouldnt require an extra SQL
    • for name and attachment requests, this would require an extra SQL query
    • for pagename requests, this would probably require a few more queries

Since this isnt a regression(and we're trying to get 3.0 done), I'm setting this to Future Release pending agreement and/or a patch.

#3 in reply to: ↑ 2 @solarissmoke
15 years ago

Replying to dd32:

Its a trade off between bandwidth and server load, continuing to process the request will require loading the rest of WordPress & parsing the request & database fetches. all of which, would be pointless if no comments have been posted.

True, but every time Wordpress sends a 200 response where 304 would have done, we end up wasting both bandwidth and server load (because a full load takes place anyway).

  • Leave as-is

I would suggest that even for semantic reasons the Last-Modified information should reflect the state of the content being served (the 304 issue aside).

  • Make the comment check specify the post for the requests using a manual SQL

As you say this would be a little messy for pagename requests etc. The easiest option would be to do it after everything has loaded, at the expense of server load.

I was also thinking whether it might be possible to cache (in the db) the last-modified data for feeds, so that this could be accessed early in the execution? That might be even more messy though.

#4 @hakre
14 years ago

Looks like there is no intention to fix this. Close as wontfix?

#5 @westi
14 years ago

  • Cc westi added
  • Keywords gci added

I have proposed this as GCI2010 Task

#6 @jgci
14 years ago

  • Owner set to jgci
  • Status changed from new to accepted

#7 @jgci
14 years ago

  • Keywords has-patch added

#8 @westi
14 years ago

  • Keywords 3.2-early added

The patch resolves the issue in my testing.

Not 100% sure this is the perfect solution but definitely a working one.

@jgci
14 years ago

#9 @jgci
14 years ago

I've just added a secound patch, which makes sure that one query isn't parsed twice.

#10 @chriscct7
10 years ago

  • Focuses performance added
  • Keywords needs-refresh added; gci has-patch 3.2-early removed

Patch would need a slight refresh (line for globals has changed in the last 4 years). Ticket goal needs to be re-evaluated

#11 @stevenkword
10 years ago

I've just posted an alternative solution in #4575

#12 @stevenkword
7 years ago

Having looked at this further, I agree with @dd32. The code that generates the HTTP headers doesn't have access to the global WP_Query object due it's early execution. The patch attached in #4575 resolves the <lastBuildDate> side of this, but does not address the HTTP headers.

For now, I'm in favor of leaving the HTTP headers as is and implementing the new get_last_build_date method proposed in #4575 as it is "free".

#13 @mukesh27
3 years ago

For the last 5 years, we haven't made any progress on this ticket. If anyone has a proposed solution or feedback on this ticket, then please share it. Do you think we can get it fixed in the upcoming version?

One question Does it improve the performance anyway?

Note: See TracTickets for help on using tickets.