Opened 12 years ago
Last modified 9 months ago
#22301 new defect (bug)
Performance problem with Recent Comments widget
Reported by: | pento | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 2.8 |
Component: | Comments | Keywords: | has-patch needs-refresh |
Focuses: | performance | Cc: |
Description
When a comment is posted (or the status of a comment changes), the widget_recent_comments
cache item is invalidated, which the Recent Comments widget uses to populate the widget content. On the next widget display, it will call get_comments()
to repopulate the cache.
The problem occurs when you have a very large number of comments, the MySQL query will use the (comment_approved, comment_date_gmt)
index, but if MySQL has to scan too many rows in an index, it'll switch to table scan instead. As the comment_approved
column is mostly the same value, this will almost always happen. This is compounded by the query occurring on every page load until the cache is re-populated - if the query takes 60 seconds to run, there could potentially be hundreds of instances of the same query running.
So, we need a solution that either hides or eliminates how slow this query can be, and only runs it (at most) once per new comment.
After discussing this with @matt, we have a couple of ideas:
- Move this query to a
wp_schedule_single_event()
call, which has the bonus of ensuring only one is scheduled at any given time. The downside is that it may cause the cache to be outdated on a low traffic site.
- Keep a queue of recent comments in cache, and push a new one onto the queue when posted. This avoids the query entirely, but there would be a race condition if two comments were posted at nearly the same time - one of them could be left out of the queue entirely.
Attachments (2)
Change History (17)
#7
@
10 years ago
- Keywords has-patch added
attachment:22301.diff goes with Nacin's suggestion: using a lock so only one reader will update the cache, everyone else will use the old cache.
#8
@
10 years ago
- Keywords commit added; dev-feedback removed
- Milestone changed from Awaiting Review to 4.1
#9
@
10 years ago
- Keywords commit removed
22301.diff won't work because regardless of whether the lock was acquired, the widget()
method will output the data from the _old cache and return, leaving us with a lock that will never be released.
Also, I think that this approach will not work when multiple instances of the same widget are used. We're storing all the cached data under the same key which holds an array with multiple widget IDs, so with the proposed patch it looks like we'll acquire a lock and update only the first instance of the widget.
#10
@
10 years ago
Took a stab in 22301.2.diff. The idea is the same and it also accounts for multiple instances of the widget.
It's not perfect. I think there's still potential for race conditions because the lock is not guaranteed, i.e. if the hard work (comment query) has been done, we save the data in cache, regardless of any locks + the fact that there's no easy way to force wp_cache_get
not to use local cache.
Also, what happens if the comment query times out and halts the php process? We'll end up with a lock that will never be released. How can we clean that up or should we even attempt to? Would scheduling a single event instead solve these issues? Are there other areas in core that could benefit from a similar approach?
#11
@
10 years ago
- Milestone changed from 4.1 to Future Release
The latest patch has some missing braces and needs some testing. Unfortunately we're late in the cycle. Let's try 4.2.
It isn't likely that a low-traffic site is using a persistent cache, so the concern of (1) is minor. In the case of a persistent cache, we could schedule an event, yes, but we could also try to store a "lock" in the persistent cache store. Whichever process successfully sets the lock gets to run the query, and until the query is complete, the old cache gets used. Just an idea.