Opened 13 years ago
Closed 13 years ago
#17560 closed defect (bug) (fixed)
Severe bug in XMLRPC and Windows Live Writer integration related to to_ping
Reported by: | archon810 | Owned by: | ryan |
---|---|---|---|
Milestone: | 3.4 | Priority: | normal |
Severity: | major | Version: | 3.1.2 |
Component: | XML-RPC | Keywords: | has-patch needs-testing 2nd-opinion early |
Focuses: | Cc: |
Description
I found a pretty serious performance bug with XMLRPC and Windows Live Writer, which is a really popular post software made by Microsoft in use by many large blogs.
In my continuous fight to improve performance, or rather figure out what the heck is badly slowing down my server, I noticed that I get a lot of stuck doing_cron apache processes after posting and the load shoots up.
I ran strace on some apache processes and found that a lot of them are stuck doing do_pings(), rolling through tons of posts using something like:
SELECT to_ping FROM wp_posts WHERE ID = 27823 SELECT pinged FROM wp_posts WHERE ID = 27823
To figure out why a bunch of my wp cron processes are rolling through random posts trying to ping them all the time, I ran a db query and found that 800+ out of 2900 posts have to_ping field set to
(comma separated)
Exactly as above, 2 lines.
So, what I think happens is a bunch of wp cron processes roll through 800 posts all the time and try to ping non-existent servers, which fails. Then the field is not cleared and this happens again and again.
At first, I wasn't able to figure out where this value is coming from, but then it hit me - I was able to trace it back to Windows Live Writer's interface. Take a look: http://farm6.static.flickr.com/5029/5760026936_50895cd36e_b.jpg.
The value here is grayed out, meaning it disappears as soon as you click into it. However, it seems like WLW sends it along anyway, even if the user doesn't fill it in. Then it gets put in the database exactly as above and here we are.
I realize it's not a Wordpress bug per se, but I believe it affects a lot of Wordpress installations that use Windows Live Writer and it's already too late even if MS fixes it. Getting MS to fix it is a whole different business.
So, here are my questions:
- is do_all_pings() really run every time cron runs?
- is it possible that multiple doing_cron processes end up running at the same time due to concurrency issues on very busy servers like mine? I see a ton of them during high load, all running simultaneously. They should really be using a semaphore of some sort.
- can this bug be worked around in Wordpress?
Thank you.
Attachments (5)
Change History (24)
#1
@
13 years ago
- Cc admin@… added
- Summary changed from Severe bug in XMLRPC and Windows Live Writer integration to Severe bug in XMLRPC and Windows Live Writer integration related to to_ping
#3
@
13 years ago
I think WP should clean all the trackback urls that do not start with a protocol when saving the posts.
If nobody proposes a patch, I'll try to make one by tomorrow.
#4
@
13 years ago
- Keywords needs-patch added
- Milestone changed from Awaiting Review to Future Release
What xknown said.
#5
@
13 years ago
Sounds like a good solution, which will be easy to verify using the steps above (post, then update using WLW).
#6
@
13 years ago
Opened up a bug report at WLW forums: http://social.microsoft.com/Forums/en/writergeneral/thread/10354161-2f8b-4b16-a288-fde9e455799c
@
13 years ago
Introduce sanitize_trackback_urls function. I only consider http and https as the protocols that allow trackbacks.
#7
@
13 years ago
Bug #17210 seems to be caused directly by this bug. I've left a comment describing my findings there.
#8
@
13 years ago
17560.first.diff works, except it doesn't alleviate the problem of re-selecting all the same rows every time. Bad data should be cleaned in the posts table after the ping, shouldn't it?
#10
@
13 years ago
I've confirmed that fixing this bug and #17210 that it causes solved my massive load after each post is published problem that's been getting worse and worse over the past months and made me lose a lot of sleep. Wordpress makes sense again!
#12
@
13 years ago
I like this. It probably has hardening benefits. Perhaps esc_url_raw() could be used in some way?
We should probably remove the trim(), though, and do that in the function.
#13
@
13 years ago
Will it make it into 3.3? Combined with the wp-cron locking fixes, which are still very much problematic in 3.2, this would make 3.3 the most exciting release I can remember.
#15
@
13 years ago
Need to be careful with \s
which can break some UTF-8 characters.
17560.4.diff uses \r\n\t
and space.
#16
@
13 years ago
17560.5.diff adds esc_url_raw()
.
I narrowed this down to updates done via Windows Live Writer. Upon publishing, the to_ping value is actually blank, but updates are resulting in to_ping changing to