#23231 closed defect (bug) (fixed)
preprocess_comment should get the normalised comment data
Reported by: | westi | Owned by: | dkotter |
---|---|---|---|
Milestone: | 4.0 | Priority: | low |
Severity: | minor | Version: | 3.5 |
Component: | Comments | Keywords: | westi-like has-patch good-first-bug |
Focuses: | Cc: |
Description
In wp_new_comment we normalise some data like user_ID/user_id so that we support people providing both versions but we fire the filter to preprocess the comment first.
This means if someone writes a filter they have to re-implement this normalisation if they want to filter the comment data properly.
An example of where this currently goes wrong is when Akismet tries to fetch the roles of the user submitting the comment - http://plugins.trac.wordpress.org/browser/akismet/tags/2.5.7/akismet.php#L332
This only works if the comment comes via a route that supplies the data correctly and not one that doesn't.
Attachments (3)
Change History (16)
#2
@
11 years ago
- Keywords has-patch added; needs-patch removed
Added a patch that normalizes the user_ID data before sending on to the preprocess_comment filter. I wasn't sure whether we should leave the normalizing of the same data after the filter, as it ends up with duplicate code just a few lines apart, but for now I've left that there. My thought was someone could change the data in the filter call, so re-normalizing it might make sense.
#3
@
11 years ago
- Keywords good-first-bug added; easy removed
- Owner set to dkotter
- Status changed from new to assigned
#5
@
11 years ago
Hi dkotter, this patch looks like a good start. I think we could probably split the code after the filter into two parts — normalization and sanitization. Standardize on user_id before the filter, cast it to an integer after. Basically, take the if statement and move it up, while leave the elseif there.
#6
@
11 years ago
Alright, added a new patch, made the changes as requested, which I think is a better approach. Again, let me know if more is needed here. Thanks!
#7
@
11 years ago
This is exactly what I was envisioning. The patch is perfect. I did however catch an odd situation. Imagine:
- Function is passed user_id 1
- Filter sets user_ID 2
With the current code, the function takes user_ID 2 and overrides user_id 1 with it. With the patched code, the function will end up with user_id 1. By normalizing user_id we lose the fact that user_ID could have been set by the filter.
(Also, with the initial patch, a user_ID would have become a user_id, then have been passed to the filter. The original user_ID will then override any changes the filter makes to user_id.)
I'll be honest, I don't entirely care about breaking this, but we still probably shouldn't. Open to suggestions.
#8
follow-up:
↓ 9
@
11 years ago
Hmm.. yeah. That's probably (hopefully?) an edge case and seems more like someone shooting themselves in the foot, trying to change the user ID of a comment, but it is a possibility and something we should probably try and avoid.
The only thought I had was to keep the normalization part above the filter, setting both user_id and user_ID there, then after the filter, check if those values are equal. If they're not, then one of those was changed in the filter so at that point, normalize the data again, so we don't end up with two different values. The problem I see there is there's not a way I could think of to tell which one was changed and which one wasn't, so as you mention, if someone changes user_id and not user_ID, and we normalize back to user_ID, they lose their changes.
So not sure if there is a good answer here. We don't want two separate ID's set, but not sure if we want to "guess" on what value (either user_id or user_ID) someone actually wanted to use.
#9
in reply to:
↑ 8
;
follow-up:
↓ 10
@
11 years ago
Replying to dkotter:
The problem I see there is there's not a way I could think of to tell which one was changed and which one wasn't
We could probably just save the initial values to separate variables and then compare them to filtered values.
#10
in reply to:
↑ 9
@
11 years ago
Replying to SergeyBiryukov:
We could probably just save the initial values to separate variables and then compare them to filtered values.
Ah, yeah, not sure why I didn't think of that. Most recent patch takes that approach. Normalizes data before filter, saving user_id to a variable, then after filter, checks if it's been changed, and if so, re-normalizes things.
#12
@
11 years ago
There's a PHP warning when $commentdata
doesn't contain the user_ID or user_id
fields. For example, in this line there's likely a PHP warning https://core.trac.wordpress.org/browser/trunk/src/wp-trackback.php#L113
There is some history as to why we support both in #11222 / [12267] / [12300]