Make WordPress Core

Opened 14 years ago

Closed 14 months ago

Last modified 7 weeks ago

#12939 closed defect (bug) (invalid)

Counterpart to content_save_pre hook not called when getting post content via API

Reported by: redsweater's profile redsweater Owned by: josephscott's profile josephscott
Milestone: Priority: normal
Severity: normal Version: 2.9.2
Component: XML-RPC Keywords: dev-feedback 2nd-opinion
Focuses: Cc:

Description

When a post is submitted either through the web editor interface or from an API call to newPost or editPost, the content of the post is inevitably passed through the content_save_pre filter.

And when a post is opened for editing in the web editor, the content is passed through filters such as the_editor_content and content_edit_pre.

However, when a post is fetched for editing via XMLRPC API calls such as getPost or getRecentPosts, the content_edit_pre filter is never reached.

This leads to a situation where whatever massaging of the content that a plugin may peform on the way into the database is not reversed on the way back out, for clients of the API. A concrete example of this problem is with the popular Syntax Highlighter Evolved:

http://www.viper007bond.com/wordpress-plugins/syntaxhighlighter/

This plugin performs encoding of the post content before it is stored in the database, and it counts on being able to decode that content by adding filters to hooks such as the_editor_content. However, none of these filters are reached via the API, leading to "corrupted content" when users try to edit posts from API clients such as the iPhone WordPress app, or my desktop editor, MarsEdit.

(My previous bug report #10802 exhibits the same symptoms of this bug, but this is a different cause).

In summary:

  1. WordPress needs to establish a clear, baseline hooks for massaging content before it is saved to the database, and for un-massaging the content on the way out of the database. Currently there seems to be uncertainty about which hooks need to be overridden and under what circumstances. It seems to me that content_save_pre and content_edit_pre are probably good candidates for this.
  1. Whatever hooks are established as the guarantees need to be applied once and only once in both the web-based editing scenario, and in the API editing scenario.

I think that having a well documented pair of hooks for this purpose that works identically in the web editing and API editing cases will do a lot to ensure correct behavior when plugins are installed that massage content, and will make it easy for plugin developers to "do the right thing" without relying on hooks that are specific only to the web editor, or to the API.

Change History (18)

#1 @redsweater
14 years ago

For testing purposes, you can use the syntax highlighter plugin and provide this literal HTML content from the web editor or from the API:

Testing "<

[sourcecode]Testing "<[/sourcecode]

When the Syntax Highlighter plugin saves the post, it encodes the content inside [sourcecode] short tag. When you read the post back into the wordpress HTML editor, it is preserved exactly, because the plugin has the opportunity to decode the content back to original forma.t

If you load the text via the API e.g. wtih getRecentPosts or with getPost, you end up with content:

Testing "<

[sourcecode]Testing "<[/sourcecode]

The encoding that was done by the plugin has not been reversed. Note well that this is NOT simply XML encoding that you would expect from the API. The literal content as it is transferred over the wire is further encoded and then decoded by the client.

#2 @redsweater
14 years ago

I have confirmed that the same problem affects WordPress's AtomPub API implementation.

#3 @mikeschinkel
13 years ago

  • Cc mikeschinkel@… added

#4 @mdawaffe
13 years ago

  • Milestone changed from Awaiting Review to Future Release
  • Status changed from new to reviewing

#5 @solarissmoke
13 years ago

  • Keywords dev-feedback added

The problem is that when done using the API, the filter that gets called is post_content_save_pre and not content_save_pre, and we have no default hooks on that filter. Is there a reason we don't have the same filters on content_save_pre as post_content_save_pre.

#6 @chriscct7
9 years ago

  • Keywords dev-feedback removed
  • Resolution set to invalid
  • Status changed from reviewing to closed

post_content_save_pre was removed but I can't find the commit it was removed or any discussion about the post_content_save_pre on trac to know when it was actually removed. If someone knows the commit or trac issue it was removed in, please add it.

As the filter no longer exists in core, the trac ticket is invalid

#7 @DrewAPicture
9 years ago

  • Keywords dev-feedback added
  • Milestone changed from Future Release to Awaiting Review
  • Resolution invalid deleted
  • Status changed from closed to reopened

To my knowledge, there has never been a post_content_save_pre hook, that is to say {$field}_save_pre. In the case where the field is passed into sanitize_post_field() with a prefix, two filter hooks are evaluated, pre_{$field} -- in this case pre_post_content -- followed by {field_no_prefix}_save_pre, in this case content_save_pre.

Based on the comments, I don't think it was ever actually confirmed whether the correct (or incorrect) filter hooks were fired for XML-RPC requests or not, so moving back to Awaiting Review for investigation.

#8 @redsweater
9 years ago

I can confirm the bug still exists. I just tested with WP trunk from SVN and against the latest version of Syntax Highlighter Evolved.

This is still an issue that my customers run into, and I have to imagine any users of WordPress's iOS app are vulnerable to it. The behavior is subtle but the problems could span a very wide range of severities from mild to extreme, depending on the expectations and assumptions of what filtering will be done in the "edit_pre" filters.

To elaborate on @DrewAPicture's comments in the last update about whether it has even been confirmed that the filters are not called, I can confirm that they are not and I have done a little bit of investigation explaining why not. Note that this is not my area of expertise but it would be valuable to finally see this fixed, so if there's anything I can do to help nudge the bug along I'd like to help.

The core of the problem seems to be that in fact when any posts are fetched for the service of an API call such as wp.getPosts, the lower-level wp_get_recent_posts in post.php, and ultimately the get_post() function call, is reached in such a way that the "context" parameter will always be raw, and thus none of the usual filtering is applied.

By contrast, the get_post() call is reached when editing a post through the web UI, through the explicit call in wp-admin/post.php's "edit:" state case to:

$post = get_post($post_id, OBJECT, 'edit');

So an important question that I think needs to be answered is: should fetches of posts via the API default to massaging the content for 'edit' context? I can see an argument against this, as it all depends on what the purpose of using the API is. However, given that a good number of API clients are in fact using the API to provide a similar editing console to what you find in wp-admin/post.php, I think the API should either default to providing post content with 'edit' context filters applied, or else provide a means for API clients to specify which context mode they are seeking the post content for. For example an API client that uses the data strictly for display may in fact want to specify that the 'display' context is used in fetching the desired posts.

In summary: currently the post content retrieved via the API is 'raw' and not filtered for any context, but this makes it impossible to participate via the API e.g. as a suitable alternative for reliably editing post content.

#9 @redsweater
9 years ago

I am glad to see that the wp-json API under development addresses this issue with the support of a "context" parameter: http://wp-api.org/#posts_retrieve-a-post_input. So it seems likely that for the longer term, this problem will be solved for clients of the JSON API.

The question then in the mean time is whether it would be pragmatic to support a similar kind of "context" option for the WordPress XMLRPC API.

This ticket was mentioned in Slack in #mobile by redsweater. View the logs.


9 years ago

This ticket was mentioned in Slack in #core-restapi by redsweater. View the logs.


9 years ago

#12 @markoheijnen
7 years ago

I will check the code of the REST API to see what efforts need to be done to get this in.

#13 @danielbachhuber
5 years ago

  • Keywords 2nd-opinion added

@redsweater Is this still worth pursuing or should we punt it?

#14 @redsweater
5 years ago

@danielbachhuber I just confirmed again that the bug still exists. It's frustrating, but it's been so many years, maybe in the big scheme of things people just don't care enough for it to be fixed? Over the years, I don't know if the lack of complaints are because people have decided not to use 3rd party apps like MarsEdit when they need this kind of data integrity, or if they are just used to working around the problems.

I would be more comfortable with punting if there was a rosier outlook for moving 3rd party apps to the JSON API. For the foreseeable future, many of us using the XML-RPC API have to count on it as the canonical method for interacting with WordPress. IMO that makes the validity of that API more important than it otherwise would be.

#15 @redsweater
14 months ago

I saw there is an "old tickets" triage going on soon, so in case it helps, I will add that I think that can be closed now. The relevance of the XMLRPC API is diminishing quickly. Even though I and many other still rely on it, I think that the future will be best focused on REST API improvements.

#16 @psykro
14 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

#17 @psykro
14 months ago

  • Resolution changed from fixed to invalid

#18 @swissspidy
7 weeks ago

  • Milestone Awaiting Review deleted
Note: See TracTickets for help on using tickets.