Make WordPress Core

Opened 5 years ago

Last modified 5 years ago

#48106 new defect (bug)

Revisit post GUID sanitization on `&`

Reported by: zzxiang's profile zzxiang Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 5.2.3
Component: Posts, Post Types Keywords:
Focuses: Cc:

Description

The source code of core which needs to be revisit

When a new post/attachment is inserted into the database, the post GUID is sanitized and the & character in GUID is converted to &.

More specifically, in wp-includes/default-filters.php, function wp_filter_kses is added as a default pre_post_guid filter.

// Save URL
foreach ( array( 'pre_comment_author_url', 'pre_user_url', 'pre_link_url', 'pre_link_image',
        'pre_link_rss', 'pre_post_guid' ) as $filter ) {
        add_filter( $filter, 'wp_strip_all_tags' );
        add_filter( $filter, 'esc_url_raw'       );
        add_filter( $filter, 'wp_filter_kses'    );
}

Before a post GUID is saved, function wp_filter_kses in wp-includes/kses.php is called, and eventually function wp_kses_normalize_entities does the real conversion, so that & is converted to &.

The problem it causes

The plugin External Media without Import (https://wordpress.org/plugins/external-media-without-import/) inserts external image URLs as post GUIDs into database so that users can add external images into their media libraries without actually uploading the image files to their WordPress servers. If the image URL contains &, such as

https://pbs.twimg.com/media/D_NKa3yWkAYwZwn?name=900x900&format=png

it is converted to

https://pbs.twimg.com/media/D_NKa3yWkAYwZwn?name=900x900&format=png

The result is that the image is not correctly displayed in some places, such as the media library page of the admin dashboard.

There're also other plugins, such as Imposer (https://github.com/dirtsimple/imposer/) and Postmark (https://github.com/dirtsimple/postmark/), encountering the same issue. Imposer fixes the issue by forcing to save post GUIDs again with the unsanitized version. I think it is equivalent to removing wp_filter_kses from the default pre_post_guid filters.

The reason of post GUID sanitization

Post GUID sanitization was added with a commit in 2011: https://github.com/WordPress/WordPress/commit/81a5f821fbfb63be6c5517d033b8e7a0a4172f07. The commit log message does not state why post GUIDs need to be sanitized on save and display. Also, the commit is so long time ago that seems that even the members of the core channel of WordPress Slack group can't tell the reason.

At first it was thought that it is because when exporting RSS feeds, & needs to be converted due to XML specification. But I did some experiments and inspected the core source code, and found that in fact WordPress core does convert & to & while exporting RSS2 feed, even if I changed the & back to & in the database via MySQL client. The convertion is done by function wptexturize in wp-includes/formatting.php. The function is added as a default the_content filter.

So I really don't understand why post GUIDs should be sanitized, especially for the & issue. This might be a core issue rather than a plugin issue. It might be fine to not add wp_filter_kses as a default pre_post_guid, i.e. not do the post GUID sanitization.

This issue has also been discussed here: https://github.com/zzxiang/external-media-without-import/issues/17

Change History (2)

#1 in reply to: ↑ description @SergeyBiryukov
5 years ago

  • Component changed from Post Formats to Posts, Post Types

Hi there, welcome to WordPress Trac! Thanks for the report.

Replying to zzxiang:

Post GUID sanitization was added with a commit in 2011: https://github.com/WordPress/WordPress/commit/81a5f821fbfb63be6c5517d033b8e7a0a4172f07. The commit log message does not state why post GUIDs need to be sanitized on save and display. Also, the commit is so long time ago that seems that even the members of the core channel of WordPress Slack group can't tell the reason.

Per the release post for WordPress 3.1.3, this appears to be a part of "Various security hardening" and "Media security fixes" items on the list. This predates our current security program, so getting more details is indeed a non-trivial task.

I guess contacting the Security Team would be the way forward here, since any changes would need to be carefully reviewed to avoid regressions.

#2 @zzxiang
5 years ago

Thanks for your reply.

How to contact the security team? I didn't find the contact information in the security team blog page you provided.

Note: See TracTickets for help on using tickets.