#52250 closed defect (bug) (fixed)
Standardize sanitization of post title during export
Reported by: | jmdodd | Owned by: | pento |
---|---|---|---|
Milestone: | 5.7 | Priority: | normal |
Severity: | normal | Version: | 5.7 |
Component: | Export | Keywords: | has-patch commit has-dev-note |
Focuses: | Cc: |
Description
Export currently uses apply_filters( 'the_title_rss', $post->post_title )
to sanitize the post title when generating a WXR. This has the side effect of stripping valid HTML tags (reported in #50540) and also creates artificial misses in post_exists tests because newly-encoded characters in the export file do not match those that may already exist in valid titles in the posts table.
An example post title that would have this behavior is: Alice & Bob
. This is encoded in the export file as Alice & Bob
, resulting in a near-duplicate post on import.
Most other character data in the export is wrapped with wxr_cdata(), and both post content and excerpts have a special export-ready filter:
wxr_cdata( apply_filters( 'the_content_export', $post->post_content ) )
This changeset treats post titles like other character data and provides a filter if additional handlers are needed.
Attachments (2)
Change History (9)
#2
@
4 years ago
- Keywords commit needs-dev-note added
This is a nice enhancement and it also bring a better consistency between the data that are sent to the exporter.
Adding needs-dev-note
to make sure it's mentioned into the Miscellaneous Changes dev note.
#5
@
4 years ago
Interestingly, this bug does point to the behaviour of the_title_rss
being incorrect: per the RSS spec, the <title>
field can contain HTML tags, clients should just treat it as plain text, though.
I don't think it's worth changing the behaviour of that filter, though. 🙂
Patch refreshed - @since mention added