WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 2 years ago

Last modified 2 years ago

#15203 closed defect (bug) (fixed)

Export function does not properly escape ]]> (CDATA)

Reported by: ceefour Owned by: duck_
Milestone: 3.4 Priority: normal
Severity: normal Version: 3.0.1
Component: Export Keywords: has-patch
Focuses: Cc:

Description

  1. Create a post with <![CDATA[ ... ]]> in it (raw form)
  2. Export the WordPress data as WXR.

The resulting WXR is not well-formed and is not readable.

Attachments (4)

wordpress_export_cdata.patch (459 bytes) - added by ceefour 4 years ago.
Patch to fix export (this makes WXR readable to XML-compliant importers, but perhaps not WordPress's own importer)
15203.importer.diff (697 bytes) - added by duck_ 3 years ago.
Regex parser need to understand escaped ]]>
15203.diff (654 bytes) - added by duck_ 3 years ago.
15203.importer.2.diff (865 bytes) - added by duck_ 3 years ago.
It needs to work for old exports too

Download all attachments as: .zip

Change History (15)

comment:1 ceefour4 years ago

  • Cc ceefour added

Related to #7400 (Import hardcodes CDATA syntax)

comment:2 ceefour4 years ago

]]>

should be escaped as

]]]]><![CDATA[>

comment:3 ceefour4 years ago

File to fix is wp-admin/includes/wordpress.php in wxr_cdata() function

ceefour4 years ago

Patch to fix export (this makes WXR readable to XML-compliant importers, but perhaps not WordPress's own importer)

comment:4 ceefour4 years ago

  • Keywords patch added

comment:5 duck_4 years ago

  • Keywords has-patch added; wxr export cdata escape backup patch removed

I had thought of this whilst preparing for #15197, but obviously forgot about it. Though I cannot reproduce using your steps above, since the > is encoded as &gt; in the output file, there is a possibility of this happening in other circumstances (see #15294).

Before this can go in, I believe the regular expression based importer might have to be updated.

comment:6 westi3 years ago

  • Owner set to duck_
  • Status changed from new to assigned

duck_ is this ticket still relevant or can it be closed?

duck_3 years ago

Regex parser need to understand escaped ]]>

duck_3 years ago

duck_3 years ago

It needs to work for old exports too

comment:7 duck_2 years ago

In [19858]:

Export valid XML by escaping the closing CDATA sequence "]]>". Props ceefour. See #15203.

comment:8 duck_2 years ago

  • Milestone changed from Awaiting Review to 3.4

comment:9 duck_2 years ago

In [19859]:

Bump WXR_VERSION because of r19858 which affects the regex based importer. See #15203.

comment:10 duck_2 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

comment:11 archon8102 years ago

  • Cc admin@… added

Just wanted to stop by here and say that I had the issue due to having CDATA tags in post content, which broke XML, applied the fix manually (as WP 3.4 isn't out yet), and verified the final XML validates. Thanks, everyone.

Note: See TracTickets for help on using tickets.