WordPress.org

Make WordPress Core

Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#4452 closed defect (bug) (fixed)

wpx can include invalid named entities in comment author name

Reported by: tellyworth Owned by:
Milestone: 2.2.2 Priority: normal
Severity: normal Version: 2.2.1
Component: Administration Keywords:
Focuses: Cc:

Description

Hi,

WP's xml export doesn't currently escape the conents of many fields, including the comment author. If those fields include named HTML entities, that means it's invalid XML. The importer handles it just fine, but some browsers will complain with an error or refuse to download the export file if the XML doesn't validate.

Attached is an example of the problem output, and a patch that uses CDATA escaping on the comment author field. Other fields could be escaped too, but I've limited the change to the one that I've seen cause a problem in the wild.

On the import side, get_tag() will accept CDATA on any field now. It should retain backwards compatibility with export files created prior to this patch.

Attachments (3)

import-cdata-r5694.patch (2.6 KB) - added by tellyworth 7 years ago.
export-error.xml (2.8 KB) - added by tellyworth 7 years ago.
4452-2.diff (554 bytes) - added by foolswisdom 7 years ago.
tellyworth found a problem, this fix from tellyworth fixes the problem importing post body

Download all attachments as: .zip

Change History (15)

tellyworth7 years ago

comment:1 rob1n7 years ago

  • Milestone set to 2.2.2

comment:2 ryan7 years ago

Looks okay to me.

comment:3 ryan7 years ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [5711]) Use CDATA escaping on fields. Props tellyworth. fixes #4452

comment:4 ryan7 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Committed for 2.3. Let's see how it handles and then schedule it for 2.2.2.

foolswisdom7 years ago

tellyworth found a problem, this fix from tellyworth fixes the problem importing post body

comment:5 ryan7 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [5718]) Regex fix. Props tellyworth. fixes #4452

comment:6 foolswisdom7 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened
  • Version set to 2.2.1

Re-open, currently only fixed on trunk.

comment:7 jhodgdon7 years ago

I am not sure whether this should go on the same ticket or a different one, but the comment content is another field that might contain entities. As of [5744], if you add a comment to a post with an entity, such as é or ñ (common in Spanish for accents), your XML export file will not validate, as described in this bug report. So probably the wp:comment field in the export needs to be escaped with CDATA too.

comment:8 jhodgdon7 years ago

  • Cc jhodgdon added

comment:9 foolswisdom7 years ago

Marked #4684 a duplicate, can we get this checked into the 2.2 branch, because current WordPress.com exports imported into 2.2.1 are broken b/c of this fix to trunk.

comment:10 markjaquith7 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [5822]) Use CDATA escaping/unescaping for comment_author. props tellyworth. fixes #4452 for 2.2.x

comment:11 markjaquith7 years ago

(In [5846]) Roll back export portion of #4452 for 2.2.x, see #4452, see #4686

comment:12 markjaquith7 years ago

Note: current status of 2.2.x (starting with 2.2.2) is that its export format is unchanged, but it can handle exports from trunk/WP.com

Note: See TracTickets for help on using tickets.