Ticket #9633 (closed defect (bug): fixed)

Opened 3 years ago

Last modified 2 years ago

wp:meta_value does not escape correctly

Reported by: gslin Owned by: westi
Priority: normal Milestone: 3.0
Component: Export Version:
Severity: normal Keywords: has-patch commit early
Cc:

Description

In wp-admin/includes/export.php:

<wp:meta_value><?Php echo $meta->meta_value; ?></wp:meta_value>

Since meta_value might have "&" or "<" this kind of special chars, we need to escape it:

<wp:meta_value><?php echo htmlspecialchars($meta->meta_value); ?></wp:meta_value>

Attachments

9633.patch Download (531 bytes) - added by hakre 3 years ago.
9633.2.patch Download (524 bytes) - added by hakre 3 years ago.
CDATA to the rescue

Change History

  • Milestone changed from Unassigned to 2.8

hakre3 years ago

  • Keywords has-patch added; needs-patch removed

shouldn't that be html entities, even?

Well, send me the DTD for that export-xml and I can tell you. normally there is no problem with having elements inside other elements within xml.

I got an export file from our customer, the file's wp:meta_value has serialized value, like this one:

<wp:meta_value>a:9:{s:5:"title";s:61:"[keep private]";s:11:"description";s:744:"[some content with & char]";s:4:"date";s:14:"[keep private]";s:6:"author";s:11:"[keep private]";s:14:"author_profile";s:40:"[keep private]"";s:8:"duration";s:5:"05:10";s:9:"embedcode";s:198:"[some content with < char]";}</wp:meta_value>

  • Keywords needs-patch added; has-patch removed

I believe htmlentities should be used, or the thing should be put within a cdata. else, any meta value with invalid chars and invalid named entities will break the feed just as well

  • Status changed from new to closed
  • Resolution set to wontfix

generally html_specialchars() should be sufficent because the export names the charset used and therefore html_enteties() is not needed.

but this is quite theoretical, most important is the import. there is not a problem with that invalid xml export, wordpress does not take care for validity on the import side nor does it have a problem with & characters inside the content:

// Now for post meta
preg_match_all('|<wp:postmeta>(.*?)</wp:postmeta>|is', $post, $postmeta);

/wordpress-trunk/wp-admin/import/wordpress.php ~ line 533

so there is no bug with the export. additionally, the import does not decode anything. therefore, the content should not be encoded on export, wether be it with html_specialchar() nor with html_entities(). this would break the import.

i suggest to close as won't fix now. the other suggestion is to improve the import as well but that needs a bigger patch and more developer feedback. without adoption in the immport a fix here should not considered to be usefull because it will break the import.

gerenally i think this should be fixed in import and export. feel free to reopen and provide more information.

  • Milestone 2.8 deleted

agreed to a large extent -- importer and exporter should get fixed together in 2.9 or 3.0.

hakre3 years ago

CDATA to the rescue

  • Milestone set to 2.8

The propper way in 2.8 is CDATA. For the rewrite of import and export I suggest to open another ticket.

  • Milestone 2.8 deleted

there is another ticket for import/export already, with a milestone of future. it initially dealt with beautifying the export, and Ryan highlighted that it might break the importer, which is regex based. he also hinted that there is no point in changing something that works, even if it doesn't technically validate against a dtd.

  • Keywords has-patch added; needs-patch removed
  • Status changed from closed to reopened
  • Resolution wontfix deleted
  • Milestone set to 2.8

-1 for close. CDATA does not break the importer. +1 for the rest as in #8471.

  • Keywords commit added

let's see if it generates any interest

traction?

  • Keywords early added
  • Milestone changed from 2.8 to 2.9

supposedly not for 2.8

patch still applies clean

  • Owner set to westi
  • Status changed from reopened to accepted
  • Milestone changed from 2.9 to 3.0

Probably relates to #10619 as well.

We should fix both of these early in 3.0

This patch looks ok and tests out well for me.

Old versions of WordPress back to at least 2.7.0 (I only tested this far back) will import these encoded values fine.

I added some simple test cases for the import of these files to wordpress-tests and also discovered #11574 in the process.

  • Status changed from accepted to closed
  • Resolution set to fixed

(In [12506]) Wrap the content of <wp:meta_value></wp:meta_value> in CDATA for WXR exports. Fixes #9633 props hakre.

Note: See TracTickets for help on using tickets.