Opened 6 years ago

Closed 6 years ago

#4613 closed defect (bug) (fixed)

Entities HTML causes problems in WXR Import !

Reported by: momo360modena Owned by: westi
Priority: normal Milestone: 2.3
Component: Administration Version: 2.2.1
Severity: major Keywords: import, wxr, xml has-patch
Cc:

Description

I have a problem with importer.

If categories have entites HTML, as "&" or "&" in name
During importing, Importer create "a zillion duplicate categories"

Because, in importer you clean title with

$categories[$cat_index] = $wpdb->escape($this->unhtmlentities(str_replace(array ('<![CDATA[', ']]>'), '', $category)));

And after the query return false...

Example : In DB you have
Category NAME : 'Toto &amp; Blurps'
In Query (l.319) : 'Toto & Blurps'

"SELECT cat_ID FROM $wpdb->categories WHERE cat_name = '$category'"

See patch for one possible solution.
See XML for an example of the problem...

Attachments (1)

fix_import.patch (760 bytes) - added by momo360modena 6 years ago.

Download all attachments as: .zip

Change History (9)

  • Keywords wxr added; wsr removed
  • Milestone changed from 2.2.2 to 2.2.3
  • Keywords has-patch added
  • Milestone changed from 2.2.3 to 2.3 (trunk)
  • Owner changed from anonymous to westi
  • Status changed from new to assigned

I'll take a look at this.

Is it possible to have an example import file with the issue?

comment:5   ryan6 years ago

The WP importer needs to be updated to use taxonomy and the sanitize term API.

comment:6   ryan6 years ago

(In [5937]) Update WP importer to use taxonomy and query cat based on slug. see #4613

comment:7   ryan6 years ago

Try that out.

  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.