WordPress.org

Make WordPress Core

Opened 7 years ago

Closed 7 years ago

#3570 closed defect (bug) (worksforme)

WXR imports partial entries only

Reported by: pdenapo Owned by: foolswisdom
Milestone: Priority: low
Severity: normal Version: 2.1
Component: Administration Keywords: WXR
Focuses: Cc:

Description

I've tried to import the rss file with my posts in http://byteslibres.wordpress.com, into a local server running wordpress-2.0.6
on a Gentoo-linux x86 host.
[ I don't know which version of wordpress program is the site wordpress.com running)

The problem is that the posts get truncated at the accents (wich are encoded in the iso-8859-1 encoding). Aparently, this confuses the parser during the import process.

The encoding used in the rss is not especified in the xml file... shouldn't be a tag specifying that?

In the enclosed files, you will find two screen snapshots, one from my wordpres.com site, the other from my local server so that you can see the problem

Attachments (5)

wordpress.com-snapshot.png (69.9 KB) - added by pdenapo 7 years ago.
wordpress.com-snapshot.2.png (69.9 KB) - added by pdenapo 7 years ago.
local-wordpress-snapshot.png (60.1 KB) - added by pdenapo 7 years ago.
wordpress.2007-01-10.xml (14.9 KB) - added by pdenapo 7 years ago.
The file exported by wordpress.com
wordpress.2007-01-12.xml (20.7 KB) - added by pdenapo 7 years ago.
Another file exported by wordpress.com

Download all attachments as: .zip

Change History (19)

comment:1 pdenapo7 years ago

Also the categories didn't get imported correctly!

comment:2 foolswisdom7 years ago

  • Milestone set to 2.1
  • Version changed from 2.0.6 to 2.1

Thank you for participating in WordPress!

WordPress.com runs trunk (WP 2.1)

It is best to focus on one specific issue in each ticket. Here we could first look at if/why WordPress (2.1) is not including the encoding in the feed.

WORKAROUND
Export your blog at WordPress.com and use http://www.technosailor.com/wordpress-to-wordpress-import/ to import into 2.0.6 .

comment:3 matt7 years ago

We need two things:

  • Could you upload the export file so we test it?
  • Let us know what version of PHP you're using.

comment:4 foolswisdom7 years ago

  • Milestone changed from 2.1 to 2.2
  • Owner changed from anonymous to foolswisdom

pdenapo7 years ago

The file exported by wordpress.com

pdenapo7 years ago

Another file exported by wordpress.com

comment:5 pdenapo7 years ago

I'm using php 5.1.6-r6
Of the two xml files exported by wordpress.com, the one that I've tried is
the second one

comment:6 rob1n7 years ago

  • Keywords WXR added
  • Summary changed from Problems importing data from my wordpress.com blog into my local server runing wordpress to WXR trips up on accented characters

comment:7 rob1n7 years ago

  • Summary changed from WXR trips up on accented characters to WXR imports partial entries only

comment:8 rob1n7 years ago

So, I've narrowed it down to the fact that WXR somehow messes up UTF-8 special char encoding. I can't import the entries completely either. They stop at the first accented character.

comment:9 ryan7 years ago

I used iconv to convert to utf8 and got it to work.

iconv -f  ISO_8859-1 -t UTF-8 wordpress.2007-01-12.xml > wordpress-iconv.xml

I think we need to put the <?xml> header in the export file so that we get the encoding specification and use iconv (if installed) to convert to the blog_charset of the target blog.

comment:10 follow-up: rob1n7 years ago

Export now has <?xml?> header with the encoding of the charset option. ([5263])

comment:11 ryan7 years ago

  • Milestone changed from 2.2 to 2.3

comment:12 in reply to: ↑ 10 Nazgul7 years ago

Replying to rob1n:

Export now has <?xml?> header with the encoding of the charset option. ([5263])

Does this fix this issue?

comment:13 ryan7 years ago

  • Milestone changed from 2.3 to 2.4 (next)

comment:14 foolswisdom7 years ago

  • Milestone 2.4 (next) deleted
  • Resolution set to worksforme
  • Status changed from new to closed

Closing as works for me for now, b/c no reply from original reporter after change in the implementation.

Note: See TracTickets for help on using tickets.