Opened 19 years ago
Closed 18 years ago
#3570 closed defect (bug) (worksforme)
WXR imports partial entries only
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | Priority: | low | |
| Severity: | normal | Version: | 2.1 |
| Component: | Administration | Keywords: | WXR |
| Focuses: | Cc: |
Description
I've tried to import the rss file with my posts in http://byteslibres.wordpress.com, into a local server running wordpress-2.0.6
on a Gentoo-linux x86 host.
[ I don't know which version of wordpress program is the site wordpress.com running)
The problem is that the posts get truncated at the accents (wich are encoded in the iso-8859-1 encoding). Aparently, this confuses the parser during the import process.
The encoding used in the rss is not especified in the xml file... shouldn't be a tag specifying that?
In the enclosed files, you will find two screen snapshots, one from my wordpres.com site, the other from my local server so that you can see the problem
Attachments (5)
Change History (19)
#2
@
19 years ago
- Milestone set to 2.1
- Version changed from 2.0.6 to 2.1
Thank you for participating in WordPress!
WordPress.com runs trunk (WP 2.1)
It is best to focus on one specific issue in each ticket. Here we could first look at if/why WordPress (2.1) is not including the encoding in the feed.
WORKAROUND
Export your blog at WordPress.com and use http://www.technosailor.com/wordpress-to-wordpress-import/ to import into 2.0.6 .
#3
@
19 years ago
We need two things:
- Could you upload the export file so we test it?
- Let us know what version of PHP you're using.
#5
@
19 years ago
I'm using php 5.1.6-r6
Of the two xml files exported by wordpress.com, the one that I've tried is
the second one
#6
@
19 years ago
- Keywords WXR added
- Summary changed from Problems importing data from my wordpress.com blog into my local server runing wordpress to WXR trips up on accented characters
#7
@
19 years ago
- Summary changed from WXR trips up on accented characters to WXR imports partial entries only
#8
@
19 years ago
So, I've narrowed it down to the fact that WXR somehow messes up UTF-8 special char encoding. I can't import the entries completely either. They stop at the first accented character.
#9
@
19 years ago
I used iconv to convert to utf8 and got it to work.
iconv -f ISO_8859-1 -t UTF-8 wordpress.2007-01-12.xml > wordpress-iconv.xml
I think we need to put the <?xml> header in the export file so that we get the encoding specification and use iconv (if installed) to convert to the blog_charset of the target blog.
#10
follow-up:
↓ 12
@
19 years ago
Export now has <?xml?> header with the encoding of the charset option. ([5263])
Also the categories didn't get imported correctly!