Opened 18 years ago
Closed 18 years ago
#4040 closed defect (bug) (fixed)
WXR importing duplicates categories
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 2.2 | Priority: | low |
Severity: | normal | Version: | 2.2 |
Component: | Administration | Keywords: | has-patch commit |
Focuses: | Cc: |
Description
In admin panel [Manage] - [Import], importing WordPress eXtended RSS (WXR) incorrectly generate duplicated categories with same cat_name.
For example, this occurs under condition below.
At exporting:
cat_ID | cat_name | category_nicename |
3 | 書評 | book-review |
(書評 means book review in Japanese.)
And after importing this WXR file, categories will look like below.
cat_ID | cat_name | category_nicename |
3 | 書評 | book-review |
4 | 書評 | %e6%9b%b8%e8%a9%95 |
Two 書評 categories are generated. One has original "book-review" nicename and another has "%e6%9b%b8%e8%a9%95" which is sanitized string of "書評". And all posts in 書評 category belongs to 書評 category of "%e6%9b%b8%e8%a9%95".
It seems that this duplication occurs only when a category has category_nicename which is not equal to sanitized cat_name.
For easy reproducing of this behavior, I'll attach example WXR file. And I'll attach a patch for this.
Attachments (3)
Change History (12)
#3
@
18 years ago
category_exists() depends on category_nicename to check existence of category. And it uses sanitize_title(cat_name) to get the category_nicename.
Under the example described above, sanitize_title("書評") returns "%e6%9b%b8%e8%a9%95" and because there's been no category with category_nicename "%e6%9b%b8%e8%a9%95" ever, another 書評 category (cat_ID:4) is created.
My patch uses cat_name instead of category_nicename to check category existence.
#8
@
18 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
I reopened to upload additional patch to correct two points below,
- $post_cats should always include $cat_ID, regardless the category is existent or not.
- Replace $post_ID to $post_id.
$post_ID is old ID on WXR file and $post_id is new ID. It's confusing naming (I'm sorry because I named $post_ID). It would be good to give them better names.
WXR file for example. UTF-8 encoded.