Ticket #8019 (closed defect (bug): fixed)

Opened 16 months ago

Last modified 7 weeks ago

WXR import incorrectly handles comment ID's, Results in threading being lost

Reported by: DD32 Owned by: joostdevalk
Priority: high Milestone: 2.9
Component: Import Version: 2.7
Severity: major Keywords: needs-testing
Cc: WordPress@…

Description

The Export'd WXR files & Import process contains the original Comment ID, and Comment Parent ID allready. However, Upon import, the Comment ID's are re-assigned, however the parent ID's are not.

The result is that the comment threading is lost entirely.

Eg: Previously comment 1234 was in reply to comment 1222, however, comment 1222 is now comment 22, but comment 1234 still has '1222' as its parent_id.

2 options are possible:

  • Remapping the comment parent ID's
  • Preserving the comment ID's

The only issue i see with preserving the comment ID's is that importing into another blog with comments allready could cause conflicts, which would require remapping of some comment ID's regardless.

Attachments

wordpress.php.diff Download (3.5 KB) - added by joostdevalk 8 months ago.
Patch
testwxr.xml Download (9.7 KB) - added by joostdevalk 8 months ago.
WXR File I used to test the patch
8019.diff Download (3.6 KB) - added by ryan 2 months ago.

Change History

 

ryan15 months ago

[9100] is how we fixed this for posts.

 

ryan15 months ago
  • keywords early added
  • milestone changed from 2.7 to 2.8

Too big a job for 2.7. Let's shoot for early 2.8 with a possible backport to a 2.7 dot release.

  • milestone changed from 2.8 to Future Release

Punting due to feature freeze. Reconsider with next release.

 

tott9 months ago

I am currently testing a patch that preserves the comment_IDs where possible. This solves most of the issues. I will attach the patch after some more testing.

 

lloydbudd9 months ago
  • owner set to tott
  • priority changed from normal to high
  • status changed from new to assigned
  • milestone changed from Future Release to 2.9

With threaded commenting in core with 2.7, now more threading in the wild. Raising priority. Depending on 2.8 release status and nature of the solution should consider getting in 2.8 .

Patch

  • keywords has-patch added; needs-patch early removed
  • owner changed from tott to joostdevalk

WXR File I used to test the patch

  • keywords needs-testing added
  • milestone changed from 2.9 to 2.8

I'd like to get this in in 2.8, I think it's a pretty simple and solid patch which solves a major annoyance...

  • keywords commit added

IRC discussion for reference:

[09:30] <jdevalk> basically threw all comments into an array, sort them by id, remapped the wxr id's to the new id's and attach the correct parent
[09:31] <jdevalk> also working on a plugin that let's you import wxr files that are on the server, so you're not capped by HTTP upload restraints
[09:31] <ddebernardy> I take it you tested before/after with your data already, no?
[09:31] <jdevalk> yes.
[09:33] <ddebernardy> patch's logic seems correct
[09:33] <jdevalk> yeah it's not too hard
[09:34] <ddebernardy> an alternative method, which might be more robust, could be re-key the comments array based on parents
[09:34] <ddebernardy> array ( comment_parent => comments )
[09:34] <ddebernardy> and then do a double foreach loop
[09:34] <jdevalk> not sure i follow? :)
[09:35] <ddebernardy> well, it would cover the case where a weirdo plays with his database
[09:35] <ddebernardy> and ends up with comment IDs in an order than hasn't much to do with the comments date or hierarchy
[09:35] <ddebernardy> but it's such an edge case that we can ignore it imo
[09:36] <jdevalk> yeah i think so too

Safe to commit imo.

  • milestone changed from 2.8 to 2.9

can wait until 2.9

  • milestone changed from 2.9 to 2.8

unless someone wants to try it...

  • keywords early added
  • milestone changed from 2.8 to 2.9

per discussion in IRC

  • keywords needs-patch added; has-patch needs-testing commit early removed

let's give the double array approach a try, just in case.

http://core.trac.wordpress.org/ticket/8019#comment:9

follow-up: ↓ 15 | 6 months ago  

ocean906 months ago

I want to test the patch but I get an error on WP and WPMU 2.8.4:

Parse error: syntax error, unexpected T_VARIABLE, expecting T_FUNCTION in .../wordpress.php on line 529

Is there something new? Would be nice if the bug can be fixed.

in reply to: ↑ 14 | 6 months ago  

ocean906 months ago

Sorry, my mistake.

It works now. Thanks.

follow-up: ↓ 17 | 5 months ago  

michaelh5 months ago

Assuming the correct advice was provided, this fix didn't solve the problem according to this Forum thread:

 http://wordpress.org/support/topic/312843

in reply to: ↑ 16 | 5 months ago  

michaelh5 months ago

Replying to michaelh:

Assuming the correct advice was provided, this fix didn't solve the problem

Retracting that statement as the patch wordpress.php.diff does in fact fix the threaded comment problem at 2.8.4:

 http://wordpress.org/support/topic/312843?replies=17#post-1225031

follow-up: ↓ 20 | 2 months ago  

azaozz2 months ago
  • keywords needs-testing added; needs-patch removed

Is this good to go?

if we make the assumption that they're tossed in, in the correct order, it worked fine a few months back.

there were some talks of using a double foreach loop in the event they weren't. but it's mostly an improvement over the current patch.

in reply to: ↑ 18 | 2 months ago  

tott2 months ago

Replying to azaozz:

Is this good to go?

I tested against current trunk. Works fine as import on an empty blog as well as when importing on a blog which already has post/comment ids provided in the import. This should cover most cases and all further items would be improvements to this patch.

ryan2 months ago

 

ryan2 months ago

Refreshed patch that applies against trunk. Needs testing.

 

ryan2 months ago
  • status changed from assigned to closed
  • resolution set to fixed

(In [12301]) Preserve parentage when importing comments. Props joostdevalk. fixes #8019

  • cc WordPress@… added

I assume that blogs which already have imported content will have to delete and re-import their comments to receive any benefits from this patch?

Note: See TracTickets for help on using tickets.