Make WordPress Core

Opened 16 years ago

Closed 13 years ago

#7612 closed enhancement (fixed)

Tumblr importer

Reported by: hailin's profile hailin Owned by: otto42's profile Otto42
Milestone: WordPress.org Priority: normal
Severity: normal Version:
Component: Import Keywords:
Focuses: Cc:

Description

Hao wrote a tool to convert Tumblr blog into WordPress XML file, which can be imported into WordPress blogs.

http://haochen.me/tumblr/ and blogged about it at
http://haochen.wordpress.com/2008/08/19/export-your-tumblr-blog-to-wordpress/

We want to incorporate this into wpcore eventually.

Attachments (1)

tumblr_export.php (15.0 KB) - added by hailin 16 years ago.
export code using XML parser

Download all attachments as: .zip

Change History (12)

#1 @hailin
16 years ago

  • Type changed from defect to enhancement

I've reviewed and experimented the Tumblr export code from Hao.

The current wordpress.org and wordpress.com import code replies on

preg_match to extract tags, largely because of legacy reasons -
powerful XML parsing modules such as SimpleXMLElement are not
available in PHP 4.x. Using PHP 5.x built-in XML parsers can produce
much cleaner and faster code. I can envision that we significantly
improve our import code, by rewriting the XML parsing logic, once we
switch to PHP 5.x.

Hao's current Tumblr export code replies on SimpleXMLElement, which is
the preferred approach. I don't think it's worthwhile to rewrite it
using our existing, old preg_match approach.

I've suggested taking a better alternative approach:

Since Tumblr has simple formats, we can directly parse it's xml and
create WordPress posts and categories, thus eliminating the intermediary step of exporting it to a WordPress XML file.

The approach would be similar to
wp-admin/import/rss.php

where the following functions are used to create post/category:

wp_insert_post($post);
wp_create_categories($categories, $post_id);

I would suggest that we wait till we migrate to PHP 5.x to incorporate the Tumblr export code into wp core.

#2 @hailin
16 years ago

Ryan's comment:

preg_match() is subject to the backtrack limits in php 5, one of the

things that tripped us last time we migrated to php 5. I too think it
best to wait for php 5 and use a real parser.

Barry's comment:

I think it was subject to that in php4 as well, just that the default backtrack limit in php 5 it was reduced 10x or 100x.

I too think it best to wait for php 5 and use a real parser.

#3 @hailin
16 years ago

  • Owner changed from anonymous to hailin
  • Status changed from new to assigned

Matt's comment:

This doesn't mean it shouldn't go in core, just do a detect at the beginning for the needed functions and show a friendly error message if not available. You should drop the patch on a Trac ticket.

Notes by Hailin:

I thought about using XML parsing class like http://www.criticaldevelopment.net/xml/doc.php. However, this violates the principle of keeping core small, and clean.

Let's see how the alternative approach goes (I believe that is better as it eliminates the middle step, and achieves one-click import).

@hailin
16 years ago

export code using XML parser

#4 @Detect
16 years ago

Just leaving a comment here to join in on the discussion.

#5 @jacobsantos
16 years ago

  • Milestone changed from 2.7 to 2.8

#6 @ryan
16 years ago

  • Component changed from General to Import

#7 @thee17
16 years ago

  • Milestone changed from 2.8 to Future Release

I don't think this is going anywhere in the 2.8 realm.

#8 @Denis-de-Bernardy
16 years ago

closed #7917 as dup (has patches)

#9 @Otto42
14 years ago

  • Owner changed from hailin to Otto42

Working on a real importer for this. Will release it soon. Coordinating with nacin this weekend.

#11 @SergeyBiryukov
13 years ago

  • Milestone changed from Future Release to WordPress.org
  • Resolution set to fixed
  • Status changed from accepted to closed

This seems fixed. Since 3.0, importers are plugins, and importer for Tumblr is now available.

Note: See TracTickets for help on using tickets.