Ticket #7612 (closed enhancement: fixed)

Opened 3 years ago

Last modified 5 months ago

Tumblr importer

Reported by: hailin Owned by: Otto42
Priority: normal Milestone: WordPress.org
Component: Import Version:
Severity: normal Keywords:
Cc:

Description

Hao wrote a tool to convert Tumblr blog into WordPress XML file, which can be imported into WordPress blogs.

 http://haochen.me/tumblr/ and blogged about it at  http://haochen.wordpress.com/2008/08/19/export-your-tumblr-blog-to-wordpress/

We want to incorporate this into wpcore eventually.

Attachments

tumblr_export.php Download (15.0 KB) - added by hailin 3 years ago.
export code using XML parser

Change History

  • Type changed from defect to enhancement

I've reviewed and experimented the Tumblr export code from Hao.

The current wordpress.org and wordpress.com import code replies on

preg_match to extract tags, largely because of legacy reasons - powerful XML parsing modules such as SimpleXMLElement are not available in PHP 4.x. Using PHP 5.x built-in XML parsers can produce much cleaner and faster code. I can envision that we significantly improve our import code, by rewriting the XML parsing logic, once we switch to PHP 5.x.

Hao's current Tumblr export code replies on SimpleXMLElement, which is the preferred approach. I don't think it's worthwhile to rewrite it using our existing, old preg_match approach.

I've suggested taking a better alternative approach:

Since Tumblr has simple formats, we can directly parse it's xml and create WordPress posts and categories, thus eliminating the intermediary step of exporting it to a WordPress XML file.

The approach would be similar to wp-admin/import/rss.php

where the following functions are used to create post/category:

wp_insert_post($post); wp_create_categories($categories, $post_id);

I would suggest that we wait till we migrate to PHP 5.x to incorporate the Tumblr export code into wp core.

Ryan's comment:

preg_match() is subject to the backtrack limits in php 5, one of the

things that tripped us last time we migrated to php 5. I too think it best to wait for php 5 and use a real parser.

Barry's comment:

I think it was subject to that in php4 as well, just that the default backtrack limit in php 5 it was reduced 10x or 100x.

I too think it best to wait for php 5 and use a real parser.

  • Owner changed from anonymous to hailin
  • Status changed from new to assigned

Matt's comment:

This doesn't mean it shouldn't go in core, just do a detect at the beginning for the needed functions and show a friendly error message if not available. You should drop the patch on a Trac ticket.

Notes by Hailin:

I thought about using XML parsing class like  http://www.criticaldevelopment.net/xml/doc.php. However, this violates the principle of keeping core small, and clean.

Let's see how the alternative approach goes (I believe that is better as it eliminates the middle step, and achieves one-click import).

hailin3 years ago

export code using XML parser

Just leaving a comment here to join in on the discussion.

  • Milestone changed from 2.7 to 2.8

comment:6   ryan3 years ago

  • Component changed from General to Import
  • Milestone changed from 2.8 to Future Release

I don't think this is going anywhere in the 2.8 realm.

closed #7917 as dup (has patches)

  • Owner changed from hailin to Otto42

Working on a real importer for this. Will release it soon. Coordinating with nacin this weekend.

  • Status changed from accepted to closed
  • Resolution set to fixed
  • Milestone changed from Future Release to WordPress.org

This seems fixed. Since 3.0, importers are plugins, and importer for Tumblr is now available.

Note: See TracTickets for help on using tickets.