WordPress.org

Make WordPress Core

#21710 closed enhancement (fixed)

More hooks for WordPress Importer

Reported by: batmoo Owned by:
Milestone: WordPress.org Priority: normal
Severity: normal Version: 3.4.1
Component: Import Keywords: has-patch 2nd-opinion
Focuses: Cc:

Description

It would be nice to add a number of additional hooks to the WordPress Importer so that plugins and other tools (like CLI scripts) can hook and modify imports on-the-fly.

Attachments (1)

21710.1.diff (4.8 KB) - added by batmoo 20 months ago.

Download all attachments as: .zip

Change History (24)

batmoo20 months ago

comment:1 batmoo20 months ago

Patch adds a number of new filters:

  • wp_import_categories (before categories are processed)
  • wp_import_tags (before tags are processed)
  • wp_import_terms (before terms are processed)
  • wp_import_posts (before posts are processed)
  • wp_import_post_data_raw (modify the raw post data pre-insert)
  • wp_import_post_data_processed (modify the processed post data pre-insert)
  • wp_import_post_terms (modify terms for a post)
  • wp_import_post_comments (modify comments for a post)
  • wp_import_post_meta (modify postmeta for a post)

New actions introduced:

  • wp_import_post_exists (when a post already exists)
  • wp_import_insert_post (when a post is successfully inserted)
  • wp_import_insert_term (when a term is successfully inserted)
  • wp_import_insert_term_failed (when a term insert fails)
  • wp_import_set_post_terms (when terms for a post are added)
  • wp_import_insert_comment (when a comment for a post is added)

Modified one existing filter:

  • import_post_meta_key (gets the $post_id and $post params passed in)

comment:2 batmoo20 months ago

Sample code that hooks in to these filters (excuse the PHP 5.3-ness).

Make the import more verbose:

add_action( 'wp_import_insert_post', function( $post_id, $original_post_ID, $postdata, $post ) {
 	if ( is_wp_error( $post_id ) )
 		echo "-- Error importing post: " . $post_id->get_error_code() . PHP_EOL;
 	else
 		echo "-- Imported post as post_id #{$post_id}" . PHP_EOL;
 }, 10, 4 );

Modify post data on the fly:

add_filter( 'wp_import_post_meta', function( $postmeta, $post_id, $post ) {
	$postmeta[] = array(
		'key' => '_imported_author',
		'value' => sanitize_user( $post['post_author'], true ),
	);
	return $postmeta;
}, 10, 3 );

comment:3 follow-up: scribu20 months ago

  • Keywords has-patch 2nd-opinion added

Exporter hooks: #19863

I think this is a bit too much. Why can't you do all these things after the initial import step?

comment:4 batmoo20 months ago

I guess these 3 are a bit redundant and can be handled using wp_import_post_data_processed:

  • wp_import_post_terms
  • wp_import_post_comments
  • wp_import_post_meta

comment:5 in reply to: ↑ 3 batmoo20 months ago

Replying to scribu:

I think this is a bit too much. Why can't you do all these things after the initial import step?

Do you mean using the wp_import_step* actions?

comment:6 follow-up: scribu20 months ago

I don't know what wp_import_step* actions you're talking about.

I was referring to this workflow:

  1. Import data.
  2. Go through all data and make the necessary adjustments (delete unwanted posts, change the tags, etc).

comment:7 danielbachhuber20 months ago

  • Cc danielbachhuber added

comment:8 in reply to: ↑ 6 Viper007Bond20 months ago

Replying to scribu:

I was referring to this workflow:

  1. Import data.
  2. Go through all data and make the necessary adjustments (delete unwanted posts, change the tags, etc).

After-the-fact fixer scripts are super slow when you're dealing with hundreds of thousands of posts. It can literally take multiple days to run and the cache invalidation is not fun.

Doing the fixing in an automated fashion during the import process really is the fastest way.

Can you really have too many hooks?

comment:9 scribu20 months ago

Each call to apply_filters() takes a bit of time, which adds up "when you're dealing with hundreds of thousands of posts".

But more importantly, each hook needs to be maintained as the code around it changes, which gets really tricky sometimes.

So, I'm not saying that more hooks aren't useful here; just that each one needs to be considered carefully.

comment:10 Viper007Bond20 months ago

Not to derail this ticket, but apply_filters() is insanely fast. Like you can run 250,000 calls a second no joke.

But anyway, maintaining filters is understandable but I don't think a blocker. These are power user filters and actions and it would be understandable if they broke.

But right now the alternative is not pretty.

comment:11 DrewAPicture20 months ago

  • Cc xoodrew@… added

comment:12 SergeyBiryukov20 months ago

  • Milestone changed from Awaiting Review to WordPress.org

comment:13 Viper007Bond20 months ago

  • Milestone changed from WordPress.org to Awaiting Review

Unless I'm mistaken, the WordPress.org milestone is for things affecting the actual WordPress.org website. This ticket is about WordPress the software, not WordPress the website.

comment:14 scribu20 months ago

Actually, the WordPress.org milestone is used for plugins supported by the core team as well.

We could have a separate milestone just for those plugins, but what should we call it?

  • "WordPress.org plugins"
  • "Core plugins"
  • "Canonical plugins"

You see the difficulty. :)

Last edited 20 months ago by scribu (previous) (diff)

comment:15 SergeyBiryukov19 months ago

  • Milestone changed from Awaiting Review to WordPress.org

comment:16 stephenh198819 months ago

  • Cc contact@… added

Currently it's very difficult to import custom tables where one of the columns references a core WordPress table. Hooks like wp_import_insert_post would help a lot in keeping check of how posts have been imported (i.e. Old ID -> New ID).

In fact, all that would be required is one action, preferably passing the $wp_import object (as this tracks ID changes), just before wp_import_cleanup() is called in the import_end() method.

At the risk of adding more hooks - another one that I think is needed is an action inside WXR_Parser_SimpleXML::parse to allow the imported file to be parsed for custom data, rather than having to use import_start, access the global $wp_import and re-parse the file.

Couldn't some of the above suggested filters be made redundant by passing $wp_import by reference to import_start?

comment:17 follow-up: danielbachhuber18 months ago

Just to echo the original request — I've been actively using these filters for over six months now and they're indispensable. I look forward to seeing them in the actual plugin so I don't have to maintain the fork :)

comment:18 JustinSainton17 months ago

  • Cc justinsainton@… added

comment:19 tmtrademark15 months ago

  • Cc toby.mckes@… added

comment:20 in reply to: ↑ 17 aaroncampbell15 months ago

  • Cc aaroncampbell added

Replying to danielbachhuber:

Just to echo the original request — I've been actively using these filters for over six months now and they're indispensable. I look forward to seeing them in the actual plugin so I don't have to maintain the fork :)

We've had the need for these recently too. Do you maintain the fork someplace public?

comment:21 danielbachhuber15 months ago

I just went ahead and committed to wpcom trunk :) Of all the ways we've hacked core, I think this is the least egregious

comment:23 westi13 months ago

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.