WordPress.org

Make WordPress Core

Opened 3 years ago

Closed 2 years ago

#21710 closed enhancement (fixed)

More hooks for WordPress Importer

Reported by: batmoo Owned by:
Milestone: WordPress.org Priority: normal
Severity: normal Version: 3.4.1
Component: Import Keywords: has-patch 2nd-opinion
Focuses: Cc:

Description

It would be nice to add a number of additional hooks to the WordPress Importer so that plugins and other tools (like CLI scripts) can hook and modify imports on-the-fly.

Attachments (1)

21710.1.diff (4.8 KB) - added by batmoo 3 years ago.

Download all attachments as: .zip

Change History (24)

@batmoo3 years ago

comment:1 @batmoo3 years ago

Patch adds a number of new filters:

  • wp_import_categories (before categories are processed)
  • wp_import_tags (before tags are processed)
  • wp_import_terms (before terms are processed)
  • wp_import_posts (before posts are processed)
  • wp_import_post_data_raw (modify the raw post data pre-insert)
  • wp_import_post_data_processed (modify the processed post data pre-insert)
  • wp_import_post_terms (modify terms for a post)
  • wp_import_post_comments (modify comments for a post)
  • wp_import_post_meta (modify postmeta for a post)

New actions introduced:

  • wp_import_post_exists (when a post already exists)
  • wp_import_insert_post (when a post is successfully inserted)
  • wp_import_insert_term (when a term is successfully inserted)
  • wp_import_insert_term_failed (when a term insert fails)
  • wp_import_set_post_terms (when terms for a post are added)
  • wp_import_insert_comment (when a comment for a post is added)

Modified one existing filter:

  • import_post_meta_key (gets the $post_id and $post params passed in)

comment:2 @batmoo3 years ago

Sample code that hooks in to these filters (excuse the PHP 5.3-ness).

Make the import more verbose:

add_action( 'wp_import_insert_post', function( $post_id, $original_post_ID, $postdata, $post ) {
 	if ( is_wp_error( $post_id ) )
 		echo "-- Error importing post: " . $post_id->get_error_code() . PHP_EOL;
 	else
 		echo "-- Imported post as post_id #{$post_id}" . PHP_EOL;
 }, 10, 4 );

Modify post data on the fly:

add_filter( 'wp_import_post_meta', function( $postmeta, $post_id, $post ) {
	$postmeta[] = array(
		'key' => '_imported_author',
		'value' => sanitize_user( $post['post_author'], true ),
	);
	return $postmeta;
}, 10, 3 );

comment:3 follow-up: @scribu3 years ago

  • Keywords has-patch 2nd-opinion added

Exporter hooks: #19863

I think this is a bit too much. Why can't you do all these things after the initial import step?

comment:4 @batmoo3 years ago

I guess these 3 are a bit redundant and can be handled using wp_import_post_data_processed:

  • wp_import_post_terms
  • wp_import_post_comments
  • wp_import_post_meta

comment:5 in reply to: ↑ 3 @batmoo3 years ago

Replying to scribu:

I think this is a bit too much. Why can't you do all these things after the initial import step?

Do you mean using the wp_import_step* actions?

comment:6 follow-up: @scribu3 years ago

I don't know what wp_import_step* actions you're talking about.

I was referring to this workflow:

  1. Import data.
  2. Go through all data and make the necessary adjustments (delete unwanted posts, change the tags, etc).

comment:7 @danielbachhuber3 years ago

  • Cc danielbachhuber added

comment:8 in reply to: ↑ 6 @Viper007Bond3 years ago

Replying to scribu:

I was referring to this workflow:

  1. Import data.
  2. Go through all data and make the necessary adjustments (delete unwanted posts, change the tags, etc).

After-the-fact fixer scripts are super slow when you're dealing with hundreds of thousands of posts. It can literally take multiple days to run and the cache invalidation is not fun.

Doing the fixing in an automated fashion during the import process really is the fastest way.

Can you really have too many hooks?

comment:9 @scribu3 years ago

Each call to apply_filters() takes a bit of time, which adds up "when you're dealing with hundreds of thousands of posts".

But more importantly, each hook needs to be maintained as the code around it changes, which gets really tricky sometimes.

So, I'm not saying that more hooks aren't useful here; just that each one needs to be considered carefully.

comment:10 @Viper007Bond3 years ago

Not to derail this ticket, but apply_filters() is insanely fast. Like you can run 250,000 calls a second no joke.

But anyway, maintaining filters is understandable but I don't think a blocker. These are power user filters and actions and it would be understandable if they broke.

But right now the alternative is not pretty.

comment:11 @DrewAPicture3 years ago

  • Cc xoodrew@… added

comment:12 @SergeyBiryukov3 years ago

  • Milestone changed from Awaiting Review to WordPress.org

comment:13 @Viper007Bond3 years ago

  • Milestone changed from WordPress.org to Awaiting Review

Unless I'm mistaken, the WordPress.org milestone is for things affecting the actual WordPress.org website. This ticket is about WordPress the software, not WordPress the website.

comment:14 @scribu3 years ago

Actually, the WordPress.org milestone is used for plugins supported by the core team as well.

We could have a separate milestone just for those plugins], but what should we call it?

  • "WordPress.org plugins"
  • "Core plugins"
  • "Canonical plugins"

You see the difficulty. :)

Version 2, edited 3 years ago by scribu (previous) (next) (diff)

comment:15 @SergeyBiryukov3 years ago

  • Milestone changed from Awaiting Review to WordPress.org

comment:16 @stephenh19883 years ago

  • Cc contact@… added

Currently it's very difficult to import custom tables where one of the columns references a core WordPress table. Hooks like wp_import_insert_post would help a lot in keeping check of how posts have been imported (i.e. Old ID -> New ID).

In fact, all that would be required is one action, preferably passing the $wp_import object (as this tracks ID changes), just before wp_import_cleanup() is called in the import_end() method.

At the risk of adding more hooks - another one that I think is needed is an action inside WXR_Parser_SimpleXML::parse to allow the imported file to be parsed for custom data, rather than having to use import_start, access the global $wp_import and re-parse the file.

Couldn't some of the above suggested filters be made redundant by passing $wp_import by reference to import_start?

comment:17 follow-up: @danielbachhuber3 years ago

Just to echo the original request — I've been actively using these filters for over six months now and they're indispensable. I look forward to seeing them in the actual plugin so I don't have to maintain the fork :)

comment:18 @JustinSainton2 years ago

  • Cc justinsainton@… added

comment:19 @tmtrademark2 years ago

  • Cc toby.mckes@… added

comment:20 in reply to: ↑ 17 @aaroncampbell2 years ago

  • Cc aaroncampbell added

Replying to danielbachhuber:

Just to echo the original request — I've been actively using these filters for over six months now and they're indispensable. I look forward to seeing them in the actual plugin so I don't have to maintain the fork :)

We've had the need for these recently too. Do you maintain the fork someplace public?

comment:21 @danielbachhuber2 years ago

I just went ahead and committed to wpcom trunk :) Of all the ways we've hacked core, I think this is the least egregious

comment:23 @westi2 years ago

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.