WordPress.org

Make WordPress Core

Opened 19 months ago

Last modified 19 months ago

#21913 new enhancement

Detecting MIME Types in WXR Files

Reported by: ReadyMadeWeb Owned by:
Milestone: WordPress.org Priority: normal
Severity: normal Version: 3.4.2
Component: Import Keywords: has-patch dev-feedback 2nd-opinion
Focuses: Cc:

Description

In the process of creating a service to convert TypePad data to WXR formatted files, we've encountered some unique problems with TypePad data. Namely, many TypePad files are saved without file extensions, which prevents the existing importer from importing those files into the wp-content/uploads folder.

In order to import and rename these otherwise ignored files, we've created a patch for the WordPress importer that does the following:

  1. If there is an attachment in the WXR and the importer is not able to determine the file type from the file name (ie missing extension), the patched version will make a light (body-less) request to the web server where the file is hosted for information we can use about the file. The things we're interested in are file type, size, and filename.
  1. If the importer is processing an attachment under the above situation, and it is able to determine the file type, then it will rewrite the local version of the file to have the appropriate file extension.

This is a simple bit of code, but it makes a huge difference as TypePad saves without file extensions quite regularly.

We've attached our patch and a sample WXR file from ragsgupta.com, the Brightcove co-founder's blog.

Attachments (3)

readymadeweb-filetype.patch (3.5 KB) - added by ReadyMadeWeb 19 months ago.
Patch to WordPress Importer
www.ragsgupta.com-16.zip (519.6 KB) - added by ReadyMadeWeb 19 months ago.
Sample WXR File with Missing File Extensions
readymadeweb-filetype-HEAD.patch (4.1 KB) - added by ReadyMadeWeb 19 months ago.

Download all attachments as: .zip

Change History (13)

ReadyMadeWeb19 months ago

Patch to WordPress Importer

ReadyMadeWeb19 months ago

Sample WXR File with Missing File Extensions

comment:1 SergeyBiryukov19 months ago

  • Milestone changed from Awaiting Review to WordPress.org

comment:2 follow-ups: nacin19 months ago

Rather than using cURL, we have an HTTP API. You can use wp_remote_head() to send a HEAD request, then use wp_remote_retrieve_header() on the response to get a specific header.

We also use lowercase true/false/null.

Also, please use svn diff rather than the GNU diff tool to create patches. It can still be applied with patch but contains SVN metadata (and is properly recognized by our bug tracker).

comment:3 ReadyMadeWeb19 months ago

We'll adjust and submit a new patch soon.

comment:4 ReadyMadeWeb19 months ago

Nacin, I've uploaded a new version for your review.

comment:5 ReadyMadeWeb19 months ago

A patch file readymadeweb-filetype-HEAD.patch has been uploaded and is ready for review.

comment:6 in reply to: ↑ 2 ReadyMadeWeb19 months ago

Replying to nacin:

Rather than using cURL, we have an HTTP API. You can use wp_remote_head() to send a HEAD request, then use wp_remote_retrieve_header() on the response to get a specific header.

We also use lowercase true/false/null.

Also, please use svn diff rather than the GNU diff tool to create patches. It can still be applied with patch but contains SVN metadata (and is properly recognized by our bug tracker).

All of these issues have been addressed and a new version of the patch is uploaded. Still not sure if I'm using this forum correctly. Please excuse the excess comments.

comment:7 readymadeweb19 months ago

  • Keywords dev-feedback 2nd-opinion added
  • Version set to 3.4.2

comment:8 readymadeweb19 months ago

  • Cc readymadeweb added

comment:9 in reply to: ↑ 2 readymadeweb19 months ago

New patch is posted!

Replying to nacin:

Rather than using cURL, we have an HTTP API. You can use wp_remote_head() to send a HEAD request, then use wp_remote_retrieve_header() on the response to get a specific header.

We also use lowercase true/false/null.

Also, please use svn diff rather than the GNU diff tool to create patches. It can still be applied with patch but contains SVN metadata (and is properly recognized by our bug tracker).

comment:10 nacin19 months ago

#22157 was marked as a duplicate.

Note: See TracTickets for help on using tickets.