Importer tries to open XML file in wrong location on Windows

I downloaded an export file from my WordPress.com sandbox and attempted to import it into my localhost trunk install. Doing so threw the following error:

Warning: gzopen(D:\Webserver\htdocs\wordpress-dev/wp-content/uploads/D:Webserverhtdocswordpress-dev/wp-content/uploads/2010/11/wordpress.2010-11-05.xml_.txt) [function.gzopen]: failed to open stream: Invalid argument in D:\Webserver\htdocs\wordpress-dev\wp-content\plugins\wordpress-importer\wordpress-importer.php on line 88
Call Stack
#	Time	Memory	Function	Location
1	0.0008	158712	{main}( )	..\admin.php:0
2	0.2823	24185272	call_user_func ( )	..\admin.php:207
3	0.2823	24185456	WP_Import->dispatch( )	..\admin.php:0
4	0.2828	24186504	WP_Import->import( )	..\wordpress-importer.php:892
5	0.2847	24193184	WP_Import->import_file( )	..\wordpress-importer.php:838
6	1.7980	24426448	WP_Import->get_entries( )	..\wordpress-importer.php:847
7	1.7980	24428304	WP_Import->fopen( )	..\wordpress-importer.php:116
8	1.7980	24428552	gzopen ( )	..\wordpress-importer.php:88

Note the totally whacko path that it's trying to open my XML file from.

I'm using 0.2 of the importer that my install downloaded from the WordPress.org repository.

UPDATE: Fails with the new importer too (see below). Also I'm running PHP 5.2.13.

Fails with the new importer too:

0:0 failed to load external entity "file:///D:/Webserver/htdocs/wordpress-dev/wp-content/uploads/D%3AWebserverhtdocswordpress-dev/wp-content/uploads/2010/11/wordpress.2010-11-05.xml_3.txt"

There was an error when reading this WXR file
Details are shown above. The importer will now try again with a different parser...

( ! ) Warning: gzopen(D:\Webserver\htdocs\wordpress-dev/wp-content/uploads/D:Webserverhtdocswordpress-dev/wp-content/uploads/2010/11/wordpress.2010-11-05.xml_3.txt) [function.gzopen]: failed to open stream: Invalid argument in D:\Webserver\htdocs\wordpress-dev\wp-content\plugins\wordpress-importer\parsers.php on line 575
Call Stack
#	Time	Memory	Function	Location
1	0.0008	157952	{main}( )	..\admin.php:0
2	0.2810	24397488	call_user_func ( )	..\admin.php:207
3	0.2810	24397672	WP_Import->dispatch( )	..\admin.php:0
4	0.2831	24403808	WP_Import->import( )	..\wordpress-importer.php:79
5	0.2832	24405432	WP_Import->import_start( )	..\wordpress-importer.php:89
6	0.2832	24406224	WP_Import->parse( )	..\wordpress-importer.php:109
7	0.2832	24406632	WXR_Parser->parse( )	..\wordpress-importer.php:748
8	0.2840	24411152	WXR_Parser_Regex->parse( )	..\parsers.php:48
9	0.2840	24413504	WXR_Parser_Regex->fopen( )	..\parsers.php:374
10	0.2840	24413728	gzopen ( )	..\parsers.php:575

Sorry, there has been an error.

This does not appear to be a WXR file, missing/invalid WXR version number

This isn't a problem with the WXR importer itself, but the supporting core functions for all importers.

This is mainly just guesswork based on the code since... It looks like the problem is occurring in get_attached_file where the condition on line 165 is being evaluated to true. I believe the problem is being introduced because of the use of addslashes in wp_import_handle_upload. The extra slashes stop the removal of path to the upload dir part (_wp_relative_upload_path) so update_metadata gets the full slashed path, stripslashes this and sends it on to add_metadata which stripslashes again and stores the full path but without the Windows /s. Leading to the failed preg_match I mentioned at the beginning.

tl;dr the root cause seems to be addslashes in wp_import_handle_upload we should probably emulate media_handle_upload more.

Windows /s.

That was supposed to be a backslash...

15325.diff remove addslashes.

For reference changeset [3770] was when addslashes was introduced here.

(In [16608]) Remove unnecessary addslashes. Props duck_. fixes #15325

