Make WordPress Core

Opened 6 years ago

Last modified 6 years ago

#42517 new defect (bug)

get_file_data() doesn't support the single-line variant of post template headers

Reported by: gschoppe's profile gschoppe Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 4.9
Component: General Keywords: has-patch needs-unit-tests
Focuses: Cc:

Description

get_file_data() is the common function used to retrieve information stored in a file's header comment. The function works fine for Plugin files and Theme style.css files, but is currently not used by the functions enumerating post templates, despite their header format being a clear subset of the same grammar.

The issue comes from the fact that post templates have traditionally supported single-line header comments, of the form:

<?php // Template Name: Full-Width ?>

or even

<? # Template Name: Full-Width ?>

Addressing this inconsistency will allow post templates to use the same file-handling function that other theme/plugin files use, and will significantly reducing the amount of potential data parsed by WP_Theme::get_post_templates() see issue #42513

Attachments (2)

refactor_get_file_data_regex.patch (3.4 KB) - added by gschoppe 6 years ago.
refactoring of get_file_data regex to support single line headers, as seen in template files
42517.diff (4.7 KB) - added by birgire 6 years ago.

Download all attachments as: .zip

Change History (5)

@gschoppe
6 years ago

refactoring of get_file_data regex to support single line headers, as seen in template files

#1 @birgire
6 years ago

  • Keywords has-patch needs-unit-tests added

It sounds reasonable to extend it to support single headers.

Thanks for the patch @gschoppe

The refactor_get_file_data_regex.patch also introduces a new support for opening and closing php tags in multiple lines, like:

<?php // Plugin Name: Test ?>
<?php // Version: 1.2.3 ?>

<?php /* Plugin Name: Test */ ?>
<?php /* Version: 1.2.3 */ ?>

<?php # Plugin Name: Test ?>
<?php # Version: 1.2.3 ?>

I'm not sure what the header-concensus is regarding that or how the submitted plugin/theme headers are parsed in wordpress.org? But maybe that could be adapted if needed?

Another approach that comes to mind, if we only want to allow a single opening php tag and a possible single closing php tag, around the header, is to replace the first php opening and the first possible closing tag, before running the unchanged match-regex in get_file_data().

Example:

$file_data = preg_replace( '#<\?(php)?#i', '', $file_data, 1 );
$file_data = preg_replace( '#\?>#', '', $file_data, 1 );

We could try to combine it into a single preg replace and the first that comes to mind is something like:

$file_data = preg_replace( '#\s*<\?(?:php)?(.*?)(?:\?>)?#i', '$1', $file_data, 1 );

but that would not replace the optional ?> ;-)

Last edited 6 years ago by birgire (previous) (diff)

#2 @gschoppe
6 years ago

If we were to go that route, I'd recommend removing all php tags, rather than just the first, as currently some php files certainly start with:

<?php if ( ! defined( ABSPATH ) ) die()! ?>
<?php // Template Name: My Template ?>

We could go oldschool and simply run

<?php
str_replace( array( '<?php', '<?', '?>' ), '', $file_data );

The end result shouldn't be effected by whether we remove one or 100 tags, though it may effect speed of algorithm.

Not sure of the exact profiling, but my guess would be that a single initialization of the regex engine would be faster than multiple, or than a str_replace and a regex, but I am not certain of that.

I'm also not certain of the weight of loading the regex engine multiple times vs running a regex that includes look-ahead queries, like mine.

My knee jerk is to believe that a single regex is faster over an 8k dataset, even with look-aheads.

@birgire
6 years ago

#3 @birgire
6 years ago

Good point @gschoppe regarding the defined-check.

Profiling would be interesting.

42517.diff contains the "old school" replacements. Not all files are php files, so maybe an additional check before the replacements?

It contains a unit test, draft example, that could be extended further. It also merged the two existing get_file_data() tests, into the new data provider.

It would be handy if something like vfsStream existed for core tests. Some discussions about it in #29544.

Another thought was to write to a temp file, but probably the way to go would be to just include new test files in DIR_TESTDATA.

Note: See TracTickets for help on using tickets.