WordPress.org

Make WordPress Core

Opened 2 years ago

Closed 15 months ago

Last modified 15 months ago

#23587 closed defect (bug) (fixed)

url_to_postid() needs to use # or @ for preg_match start/end character

Reported by: coreygilmore Owned by: nacin
Milestone: 3.9 Priority: normal
Severity: normal Version: 1.5.2
Component: Rewrite Rules Keywords: has-patch commit
Focuses: Cc:

Description

url_to_postid() uses ! for the start/end characters of regular expressions, which doesn't allow you to use expressions containing !, like (?!negativelookahead).

The solution is to use preg_match("#^$match#", $request_match, $matches) or preg_match("@^$match@", $request_match, $matches)

See also: #7486 and WP::parse_request()

Attachments (3)

23587_regex_delimiters.patch (966 bytes) - added by stevenkword 2 years ago.
miqro-23587.patch (780 bytes) - added by miqrogroove 15 months ago.
This would allow any chars in the pattern.
miqro-23587.2.patch (897 bytes) - added by miqrogroove 15 months ago.
Extra paranoid version.

Download all attachments as: .zip

Change History (20)

comment:1 @SergeyBiryukov2 years ago

  • Milestone changed from Awaiting Review to 3.6
  • Version changed from 3.5.1 to 1.5.2

Related: [2609]

comment:2 @stevenkword2 years ago

  • Keywords has-patch added; needs-patch removed

Attached a patch that replaces the regular expression delimiters of "!" with "#" for an instance of preg_match. I also noticed the same pattern with an instance of preg_replace and performed the same substituion. This was done to allow for regular expressions containing the "!" character.

Version 0, edited 2 years ago by stevenkword (next)

comment:3 @stevenkword2 years ago

  • Cc stevenkword added

comment:4 @ryan2 years ago

  • Milestone changed from 3.6 to Future Release

comment:5 follow-up: @nacin17 months ago

#26970 was marked as a duplicate.

comment:6 in reply to: ↑ 5 @Denis-de-Bernardy17 months ago

Replying to nacin:

#26970 was marked as a duplicate.

Replying here

the # delimiter doesn't break anything. It's used in wp-includes/class-wp.php, in the parse_request() function:

foreach ( (array) $rewrite as $match => $query ) {
	// If the requesting file is the anchor of the match, prepend it to the path info.
	if ( ! empty($req_uri) && strpos($match, $req_uri) === 0 && $req_uri != $request )
		$request_match = $req_uri . '/' . $request;

	if ( preg_match("#^$match#", $request_match, $matches) ||
		preg_match("#^$match#", urldecode($request_match), $matches) ) {

		if ( $wp_rewrite->use_verbose_page_rules && preg_match( '/pagename=\$matches\[([0-9]+)\]/', $query, $varmatch ) ) {
			// this is a verbose page match, lets check to be sure about it
			if ( ! get_page_by_path( $matches[ $varmatch[1] ] ) )
		 		continue;
		}

		// Got a match.
		$this->matched_rule = $match;
		break;
	}
}

comment:7 follow-up: @nacin17 months ago

  • Keywords commit added
  • Milestone changed from Future Release to 3.9

23587_regex_delimiters.patch looks OK to me. Is this something we could unit test?

What if someone wrongly includes '#' in an existing rewrite rule? What happens currently? If it has minimal side effects currently, switching to an E_WARNING doesn't sound very nice. See also #12271.

comment:8 in reply to: ↑ 7 @Denis-de-Bernardy17 months ago

Replying to nacin:

23587_regex_delimiters.patch looks OK to me. Is this something we could unit test?

Methinks there are plenty already.

What if someone wrongly includes '#' in an existing rewrite rule? What happens currently?

It doesn't work, per #12271 which got closed as wontfix.

If it has minimal side effects currently, switching to an E_WARNING doesn't sound very nice.

They're already present with a # in a permalink structure anyway, when rewrite rules try to match pages on the front end.

comment:9 @ircbot17 months ago

This ticket was mentioned in IRC in #wordpress-dev by ddebernardy. View the logs.

comment:10 @samuelsidler16 months ago

#26827 was marked as a duplicate.

comment:11 follow-up: @ircbot15 months ago

This ticket was mentioned in IRC in #wordpress-dev by miqrogroove. View the logs.

@miqrogroove15 months ago

This would allow any chars in the pattern.

comment:12 @miqrogroove15 months ago

Another way is to switch the delimeter to @ and then slash @ within the pattern. Then there will be no syntax conflicts. Patch attached.

comment:13 in reply to: ↑ 11 ; follow-up: @Denis-de-Bernardy15 months ago

Replying to ircbot:

This ticket was mentioned in IRC in #wordpress-dev by miqrogroove. View the logs.

Seeing those logs, I'm at a loss as to what could possibly have a core dev be worried about when it comes to changing the delimiter to #. Per my further comment further up:

https://core.trac.wordpress.org/ticket/23587#comment:6

@Nacin: which part of "the # delimiter is ALREADY used on the exact SAME data in another part of WP (aka parse_request())" do you not understand and could possibly be worrying about?

Last edited 15 months ago by Denis-de-Bernardy (previous) (diff)

comment:14 in reply to: ↑ 13 ; follow-up: @nacin15 months ago

Replying to Denis-de-Bernardy:

@Nacin: which part of "the # delimiter is ALREADY used on the exact SAME data in another part of WP (aka parse_request())" do you not understand and could possibly be worrying about?

What part of that IRC conversation did you not read? Oh, right, the part where I said "#s don't work cause of #12271. I forgot about that. So # seems to be fine, then."

I'm juggling a few hundred tickets a week, I don't remember every little detail about each of them.

Attitude matters, Denis.

@miqrogroove15 months ago

Extra paranoid version.

comment:15 @miqrogroove15 months ago

We were merely discussing alternatives. I updated my patch again based on nacin's thought that the delimeter could already be slashed. This becomes fairly convoluted, as you can see in the regexp. Switching straight to # remains the simplest solution.

comment:16 @nacin15 months ago

  • Owner set to nacin
  • Resolution set to fixed
  • Status changed from new to closed

In 27835:

Rewrite: Use same delimiter in url_to_postid() we use in WP::parse_request().

props stevenkword.
fixes #23587.

comment:17 in reply to: ↑ 14 @Denis-de-Bernardy15 months ago

Replying to nacin:

What part of that IRC conversation did you not read? Oh, right, the part where I said "#s don't work cause of #12271. I forgot about that. So # seems to be fine, then."

That came later in the conversation and, incidentally, had been highlighted already by yours truly in this very ticket:

https://core.trac.wordpress.org/ticket/23587#comment:8

I'm juggling a few hundred tickets a week

So do I, fwiw.

I don't remember every little detail about each of them.

Can't blame you for that: things would be a lot simpler if the WP bug tracker had less bugs open each time WP ships.

Attitude matters, Denis.

So does reading feedback to objections you raise in tickets or IRC when someone is helpfully researching a ticket in order to provide you with the needed context to make the right decision asap.

Or addressing obvious bugs with obvious fixes proactively, for that matter.

(See #27550 for another example of such. Create a post with a bunch of slashes in its title, then trash it or untrash it to see for yourself. The fix is utterly obvious, too: wp_slash() the data before calling wp_update_post().)

Note: See TracTickets for help on using tickets.