WordPress.org

Make WordPress Core

Opened 2 years ago

Closed 11 months ago

Last modified 11 months ago

#23587 closed defect (bug) (fixed)

url_to_postid() needs to use # or @ for preg_match start/end character

Reported by: coreygilmore Owned by: nacin
Milestone: 3.9 Priority: normal
Severity: normal Version: 1.5.2
Component: Rewrite Rules Keywords: has-patch commit
Focuses: Cc:

Description

url_to_postid() uses ! for the start/end characters of regular expressions, which doesn't allow you to use expressions containing !, like (?!negativelookahead).

The solution is to use preg_match("#^$match#", $request_match, $matches) or preg_match("@^$match@", $request_match, $matches)

See also: #7486 and WP::parse_request()

Attachments (3)

23587_regex_delimiters.patch (966 bytes) - added by stevenkword 2 years ago.
miqro-23587.patch (780 bytes) - added by miqrogroove 11 months ago.
This would allow any chars in the pattern.
miqro-23587.2.patch (897 bytes) - added by miqrogroove 11 months ago.
Extra paranoid version.

Download all attachments as: .zip

Change History (20)

comment:1 @SergeyBiryukov2 years ago

  • Milestone changed from Awaiting Review to 3.6
  • Version changed from 3.5.1 to 1.5.2

Related: [2609]

comment:2 @stevenkword2 years ago

  • Keywords has-patch added; needs-patch removed

Attached a patch that replaces the regular expression delimiters of "!" with "#" for an instance of preg_match(). I also noticed the same pattern with an instance of preg_replace() and performed the same substitution. This was done to allow for regular expressions containing the "!" character.

Last edited 2 years ago by stevenkword (previous) (diff)

comment:3 @stevenkword2 years ago

  • Cc stevenkword added

comment:4 @ryan20 months ago

  • Milestone changed from 3.6 to Future Release

comment:5 follow-up: @nacin13 months ago

#26970 was marked as a duplicate.

comment:6 in reply to: ↑ 5 @Denis-de-Bernardy13 months ago

Replying to nacin:

#26970 was marked as a duplicate.

Replying here

the # delimiter doesn't break anything. It's used in wp-includes/class-wp.php, in the parse_request() function:

foreach ( (array) $rewrite as $match => $query ) {
	// If the requesting file is the anchor of the match, prepend it to the path info.
	if ( ! empty($req_uri) && strpos($match, $req_uri) === 0 && $req_uri != $request )
		$request_match = $req_uri . '/' . $request;

	if ( preg_match("#^$match#", $request_match, $matches) ||
		preg_match("#^$match#", urldecode($request_match), $matches) ) {

		if ( $wp_rewrite->use_verbose_page_rules && preg_match( '/pagename=\$matches\[([0-9]+)\]/', $query, $varmatch ) ) {
			// this is a verbose page match, lets check to be sure about it
			if ( ! get_page_by_path( $matches[ $varmatch[1] ] ) )
		 		continue;
		}

		// Got a match.
		$this->matched_rule = $match;
		break;
	}
}

comment:7 follow-up: @nacin13 months ago

  • Keywords commit added
  • Milestone changed from Future Release to 3.9

23587_regex_delimiters.patch looks OK to me. Is this something we could unit test?

What if someone wrongly includes '#' in an existing rewrite rule? What happens currently? If it has minimal side effects currently, switching to an E_WARNING doesn't sound very nice. See also #12271.

comment:8 in reply to: ↑ 7 @Denis-de-Bernardy13 months ago

Replying to nacin:

23587_regex_delimiters.patch looks OK to me. Is this something we could unit test?

Methinks there are plenty already.

What if someone wrongly includes '#' in an existing rewrite rule? What happens currently?

It doesn't work, per #12271 which got closed as wontfix.

If it has minimal side effects currently, switching to an E_WARNING doesn't sound very nice.

They're already present with a # in a permalink structure anyway, when rewrite rules try to match pages on the front end.

comment:9 @ircbot13 months ago

This ticket was mentioned in IRC in #wordpress-dev by ddebernardy. View the logs.

comment:10 @samuelsidler11 months ago

#26827 was marked as a duplicate.

comment:11 follow-up: @ircbot11 months ago

This ticket was mentioned in IRC in #wordpress-dev by miqrogroove. View the logs.

@miqrogroove11 months ago

This would allow any chars in the pattern.

comment:12 @miqrogroove11 months ago

Another way is to switch the delimeter to @ and then slash @ within the pattern. Then there will be no syntax conflicts. Patch attached.

comment:13 in reply to: ↑ 11 ; follow-up: @Denis-de-Bernardy11 months ago

Replying to ircbot:

This ticket was mentioned in IRC in #wordpress-dev by miqrogroove. View the logs.

Seeing those logs, I'm at a loss as to what could possibly have a core dev be worried about when it comes to changing the delimiter to #. Per my further comment further up:

https://core.trac.wordpress.org/ticket/23587#comment:6

@Nacin: which part of "the # delimiter is ALREADY used on the exact SAME data in another part of WP (aka parse_request())" do you not understand and could possibly be worrying about?

Last edited 11 months ago by Denis-de-Bernardy (previous) (diff)

comment:14 in reply to: ↑ 13 ; follow-up: @nacin11 months ago

Replying to Denis-de-Bernardy:

@Nacin: which part of "the # delimiter is ALREADY used on the exact SAME data in another part of WP (aka parse_request())" do you not understand and could possibly be worrying about?

What part of that IRC conversation did you not read? Oh, right, the part where I said "#s don't work cause of #12271. I forgot about that. So # seems to be fine, then."

I'm juggling a few hundred tickets a week, I don't remember every little detail about each of them.

Attitude matters, Denis.

@miqrogroove11 months ago

Extra paranoid version.

comment:15 @miqrogroove11 months ago

We were merely discussing alternatives. I updated my patch again based on nacin's thought that the delimeter could already be slashed. This becomes fairly convoluted, as you can see in the regexp. Switching straight to # remains the simplest solution.

comment:16 @nacin11 months ago

  • Owner set to nacin
  • Resolution set to fixed
  • Status changed from new to closed

In 27835:

Rewrite: Use same delimiter in url_to_postid() we use in WP::parse_request().

props stevenkword.
fixes #23587.

comment:17 in reply to: ↑ 14 @Denis-de-Bernardy11 months ago

Replying to nacin:

What part of that IRC conversation did you not read? Oh, right, the part where I said "#s don't work cause of #12271. I forgot about that. So # seems to be fine, then."

That came later in the conversation and, incidentally, had been highlighted already by yours truly in this very ticket:

https://core.trac.wordpress.org/ticket/23587#comment:8

I'm juggling a few hundred tickets a week

So do I, fwiw.

I don't remember every little detail about each of them.

Can't blame you for that: things would be a lot simpler if the WP bug tracker had less bugs open each time WP ships.

Attitude matters, Denis.

So does reading feedback to objections you raise in tickets or IRC when someone is helpfully researching a ticket in order to provide you with the needed context to make the right decision asap.

Or addressing obvious bugs with obvious fixes proactively, for that matter.

(See #27550 for another example of such. Create a post with a bunch of slashes in its title, then trash it or untrash it to see for yourself. The fix is utterly obvious, too: wp_slash() the data before calling wp_update_post().)

Note: See TracTickets for help on using tickets.