Make WordPress Core

Opened 11 years ago

Closed 10 years ago

Last modified 10 years ago

#23587 closed defect (bug) (fixed)

url_to_postid() needs to use # or @ for preg_match start/end character

Reported by: coreygilmore's profile coreygilmore Owned by: nacin's profile nacin
Milestone: 3.9 Priority: normal
Severity: normal Version: 1.5.2
Component: Rewrite Rules Keywords: has-patch commit
Focuses: Cc:

Description

url_to_postid() uses ! for the start/end characters of regular expressions, which doesn't allow you to use expressions containing !, like (?!negativelookahead).

The solution is to use preg_match("#^$match#", $request_match, $matches) or preg_match("@^$match@", $request_match, $matches)

See also: #7486 and WP::parse_request()

Attachments (3)

23587_regex_delimiters.patch (966 bytes) - added by stevenkword 11 years ago.
miqro-23587.patch (780 bytes) - added by miqrogroove 10 years ago.
This would allow any chars in the pattern.
miqro-23587.2.patch (897 bytes) - added by miqrogroove 10 years ago.
Extra paranoid version.

Download all attachments as: .zip

Change History (20)

#1 @SergeyBiryukov
11 years ago

  • Milestone changed from Awaiting Review to 3.6
  • Version changed from 3.5.1 to 1.5.2

Related: [2609]

#2 @stevenkword
11 years ago

  • Keywords has-patch added; needs-patch removed

Attached a patch that replaces the regular expression delimiters of "!" with "#" for an instance of preg_match(). I also noticed the same pattern with an instance of preg_replace() and performed the same substitution. This was done to allow for regular expressions containing the "!" character.

Last edited 11 years ago by stevenkword (previous) (diff)

#3 @stevenkword
11 years ago

  • Cc stevenkword added

#4 @ryan
11 years ago

  • Milestone changed from 3.6 to Future Release

#5 follow-up: @nacin
10 years ago

#26970 was marked as a duplicate.

#6 in reply to: ↑ 5 @Denis-de-Bernardy
10 years ago

Replying to nacin:

#26970 was marked as a duplicate.

Replying here

the # delimiter doesn't break anything. It's used in wp-includes/class-wp.php, in the parse_request() function:

foreach ( (array) $rewrite as $match => $query ) {
	// If the requesting file is the anchor of the match, prepend it to the path info.
	if ( ! empty($req_uri) && strpos($match, $req_uri) === 0 && $req_uri != $request )
		$request_match = $req_uri . '/' . $request;

	if ( preg_match("#^$match#", $request_match, $matches) ||
		preg_match("#^$match#", urldecode($request_match), $matches) ) {

		if ( $wp_rewrite->use_verbose_page_rules && preg_match( '/pagename=\$matches\[([0-9]+)\]/', $query, $varmatch ) ) {
			// this is a verbose page match, lets check to be sure about it
			if ( ! get_page_by_path( $matches[ $varmatch[1] ] ) )
		 		continue;
		}

		// Got a match.
		$this->matched_rule = $match;
		break;
	}
}

#7 follow-up: @nacin
10 years ago

  • Keywords commit added
  • Milestone changed from Future Release to 3.9

23587_regex_delimiters.patch looks OK to me. Is this something we could unit test?

What if someone wrongly includes '#' in an existing rewrite rule? What happens currently? If it has minimal side effects currently, switching to an E_WARNING doesn't sound very nice. See also #12271.

#8 in reply to: ↑ 7 @Denis-de-Bernardy
10 years ago

Replying to nacin:

23587_regex_delimiters.patch looks OK to me. Is this something we could unit test?

Methinks there are plenty already.

What if someone wrongly includes '#' in an existing rewrite rule? What happens currently?

It doesn't work, per #12271 which got closed as wontfix.

If it has minimal side effects currently, switching to an E_WARNING doesn't sound very nice.

They're already present with a # in a permalink structure anyway, when rewrite rules try to match pages on the front end.

This ticket was mentioned in IRC in #wordpress-dev by ddebernardy. View the logs.


10 years ago

#10 @samuelsidler
10 years ago

#26827 was marked as a duplicate.

This ticket was mentioned in IRC in #wordpress-dev by miqrogroove. View the logs.


10 years ago

@miqrogroove
10 years ago

This would allow any chars in the pattern.

#12 @miqrogroove
10 years ago

Another way is to switch the delimeter to @ and then slash @ within the pattern. Then there will be no syntax conflicts. Patch attached.

#13 in reply to: ↑ 11 ; follow-up: @Denis-de-Bernardy
10 years ago

Replying to ircbot:

This ticket was mentioned in IRC in #wordpress-dev by miqrogroove. View the logs.

Seeing those logs, I'm at a loss as to what could possibly have a core dev be worried about when it comes to changing the delimiter to #. Per my further comment further up:

https://core.trac.wordpress.org/ticket/23587#comment:6

@Nacin: which part of "the # delimiter is ALREADY used on the exact SAME data in another part of WP (aka parse_request())" do you not understand and could you possibly be worrying about?

Version 1, edited 10 years ago by Denis-de-Bernardy (previous) (next) (diff)

#14 in reply to: ↑ 13 ; follow-up: @nacin
10 years ago

Replying to Denis-de-Bernardy:

@Nacin: which part of "the # delimiter is ALREADY used on the exact SAME data in another part of WP (aka parse_request())" do you not understand and could possibly be worrying about?

What part of that IRC conversation did you not read? Oh, right, the part where I said "#s don't work cause of #12271. I forgot about that. So # seems to be fine, then."

I'm juggling a few hundred tickets a week, I don't remember every little detail about each of them.

Attitude matters, Denis.

@miqrogroove
10 years ago

Extra paranoid version.

#15 @miqrogroove
10 years ago

We were merely discussing alternatives. I updated my patch again based on nacin's thought that the delimeter could already be slashed. This becomes fairly convoluted, as you can see in the regexp. Switching straight to # remains the simplest solution.

#16 @nacin
10 years ago

  • Owner set to nacin
  • Resolution set to fixed
  • Status changed from new to closed

In 27835:

Rewrite: Use same delimiter in url_to_postid() we use in WP::parse_request().

props stevenkword.
fixes #23587.

#17 in reply to: ↑ 14 @Denis-de-Bernardy
10 years ago

Replying to nacin:

What part of that IRC conversation did you not read? Oh, right, the part where I said "#s don't work cause of #12271. I forgot about that. So # seems to be fine, then."

That came later in the conversation and, incidentally, had been highlighted already by yours truly in this very ticket:

https://core.trac.wordpress.org/ticket/23587#comment:8

I'm juggling a few hundred tickets a week

So do I, fwiw.

I don't remember every little detail about each of them.

Can't blame you for that: things would be a lot simpler if the WP bug tracker had less bugs open each time WP ships.

Attitude matters, Denis.

So does reading feedback to objections you raise in tickets or IRC when someone is helpfully researching a ticket in order to provide you with the needed context to make the right decision asap.

Or addressing obvious bugs with obvious fixes proactively, for that matter.

(See #27550 for another example of such. Create a post with a bunch of slashes in its title, then trash it or untrash it to see for yourself. The fix is utterly obvious, too: wp_slash() the data before calling wp_update_post().)

Note: See TracTickets for help on using tickets.