WordPress.org

Make WordPress Core

Opened 3 years ago

Closed 3 years ago

#17267 closed defect (bug) (fixed)

twentyeleven_url_grabber() needs work

Reported by: nacin Owned by:
Milestone: 3.2 Priority: normal
Severity: normal Version: 3.2
Component: Bundled Theme Keywords: has-patch close
Focuses: Cc:

Description

/**
 * Grab the first URL from a Link post
 */
function twentyeleven_url_grabber() {
	global $post, $posts;

	$first_url = '';

	ob_start();
	ob_end_clean();

	$output = preg_match_all('/<a.+href=[\'"]([^\'"]+)[\'"].*>/i', $post->post_content, $matches);

	$first_url = $matches [1] [0];

	if ( empty( $first_url ) )
		return false;

	return $first_url;
}

The regex needs some work:

  • a.+href would match area href. While that isn't much of a concern, it would also skip to the second link as it's greedy.
  • Likewise we should probably use a non-greedy (.+?) rather than ([\'"]+) for what we're capturing.
  • We probably want the s pattern modifier there, in case there's an \n somewhere such as after <a and before href.
  • No need for the .*> at the end, just grab the href and bail.

Something like this might work: '/<a\s[^>]+href=[\'"](.*?)[\'"]/is.

Also:

No need for the $posts global to be called upon.

Rather than preg_match_all, we should just use preg_match, since we want the first one. If ! preg_match, then we should return false. No need to assign the result to $output.

We should arguably use get_the_content() since this is being called in within a loop, rather than the raw, unfiltered data.

Anyone know what the output buffering was designed to do?

Attachments (2)

17267.diff (951 bytes) - added by duck_ 3 years ago.
17267-2.diff (868 bytes) - added by lancewillett 3 years ago.
Refreshed patch with better comment text

Download all attachments as: .zip

Change History (10)

comment:1 nacin3 years ago

We also need to make sure the link is safe. Perhaps esc_url_raw() before returning.

duck_3 years ago

comment:2 duck_3 years ago

  • Keywords has-patch added; needs-patch removed

How's this?

Slight change to nacin's suggested regex -- can have zero non > characters after the space and matching at least one character in the href attribute.

This also fixes a notice if there is no link present.

Related: #17279

comment:3 lancewillett3 years ago

Nice improvements!

Output buffering is definitely not needed here since the "haystack" is the post content string; it was probably a leftover from a different function.

The updated regex works well in my testing.

lancewillett3 years ago

Refreshed patch with better comment text

comment:4 nacin3 years ago

Note that /** et al. is phpdoc syntax. So any changes to comment text should happen to the text only, not the syntax around it.

duck_'s patch tests well here and looks like a good start. Let's roll with it.

comment:5 nacin3 years ago

In [17968]:

Fix twentyeleven_url_grabber(). props duck_, see #17267.

comment:6 westi3 years ago

  • Component changed from Themes to Bundled Theme

comment:7 westi3 years ago

  • Keywords close added

This looks good enough to me.

All done here for 3.2?

comment:8 nacin3 years ago

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.