Opened 12 years ago
Closed 12 years ago
#23572 closed enhancement (fixed)
Add functions to parse Audio / Video data out of arbitrary content or a post
Reported by: | wonderboymusic | Owned by: | markjaquith |
---|---|---|---|
Milestone: | 3.6 | Priority: | normal |
Severity: | normal | Version: | |
Component: | Post Formats | Keywords: | needs-unit-tests has-patch commit needs-codex |
Focuses: | Cc: |
Description
Scenarios
Audio
After realizing that a post has a format of audio but no metadata for audio:
- Theme does not support Audio, but the content contains an <audio>, <object>, <embed>, or <iframe>
- Theme does not support Audio, but the content is a URL
- Theme does not support Audio, but the first line of content is a URL followed by a new line(s) and some commentary
- Theme does not support Audio, but the post has attached audio, and core or a plugin has an [audio] shortcode
- Theme does not support Audio, but the post has attached audio, and core or a plugin does not have an [audio] shortcode
Introduce the_audio()
, get_the_audio( $id = 0 )
, get_post_audio( $post_id = 0 )
, get_content_audio( &$content, $remove = false )
Video
After realizing that a post has a format of video but no metadata for video:
- Theme does not support Video, but the content contains an <video>, <object>, <embed>, or <iframe>
- Theme does not support Video, but the content is a URL
- Theme does not support Video, but the first line of content is a URL followed by a new line(s) and some commentary
- Theme does not support Video, but the post has attached video, and core or a plugin has an [video] shortcode
- Theme does not support Video, but the post has attached video, and core or a plugin does not have an [video] shortcode
Introduce the_video()
, get_the_video( $id = 0 )
, get_post_video( $post_id = 0 )
, get_content_video( &$content, $remove = false )
Helper functions
Introduce get_tag_regex()
and shortcode_exists( $tag )
Attachments (13)
Change History (67)
#5
@
12 years ago
- Keywords needs-refresh added; has-patch removed
This ticket really requires #23282 and needs to be refreshed after it is committed
#6
@
12 years ago
- Keywords needs-unit-tests added
The extraction code definitely needs unit tests.
Probably no need for so many functions. wp_get_post_videos( $post ) and wp_get_post_audio( $post ) both returning arrays should be a sufficient starting point. In general I would start with general utility functions and then add more as we realize how best these will be used in a theme.
#7
follow-up:
↓ 8
@
12 years ago
- Keywords has-patch added; needs-refresh removed
Probably no need for so many functions.
This is similar to slurping images / galleries from a content blob. There are so many different ways it can be in there, we need 6 extraction functions, and here they are:
get_post_audio()
- wrapsget_children()
, returns attached audio, impl'd in #23282get_post_video()
- wrapsget_children()
, returns attached video, impl'd in #23282get_content_audio()
- slurps shortcodes out of the passed content and returns a list of lists: each shortcode can have multiple<source>
s (the primary audio<source>
and its fallbacks)get_content_video()
- slurps shortcodes out of the passed content and returns a list of lists: each shortcode can have multiple<source>
s (the primary video<source>
and its fallbacks)get_embedded_audio()
- returns a list: slurps embed HTML (<object>
,<embed>
,<iframe>
) and HTML5<audio>
tags out of content, or if the content is a URL or starts with a URL, returns that as wellget_embedded_video()
- returns a list: slurps embed HTML (<object>
,<embed>
,<iframe>
) and HTML5<video>
tags out of content, or if the content is a URL or starts with a URL, returns that as well
Patch updated. Unit Tests forthcoming.
#8
in reply to:
↑ 7
@
12 years ago
- Keywords dev-feedback added
Replying to wonderboymusic:
Probably no need for so many functions.
This is similar to slurping images / galleries from a content blob. There are so many different ways it can be in there, we need 6 extraction functions, and here they are:
Looking at these six functions, seems like there's a lot of duplication between the pairs. I could see reducing six to three and adding a media flag with a switch:
get_post_media( 'audio|video' )
(#23282)get_content_media ( 'audio|video' )
get_embedded_media( 'audio|video' )
Also happy to make patch(es) to do just that.
#13
follow-up:
↓ 16
@
12 years ago
@wonderboymusic -- two questions as we dig into integrating this into Twenty Thirteen.
- For
get_content_images()
andget_content_image()
could we add an option to return the full <img /> HTML string instead of just thesrc
attribute value?
Right now we'll constrain the display with CSS, but ideally we'd be able to use the HTML that was inserted into the editor so we get the correctly sized img
output HTML without having to rebuild it.
Or, even better: return the attachment IDs of the images, if the images strings looks like a WP attachment. That way we can use wp_get_attachment_image()
on the image instead of a) trusting the returned HTML and b) trying to build the image HTML string from scratch.
Reasoning: In Twenty Thirteen after we get the bare image src URL we have to turn around and rebuild an img
tag again.
get_content_media()
-- should this function be outputting the final HTML or just returning bare URLs?
@return array A list of lists. Each item has a list of sources corresponding * to a [{media type}]'s primary src and specified fallbacks
Looks to me like this is only returning a URL now, right? We get the "Primary src" back. (Also, what would be "specified callbacks" here)?
Reasoning: In Twenty Thirteen after we get the bare URL we have to then turn around and re-parse with wp_oembed_get()
or similar to make it into the correct output HTML again.
#14
@
12 years ago
yes, I can see how that is weird - its probably easy to change it to just return the HTML and skip the src
regex - I will take a look tonight
#15
@
12 years ago
Looking at Lance's patch for 2013 there is two more things that come to mind for me:
- You probably talked about this earlier, but I think it would be beneficial to have template tags a la
the_video()
as a single point of reference for themes. Apparently there are up to four ways to find a video - we should leave it to themes to come up with an abstraction layer. - It's a pain in the butt to remove a
[video]
shortcode from the post's content throughget_content_media()
. Can we come up with a (default) way to remove it so that it is removed fromthe_content()
?
#16
in reply to:
↑ 13
@
12 years ago
Replying to lancewillett:
Right now we'll constrain the display with CSS, but ideally we'd be able to use the HTML that was inserted into the editor so we get the correctly sized
img
output HTML without having to rebuild it.
Replying to wonderboymusic:
yes, I can see how that is weird - its probably easy to change it to just return the HTML and skip the
src
regex - I will take a look tonight
I think we need to be careful about pigeonholing ourselves into only returning the source or a built HTML tag.
Sometimes people want the source, sometimes they want the markup and using a bool $src
parameter or similar could accomplish that.
I'd just like to see us avoid a situation where we need two functions to do basically the same thing with a different return, such as with wp_get_attachment_image()
and wp_get_attachment_image_src()
#17
@
12 years ago
- Keywords commit added
attachment:23572-html.diff changes the function signatures for the following functions:
get_content_image( &$content, $html = true, $remove = false )
get_content_images( &$content, $html = true, $remove = false, $limit = 0 )
get_content_video( &$content, $html = true, $remove = false )
get_content_audio( &$content, $html = true, $remove = false )
get_content_media( $type, &$content, $html = true, $remove = false )
By default now, you get the HTML for the matched items, set $html
to false
to get the src
s
#18
follow-up:
↓ 19
@
12 years ago
yes, there def needs to be a the_video()
and the_audio()
type thing, just needs to be thorough. Has to check for the following things:
- Is the theme structured and is there meta?
- Is there a video attached to the post instead? Is there more than one?
- Is there an attachments AND meta? Is is the same or different?
- Is there a
[video]
in the content? - Is there embed code(s) in the content?
- does the content contain a video URL on a line by itself?
In order of precedence, I would say:
- Metadata
- Attached video
- Parse from content
- Shortcode
- URL by itself
- Embed codes
#19
in reply to:
↑ 18
@
12 years ago
Replying to wonderboymusic:
yes, there def needs to be a
the_video()
andthe_audio()
type thing, just needs to be thorough.
+1
This would remove almost all the complex code currently in the Twenty Thirteen patch to support video post formats.
#20
@
12 years ago
I agree with your order of precedence, and for these wrapper functions I'd argue that "removing from content" should be the default behavior. Theme can call the_video()
once and get back the HTML output for that video item -- and the post content should be automatically filtered without having to change the_content()
call in that same template.
I'd love to start off with:
the_video()
the_audio()
the_image()
Those are the most common use cases for media-related post formats.
#21
follow-up:
↓ 22
@
12 years ago
Sorry, I didn't make myself clear: I meant, if remove == true, remove from content so the_content()
works, unless a content to remove it from was passed.
#22
in reply to:
↑ 21
@
12 years ago
Replying to obenland:
Sorry, I didn't make myself clear: I meant, if remove == true, remove from content so
the_content()
works, unless a content to remove it from was passed.
+1
So if no content string passed as an argument (which would be the case for a filter) the core function should remove it from the globally scoped $post
object (the current post's content)?
Also: should these functions check for being in a post format? has_post_format( 'video' )
before executing? Does that matter?
#23
follow-up:
↓ 24
@
12 years ago
Ok, I updated attachment:23572-html.diff and added 3 functions:
get_the_media( $type )
the_audio()
the_video()
the_audio()
and the_video()
are just wrappers for get_the_media()
, which contains all of the black voodoo to return the HTML for an audio or video and remove it from the global post's content when appropriate. Does a post format check for meta only
Comments?
#24
in reply to:
↑ 23
@
12 years ago
Replying to wonderboymusic:
added 3 functions:
the_audio()
andthe_video()
are just wrappers forget_the_media()
, which contains all of the black voodoo to return the HTML for an audio or video and remove it from the global post's content when appropriate. Does a post format check for meta only
Lookin' awesome.
One thing noticed in testing, if the video output is a bare URL it probably needs to be converted to embed before returning. @obenland noted in IRC, we should probably check there in get_the_media()
whether the post meta value is an embeddable URL, similar to how we handle oEmbed in regular content.
Could you also add a case for the_image()
?
Since it doesn't involve post meta, would be first checking the first attachment, then fallback to first image found in the post content.
get_attached_images( ... )
-- take the first result, return as output HTMLget_content_image( ... )
-- take 1st result, return as output HTML
@obenland Can you please add your notes here about the $pages
global versus $post->post_content
?
#25
@
12 years ago
if the video output is a bare URL
should Line 2042 in patch be [embed]
instead of [{type}]
shortcode?
I will work on the_image()
right now
#26
@
12 years ago
Updating $post->post_content
seems to be only one step in a two step process. We would need to call setup_postdata( $post )
again after that, so that all globals get updated for the template tags.
Also I noticed, an attached video (for example) would not get stripped from the post content. In my test case I had it in there with a video shortcode that made it even more challenging to come up with a way to remove it :(
Scott, the next time I see you, beer's on me!
#27
@
12 years ago
do_shortcode( '[embed]http://www.youtube.com/watch?v=9yEtf-r08uM[/embed]' )
did not work for me. Is my syntax wrong?
#28
@
12 years ago
Patch updated.
Added the_image()
Also fixed get_the_media()
so that the shortcode is removed from the content. The problem was nesting of functions that all passed the same variable by reference. Super weird. The content would be correct within the function, but not when the function returned.
do_shortcode( '[embed]http://www.youtube.com/watch?v=9yEtf-r08uM[/embed]' ) did not work for me.
I think [embed]
only works during the_content()
- it works in post_format_compat()
because that filter runs before the embed filter runs.
#29
@
12 years ago
Updated patch - the_image()
will now return HTML and remove the found image from the content when present, even when the image is the result of an Insert Into HTML image attachment.
Few things left to do, and why this is a nightmare:
- Found an image, the image is from the Media Library, but the image's post parent is another post.
- The found image is wrapped in a link that could either be the image
src
or the attachment page - All of this nonsense could be wrapped in a caption shortcode
- Even more awesome, have to loop through every possible size of the image
Probably have to do the following:
- Extract
[caption]
s, and for each one, look for the existence of the image URL in all of its sizes, also looping through the attachment URLs (they will be in the same$urls
array) - if the URL is present, remove the[caption]
- Do the same, but for
<a>
then<img>
Brutal.
#30
@
12 years ago
Patch updated - strips associated captions now, which takes care of swallowing their contents
#31
@
12 years ago
So there's some strange behavior here. The wp_video_shortcode()
has a printf()
. Paste in a YouTube URL and setup_postdata()
will echo a link onto the page. Huh?
Seems like it's checking for a local URL before it's checking for an OEmbed capable URL.
#32
@
12 years ago
Testing .9.diff with images and video -- works nicely, w00t. Using the_extra_content()
works beautifully, using attachment:ticket:23620:23620.7.diff for testing with Twenty Thirteen.
#33
@
12 years ago
@wonderboymusic Tiny code style comments on .9.diff, generally with PHP code we avoid camel-cased variable names; you use hasTeaser
for example. Better: has_teaser
. And something like stripteaser
could be strip_teaser
, both per Reference.
#34
@
12 years ago
those snippets also lovingly reside in get_the_content()
which is what we copied to make that func. We knew it was duping code, so left an @internal
block that says we need to abstract more of it - this code happened at like 3am, pretty sure Jaquith was calling me names as it happened. Will peruse again right now
#35
@
12 years ago
attachment:23572-html.10.diff modernizes the get_the_content()
-inspired code
#36
@
12 years ago
- Owner set to markjaquith
- Resolution set to fixed
- Status changed from new to closed
In 23819:
#37
@
12 years ago
I wanted to point out that the get_the_image()
function will conflict with the Get the Image plugin here: http://wordpress.org/extend/plugins/get-the-image/
The main function of that plugin is get_the_image()
. Introducing that function into core will break 1,000s of WordPress sites that currently use it.
#40
@
12 years ago
- Cc gabriel.koen@… added
Just a heads up that r23819 contains a function naming conflict with the (as far as I can tell) widely-used plugin http://wordpress.org/extend/plugins/get-the-image/
#43
@
12 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
I'd been holding back on saying this until the dust settled and someone else found a true issue but I think that these functions have really generic names and are likely to clash with other stuff.
Should we not wp_ prefix them all while we can?
#44
follow-up:
↓ 45
@
12 years ago
Perhaps not oddly, I have the opposite opinion.
Plugin authors have a responsibility to namespace their functions and classes, so as not to collide with the non-namespaced environment they run in.
Another cringe-worthy approach, is moving get_the_image() into pluggable.php, and Tadlock making sure there's feature parity between core and his plugin.
#45
in reply to:
↑ 44
@
12 years ago
Replying to johnjamesjacoby:
Perhaps not oddly, I have the opposite opinion.
Plugin authors have a responsibility to namespace their functions and classes, so as not to collide with the non-namespaced environment they run in.
I agree. Especially when it comes to template tags.
#46
follow-up:
↓ 47
@
12 years ago
Replying to westi:
Should we not wp_ prefix them all while we can?
I actually wish we'd do that with all WordPress functions, but that's a different discussion. I say we should definitely do that for these functions.
Replying to johnjamesjacoby:
Plugin authors have a responsibility to namespace their functions and classes, so as not to collide with the non-namespaced environment they run in.
The name of the plugin is "Get the Image". At the time I created the plugin, using get_the_image
made a lot of sense. If I had created the plugin today, I'd probably name it different. Five years ago when the plugin was created, I didn't know about prefixing stuff. It wasn't something that was preached that much.
Unfortunately, it's not something that can change in the plugin because the plugin code, specifically the get_the_image()
call, requires theme integration.
Another cringe-worthy approach, is moving get_the_image() into pluggable.php, and Tadlock making sure there's feature parity between core and his plugin.
One thing I didn't mention above is the the Get the Image plugin is included with many themes and is used on millions of Web sites in that manner. Themes can't overwrite pluggable functions.
It's also used on WordPress.com, which is problematic.
#47
in reply to:
↑ 46
@
12 years ago
Replying to greenshady:
The name of the plugin is "Get the Image". At the time I created the plugin, using
get_the_image
made a lot of sense. If I had created the plugin today, I'd probably name it different. Five years ago when the plugin was created, I didn't know about prefixing stuff. It wasn't something that was preached that much.
I understand, and assumed as much.
One thing I didn't mention above is the the Get the Image plugin is included with many themes and is used on millions of Web sites in that manner. Themes can't overwrite pluggable functions. It's also used on WordPress.com, which is problematic.
At the risk of sounding dramatic, it's another example of why including plugins in themes is such a bad idea; functions.php ends up being a booby-trap.
I actually wish we'd do that with all WordPress functions, but that's a different discussion. I say we should definitely do that for these functions.
Seems like the most logical choice given the circumstance. A shame to break the convention, since most other loop-oriented template tags aren't prefixed.
#50
@
12 years ago
get_the_image()
conflict was also mentioned in ticket:23593:2.
FWIW, "Avoiding Function Name Collisions" section was added to both "Plugin API" and "Writing a Plugin" Codex articles back in 2005:
https://codex.wordpress.org/index.php?title=Plugin_API&diff=6312&oldid=5289
https://codex.wordpress.org/index.php?title=Writing_a_Plugin&diff=9452&oldid=9448
For support in default themes, see #23620 -- only Twenty Thirteen needs this right now.