Make WordPress Core

Opened 13 years ago

Closed 11 years ago

Last modified 10 years ago

#19110 closed enhancement (invalid)

Media File Url

Reported by: braydonf's profile braydonf Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Media Keywords:
Focuses: Cc:

Description

The media file URL includes the site URL this makes running two identical copies of the same site problematic since it would require doing a search and replace through a dumped database for each synchronization.

The following is stored in the guid column:
http://braydon.com/wp-content/uploads/2011/10/BGF028_500x606.jpg

My recommendation is to store in the guid column:
/wp-content/uploads/2011/10/BGF028_500x606.jpg

Which then could become:
http://braydon.com/wp-content/uploads/2011/10/BGF028_500x606.jpg
http://mirror1-of-braydon.com/wp-content/uploads/2011/10/BGF028_500x606.jpg
http://mirron2-of-braydon.com/wp-content/uploads/2011/10/BGF028_500x606.jpg
http://braydon.localhost/wp-content/uploads/2011/10/BGF028_500x606.jpg

Although this could cause a problem to how the guid field is to be handled, and a new column would need to be made.

Change History (12)

#1 follow-up: @SergeyBiryukov
13 years ago

Related: #6492

#2 in reply to: ↑ 1 @braydonf
13 years ago

Replying to SergeyBiryukov:

Related: #6492

That is also related to another one I just added #19111 which was suggesting using a hash for the ID, looks like their discussion was ending at using a UUID value for the GUID instead of the permalink.

As far I see, the GUID for media *has* to be the URL, there isn't another place for it.

#3 @Otto42
13 years ago

The attachments are probably using the GUID incorrectly. This is something that should be fixed.

#4 @braydonf
13 years ago

Possible Solutions

A: New Column

New column post_media in the wp_posts table, this would not only be useful for attachments but for any custom post type that requires video audio or photos. Currently plugins like NextGen ( http://wordpress.org/extend/plugins/nextgen-gallery/ ) define an entirely new table to store photos, while photos, galleries and albums do not support many features of posts such as post status, publish date.

ID
post_author 
post_date     
post_date_gmt  
post_content
post_media
post_title
post_excerpt
post_status
comment_status
ping_status
post_password
post_name
to_ping
pinged
post_modified
post_modified_gmt
post_content_filtered
post_parent
guid
menu_order
post_type
post_mime_type
comment_count

This would leave the GUID column open to be used to store with either the attachment permalink as consistent with posts GUID. Though GUID is not supported to be a permalink, a permalink just happens to be a collision free identifier, though lacks ability to move between domains. The GUID field could also be used to store a random UUID version 4.

B: Post Meta

Add new functions to handle setting and retrieving the media url, in the same way that there are functions for dealing with post thumbnails.

get_the_post_media
has_post_media
the_post_media
get_post_media_id

get_the_post_thumbnail
has_post_thumbnail
the_post_thumbnail
get_post_thumbnail_id

C: New Table

To simplify the namespace of post thumbnails and media, to use an object to handle more than one one types of media for a post. Keys could be used such as "thumbnail". The API could handle images, video, audio and in addition just files. Each type would be handled differently. Stored in a new table ( http://codex.wordpress.org/Database_Description ) that is specifically for media. This new table could be wp_media:

media_id - incremented integer
media_key
post_id - relational to wp_posts ID
media_type - image, video or audio
media_mime_type - file type
media_path - path to the media relative to the upload directory
media_size - size in bytes
media_dimensions - length of the media or dimensions for images
media_order - for tracks
media_meta - any metadata, such as ID3 or EXIF

The media library and attachments could be written to use this, in addition to making these available for custom post types for handling audio, video and images. Post thumbnails could also be written to use this for simplicity.

#5 @DrewAPicture
13 years ago

braydonf: What are the processing and speed implications for these 3 solutions? New column is attractive because it's expanding on an existing table. A wp_media table is attractive because it fully segregates media out of the postmeta and wp_posts tables, the downside is also that it segregates the media to a new table.

+1 for this line of thinking, either way.

#6 follow-ups: @dd32
13 years ago

The Guid field simply stores the publically accessible URL to the resource at present, this can either be the file url (for attachments) or the Post/Page Permalink. If you Clear/md5 the Guid column, WordPress should operate as normal (or at least the default theme should, plugins and other themes may be _doing_it_wrong())

WordPress does not use the Guid field for anything other than RSS feeds (It used to use it for the URL to media.. And still does for backwards compatibility, but only if, and only if, all the other WordPress attachment metadata is missing (or was never added since the image was uploaded pre 2.3-ish)). The URL to the media is dynamically generated from the site url(well, Content url) and the Attachment metadata (stored as postmeta for the media posts).

There is a metadata generation hook where you can insert/extract metadata from uploaded files and store it as meta, WordPress extracts image Exif data for example (but unfortunately, stores it as a serialized field - which limits it's query-ability).

The problem with adding another table (aside from the obvious; adding a new table) is primarily the migration to a new table, and the fact that for most things, it's not needed. Post Meta and Taxonomies can group and/or store meta for a specific upload, not only that, but the specific metadata one might want to store will change over time, requiring new columns and generally scaling out of control.

The only downside to the current media situation (IMO) is the One-to-One relationship of attachment:post_parent -> post:id, A Attachment can only have "1" parent post at present, which limits being able to include media into multiple posts using the current [gallery] shortcodes (Neither of the above suggestions address this limitation I should add, and in reality, tie it into the current structure harder).

The GUID field could also be used to store a random UUID version 4.

This seems to infer you're thinking of WordPress 2, 3, and 4. WordPress uses the Decimal versioning system, Version 3.2 -> 3.3 is more of a Version 32 to Version 33 jump if you subscribe to the notion that the first number represents a milestone in a product. When we reach 3.9, we'll then go to 4.0, which will be as major as 2.2 to 2.3 was.

#7 in reply to: ↑ 6 @braydonf
13 years ago

Replying to dd32:

The Guid field simply stores the publically accessible URL to the resource at present, this can either be the file url (for attachments) or the Post/Page Permalink. If you Clear/md5 the Guid column, WordPress should operate as normal (or at least the default theme should, plugins and other themes may be _doing_it_wrong())

That's great, some point would be worth it. If at all to avoid the confusion of having the URL for the media stored in two places.

WordPress does not use the Guid field for anything other than RSS feeds (It used to use it for the URL to media.. And still does for backwards compatibility, but only if, and only if, all the other WordPress attachment metadata is missing (or was never added since the image was uploaded pre 2.3-ish)). The URL to the media is dynamically generated from the site url(well, Content url) and the Attachment metadata (stored as postmeta for the media posts).

md5/sha1 for each post would seem useful for more than just in RSS, such as version control. Though the id for posts would be only part of the issue.

There is a metadata generation hook where you can insert/extract metadata from uploaded files and store it as meta, WordPress extracts image Exif data for example (but unfortunately, stores it as a serialized field - which limits it's query-ability).

The problem with adding another table (aside from the obvious; adding a new table) is primarily the migration to a new table, and the fact that for most things, it's not needed. Post Meta and Taxonomies can group and/or store meta for a specific upload, not only that, but the specific metadata one might want to store will change over time, requiring new columns and generally scaling out of control.

The properties of video, audio, and images don't change. There will be different codecs, but there will be codecs.

The only downside to the current media situation (IMO) is the One-to-One relationship of attachment:post_parent -> post:id, A Attachment can only have "1" parent post at present, which limits being able to include media into multiple posts using the current [gallery] shortcodes (Neither of the above suggestions address this limitation I should add, and in reality, tie it into the current structure harder).

There should be a one-to-one relationship because the post should be the media. The post_parent is a different issue, and it shouldn't be favored to put media into a post for it to be published. Thus it's not encouraged that it even have a post_parent property. Including the media into different posts, such if needed, wouldn't set the post_parent property, but could use another one that could use multiple ids.

The GUID field could also be used to store a random UUID version 4.

This seems to infer you're thinking of WordPress 2, 3, and 4. WordPress uses the Decimal versioning system, Version 3.2 -> 3.3 is more of a Version 32 to Version 33 jump if you subscribe to the notion that the first number represents a milestone in a product. When we reach 3.9, we'll then go to 4.0, which will be as major as 2.2 to 2.3 was.

WordPress is unique in that way that major and minor version numbers don't carry different weight, since the change from 2.2 to 2.3 was larger than 2.9 to 3.0. Version 4 UUID is the random generated UUID. There are several types of UUIDs; Version 1 (Mac Address), Version 2 (DCE Security), Version 3 (MD5 hash), Version 4 (random), Version 5 (SHA-1 hash).

#8 @ericlewis
11 years ago

  • Resolution set to invalid
  • Status changed from new to closed

Media URLs are built on the fly, the base of which is relative to WP_CONTENT_URL(link) which is relative to the site_url option (link).

If you're mirroring your site, you're already filtering the site_url somehow (I assume), so the content directories will be automagically set correctly.

Closing out. Feel free to reopen with more follow-up.

#9 in reply to: ↑ 6 @braydonf
11 years ago

Replying to dd32:

WordPress does not use the Guid field for anything other than RSS feeds (It used to use it for the URL to media.. And still does for backwards compatibility, but only if, and only if, all the other WordPress attachment metadata is missing (or was never added since the image was uploaded pre 2.3-ish)). The URL to the media is dynamically generated from the site url(well, Content url) and the Attachment metadata (stored as postmeta for the media posts).

At some point it could be a good idea to use a hash instead of the 'permalink' as the GUID, and to drop support of using the the GUID as a permalink that might break backwards compatibility. Though this would likely need to be part of another change that would bring larger advantages to support it.

#10 @braydonf
11 years ago

A case to use a hash for the GUID could be:

If you're publishing to multiple installs which are sharing data in that way, you shouldn't be relying on the post ID to stay continuous, it should be updated on merging upstream/downstream. You could turn the Guid field into a hash/UUID for that specific purpose if it helped.

https://core.trac.wordpress.org/ticket/19111#comment:7

#11 @braydonf
11 years ago

As earlier mentioned, discussion of using hash as GUID at #6492

#12 @SergeyBiryukov
10 years ago

  • Milestone Awaiting Review deleted
Note: See TracTickets for help on using tickets.