WordPress.org

Make WordPress Core

Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#19111 closed enhancement (wontfix)

Post ID by Hash

Reported by: braydonf Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Database Keywords:
Focuses: Cc:

Description

Post IDs are currently stored as incrementing values. In the case with merging posts from two separate WordPress databases it introduces the problem of ID collisions. The step to merge posts becomes a process of importing with new IDs and breaks continuity since there would be an issue with maintaining the original ID, and able to sync back.

I'm suggesting using hashed IDs instead of incremented ID values, similar to how version control systems are currently designed.

GUID values are useful for site to site reference, however doesn't take into consideration having multiple and simultaneous systems that are able to be easily merged and branched, just like distributed version control.

Change History (11)

comment:1 braydonf2 years ago

  • Component changed from General to Database

comment:2 braydonf2 years ago

Looks like there is a ticket discussing using UUID for the GUID value which could help this. See #6492

comment:3 follow-up: scribu2 years ago

  • Milestone Awaiting Review deleted

Why not just use a custom field to store a unique hash?

Also, what's the use-case for "merging and branching" posts between sites?

comment:4 scribu2 years ago

  • Milestone set to Awaiting Review

comment:5 in reply to: ↑ 3 braydonf2 years ago

Replying to scribu:

Why not just use a custom field to store a unique hash?

  1. Functions like get_post work on the ID and not on the custom fields.
  2. It would require another query to determine which hash is for what ID.
  3. Why have two IDS when there could be one?

Also, what's the use-case for "merging and branching" posts between sites?

Use-case: There is a website at http://sample.com/ of documentary journalists and photographers who for most of the time are not connected to the Internet; at airports, buses, and traveling constantly. They have identical copies of the entire website stored locally so that they can work on preparing posts while they are offline, once they are back online they can quickly send their changes and add their articles and receive new articles and updates from others. IDs should remain the same for each writer because some of the posts may not be immediately published and the other writers can edit and add to the story before it is finally published. Examples could be between the writer and a photographer with different skills, or two people with the same skills but different locations and points of view.

comment:6 scribu2 years ago

Related plugin: Local Storage Backup.

So, if we switched to hashes instead of numerical ids, what would happen to all the existing sites when they upgrade to the new version of WordPress?

Also, won't using hashes slow down complex queries, such as multi-taxonomy queries?

Last edited 2 years ago by scribu (previous) (diff)

comment:7 follow-up: dd322 years ago

Post ID's are unique to a single install, and thats the way they should stay.

If you're publishing to multiple installs which are sharing data in that way, you shouldn't be relying on the post ID to stay continuous, it should be updated on merging upstream/downstream. You could turn the Guid field into a hash/UUID for that specific purpose if it helped.

One alternative would be to start the post ID's for blog1 at 750,000 post ID, the second at 1,500,00, etc (Done by setting the auto increase value on the database).

In the end, post ID's are the least of your worries in that scenario, you'll have to remap more than just the post ID (Term ids, etc) and deal with multiple versions of content..

comment:8 in reply to: ↑ 7 braydonf2 years ago

Replying to scribu:

Related plugin: Local Storage Backup.

That looks useful for writing as the post will be stored in a browser local storage, but uploading media is tied to the server still. Also wouldn't be able to view any posts that were not already looking at. It's good proof-on-concept of using local storage in the browser.

So, if we switched to hashes instead of numerical ids, what would happen to all the existing sites when they upgrade to the new version of WordPress?

That would be a huge problem. GUIDs are currently stored using the ID and many blogs that are not using pretty permalinks would be affected by the change. Internally, the whole process could be automated to update all of the IDs in the database. It would be a significant enough change to provide some sort of optional opt-out, to avoid any potential problems. There are also likely many plug-ins that are depending on the ID to be numerical and not a hash, and these would also be affected. So if this were to change, it would need to come along with some other changes that would make the change warranted. The biggest example I can see would be a offline admin interface that is built around using localstorage, or some other type of database that can be used on a users local machine, where plugins wouldn't be able to be run likely without being updated. How would we run PHP locally, which is probably a larger part of this than IDs.

Also, won't using hashes slow down complex queries, such as multi-taxonomy queries?

I don't know enough about MySQL queries, unfortunately, to know if a numerical ID or hash would be faster or slower. I do know that numerical and incremented IDs are very common with relational databases, so I'll assume that there is a reason.

Replying to dd32:

Post ID's are unique to a single install, and thats the way they should stay.

I'm not seeing anyway around it at this time, especially considering terms and taxonomies are all using numerical incremented IDs.

If you're publishing to multiple installs which are sharing data in that way, you shouldn't be relying on the post ID to stay continuous, it should be updated on merging upstream/downstream. You could turn the Guid field into a hash/UUID for that specific purpose if it helped.

Yes, this could be useful for that, at the level of feeds for sure. #19110 #6492

One alternative would be to start the post ID's for blog1 at 750,000 post ID, the second at 1,500,00, etc (Done by setting the auto increase value on the database).

That could work assuming that there is enough organization.

In the end, post ID's are the least of your worries in that scenario, you'll have to remap more than just the post ID (Term ids, etc) and deal with multiple versions of content..

I didn't realize this, that's a great point.

comment:9 scribu2 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to wontfix
  • Status changed from new to closed

I had a feeling there was a larger problem with this proposal, but dd32 nailed it.

We're not changing post ids to hashes any time soon.

comment:10 follow-up: scribu2 years ago

Continuing the discussion, though:

Without using a desktop or mobile client, the only other way to have an offline admin area is through JavaScript and some form of local storage.

Collaboration with other users could only happen through the server, after a sync is made.

comment:11 in reply to: ↑ 10 braydonf2 years ago

Replying to scribu:

Continuing the discussion, though:

Without using a desktop or mobile client, the only other way to have an offline admin area is through JavaScript and some form of local storage.

Collaboration with other users could only happen through the server, after a sync is made.

Unless there was a server they could serve it from locally to send changes to each other. Easiest way would be to just run WordPress in the same way as the server but on a local machine and connect to each other through a LAN. Doesn't require a total JavaScript rewrite, everything just works as normal. Though it would still not be possible with MySQL/PHP with breaking lots of stuff. C'est la vie.

Note: See TracTickets for help on using tickets.