WordPress.org

Make WordPress Core

Opened 10 years ago

Closed 7 months ago

Last modified 8 weeks ago

#14558 closed enhancement (wontfix)

Separate Database Table Support for Custom Post Types

Reported by: rahul286 Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Posts, Post Types Keywords:
Focuses: Cc:

Description

While working on custom post types, I felt need for this enhancements.

This can be achieved by adding an extra argument to the register_post_type function like below...

register_post_type( 'acme_product',
    array(
      'labels' => array(
        'name' => __( 'Products' ),
        'singular_name' => __( 'Product' )
      ),
      'public' => true,

      /* Database separation */
     'db_tables' => array(
        'prefix' => '', //by default, value of $table_prefix will be used. If user sets this value to something, it will be used as prefix for both of following tables
        'base_prefix' => '' , //this will control it tables are to be kept sitewide or per blog 
        'posts_name' => 'acme',
        'postmeta_name' => 'acmemeta',
      ),
  );

This small enhancement (not from coding perspective) will help more plugins authors go for custom post type.
Reasons are - first they will get option to have separate data storage.
Second - if some other badly coded plugin manipulates wp_posts table in some wrong way, it won't have sideeffect on third-party data.
Third - Plugin authors will get more space to experiment as at any time they will be dealing with their own plugin's data.

Of course, one of the goal of this nice feature must be to abstract database layer, but as a developer I feel it would be better if I can have some control over database without loosing power of this new (custom post type) feature.

Change History (29)

#1 follow-up: @westi
10 years ago

  • Resolution set to wontfix
  • Status changed from new to closed

This is not something we would likely consider.

We are happy with storing the custom post types in the existing table structures.

We don't recommend on adding extra tables - in some installation scenarios it isn't even possible.

#2 in reply to: ↑ 1 @rahul286
10 years ago

@Peter (westi)

Of course you know better and I respect ur opinion.

But can register_post_type be extended (as a class) or atleast changed to have enough actions and filter so experienced developers can take some extra efforts where this kind of requirement is critical for this.

I really do not believe in copying lots of codes from core and then patching them up to get desired results. It becomes harder to maintain.

Anyway, thanks for being so quick to reply. :-)

#3 @nacin
10 years ago

  • Milestone Awaiting Review deleted

#4 follow-up: @martin.krcho
6 years ago

  • Keywords changed from post type, database to post type database
  • Resolution wontfix deleted
  • Status changed from closed to reopened

This is something I would like to see in the future versions of WordPress as well. I am currently deciding between custom post types and something completely separate.

#5 follow-up: @dd32
6 years ago

  • Resolution set to wontfix
  • Status changed from reopened to closed

Re-closing due to the first comment still holding true.

Storing the posts in a different table will ultimately make very little difference, future WP_Post child classes will handle the requirements devs need with easier access to a different set of fields.

#6 @danieliser
5 years ago

This may not be the best place but what we really need isn't CPT in a CT, but rather an API to easily generate the post type editor page with custom data feeds.

The CPT UI, Permalinks, Taxonomy Integration etc are the factors we are struggling to keep, but the data constraints make things like e-commerce or any other data that requires reporting very difficult without Custom Tables.

I am considering the implications at this point of a Hybrid option. Use CPT to store the primary info & ID, then modify the query to JOIN the CT data. It's definately not the way I would prefer to do it but even with the extended query it should be fast, and easily searchable using a JOIN in the other direction.

#7 @SergeyBiryukov
5 years ago

  • Keywords post type database removed

#8 @DrewAPicture
5 years ago

#28519 was marked as a duplicate.

#9 follow-up: @CodeBard
4 years ago

Writing a forum plugin, i don't want to store tens of thousands of posts in wp's posts table and their metas in the postmeta. Leaving aside the incurred bloat, searching for multiple criteria from among meta becomes a nightmare. This is an intrinsic problem of EAV table structure, and until that fundamental concept is solved either through WP or through SQL enhancements, that problem is here to stay with us.

So my situation is, i want to use custom post types, but with my own tables which will be strictly relational and flat, but with the intention of using WP post functions. So far, there doesn't seem to be a way. While researching i got to this topic, this wasnt of help either.

#10 in reply to: ↑ 9 @danieliser
4 years ago

@CodeBard This may be of interest to you. I am using it in a couple plugins and it works well. You can easily add your own caching or search methods on the custom data as well.

https://github.com/danieliser/WP-Post-Partner-Tables

#11 in reply to: ↑ description @greplay
2 years ago

does this work will with out any problems? terms&taxonomy$cat and so...

#12 in reply to: ↑ 4 @greplay
2 years ago

Did you get the solution

#13 in reply to: ↑ description @greplay
2 years ago

It is just your imagine,I get it as a api to solve my problem...

#14 in reply to: ↑ 5 @greplay
2 years ago

@dd32 Im wating this solution for 3 years, I need your help.

Replying to dd32:

Re-closing due to the first comment still holding true.

Storing the posts in a different table will ultimately make very little difference, future WP_Post child classes will handle the requirements devs need with easier access to a different set of fields.

#15 @cuteantonyraj
20 months ago

Definitely it will be useful for lot of developers

#16 @Mte90
8 months ago

Just to reopen the floor, this is starting to be something urgent to do.
Now every plugin handles that in a custom way with bloating of resources, different standards and developers that ignore that part.
It is easy now using some plugins that add post types likes for log, discussions, learning systems, revisions (that with gitenberg are vary a lot) and so on to get easily thousands of rows in the wp_posts table.
It is quite difficult to keep the database healthy if WordPress itself doesn't offer an API, like did when the custom post types was created.
Right now there are different tools that are merging (liks berlindb presented at WCEU 2019) and many others.
Also as logic handle in that table different kind of post types that are not posts but maybe products is not so good.

Basically there are a lot of reasons to do this:

  • Create a better database organization
  • Improve the caching
  • Teach to developers better development habits

I don't think that we need to create a different schema but the same, only that use a different table for that specific post type.

#17 @samjco
7 months ago

  • Resolution wontfix deleted
  • Status changed from closed to reopened

I am 1+ for this to be done.

Especially when you have CPT, WC Orders, WC Products, Posts, Pages, etc, etc, etc all sharing the same table!!!
The flatten the table is, the better the querying speed Is.

ex:

wp_posts_cptname
wp_posts_orders
wp_postmeta_orders

Please. Registered CPTS should create tables. OR atleast have an OPTION to create new or store in default

'public' => true,
'separate_table' => true, //false to use wp_posts

Here is a solution made by Tinkler:
https://github.com/hannahtinkler/wordpress-custom-post-type-tables

This is a similar issue that came up for discussion in Woocommerce:
https://github.com/woocommerce/woocommerce/issues/9735

Please notice the experts commenting

And here is a solution for WC made by Liquidweb:
https://github.com/liquidweb/woocommerce-custom-orders-table

Last edited 7 months ago by samjco (previous) (diff)

#18 @pento
7 months ago

  • Resolution set to wontfix
  • Status changed from reopened to closed

Thank you for the extra thoughts, folks.

I appreciate that wp_posts can be a bottleneck for sites that create a lot of CPTs. I don't think having separate tables for CPTs is the answer, though. While this would likely give some performance boost in some areas, it would significantly complicate other common tasks (for example, search, looking up posts by slug, as well as any sort of bulk post_author management).

Such a change would also redefine what the post_id means, given that each table would have its own autoincrement column: post_id has been a unique identifier for the entire life of WordPress, I don't think it would be possible for us to change that without breaking just about every WordPress site in weird and subtle ways.

Finally, the complexity of migrating existing posts to this schema change would be wildly error prone.

Since this ticket was originally created, hosts have generally relaxed their restrictions around creating tables, so plugins that need it can reliably create a table appropriate to them, if needed.

In the case of WooCommerce, they've explored splitting off tables in the past, but have run into many of the problems I've mentioned here. It's still something they're looking into, but it's not a simple problem to solve, and that's just for one plugin.

As such, these kinds of large scale performance changes really need to be done on a site-by-site or plugin-by-plugin basis, rather than drastically increasing the complexity of all WordPress sites.

#19 @rahul286
7 months ago

@pento I was excited to see this ticket reopened after nine years.

I opened this ticket when CPT was introduced. I wish we had a custom database table abstraction from the beginning. Looking back at how things have evolved, it should have been a separate API rather than a flag in CPT.

I do understand that a lot has been changed over time. CPT's are now everywhere. So any refactoring will result in migration insanity, and WooCommerce case is an unfortunate example of that!

I see many times developers use CPT when they should be using custom tables. I hope that someday the core will have a better solution for custom database table management, which will be as easy to use as CPT. And once that is achieved, plugin author's who benefit from it will write their own migrations.

#20 @samjco
7 months ago

@pento What about db Views? That is only created during the CPT registering. This will allow users to optionally pull in their rendered data from the db View itself that is made from the Post table. However, when a CPT post is updated, it is updated within the post table and the process is repeated.

OR

Dynamically creating local CPT JSON files off of the post table. And when a post is updated the JSON file is updated. But within the CPT registration process, we can define if we want to use the local JSON or MySQL as a data source.

What do you think of these options?
This way no change has to be made to the post table.

Last edited 7 months ago by samjco (previous) (diff)

#21 @pento
7 months ago

There's no performance benefit for views, every time you query the view, it effectively adds a join with the query that generates the view. MySQL doesn't have native support for materialised views.

Writing CPT JSON files would suffer pretty terrible performance if there are regular writes (either posts created or update). It would effectively lock the entire CPT while the file is being written to disk.

#22 @samjco
7 months ago

@pento But what if on viewing the rendered table (wp-admin/edit.php?post_type=myCPT), there is a function (before table renders) that checks to see if there was any modification to the post table, if true, the myCPT.JSON file gets overwritten, if false, the table renders from the last saved myCPT.JSON file.

This way when a post is updated, it updates as normal using SQL. But the rendered table source is JSON (or even a flatten table and not json). What do you think?

Having a flatten form of data might prove very helpful if the search is searching the flatten data.

Last edited 7 months ago by samjco (previous) (diff)

#23 @pento
7 months ago

I don't understand how that would help. If you're checking for changes to the post, don't you need to select it from the database?

If you're caching a "last changed" value for all posts in the CPT, which you can look up a lot faster, there are a few problems that come to mind:

  • Dealing with race conditions, as multiple page requests could all try to update the JSON file at the same time (each of those page requests would have to wait for the JSON file to finish writing before they could read it, effectively a full table lock).
  • Allowing for code that doesn't use the core APIs to query the database, so they bypass any cache invalidation check, would be nearly impossible, I think. Even the best case would be extremely fragile.
  • This doesn't address that reading and parsing JSON is really slow, relative to how fast a database is. MySQL deprecated their more generic version of this (the old query cache), because it had minimal benefits, and a whole lot of down sides.

For providing a fast, automatically updated, minimally locked copy of the posts table, your best options are:

  • Use MySQL replication, and direct all of your read queries to the replica server. WPDB dropins like HyperDB can handle this automatically for you.
  • Use an in-memory cache, like memcached, which are super fast, and reduce load on the MySQL servers, by offloading the vast majority of read queries.
  • If you need to do full text searches, set up an Elasticsearch server to handle those requests.

There are plugins to help you set up all of these options, they're heavily tested on extremely high load sites (eg, these are all things that WordPress.com does), and are much more reliable than any sort of process that would shard the posts table by arbitrary criteria.

#24 @samjco
7 months ago

Thank you @pento for your insightful response.
I may continue to investigate this further.

In a nutshell, what I was trying to explain is:

Optionally, What if just the "Source" of the data tech (that being currently SQL) of the data in the default WP list table, of any post type, could be changed to a different tech (e,g: Flatten db table, JSONfile, local REST API, etc) (though being the same exact data rendered). But keep the wpdb tech "as-is" for updating/creating posts. The source tech can be made from the wp_posts table.

Just looking for a faster way to just render data without having to ticker with the server

But I understand a lot of functions would have to be mimic to work with other techs.

Maybe this should be a plugin, but wish it was in core.

Last edited 7 months ago by samjco (previous) (diff)

#25 @pento
7 months ago

Right, I see. For most uses, MySQL is the fastest option for what you're describing. It's pretty good about allowing a large number of concurrent reads, even while writing data. Retrieving a single post by ID, for example, is only a few ms, you may be able to shave off a 1-2 ms by adding memcached in this instance, but it's not really worth it.

Once you scale beyond MySQL's limits, the idea is to add existing tools in front of the database. MySQL remains the definitive source, and then you have various caching layers which hold objects, rather than DB rows. You'd store an entire post, with postmeta in memcached, saving multiple queries. In front of that, you'd have a HTML cache, which caches sections of HTML pages, or even entire HTML pages.

#26 follow-up: @binh
6 months ago

I also need this. Because I work on site with thousands of products. And storing them togehter with post and pages, media, etc in 1 table cause it so hard to do bulk editing/copying/export/import.

#27 in reply to: ↑ 26 @knutsp
6 months ago

Replying to binh:

I also need this. Because I work on site with thousands of products. And storing them togehter with post and pages, media, etc in 1 table cause it so hard to do bulk editing/copying/export/import.

Any plugin can put their data in custom tables, and hundreds do, like WooCommerce do for many objects. The post_type column is indexed, so I don't see unsolvable problems with export/import/bulk actions.

I can't see why core should support this, given the complexity with unique IDs that has to be solved. Each plugin will have their way of dealing with this, have tables with other columns and indexes. So even if core supported this natively, it would not be sufficient.

Core provides database layer API for anyone to use to make special tables, without having to establish their own database connection.

Custom post types can be stored in custom tables, and Pods Framework Advanced Content Types does this their way. So can therefore any plugin, or rely on their or other plugin's solutions.

Definitely plugin territory.

#28 @ReneHermi
8 weeks ago

The post_type column is indexed, so I don't see unsolvable problems with export/import/bulk actions.

Export/Import not but as you probably know we can not migrate and merge data from one site just in time to another due to shared data that should be separated.

Definitely plugin territory.
I can't see why core should support this, given the complexity with unique IDs that have to be solved.

That's the point: Due to the given complexity with the unique UID
we need to have a - preferably fully backward-compatible - API for CPT's for custom tables that can be used by every plugin. This should start with a plugin but it needs to get the full power and support of the core team like the Gutenberg project has.

It's the core responsibility to show plugin developers the correct ways how to store data in a table.
As the core was and as far as I know still recommends the use of CPT it's core responsibility to fix what it has been caused by this short-sighted implementation to denormalize the wp_posts table in such a heavy way.

We need to work on this together and WordPress core needs to include the solution for storing custom post types in separate tables.

Core and plugin developers need to pull the same strings and work on a common concept for reaching this. Otherwise, we will never be successful with storing data the right way as it is much too complex to solve without breaking millions of other sites.

First of all we need to stop recommending the use of CPT's as they are immediate. WooCommerce is the best example: If they had foreseen that storing orders and customer data in _post/_postmeta makes it impossible to migrate staging/development sites they would never have done this.

Now they are locked in technical dept as many other plugin developers as well.

Having plugin developers storing all of their plugin data in CPT's in one table indexed by a global unifier was never the right decision in terms of good database design. It was a curse and a blessing. In the first years, we thought it was a good decision as it made storing data just simply for hobbyist developers and helped growing the WordPress ecosystem. Then it turned out that it led to the technical debt that we have today. Professional companies have a hard time to use WordPress because they are not able to easily migrate data from one test environment another.

It should be especially in WooCommerce/Automatic intentions to get this solved as their users would benefit heavily from this. This issue can not be solved by a single person.

#29 @binaryfire
8 weeks ago

I agree with @ReneHermi. The performance and scaling issues with storing everying in postmeta have become mainstream enough for both ACF (via Hookturn's custom tables plugin) and Meta Box to have custom table options now. And the fact that WooCommerce had to add lookup tables to work around this, plus the increasing interest in Liquid Web's custom orders table plugin, further highlights the problem.

I don't think we should be worried about migration- just introducing this functionality in core so plugin devs can tap into it would make a big difference. It would be very easy to add and would be a solid step towards a full custom tables solution like @johnjamesjacoby's BerlinDB in the future.

Matt keeps talking about WP being the "OS of the web". Custom tables are going to be a necessity for that. It don't think it makes sense to keep putting off dealing with it.

Last edited 8 weeks ago by binaryfire (previous) (diff)
Note: See TracTickets for help on using tickets.