Opened 14 years ago
Last modified 9 months ago
#14558 reopened enhancement
Separate Database Table Support for Custom Post Types
Reported by: | rahul286 | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | |
Component: | Posts, Post Types | Keywords: | 2nd-opinion |
Focuses: | Cc: |
Description
While working on custom post types, I felt need for this enhancements.
This can be achieved by adding an extra argument to the register_post_type function like below...
register_post_type( 'acme_product', array( 'labels' => array( 'name' => __( 'Products' ), 'singular_name' => __( 'Product' ) ), 'public' => true, /* Database separation */ 'db_tables' => array( 'prefix' => '', //by default, value of $table_prefix will be used. If user sets this value to something, it will be used as prefix for both of following tables 'base_prefix' => '' , //this will control it tables are to be kept sitewide or per blog 'posts_name' => 'acme', 'postmeta_name' => 'acmemeta', ), );
This small enhancement (not from coding perspective) will help more plugins authors go for custom post type.
Reasons are - first they will get option to have separate data storage.
Second - if some other badly coded plugin manipulates wp_posts table in some wrong way, it won't have sideeffect on third-party data.
Third - Plugin authors will get more space to experiment as at any time they will be dealing with their own plugin's data.
Of course, one of the goal of this nice feature must be to abstract database layer, but as a developer I feel it would be better if I can have some control over database without loosing power of this new (custom post type) feature.
Change History (44)
#2
in reply to:
↑ 1
@
14 years ago
@Peter (westi)
Of course you know better and I respect ur opinion.
But can register_post_type be extended (as a class) or atleast changed to have enough actions and filter so experienced developers can take some extra efforts where this kind of requirement is critical for this.
I really do not believe in copying lots of codes from core and then patching them up to get desired results. It becomes harder to maintain.
Anyway, thanks for being so quick to reply. :-)
#4
follow-up:
↓ 12
@
10 years ago
- Keywords changed from post type, database to post type database
- Resolution wontfix deleted
- Status changed from closed to reopened
This is something I would like to see in the future versions of WordPress as well. I am currently deciding between custom post types and something completely separate.
#5
follow-up:
↓ 14
@
10 years ago
- Resolution set to wontfix
- Status changed from reopened to closed
Re-closing due to the first comment still holding true.
Storing the posts in a different table will ultimately make very little difference, future WP_Post child classes will handle the requirements devs need with easier access to a different set of fields.
#6
@
10 years ago
This may not be the best place but what we really need isn't CPT in a CT, but rather an API to easily generate the post type editor page with custom data feeds.
The CPT UI, Permalinks, Taxonomy Integration etc are the factors we are struggling to keep, but the data constraints make things like e-commerce or any other data that requires reporting very difficult without Custom Tables.
I am considering the implications at this point of a Hybrid option. Use CPT to store the primary info & ID, then modify the query to JOIN the CT data. It's definately not the way I would prefer to do it but even with the extended query it should be fast, and easily searchable using a JOIN in the other direction.
#9
follow-up:
↓ 10
@
9 years ago
Writing a forum plugin, i don't want to store tens of thousands of posts in wp's posts table and their metas in the postmeta. Leaving aside the incurred bloat, searching for multiple criteria from among meta becomes a nightmare. This is an intrinsic problem of EAV table structure, and until that fundamental concept is solved either through WP or through SQL enhancements, that problem is here to stay with us.
So my situation is, i want to use custom post types, but with my own tables which will be strictly relational and flat, but with the intention of using WP post functions. So far, there doesn't seem to be a way. While researching i got to this topic, this wasnt of help either.
#10
in reply to:
↑ 9
@
9 years ago
@CodeBard This may be of interest to you. I am using it in a couple plugins and it works well. You can easily add your own caching or search methods on the custom data as well.
#11
in reply to:
↑ description
@
7 years ago
does this work will with out any problems? terms&taxonomy$cat and so...
#13
in reply to:
↑ description
@
7 years ago
It is just your imagine,I get it as a api to solve my problem...
#14
in reply to:
↑ 5
@
7 years ago
@dd32 Im wating this solution for 3 years, I need your help.
Replying to dd32:
Re-closing due to the first comment still holding true.
Storing the posts in a different table will ultimately make very little difference, future WP_Post child classes will handle the requirements devs need with easier access to a different set of fields.
#16
@
5 years ago
Just to reopen the floor, this is starting to be something urgent to do.
Now every plugin handles that in a custom way with bloating of resources, different standards and developers that ignore that part.
It is easy now using some plugins that add post types likes for log, discussions, learning systems, revisions (that with gitenberg are vary a lot) and so on to get easily thousands of rows in the wp_posts
table.
It is quite difficult to keep the database healthy if WordPress itself doesn't offer an API, like did when the custom post types was created.
Right now there are different tools that are merging (liks berlindb presented at WCEU 2019) and many others.
Also as logic handle in that table different kind of post types that are not posts but maybe products is not so good.
Basically there are a lot of reasons to do this:
- Create a better database organization
- Improve the caching
- Teach to developers better development habits
I don't think that we need to create a different schema but the same, only that use a different table for that specific post type.
#17
@
5 years ago
- Resolution wontfix deleted
- Status changed from closed to reopened
I am 1+ for this to be done.
Especially when you have CPT, WC Orders, WC Products, Posts, Pages, etc, etc, etc all sharing the same table!!!
The flatten the table is, the better the querying speed Is.
ex:
wp_posts_cptname wp_posts_orders wp_postmeta_orders
Please. Registered CPTS should create tables. OR atleast have an OPTION to create new or store in default
'public' => true,
'separate_table' => true, //false to use wp_posts
Here is a solution made by Tinkler:
https://github.com/hannahtinkler/wordpress-custom-post-type-tables
This is a similar issue that came up for discussion in Woocommerce:
https://github.com/woocommerce/woocommerce/issues/9735
Please notice the experts commenting
And here is a solution for WC made by Liquidweb:
https://github.com/liquidweb/woocommerce-custom-orders-table
#18
@
5 years ago
- Resolution set to wontfix
- Status changed from reopened to closed
Thank you for the extra thoughts, folks.
I appreciate that wp_posts
can be a bottleneck for sites that create a lot of CPTs. I don't think having separate tables for CPTs is the answer, though. While this would likely give some performance boost in some areas, it would significantly complicate other common tasks (for example, search, looking up posts by slug, as well as any sort of bulk post_author
management).
Such a change would also redefine what the post_id
means, given that each table would have its own autoincrement column: post_id
has been a unique identifier for the entire life of WordPress, I don't think it would be possible for us to change that without breaking just about every WordPress site in weird and subtle ways.
Finally, the complexity of migrating existing posts to this schema change would be wildly error prone.
Since this ticket was originally created, hosts have generally relaxed their restrictions around creating tables, so plugins that need it can reliably create a table appropriate to them, if needed.
In the case of WooCommerce, they've explored splitting off tables in the past, but have run into many of the problems I've mentioned here. It's still something they're looking into, but it's not a simple problem to solve, and that's just for one plugin.
As such, these kinds of large scale performance changes really need to be done on a site-by-site or plugin-by-plugin basis, rather than drastically increasing the complexity of all WordPress sites.
#19
@
5 years ago
@pento I was excited to see this ticket reopened after nine years.
I opened this ticket when CPT was introduced. I wish we had a custom database table abstraction from the beginning. Looking back at how things have evolved, it should have been a separate API rather than a flag in CPT.
I do understand that a lot has been changed over time. CPT's are now everywhere. So any refactoring will result in migration insanity, and WooCommerce case is an unfortunate example of that!
I see many times developers use CPT when they should be using custom tables. I hope that someday the core will have a better solution for custom database table management, which will be as easy to use as CPT. And once that is achieved, plugin author's who benefit from it will write their own migrations.
#20
@
5 years ago
@pento What about db Views? That is only created during the CPT registering. This will allow users to optionally pull in their rendered data from the db View itself that is made from the Post table. However, when a CPT post is updated, it is updated within the post table and the process is repeated.
OR
Dynamically creating local CPT JSON files off of the post table. And when a post is updated the JSON file is updated. But within the CPT registration process, we can define if we want to use the local JSON or MySQL as a data source.
What do you think of these options?
This way no change has to be made to the post table.
#21
@
5 years ago
There's no performance benefit for views, every time you query the view, it effectively adds a join with the query that generates the view. MySQL doesn't have native support for materialised views.
Writing CPT JSON files would suffer pretty terrible performance if there are regular writes (either posts created or update). It would effectively lock the entire CPT while the file is being written to disk.
#22
@
5 years ago
@pento But what if on viewing the rendered table (wp-admin/edit.php?post_type=myCPT), there is a function (before table renders) that checks to see if there was any modification to the post table, if true, the myCPT.JSON file gets overwritten, if false, the table renders from the last saved myCPT.JSON file.
This way when a post is updated, it updates as normal using SQL. But the rendered table source is JSON (or even a flatten table and not json). What do you think?
Having a flatten form of data might prove very helpful if the search is searching the flatten data.
#23
@
5 years ago
I don't understand how that would help. If you're checking for changes to the post, don't you need to select it from the database?
If you're caching a "last changed" value for all posts in the CPT, which you can look up a lot faster, there are a few problems that come to mind:
- Dealing with race conditions, as multiple page requests could all try to update the JSON file at the same time (each of those page requests would have to wait for the JSON file to finish writing before they could read it, effectively a full table lock).
- Allowing for code that doesn't use the core APIs to query the database, so they bypass any cache invalidation check, would be nearly impossible, I think. Even the best case would be extremely fragile.
- This doesn't address that reading and parsing JSON is really slow, relative to how fast a database is. MySQL deprecated their more generic version of this (the old query cache), because it had minimal benefits, and a whole lot of down sides.
For providing a fast, automatically updated, minimally locked copy of the posts table, your best options are:
- Use MySQL replication, and direct all of your read queries to the replica server. WPDB dropins like HyperDB can handle this automatically for you.
- Use an in-memory cache, like memcached, which are super fast, and reduce load on the MySQL servers, by offloading the vast majority of read queries.
- If you need to do full text searches, set up an Elasticsearch server to handle those requests.
There are plugins to help you set up all of these options, they're heavily tested on extremely high load sites (eg, these are all things that WordPress.com does), and are much more reliable than any sort of process that would shard the posts table by arbitrary criteria.
#24
@
5 years ago
Thank you @pento for your insightful response.
I may continue to investigate this further.
In a nutshell, what I was trying to explain is:
Optionally, What if just the "Source" of the data tech (that being currently SQL) of the data in the default WP list table, of any post type, could be changed to a different tech (e,g: Flatten db table, JSONfile, local REST API, etc) (though being the same exact data rendered). But keep the wpdb tech "as-is" for updating/creating posts. The source tech can be made from the wp_posts table.
Just looking for a faster way to just render data without having to ticker with the server
But I understand a lot of functions would have to be mimic to work with other techs.
Maybe this should be a plugin, but wish it was in core.
#25
@
5 years ago
Right, I see. For most uses, MySQL is the fastest option for what you're describing. It's pretty good about allowing a large number of concurrent reads, even while writing data. Retrieving a single post by ID, for example, is only a few ms, you may be able to shave off a 1-2 ms by adding memcached in this instance, but it's not really worth it.
Once you scale beyond MySQL's limits, the idea is to add existing tools in front of the database. MySQL remains the definitive source, and then you have various caching layers which hold objects, rather than DB rows. You'd store an entire post, with postmeta in memcached, saving multiple queries. In front of that, you'd have a HTML cache, which caches sections of HTML pages, or even entire HTML pages.
#26
follow-up:
↓ 27
@
5 years ago
I also need this. Because I work on site with thousands of products. And storing them togehter with post and pages, media, etc in 1 table cause it so hard to do bulk editing/copying/export/import.
#27
in reply to:
↑ 26
@
5 years ago
Replying to binh:
I also need this. Because I work on site with thousands of products. And storing them togehter with post and pages, media, etc in 1 table cause it so hard to do bulk editing/copying/export/import.
Any plugin can put their data in custom tables, and hundreds do, like WooCommerce do for many objects. The post_type column is indexed, so I don't see unsolvable problems with export/import/bulk actions.
I can't see why core should support this, given the complexity with unique IDs that has to be solved. Each plugin will have their way of dealing with this, have tables with other columns and indexes. So even if core supported this natively, it would not be sufficient.
Core provides database layer API for anyone to use to make special tables, without having to establish their own database connection.
Custom post types can be stored in custom tables, and Pods Framework Advanced Content Types does this their way. So can therefore any plugin, or rely on their or other plugin's solutions.
Definitely plugin territory.
#28
@
5 years ago
The post_type column is indexed, so I don't see unsolvable problems with export/import/bulk actions.
Export/Import may not the problem, but as you probably know we can not migrate and merge data from one site just in time to another due to shared data that should be separated.
Definitely plugin territory.
I can't see why core should support this, given the complexity with unique IDs that have to be solved.
That's the point: Due to the given complexity with the unique UID
we need to have a - preferably fully backward-compatible - API for CPT's for custom tables that can be used by every plugin. This should start with a plugin but it needs to get the full power and support of the core team like the Gutenberg project has.
It's the core responsibility to show plugin developers the correct way how to store data in a table.
As the core was and as far as I know still recommends the use of CPT, it's core responsibility is to fix what it has been caused by this short-sighted implementation to denormalize the wp_posts table in such a heavy way.
We need to work on this together and WordPress core needs to include the solution for storing custom post types in separate tables.
Core and plugin developers need to pull the same strings and work on a common concept for reaching this. Otherwise, we will never be successful with storing data the right way as it is much too complex to solve without breaking millions of other sites.
First of all, we need to stop recommending the use of CPT's as they are immediate. WooCommerce is the best example: If they had foreseen that storing orders and customer data in _post/_postmeta makes it impossible to migrate staging/development sites they probably never would have done this.
Now they are locked in technical dept as many other plugin developers as well.
Having plugin developers storing all of their plugin data in CPT's in one table indexed by a global unifier was never the right decision in terms of good database design.
It was a curse and a blessing.
In the first years, we thought it was a good decision as it made storing data just simply for hobbyist developers and helped to grow the WordPress ecosystem. Then it turned out that it led to the technical debt that we have today. Professional companies have a hard time to use WordPress because they are not able to easily migrate data from one test environment another.
It should be especially in WooCommerce/Automatic intentions to get this solved as their users would benefit heavily from this. This issue can not be solved by a single person.
#29
@
5 years ago
I agree with @ReneHermi. The performance and scaling issues with storing everying in postmeta have become mainstream enough for both ACF (via Hookturn's custom tables plugin) and Meta Box to have custom table options now. And the fact that WooCommerce had to add lookup tables to work around this, plus the increasing interest in Liquid Web's custom orders table plugin, further highlights the problem.
I don't think we should be worried about migration- just introducing this functionality in core so plugin devs can tap into it would make a big difference. It would be very easy to add and would be a solid step towards a full custom tables solution like @johnjamesjacoby's BerlinDB in the future.
Matt keeps talking about WP being the "OS of the web". Custom tables are going to be a necessity for that. It don't think it makes sense to keep putting off dealing with it.
#30
@
5 years ago
I think the simplest solution would be to allow the register_post_type
function to optionally specify a different table to save and retrieve data from. That way plugin authors who know they are bloating the posts and postmeta tables could opt to send data to a custom table.
#31
@
4 years ago
I am 1+ for this to be done.
I believe that we should return to discuss this matter, when the Ticket was opened 10 years ago, the technology worked in a way, the data are not with today.
The database technology has changed, the data as we store today is bigger, information comes and goes all the time, customized post types are created today to serve a more complex market, and I believe it is time to evolve the way wordpress handles the database.
It is true that some points are worrying, mainly all this legacy that we have of standard wordpress structure, but my point of view is geared towards implementing and not modifying I believe it would be great to create something that we can define natively when we create a custom post type.
We could extend this feature, for example by using a custom table for post type, at that time we could also create a custom table for postmeta and start storing information horizontally, that is, custom fields would become columns so the post_id column would be referencing the table posts_custom, and the others like post_field_custom_1, post_field_custom_2, post_field_custom_3 ......
I know that today we can do both with third party resources
Custom Tables for Posts https://github.com/hannahtinkler/wordpress-custom-post-type-tables
Custom table for postmet https://metabox.io/plugins/mb-custom-table/
But imagine the world of possibilities having this native in wordpress, I believe that we would bring a new wave of developers to the side of wordpress.
Greetings Luiz Araujo
This ticket was mentioned in Slack in #core by luizarraujoofficial. View the logs.
4 years ago
#33
@
2 years ago
- Type changed from enhancement to feature request
- Version set to 6.0
Hello, I come to ask if this will be implemented?
In this moment, with my team, we are implementing a forum and online school (without plugins) and the database crys in any consult (lol)...
We think this would be a good solution to the general performance of the applications we build in WordPress (we use WordPress as a framework and sometimes as headless).
Thanks for you attention.
Juan Iriart :)
This ticket was mentioned in Slack in #core by sergey. View the logs.
21 months ago
#35
@
21 months ago
- Focuses performance added
- Keywords 2nd-opinion changes-requested added
- Resolution wontfix deleted
- Status changed from closed to reopened
- Version changed from 6.0 to 6.0.3
WOW @rahul286 called it 9 years ago, thank you! It's unfortunate how the WP team felt about it back then. I fully agree with @ReneHermi and @binaryfire I hope This will be considered in the near future to allow better scaling for more significant sites with millions of records.
#36
@
21 months ago
I will add that my original hack solution is still viable for plug-in authors.
My use case was address radius searches which were impossible from meta, but this framework I created lets you use custom tables which support querying in optimized ways, but that act like meta tables for posts.
Further they piggy backed on the *_post_meta functions to load and update records (in many cases).
#37
@
21 months ago
My use case was address radius searches
This is precisely what I'm looking to do as well.
Additionally, our primary ecomm site has a post_meta with ~3,870,129 rows and is 1.7gb!
Looking forward to the day we can move to the new woocommerce table system. But not all our plugins have been updated to handle the change. Going to take a look at your plugin. Thanks in advance.
#38
@
17 months ago
- Keywords changes-requested removed
- Milestone set to Awaiting Review
- Version 6.0.3 deleted
#39
@
16 months ago
- Focuses performance removed
This ticket was discussed during yesterday's Performance bug scrub.
@SergeyBiryukov, do you have a plan to work on this or a path for moving this issue forward? As discussed, this does not need to have a performance focus.
Additional props to @joemcgill.
#40
@
16 months ago
Hi. Is there a way to unsubscribe from this ticket? I’m not watching it but I still get emails whenever a new comment is posted.
#41
@
16 months ago
After all these years, Im now 50/50 on this. The backwards compatibility concerns are major, and even if the custom post types are redirected to another table, that table would also choke when millions of rows were inserted to it anyway. So there may not be much gain from doing this since the performance concerns would need to be addressed eventually in either case.
However, having the possibility of redirecting all db operations for any given post type to another table could work and it could address both the backwards compatibility concern and the performance concern with custom post type setups with millions of rows. Ie, if an arg is supplied in custom post definition for a different table, WP redirects all db operations for that post type to other tables than wp_post and wp_postmeta etc. This could be made happen in other ways of course, this is just an idea...
#42
@
16 months ago
Even though lots of posts would still cause a slowdown, there would still be plenty of performance gains to be had as the slowdown would now be isolated to a specific post type.
For example, I had a client who kept records of each project they were commissioned to do for their customers. There were hundreds of thousands of projects this company had done over their many years of operation, and they wanted to be able to search through all of them. I imported all of these projects as a custom post type, and the entire site immediately became sluggish. If this data was in a separate table, the only slowdown that would happen would be when a search was made on the project search page.
The only backwards compatibility issue that I can think of with allowing a CPT to be moved to its own table would be with any plugins that are directly querying the wp_posts and wp_postmeta tables instead of using the built in WordPress functions. What other problems are there with backwards compatibility?
Besides the performance benefits, it would become much easier to exclude CPTs when migrating from staging to production which is actually a huge benefit. I'm excited to see this ticket starting to be seriously considered.
#43
@
9 months ago
The initial proposal was for a plugin/theme developer to leverage the existing core parts for editor, displaying etc. while separating data in DB. Existing post types can stay as they are - so there is no backwards compatibility issue at all.
Not even the slug is a problem - register_post_type already has a default behavior of using the post_type as the pre-slug.
Cross-type searching is also irrelevant - if developer chooses to do this, they will have no problem creating a special paginated WP_Query for this post type - so I would skip mixing post types for WP_Query and simply default to wp_posts if table not specified or use the specified table. This could be start, then it could be derived automatically (multiple post type tables => document that only wp_posts will be used; one post type table => use that).
The main use case are specific post types that have 10s or hundreds of thousands of posts. (This can range from forum to daily program list or a website with long history (10+ years) that emits about 5 posts a day for one of its specific post types.
This would be most useful for the post types that by they nature have nothing to do with the core content of the site (blog posts, static pages, ...).
#44
@
9 months ago
- Type changed from feature request to enhancement
I wonder if mysql table partitioning may help.
So, What is MySQL Partitioning? (Percona.com)
Partitioning is a way in which a database (MySQL in this case) splits its actual data down into separate tables but still gets treated as a single table by the SQL layer. https://www.percona.com/blog/what-is-mysql-partitioning/
If plugin territory
I think having a simple button that fires to ALTER TABLE (wp_posts) and add in partitions based on the existing custom post types. For instance (I haven't tested, test your own risk):
<?php global $wpdb; ?> <!-- Add HTML submit button here --> <button type="submit" name="submit" href="?runpart=1">Submit</button> <?php if ( isset( $_POST['submit'] ) && $_GET['runpart'] == 1): //Find WordPress default post types $pt_count_default = 3; $table_name = $wpdb->prefix . 'posts'; $pt_count_sql = "SELECT DISTINCT post_type FROM $table_name;"; //Get Distinct Count of post types; $curr_pt_count = $wpdb->get_var($pt_count_sql); //Check if this is the first time running if(get_option('option_old_pt_count')): $old_pt_count = get_option('option_old_pt_count'); //If any custom post types were removed if($curr_pt_count < $old_pt_count): //Stored post type count minus current count $curr_pt_count = $old_pt_count - $curr_pt_count; //Remove post type partitions no longer needed $sql = "ALTER TABLE $table_name COALESCE PARTITION $curr_pt_count;"; endif; //If any new custom post types were added if($curr_pt_count > $old_pt_count): //Add new Partitions to wp_posts table (this may error out if not dropped first) $sql = "ALTER TABLE $table_name PARTITION BY HASH(post_type) PARTITIONS $curr_pt_count;"; endif; //Update option field update_option( 'option_old_pt_count', $curr_pt_count); else: //Add new Partitions to wp_posts table $sql = "ALTER TABLE $table_name PARTITION BY HASH(post_type) PARTITIONS $curr_pt_count;"; //Add option field add_option( 'option_old_pt_count', $curr_pt_count); endif; require_once ABSPATH . 'wp-admin/includes/upgrade.php'; dbDelta( $sql ); endif;
If Core territory
Maybe the MySQL Partition should be included in core's database and table creation script.
https://dev.mysql.com/doc/refman/8.0/en/partitioning-types.html
For instance:
<?php " CREATE TABLE" . $wpdb->prefix . "posts ( ID bigint(20) NOT NULL AUTO_INCREMENT, post_author bigint(20), post_date datetime '0000-00-00 00:00:00', post_date_gmt datetime '0000-00-00 00:00:00', post_content longtext, post_title text, post_excerpt text, post_status varchar(20), comment_status varchar(20), ping_status varchar(20), post_password varchar(20), post_name varchar(200), to_ping text, pinged text, post_modified datetime '0000-00-00 00:00:00', post_modified_gmt datetime '0000-00-00 00:00:00', post_content_filtered longtext, post_parent bigint(20), guid varchar(255), menu_order int(11), post_type varchar(20), post_mime_type varchar(100), comment_count bigint(20) PRIMARY KEY (ID) ) PARTITION BY LIST COLUMNS(post_type) ( PARTITION ppost VALUES IN('post'), PARTITION ppage VALUES IN('page'), PARTITION pcpt_1 VALUES IN('custom-post-type1'), PARTITION pcpt_2 VALUES IN('custom-post-type2') ); ";
With using MySQL Partitions we can run select queries as:
"SELECT * FROM" . $wpdb->prefix . "posts PARTITION (pcpt_1);";
Or as normal. Will work both ways.
"SELECT * FROM" . $wpdb->prefix . "posts WHERE post_type = custom-post-type1;";
This is not something we would likely consider.
We are happy with storing the custom post types in the existing table structures.
We don't recommend on adding extra tables - in some installation scenarios it isn't even possible.