#8958 closed enhancement (duplicate)
Optimize rewrite rule generation
Reported by: | matthijs | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | major | Version: | 2.7 |
Component: | Optimization | Keywords: | |
Focuses: | Cc: |
Description
The problem was introduced after the upgrade from 2.6.5 to 2.7.
The site has a permalink structure of /%category%/%year%/%monthnum%/%day%/%postname%/
After the upgrade the blog runs more then 2400 queries per page load. The site works normal, only very very slow (of course).
Research into the queries being run shows wordpress is trying to insert the rewrite_rules in the wp_options table. Only problem is: the field rewrite_rules contains about 20,000 lines of code/rules, putting that single field in a text file is >1.2 Mb large. So it chokes on that.
Disabling plugins, re-importing the db file, renewing the wp files, nothing helps. Going back to 2.6.5 and then again upgrading reintroduces the problem.
Changing the permalink structure to something else, for example:
/%year%/%monthnum%/%day%/%postname%/
also solves the problem temporarily. As soon as I change back to
/%category%/%year%/%monthnum%/%day%/%postname%/
wordpress again tries to insert the huge query into the field rewrite_rules, which isn't possible.
Looking at the rewrite_rules it seems it repeats the same rules for every page and attachment, like
s:35:\"vacancies/473/attachment/([^/]+)/?$\";
s:32:\"index.php?attachment=$matches[1]\";
s:45:\"vacancies/473/attachment/([^/]+)/trackback/?$\";
s:37:\"index.php?attachment=$matches[1]&tb=1\";
s:65:\"vacancies/473/attachment/([^/]+)/feed/(feed|rdf|rss|rss2|atom)/?$\";
s:49:\"index.php?attachment=$matches[1]&feed=$matches[2]\";
s:60:\"vacancies/473/attachment/([^/]+)/(feed|rdf|rss|rss2|atom)/?$\";
s:49:\"index.php?attachment=$matches[1]&feed=$matches[2]\";
s:60:\"vacancies/473/attachment/([^/]+)/comment-page-([0-9]{1,})/?$\";
s:50:\"index.php?attachment=$matches[1]&cpage=$matches[2]\";
s:28:\"(vacancies/473)/trackback/?$\";
Interestingly:
1- when I do an export of the posts to the xml file, install a fresh wp and then re-import the posts, the problem goes away. In that case, the field rewrite_rules, suddenly "only" contains around 5000 rewrite_rules.
2 - all the rules seem to relate to (older) versions of the pages and attachments
So maybe the problem is that wordpress is writing rules for every page revision and every page attachment it has ever had. The website is not that big, but with a few hundred pages, a few hundred attachments it seems to go into the 20,000 rules with this specific permalink structure. That could also explain that when I do the import of xml in a fresh install, there are no page revisions anymore and the rewrite rules are not that many.
I cannot imagine it is something else, because besides the fact that that one query is too large to be run, everything is fine. Database tables are in good condition, files are correct, I tried it on different servers, etc so I can almost certainly rule out another causes. Besides the fact that wp cannot run the query to insert the rewrite_rules, everything works fine, front- and backend.
Attachments (2)
Change History (42)
#2
@
16 years ago
Maybe it's not for the revisions. Looking at the rules, it seems all the attachments are the problem. Each one has about 11 rules (22 lines), like:
s:35:\"vacancies/474/attachment/([^/]+)/?$\";
s:32:\"index.php?attachment=$matches[1]\";
s:45:\"vacancies/474/attachment/([^/]+)/trackback/?$\";
s:37:\"index.php?attachment=$matches[1]&tb=1\";
s:65:\"vacancies/474/attachment/([^/]+)/feed/(feed|rdf|rss|rss2|atom)/?$\";
s:49:\"index.php?attachment=$matches[1]&feed=$matches[2]\";
s:60:\"vacancies/474/attachment/([^/]+)/(feed|rdf|rss|rss2|atom)/?$\";
s:49:\"index.php?attachment=$matches[1]&feed=$matches[2]\";
s:60:\"vacancies/474/attachment/([^/]+)/comment-page-([0-9]{1,})/?$\";
s:50:\"index.php?attachment=$matches[1]&cpage=$matches[2]\";
s:28:\"(vacancies/474)/trackback/?$\";
s:37:\"index.php?attachment=$matches[1]&tb=1\";
s:48:\"(vacancies/474)/feed/(feed|rdf|rss|rss2|atom)/?$\";
s:49:\"index.php?attachment=$matches[1]&feed=$matches[2]\";
s:43:\"(vacancies/474)/(feed|rdf|rss|rss2|atom)/?$\";
s:49:\"index.php?attachment=$matches[1]&feed=$matches[2]\";
s:36:\"(vacancies/474)/page/?([0-9]{1,})/?$\";
s:50:\"index.php?attachment=$matches[1]&paged=$matches[2]\";
s:43:\"(vacancies/474)/comment-page-([0-9]{1,})/?$\";
s:50:\"index.php?attachment=$matches[1]&cpage=$matches[2]\";
s:28:\"(vacancies/474)(/[0-9]+)?/?$\";
s:49:\"index.php?attachment=$matches[1]&page=$matches[2]\";
That page (vacancies) currently has only 2 attachments, but apparently wordpress remembers all previous attachments (which are still in the media gallery) and writes rules for all of them.
So this is an expected behavior? Then does that mean that this will not scale to anything behind a few hundred pages or attachments?
#3
@
16 years ago
Ok so it looks like we could improve the situation by only build rules for published attachments.
I'll see if I can put something into trunk for you to test with.
In general by having %category% at the beginning of your permalink structure you are going to have a lot more rules because of the way it is matched.
#4
@
16 years ago
- Keywords needs-patch added; reporter-feedback removed
- Milestone set to 2.8
Looking into this a little more it's not easy I don't think at the moment to tell if a post_attachment is used or not :-(
The offending code in wp-includes/rewrite.php is:
$attachments = $wpdb->get_results( $wpdb->prepare( "SELECT ID, post_name, post_parent FROM $wpdb->posts WHERE post_type = 'attachment' AND post_parent = %d", $id ));
Which pulls all attachment into the mix - attachments have a post_status of inherit at the moment so it's not easy to modify that query to pick out the visible ones.
#5
@
16 years ago
But it should be able to query for the post status right? I don't know the intricacies of the code, but if you take the above query and take into account the field post_status == publish, it should be possible to get out only the currently active ones, I would think.
Maybe as simple as
WHERE post_type = 'attachment' AND post_parent = %d AND post_status = publish
#6
@
16 years ago
But it should be able to query for the post status right?
As westi said, -All- Attachments have the post_status of 'inherit' right now, So its impossible to separate visible from non-visible attachments.
#7
@
16 years ago
So then the conclusion would be that not all attachments should get the status of 'inherit'? It seems like the current situation doesn't scale at all. I wouldn't call a website with a few hundred posts/attachments big.
Or are there other solutions?
Why is it necessary to have rewrite rules for all media files? Wouldn't it make more sense to just :
- upload media to folders
- link to them
I thought the whole point of the htaccess rewrite rules in wordpress is "unless the file exists, channel the request through index".
#8
@
16 years ago
Having %category% or %postname% at the beginning of the permalink structure means either including explicit rewrite rules for all pages or doing extra queries when resolving permalinks to determine if a page is being requested.
#9
@
15 years ago
- Keywords dev-feedback added; permalinks rewrite_rules revisions removed
couldn't this:
$attachments = $wpdb->get_results( $wpdb->prepare( "SELECT ID, post_name, post_parent FROM $wpdb->posts WHERE post_type = 'attachment' AND post_parent = %d", $id ));
be replaced, by code that looks something this before the foreach loop:
$all_attachments = $wpdb->get_results("SELECT * FROM $wpdb->posts WHERE post_type = 'attachment'); update_post_caches($all_attachments);
and then in the foreach loop:
$attachments = get_children(...)
#10
@
15 years ago
the query could then be optimized with a join on posts as parents where parents.post_type = page and parents.post_status = published, too.
#11
@
15 years ago
- Milestone changed from 2.8 to Future Release
Moving to future, from lack of patch.
#12
@
15 years ago
A potential and temporary workaround is to insert some other string into the permalink structure. This is the permalink structure I'm using that seems to avoid the problems with this bug:
/view/%category%/%postname%/
And for the category base, use "list". Seems to make fairly human-understandable URLs, too.
#13
@
15 years ago
that work work for very long, though. there is a separate ticket that highlights that as an issue, the patch is in the commit candidates.
#14
@
15 years ago
- Cc robinmarshall added
What about just getting the id's of pages with attachments then checking if the page id is in the list before running the query?
e.g.
$all_attachments = $wpdb->get_results("SELECT post_parent FROM $wpdb->posts WHERE post_type = 'attachment'"); foreach ($posts as $id => $post) { if (!in_array($id,$all_attachments)) continue;
#15
@
15 years ago
might work too. the keyword, at the end of the day, is needs-patch. it won't get fixed unless someone gives it some love.
#19
@
15 years ago
- Keywords needs-patch added; has-patch needs-testing dev-feedback removed
indeed it should. but the patch is clunky. there is a faster way to get this all in one go.
@
15 years ago
do the same thing in two queries, instead of two queries per page and one per attachment plus two
#20
@
15 years ago
- Keywords has-patch needs-testing added; needs-patch removed
@robin: please give the new patch a try
#22
follow-up:
↓ 23
@
15 years ago
To test this patches I used the result of "SELECT LENGTH(option_value) FROM wp_options
WHERE option_name LIKE 'rewrite%'" to benchmark. Along with a walk through the website to notice differences. The installation in which tests are performed has 200 posts, 100 pages, 35 categories, and started out in 2008.
The original situation was the custom permalink setting "/%category%/%year%/%postname%", the result of the query was +/- 1.400.000 chars. After applying patch 8958.diff the result was +/- 1.400.000 chars. After applying patch reduce_attach_query.8958.diff the result was +/- 145.000 chars. A significant difference.
Following the hint of justinph (see above) to use a fixed prefix ("/list/%category%/%year%/%postname%") the result in the original (unpatched) situation is +/- 8.000 chars. An even more dramatic improvement than the results achieved by the fixes, but at a cost: (slightly) different URLs. An equal result was achieved by moving a less variable parameter to the front of the setting: "/%year%/%category%/%postname%"): +/- 7.500 chars.
#23
in reply to:
↑ 22
@
15 years ago
- Component changed from Permalinks to Optimization
- Keywords tested added; needs-testing removed
- Owner changed from ryan to Denis-de-Bernardy
- Status changed from new to accepted
Replying to onezero:
The original situation was the custom permalink setting "/%category%/%year%/%postname%", the result of the query was +/- 1.400.000 chars. After applying patch 8958.diff the result was +/- 1.400.000 chars. After applying patch reduce_attach_query.8958.diff the result was +/- 145.000 chars. A significant difference.
Yes, but you're missing the point. If anything, you've just highlighted that 8958.diff has no bugs, as opposed to reduce_attach_query.8958.diff. The question here is not reduce the amount of rewrite rules, but rather to make sure that they're generated more quickly.
Re the number of rules, it cannot be trimmed because you're using a permalink structure that requires verbose page rewrite rules.
Following the hint of justinph (see above) to use a fixed prefix ("/list/%category%/%year%/%postname%") the result in the original (unpatched) situation is +/- 8.000 chars. An even more dramatic improvement than the results achieved by the fixes, but at a cost: (slightly) different URLs. An equal result was achieved by moving a less variable parameter to the front of the setting: "/%year%/%category%/%postname%"): +/- 7.500 chars.
FYI, this prefix trick should no longer work in 2.8 beta-2. And as you highlight, the rule that starts with /%year%/ is the best -- precisely because it's no longer using verbose page rewrite rules.
Anyway, marking 8958.diff as tested, since it significantly reduces the number of queries needed to generate those rules.
#27
@
15 years ago
- Summary changed from Huge amount rewrite rules for page revisions after upgrade to Optimize rewrite rule generation
- Type changed from defect (bug) to enhancement
#29
in reply to:
↑ 28
@
15 years ago
Replying to ryan:
So what can one expect when upgrading to 2.9? Will the 8959.diff patch still work? Anyone try this out yet?
#30
@
15 years ago
- Keywords health-check added; has-patch tested removed
- Milestone changed from Future Release to 3.0
#32
@
15 years ago
- Keywords needs-patch added; has-patch tested health-check removed
Patch does not apply clean any longer against head in trunk.
#34
follow-up:
↓ 35
@
15 years ago
is health-check for those tickets where the patches do not apply clean any longer?
#35
in reply to:
↑ 34
@
15 years ago
Replying to hakre:
is health-check for those tickets where the patches do not apply clean any longer?
No, that's what needs-patch is for :-P
health-check is a keyword that westi has been assigning to tickets relating to a health check of sorts for a WP install and the server it is running on. I imagine this will begin to pick up steam as a big new feature in 3.0. See also #11466 and a few others (I guess you could search for health-check, kind of the idea).
#36
@
15 years ago
@hakre: it's mostly a marker so that we remember what to highlight to users in the health check plugin. in this case, the check would tell users that verbose rewrite rules are very bad.
#37
@
15 years ago
- Milestone changed from 3.0 to Future Release
This item does not appear to be beta ready.
#38
@
14 years ago
- Cc mikeschinkel@… added
Consider resolving this issue using this approach instead #12935.
#39
@
13 years ago
- Keywords needs-patch health-check removed
- Resolution set to duplicate
- Status changed from assigned to closed
This is being addressed in #16687 (see Otto's comment).
You have to have verbose rules because you have %category% at the beginning of your permalink structure. See #3614)
In my local install with %category% at the start of my permalinks I get verbose rules but not ones for all the revisions.
Could you supply the rules you see generated for a revision?