WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 8 months ago

Last modified 8 months ago

#11884 closed enhancement (wontfix)

mod_rewrite optimization

Reported by: Denis-de-Bernardy Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.0
Component: Optimization Keywords: close
Focuses: Cc:

Description (last modified by Denis-de-Bernardy)

Slightly edited version of the one suggested in:

http://wordpress.org/extend/ideas/topic.php?id=3524

# BEGIN WordPress
RewriteEngine on
RewriteBase /

RewriteRule \.(gif|jpe?g|png|css|js|ico)$ - [NC,L]

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]

RewriteRule . /index.php [L]

# END wordpress

Attachments (1)

wordpress.16670.patch (1.3 KB) - added by g1smd 3 years ago.
More efficient default mod_rewrite rules

Download all attachments as: .zip

Change History (22)

comment:1 Denis-de-Bernardy4 years ago

  • Description modified (diff)

comment:2 Denis-de-Bernardy4 years ago

I've been testing this on my own site since this morning:

RewriteRule \.(gif|png|jpe?g|ico)$ - [NC,L]

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]

RewriteRule . /index.php [L]

performance is slightly improved indeed.

comment:4 follow-up: sivel4 years ago

-1 from me. I can see a situation where someone may want to actually send content that looks like a static file using wp_rewrite

comment:5 in reply to: ↑ 4 nacin4 years ago

  • Resolution set to wontfix
  • Status changed from new to closed

Replying to sivel:

-1 from me. I can see a situation where someone may want to actually send content that looks like a static file using wp_rewrite

I agree. The htaccess rules should be as light as possible. If you want to configure this on your own then you should do so. I have plenty of specific htaccess rules but none are good for core. I do like #11845 so maybe that can be a start.

comment:6 nacin4 years ago

  • Milestone 3.0 deleted

comment:7 scribu3 years ago

  • Cc scribu added

g1smd3 years ago

More efficient default mod_rewrite rules

comment:8 g1smd3 years ago

  • Resolution wontfix deleted
  • Status changed from closed to reopened

I can see a situation where someone may want to actually send content that looks like a static file using wp_rewrite.

Would that content be an HTML page or would it be an image?

If HTML then the proposed patch still allows for that to happen.

Version 0, edited 3 years ago by g1smd (next)

comment:9 sivel3 years ago

  • Keywords close 2nd-opinion added

I am of the mindset that adding this by default adds restrictions that could potentially cause issues. I have no idea how many plugins in the wild it would break, but I have 3 applications (built using themes and plugins) that use real file extensions that would be delivered via WP_Rewrite. These cover images, documents, js and css.

The decision previously was that the rules should remain as simple as possible to not create adverse restrictions that would hamper the capabilities of WP_Rewrite.

This functionality can easily be wrapped into a plugin that utilizes add_external_rule(). I vote to close as wontfix and suggest that this be implemented via a plugin.

comment:10 follow-up: g1smd3 years ago

What is the mechanism for a URL request serviced by "WP_rewrite"? You're saying those are URL requests that are rewritten to the index.php script and then handled from there?

Just to be clear, you are saying that a request for /someimage.jpg (for example) is rewritten to the index.php script and the index.php script sends an image back?

comment:11 in reply to: ↑ 10 ; follow-up: sivel3 years ago

Replying to g1smd:

What is the mechanism for a URL request serviced by "WP_rewrite"? You're saying those are URL requests that are rewritten to the index.php script and then handled from there?

Just to be clear, you are saying that a request for /someimage.jpg (for example) is rewritten to the index.php script and the index.php script sends an image back?

In the only public example I have to show currently, http://paste.sivel.net/embed/24.js is not a real file. It is generated similar to the way a post is. The request comes in, the file doesn't exist so it is sent to index.php and processed. WP_Rewrite has a regex pattern looking for this type of URL, and in the end gets the request to a custom template file within my theme, based off of internal query vars the same way that a post is rendered. In this case I send the correct HTTP headers for a JS file, and output some JS.

I have other applications built off of WP which are not publicly accessible which generate documents, css and images in this manner.

So basically yes, requests to non-existent files such as image files, PDFs, JS are rewritten to index.php via the current .htaccess, which then processes the request using custom rewrite rules via WP_Rewrite, and delivers back the expected type of content.

Adding rewrite rules to exclude images, js, css, etc would not permit this to work.

comment:12 g1smd3 years ago

Hang on, does WP_rewrite add customised mod_rewrite rules to the .htaccess file or does it sit behind index.php examining the parts of the URL that was requested? I am not 100% sure what all these different modules do.

" and in the end gets the request to a custom template file "

If index.php is handling the apache request, I assume that the custom template is "included" into the index.php file in some way?

Or does Apache use a mod_rewrite rewrite to send the external request directly to that template file?

"processes the request using custom rewrite rules via WP_Rewrite"

OK. Which one is it? Is it "request is handled by index.php" or is it "custom rewrite rules invoke some other file to handle the request"?

These "custom rewrite rules via WP_Rewrite", they are rules in the .htaccess file are they, or are they some other sort of functionality?

If these are "custom rules in the .htaccess file", it is far too late for these to be invoked once index.php is handling the request.

I hope you can clarify this. Sorry, the language and terminology is not straightforward here.

comment:13 scribu3 years ago

There are two sets of rules: .htaccess rules and WP_Rewrite rules.

The .htaccess rules merely direct all non-file requests to index.php.

The WP_Rewrite rules are built internally by WP and are used to figure out what to display: a list of posts, a single page etc.

See Otto's article on WP_Rewrite: http://ottopress.com/2010/category-in-permalinks-considered-harmful/

comment:14 in reply to: ↑ 11 jvleis3 years ago

Replying to sivel:

In the only public example I have to show currently, http://paste.sivel.net/embed/24.js is not a real file. It is generated similar to the way a post is. The request comes in, the file doesn't exist so it is sent to index.php and processed. WP_Rewrite has a regex pattern looking for this type of URL, and in the end gets the request to a custom template file within my theme, based off of internal query vars the same way that a post is rendered. In this case I send the correct HTTP headers for a JS file, and output some JS.

I have other applications built off of WP which are not publicly accessible which generate documents, css and images in this manner.

So basically yes, requests to non-existent files such as image files, PDFs, JS are rewritten to index.php via the current .htaccess, which then processes the request using custom rewrite rules via WP_Rewrite, and delivers back the expected type of content.

Adding rewrite rules to exclude images, js, css, etc would not permit this to work.

This argument seems to be the one that prevents more optimization of htaccess. If one of the proposed modifications significantly improves page load times and decreases server load, then I ask you to consider this logic:

  1. 90%+ of WP users are likely not aware of htaccess, much less how to regenerate .js or .css files depending on conditional statements.
  2. Those users who do understand how to do that are highly skilled in htaccess.
  3. Therefore Sivel is arguing that millions of users run slower WP installations because a small minority of programmers, (a very small minority I suppose) do not wish to change an htaccess file they are already changing to make their modified systems work.

I would ask the participants of these tickets to consider the foundational customer. If WP can work out of the box at significantly less cost to hardware and reduced load times, then that should be the default. It must be the responsibility of sophisticated programmers to alter that baseline scenario, not the other way around.

comment:15 follow-up: sivel3 years ago

The real concern here is that WordPress should be as least restrictive as possible, and adding these rules defeats that.

In addition, if we begin adding excludes here, to what end do we do so? There are hundreds of thousands of file types that people are going to want excluded. I personally have the following list that I use for caching purposes that is pulled from the allowed list of mime types that WP uses for uploads plus some extras and minus a few:

jpg|jpeg|jpe|gif|png|bmp|tif|tiff|ico|asf|asx|wax|wmv|wmx|avi|divx|flv|mov|qt|mpeg|mpg|mpe|txt|asc|c|cc|h|csv|tsv|rtx|css|mp3|m4a|m4b|mp4|m4v|ra|ram|wav|ogg|oga|ogv|mid|midi|wma|mka|mkv|rtf|js|pdf|doc|docx|pot|pps|ppt|pptx|ppam|pptm|sldm|ppsm|potm|wri|xla|xls|xlsx|xlt|xlw|xlam|xlsb|xlsm|xltm|mdb|mpp|docm|dotm|pptx|sldx|ppsx|potx|xlsx|xltx|docx|dotx|onetoc|onetoc2|onetmp|onepkg|swf|class|tar|zip|gz|gzip|exe|odt|odp|ods|odg|odc|odb|odf|wp|wpd|diff|patch|sh|conf|xsl|bz2|dv

That totals 110 file types.

Then we start telling people, no we wont add that file extension, we then make it easier to add them via a filter, at which point removing those lines via a plugin can become more problematic using the mod_rewrite_rules filter.

As another data point, afaik Drupal still uses basically the same rules we do.

Working at a hosting company, one that hosts quite a lot of WP sites, I have never seen these rewrites cause issues. For those people who are concerned with performance in this aspect, you wouldn't want to have .htaccess files enabled anyway, and would be placing this in your vhost configuration, in which case you can do whatever you want since WordPress isn't managing your mod_rewrite rules.

I still think it is better to implement the additional rules via a plugin, and leave the mod_rewrite rules light.

comment:16 in reply to: ↑ 15 jvleis3 years ago

I believe I now understand the reason for our different perspectives.
Replying to sivel:

The real concern here is that WordPress should be as least restrictive as possible, and adding these rules defeats that.

Sorry, but I do not see how.

In addition, if we begin adding excludes here, to what end do we do so? There are hundreds of thousands of file types that people are going to want excluded. I personally have the following list that I use for caching purposes that is pulled from the allowed list of mime types that WP uses for uploads plus some extras and minus a few:

jpg|jpeg|jpe|gif|png|bmp|tif|tiff|ico|asf|asx|wax|wmv|wmx|avi|divx|flv|mov|qt|mpeg|mpg|mpe|txt|asc|c|cc|h|csv|tsv|rtx|css|mp3|m4a|m4b|mp4|m4v|ra|ram|wav|ogg|oga|ogv|mid|midi|wma|mka|mkv|rtf|js|pdf|doc|docx|pot|pps|ppt|pptx|ppam|pptm|sldm|ppsm|potm|wri|xla|xls|xlsx|xlt|xlw|xlam|xlsb|xlsm|xltm|mdb|mpp|docm|dotm|pptx|sldx|ppsx|potx|xlsx|xltx|docx|dotx|onetoc|onetoc2|onetmp|onepkg|swf|class|tar|zip|gz|gzip|exe|odt|odp|ods|odg|odc|odb|odf|wp|wpd|diff|patch|sh|conf|xsl|bz2|dv

That totals 110 file types.

jpg|jpe?g|gif|png|css|js
What percentage of files would you estimate I have just described? Would it be 3 standard deviations? The rest don't matter, and their sheer number yields an increasingly lower return on web page load times, which defeats the goal.

Then we start telling people, no we wont add that file extension, we then make it easier to add them via a filter, at which point removing those lines via a plugin can become more problematic using the mod_rewrite_rules filter.

See the earlier point. You would be quibbling over nothing since you are already covering 99% of all occurrences with 6 files. You would only change the default setting when adding a file extension is required to remain above 3 standard deviations.

But I suspect there is another issue here. You appear to be saying that the more 'dense' code would increase programming requests. This is a common error by programmers; measuring code utility by the requests it inspires (how much work it makes for the programmer) as opposed to the utility to the customer.

Let us look at that utility. Let's say there are 4M self-hosted WP sites. If the code reduces hosting loads by 20%, that is a conservative saving of $8-12M / month (not to mention the unquantified benefit of speed=fun on the Internet). Notice that the saving is divided between the hosting firm and their customers. IOW, lower loads means higher capacity for servers. Customers only enjoy the savings when it delays upgrades.

As another data point, afaik Drupal still uses basically the same rules we do.

Joomla and Drupal are considering it. But ultimately, why is that pertinent?

Working at a hosting company, one that hosts quite a lot of WP sites, I have never seen these rewrites cause issues. For those people who are concerned with performance in this aspect, you wouldn't want to have .htaccess files enabled anyway, and would be placing this in your vhost configuration, in which case you can do whatever you want since WordPress isn't managing your mod_rewrite rules.

As in your previous post, I believe you are arguing that millions of WP users uninterested in learning code are free to spend their precious time getting to know htaccess rules. Since the slower, 'simpler' code works, we should keep it. Do I have that right? If so, I urge you to consider whether you mean it.

WP is great open source software. I have no idea how you folks figure out what to program. I just wanted you to be aware that there are enormous dollar values being spent by real people when you choose to save programming code.

Thank you for your time. The decisions, as always, are up to the WP team. Good luck.

comment:17 g1smd3 years ago

Avoiding the two slow and inefficient disk reads (for requests that are generally not going to be rewritten) sees a measurable increase in server efficiency.

Although there may be "110 file types", applying this thinking purely to the requests for images already sees improved handling for 90% or more of non-HTML server requests.

Mod_rewrite offers a lot of powerful features, written to be highly efficient, but most Open Source developers ignore all of this and simply code

# If the requested path and file doesn't directly match a physical file

RewriteCond %{REQUEST_FILENAME} !-f

# and the requested path doesn't directly match a physical folder

RewriteCond %{REQUEST_FILENAME} !-d

# internally rewrite the request to the index.php script

RewriteRule .* index.php [L]

which is the absolutely worst way of doing it.

With this method you then have to write PHP code to "understand" what "type of" request was made and call the right module to deal with that request.

This functionality could have been far more efficiently coded using a compact RegEx and rewrite function in .htaccess. However, this can only be done if there is a cohesive plan for how the URL space used by a site is divided up: and therefore a simple way for requests for different types of content to each specifically match a single RegEx.

It seems that database coding comes first, and "how this impacts the URL space of the site" comes last, if indeed it is even addressed at all.

Last edited 3 years ago by g1smd (previous) (diff)

comment:18 ocean903 years ago

  • Milestone set to Awaiting Review

comment:19 c3mdigital8 months ago

  • Keywords 2nd-opinion removed
  • Resolution set to wontfix
  • Status changed from reopened to closed

Suggesting close unless anyone else thinks there should be further discussion on this.

comment:20 alex-ye8 months ago

  • Cc nashwan.doaqan@… added

comment:21 nacin8 months ago

  • Milestone Awaiting Review deleted
Note: See TracTickets for help on using tickets.