Make WordPress Core

Opened 13 years ago

Last modified 5 years ago

#20109 new defect (bug)

Valid htaccess rule causes 404 after upgrade to 3.3.1

Reported by: ronnieg's profile ronnieg Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.3.1
Component: Rewrite Rules Keywords: needs-patch needs-testing
Focuses: Cc:

Description

On 2/18/2012, I upgraded WP from 3.2.1 to 3.3.1. on www.denverhomevalue.com. I later noticed that Google webmaster tools started reporting 404 errors on lots of pages that were fine before. WP was still returning the proper pages and content, not my custom 404 page, so the issue was not readily apparent except in WMT reports. Running http header response checkers confirmed 404 responses, despite good contents being returned.

Deleting sections of the htaccess file until the problem went away, the issue was isolated to two rules meant to escape following rewrite rules on certain urls:

RewriteRule ^city(.*) - [L]
RewriteRule ^areas(.*) - [L]

These were the same family of urls that started returning 404s after 3.3.1 upgrade. These rules were in place since 12/5/2011, with WP 3.2.1 and never caused a problem.

I was able to revert a backup of the site to 3.2.1, on the same exact server environment, and confirm that these same valid htaccess rewrite rules were not a problem in that release.

Change History (6)

#1 follow-up: @duck_
13 years ago

  • Keywords reporter-feedback added

If rewrite rules are going stop being processed like that then the WordPress rewrite rules will not be reached (rewriting to index.php). This will obviously give a 404 unless the requested file exists on disk.

I cannot use those rules on trunk or 3.2.1 without getting a 404 with *no* content returned. Either I'm missing something or there are other factors at work. Could we see those rewrite rules in context (i.e. are they before the WordPress rules)?

#2 in reply to: ↑ 1 @ronnieg
13 years ago

  • Keywords reporter-feedback removed

Replying to duck_:

The rewrite bypass rules above were in the middle of .htaccess, before the WP index.php rules. Because all of the rules in the file refered to pages, not posts, the rules were all working, and are still working, converting incoming old urls to new page urls prior to hitting the WP index.php rules.

The behavior I was seeing was really bizarre, because the good and intended content is definitely being rendered, not an empty page and not the standard or custom 404 page. For the purpose of demonstrating this, I am putting just the areas rule above back into the .htaccess file of my 3.3.1 development environment, with no other url specific rules. So to see this behavior in action, go to: http://dev1.denverhomevalue.com/areas/denver-central-communities.html.

You will see the full and correct content for that page, but with a 404 header response. You should see the same kind of behavior on all pages under the "Metro Areas" menu category, but normal 200 responses to everything else in the other menu categories.

Really, regardless of what earlier release this affects, which my own experience and testing shows was not an issue in the production/stable version of 3.2.1 I was on, 3.3 actually renders proper content, while improperly returning a 404 header, and that to me is the root issue, regardless of whether the WP index.php rewrite rules are going to be hit or not. Essentially, if content is successfully rendered, as my dev1 site proves is happening, then 200 should be the response. Only if content cannot be successfully rendered should it invoke a 404 response, and that should also trigger the standard or custom 404 action for the site.

The complete .htaccess currently on the above dev site follows:

RewriteEngine On
RewriteBase /
<IfModule mod_rewrite.c>
########## Begin - Rewrite rules to block out some common exploits
## This attempts to block the most common type of exploit `attempts` 
#
# Block out any script trying to set a mosConfig value through the URL
RewriteCond %{QUERY_STRING} mosConfig_[a-zA-Z_]{1,21}(=|\%3D) [OR]
# Block out any script trying to base64_encode crap to send via URL
RewriteCond %{QUERY_STRING} base64_encode.*\(.*\) [OR]
# Block out any script that includes a <script> tag in URL
RewriteCond %{QUERY_STRING} (\<|%3C).*script.*(\>|%3E) [NC,OR]
# Block out any script trying to set a PHP GLOBALS variable via URL
RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
# Block out any script trying to modify a _REQUEST variable via URL
RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2})
# Send all blocked request to homepage with 403 Forbidden error!
RewriteRule ^(.*)$ index.php [F,L]
RewriteRule ^areas(.*) - [L]
# BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

#3 @chriscct7
10 years ago

  • Keywords needs-patch reporter-feedback added
  • Severity changed from critical to normal

I can't reproduce this like @duck_ . @ronnieg can you still reproduce this?
Given we haven't had another report of this in 3 years, downgrading to normal severity.

#4 @ronnieg
10 years ago

Yes, I can still reproduce the issue. I just restored those same htaccess rules, immediately before the WP rules just as before and as shown below. I then re-tried the same url as in the original report, and same result. Header returned is 404 per Firebug net trace below, yet proper page results are returned. Site is currently on WP 4.0.

The original purpose and need for those rewrite rules is no longer applicable, at least on my site, so normal is probably proper priority.

However, the issue remains: If a 404 header is being returned, and Firebug can see it, why can't WP? And why is a 404 page not being returned? And at the root of this issue: Why and where is a 404 header being generated at all when the page actually exists and is being rendered? I think part of the reason it hasn't been reported again is that probably very few WP site owners would be using and monitoring Webmaster tools like I do, so it could be happening but would never be noticed since the correct page is still being rendered so no red flags.

.htaccess:

RewriteRule ^city(.*) - [L]
RewriteRule ^areas(.*) - [L]
##########  End 301 Rewrite Rules for DenverHomeValue.com  ##########
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

# END WordPress
###############################################

Firebug says:

http://dev1.denverhomevalue.com/wp-content/uploads/404_issue.jpg

Resulting page displayed:

http://dev1.denverhomevalue.com/wp-content/uploads/404_result.jpg

#5 @ronnieg
10 years ago

  • Keywords reporter-feedback removed

#6 @chriscct7
9 years ago

  • Keywords needs-testing added
Note: See TracTickets for help on using tickets.