Opened 14 months ago
Last modified 11 months ago
#59608 new defect (bug)
[bug] insert_with_markers with a UTF-8 translated marker causing .htaccess file broken with garbled characters
Reported by: | sammyhk | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | 6.3.2 |
Component: | I18N | Keywords: | |
Focuses: | Cc: |
Description
We encountered issue when the site language is setting to Chinese (Hong Kong)
, the translators: 1: Marker.
have a translation (https://translate.wordpress.org/projects/wp/dev/admin/zh-hk/default/?filters%5Boriginal_id%5D=8431526&filters%5Bstatus%5D=either&filters%5Btranslation_id%5D=91677098) which contains UTF-8 characters. When calling insert_with_markers(...)
to write values to .htaccess
file, the .htaccess
file will contains garbled characters causing Apache cannot parse the file and return HTTP 500.
Sample of the broken .htaccess:
# BEGIN WP Cloudflare Super Page Cache # 在含有 BEGIN WP Cloudflare Super Page Cache 及 END WP Cloudflare Super Page Cache 標記的這� �行間的指示詞� �容為動� �產生, # 且應� 有 WordPress 篩選器能進行修改。對這� �行間任何指示詞� �容的變更, # 都會遭到系統覆寫。 <IfModule mod_expires.c> ...
The expected output should be:
# BEGIN WP Cloudflare Super Page Cache # 在含有 BEGIN WP Cloudflare Super Page Cache 及 END WP Cloudflare Super Page Cache 標記的這兩行間的指示詞內容為動態產生, # 且應僅有 WordPress 篩選器能進行修改。對這兩行間任何指示詞內容的變更, # 都會遭到系統覆寫。 <IfModule mod_expires.c> ...
Ref: https://build.trac.wordpress.org/browser/trunk/wp-admin/includes/misc.php?marks=141#L140
Change History (2)
#2
@
11 months ago
I also am having this issue with version 6.4.2 in Chinese (Taiwan)
. The problem is strange in that only some characters causes the problem. I have tried manually inserting the Chinese character into the misc.php code.
Changing this code:
$instructions = sprintf( __( 'The directives (lines) between "BEGIN %1$s" and "END %1$s" are dynamically generated, and should only be modified via WordPress filters. Any changes to the directives between these markers will be overwritten.' ), $marker );
to this would break the htaccess
$instructions = '# 兩';
but to this would not
$instructions = '# 我';
my php.ini default_charset = "UTF-8"
, both input_encoding and output_encoding are not set.
Another test case I tried was to manually insert # 兩
somewhere else in the .htaccess. This also causes the problem after insert_with_markers is executed. After reading through the code, it seems the code is reading and re-writing the entire file. This seem to indicate to me that the problem occurs when the strings are re-written into .htaccess.
Forcing the code to do mb_convert_encoding($line, 'UTF-8')
on every single line also doesn't seem to work as suggested on multiple posts on StackExchange.
I have tried to also force the code to write the UTF-8 BOM at the beginning of the file, but Apache also fail with http 500 upon reading the BOM.
Lastly, I copied out the function, wrote it as a stand-alone php file to execute on the command line in the server environment. This does NOT reproduce the problem, and results in a useable .htaccess file. I had to disable switch_to_locale, but this didn't seem to make any difference even if I disabled it in misc.php.
So there must be something different about executing this code locally or via CGI. phpinfo did not reveal any difference in default_charset and other encoding values.
This could belong in the Rewrite Rules component, but it also might relate to switching the locale.