WordPress.org

Make WordPress Core

Opened 5 years ago

Last modified 5 months ago

#28727 new defect (bug)

plugin editor content empty when source contains an invalid character

Reported by: bobbingwide Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.9.1
Component: Plugins Keywords:
Focuses: administration Cc:
PR Number:

Description

I happened to create a plugin source file which contained a pound sterling character (£) copied and pasted from a web page, and which therefore appeared in my Windows text editor as lower case u acute ( hex A3, ascii 163 ).

esc_textarea() makes a call to htmlspecialchars() which returns a null value for safe text.

$safe_text = htmlspecialchars( $text, ENT_QUOTES, get_option( 'blog_charset' ) );

Note: blog_charset is UTF-8

So the plugin editor displayed nothing at all for the source.

Question: Is this really the expected behaviour?

The documentation for htmlspecialchars says

If the input string contains an invalid code unit sequence within the given encoding an empty string will be returned, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set.

Shouldn't the plugin editor pass ENT_IGNORE OR otherwise issue a message to the user at least advising not to save the empty file when the safe content is nothing like the original.

Attachments (2)

trac28727.php (273 bytes) - added by bobbingwide 5 years ago.
Sample plugin file demonstrating the problem
28727.1.diff (1.4 KB) - added by jipmoors 4 years ago.
No implicit encoding for htmlentities in plugin source code editor

Download all attachments as: .zip

Change History (7)

#1 @SergeyBiryukov
5 years ago

  • Component changed from Administration to Plugins
  • Focuses administration added

Related: #20368

@bobbingwide
5 years ago

Sample plugin file demonstrating the problem

#2 @bobbingwide
5 years ago

A simple fix to this problem is to remove the call to esc_textarea() in plugin-editor.php
The invalid character then shows as U+FFFD - question mark in black diamond.

#3 @Ov3rfly
5 years ago

Same problem in theme editor, wp-admin/theme-editor.php line 121, WP 4.0

+Component: Themes

#4 @DrewAPicture
4 years ago

@pento @dd32 Any suggestions on how we could mitigate this issue without removing the esc_textarea() call (for obvious reasons)?

#5 @jipmoors
4 years ago

The problem lies in the implicit character coding assumption.
The esc_textarea assumes the supplied content has been written inside the CMS and thus conforms to the charset selected.

A quick test with a duplicate of the function without the implicit encoding shows the file as it is saved on disk.

@jipmoors
4 years ago

No implicit encoding for htmlentities in plugin source code editor

Note: See TracTickets for help on using tickets.