Opened 7 years ago
Closed 7 years ago
#40817 closed defect (bug) (invalid)
WordCounter removeRegExp maybe broken
Reported by: | DrLightman | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 4.6.4 |
Component: | Editor | Keywords: | |
Focuses: | javascript, administration | Cc: |
Description
In the file \wp-admin\js\word-count.js at around line 27 of my WP 4.6.6 there is the removeRegExp for WordCounter.prototype.settings:
removeRegExp: new RegExp( [ '[', // Basic Latin (extract) '\u0021-\u0040\u005B-\u0060\u007B-\u007E', // Latin-1 Supplement (extract) '\u0080-\u00BF\u00D7\u00F7', // General Punctuation // Superscripts and Subscripts // Currency Symbols // Combining Diacritical Marks for Symbols // Letterlike Symbols // Number Forms // Arrows // Mathematical Operators // Miscellaneous Technical // Control Pictures // Optical Character Recognition // Enclosed Alphanumerics // Box Drawing // Block Elements // Geometric Shapes // Miscellaneous Symbols // Dingbats // Miscellaneous Mathematical Symbols-A // Supplemental Arrows-A // Braille Patterns // Supplemental Arrows-B // Miscellaneous Mathematical Symbols-B // Supplemental Mathematical Operators // Miscellaneous Symbols and Arrows '\u2000-\u2BFF', // Supplemental Punctuation '\u2E00-\u2E7F', ']' ].join( '' ), 'g' ),
But according to Javascript docs https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp when using string notation the backslashes should be escaped:
When using the constructor function, the normal string escape rules (preceding special characters with \ when included in a string) are necessary. For example, the following are equivalent:
var re = /\w+/;
var re = new RegExp('\\w+');
So shouldn't be this the correct way to build that regexp since it uses the second way with the string, with \\u
in place of \u
?
removeRegExp: new RegExp( [ '[', // Basic Latin (extract) '\\u0021-\\u0040\\u005B-\\u0060\\u007B-\\u007E', // Latin-1 Supplement (extract) '\\u0080-\\u00BF\\u00D7\\u00F7', // General Punctuation // Superscripts and Subscripts // Currency Symbols // Combining Diacritical Marks for Symbols // Letterlike Symbols // Number Forms // Arrows // Mathematical Operators // Miscellaneous Technical // Control Pictures // Optical Character Recognition // Enclosed Alphanumerics // Box Drawing // Block Elements // Geometric Shapes // Miscellaneous Symbols // Dingbats // Miscellaneous Mathematical Symbols-A // Supplemental Arrows-A // Braille Patterns // Supplemental Arrows-B // Miscellaneous Mathematical Symbols-B // Supplemental Mathematical Operators // Miscellaneous Symbols and Arrows '\\u2000-\\u2BFF', // Supplemental Punctuation '\\u2E00-\\u2E7F', ']' ].join( '' ), 'g' ),
Change History (1)
Note: See
TracTickets for help on using
tickets.
I think you're mixing the character class shortcuts
\w
,\d
,\s
, etc. with the UTF character escape sequences\u####
(where #### are four hexadecimal digits). Also, note that the UTF chars are in an array that is joined before used as a string in the regex.