WordPress.org

Make WordPress Core

Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#17926 closed defect (bug) (wontfix)

Wordpress replacing single quotes, dashes, and other basic characters with extended search-unfriendly ones

Reported by: archon810 Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: General Keywords:
Focuses: Cc:

Description

Noticed this problem a while ago, so I don't remember which Wordpress version it was introduced in. All of my sites (specifically AndroidPolice.com) exhibit it and even a high-profile site like TechCrunch.com does too.

Pretty much all special characters are for some reason replaced by html entities, but not ones that are exactly the same.

My biggest gripe is with a single quote character ' which is replaced with ’. ’ which shows up as ’ is not the same as ' and if you do an in-page search in something like Firefox, you won't be able to find "won't" because it's spelled as "won’t"

Other entities are similar: - is replaced with – (although not every time?), etc.

When I click Edit and go to the Edit post in HTML mode (I switched off WYSIWYG a long time ago), then View Source, I see regular characters, like single quote and dash, so Wordpress seems to convert them on the fly.

This is highly annoying and makes in-page search using some punctuation characters almost impossible.

What's going on?

Thanks!

Change History (3)

comment:1 archon8103 years ago

  • Cc admin@… added

comment:2 duck_3 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to wontfix
  • Status changed from new to closed

This is intended behaviour as a feature. If you wish to disable it you can remove the hooks that apply the transformation in a custom plugin or your theme's functions.php. E.g.

remove_filter( 'the_title', 'wptexturize' );
remove_filter( 'the_content', 'wptexturize' );
remove_filter( 'the_excerpt', 'wptexturize' );
remove_filter( 'comment_text', 'wptexturize' );

comment:3 archon8103 years ago

Appreciate the solution, but what exactly is the benefit of mutating perfectly fine characters to different ones? If Wordpress wants to replace them with HTML entities, can it not do that using direct translations, rather than these special Word-like enhanced ones?

Note: See TracTickets for help on using tickets.