Make WordPress Core

Opened 13 years ago

Closed 10 years ago

Last modified 8 years ago

#17926 closed defect (bug) (duplicate)

Wordpress replacing single quotes, dashes, and other basic characters with extended search-unfriendly ones

Reported by: archon810's profile archon810 Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Formatting Keywords:
Focuses: Cc:

Description

Noticed this problem a while ago, so I don't remember which Wordpress version it was introduced in. All of my sites (specifically AndroidPolice.com) exhibit it and even a high-profile site like TechCrunch.com does too.

Pretty much all special characters are for some reason replaced by html entities, but not ones that are exactly the same.

My biggest gripe is with a single quote character ' which is replaced with ’. ’ which shows up as ’ is not the same as ' and if you do an in-page search in something like Firefox, you won't be able to find "won't" because it's spelled as "won’t"

Other entities are similar: - is replaced with – (although not every time?), etc.

When I click Edit and go to the Edit post in HTML mode (I switched off WYSIWYG a long time ago), then View Source, I see regular characters, like single quote and dash, so Wordpress seems to convert them on the fly.

This is highly annoying and makes in-page search using some punctuation characters almost impossible.

What's going on?

Thanks!

Change History (7)

#1 @archon810
13 years ago

  • Cc admin@… added

#2 @duck_
13 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to wontfix
  • Status changed from new to closed

This is intended behaviour as a feature. If you wish to disable it you can remove the hooks that apply the transformation in a custom plugin or your theme's functions.php. E.g.

remove_filter( 'the_title', 'wptexturize' );
remove_filter( 'the_content', 'wptexturize' );
remove_filter( 'the_excerpt', 'wptexturize' );
remove_filter( 'comment_text', 'wptexturize' );

#3 follow-up: @archon810
13 years ago

Appreciate the solution, but what exactly is the benefit of mutating perfectly fine characters to different ones? If Wordpress wants to replace them with HTML entities, can it not do that using direct translations, rather than these special Word-like enhanced ones?

#4 @_doherty
10 years ago

  • Resolution wontfix deleted
  • Status changed from closed to reopened

Can we at least get a rationale for this misfeature? It'd be nice to at least offer to stop corrupting users' inputs if they want you do.

#5 in reply to: ↑ 3 @SergeyBiryukov
10 years ago

  • Component changed from General to Formatting
  • Resolution set to duplicate
  • Status changed from reopened to closed

Replying to archon810:

Appreciate the solution, but what exactly is the benefit of mutating perfectly fine characters to different ones?

Better typography. See the Codex page for the list of replacements.

Replying to _doherty:

It'd be nice to at least offer to stop corrupting users' inputs if they want you do.

You can disable it using run_wptexturize filter introduced in #19550.

#6 follow-up: @_doherty
10 years ago

Sure, *I* can (and have, in fact) fixed this bug, but most users who are having their inputs corrupted aren't capable of figuring out how to do so. When I said it'd be nice to offer to stop doing that to them, I meant that Wordpress should offer a simple toggle in the "writing" settings for "leave my text alone".

You marked this bug as a duplicate, but I don't see any mention of what the other bug is.

#7 in reply to: ↑ 6 @SergeyBiryukov
8 years ago

Sure, *I* can (and have, in fact) fixed this bug, but most users who are having their inputs corrupted aren't capable of figuring out how to do so.

There's a number of plugins to disable wptexturize() per post or globally.

You marked this bug as a duplicate, but I don't see any mention of what the other bug is.

See ticket #19550.

Note: See TracTickets for help on using tickets.