Make WordPress Core

Opened 9 years ago

Closed 9 years ago

#32335 closed enhancement (fixed)

Update convert_chars()

Reported by: azaozz's profile azaozz Owned by: azaozz's profile azaozz
Milestone: 4.3 Priority: normal
Severity: normal Version:
Component: General Keywords: has-patch commit
Focuses: Cc:


This function is used a lot as a "display filter". It hasn't been updated for a long time and most of what it does seems not needed any more:

  • The $wp_htmltranswinuni HTML entity replacement seems good but I can't find a way to paste any of these entities any more. Seems this used to fix a problem when pasting from Word into TinyMCE long time ago. For many years all HTML entities are converted to characters when saving from TinyMCE, so these entities never get in the post content. In all other cases, when pasting in a textarea or a text field, there aren't any HTML entities.
  • The <title> and <category> "meta" tags haven't been used for about 10 years? Can probably stop trying to remove them on every page load.
  • The <br> to <br /> and <hr> to <hr /> are redundant for HTML 5.0.

Attachments (2)

32335.patch (961 bytes) - added by azaozz 9 years ago.
32335.2.patch (2.8 KB) - added by boonebgorges 9 years ago.

Download all attachments as: .zip

Change History (12)

9 years ago

#1 @azaozz
9 years ago

  • Keywords has-patch 2nd-opinion added

Still keeping $wp_htmltranswinuni in 32335.patch, pending more research.

As far as I see all current browsers properly display the invalid HTML entities. Even if some of these still remain in posts that were created over 10 years ago by pasting from Word, they won't look bad on non-Windows OS. Are there any other cases where these invalid HTML entities could be added?

This ticket was mentioned in Slack in #core by azaozz. View the logs.

9 years ago

#3 @jorbin
9 years ago

  • Keywords commit added; 2nd-opinion removed

These changes all make sense. This has the potential to speed up display of content.

#4 @azaozz
9 years ago

There is a (rare) user case that might still add the invalid entities. When the user types in (old) Word, saves as HTML, opens the saved file in a browser, goes to "View source", copies the HTML and finally pastes it in the Text editor and saves the post (yeah, some users do that).

There is no good reason to have these entities in post_content, we should replace them on save.

#5 @azaozz
9 years ago

  • Owner set to azaozz
  • Resolution set to fixed
  • Status changed from new to closed

In 32896:

Update convert_chars():

  • Stop trying to remove <title> and <category> meta tags. They have not been used for many many years.
  • Replacement of <br> with <br /> and <hr> with <hr /> is not needed for HTML 5.0. Also, these tags are formatted like that by the visual editor.
  • Replace invalid HTML entities that might be pasted in the Text editor on save instead of on display.

Fixes #32335.

#6 @iseulde
9 years ago


#7 @ocean90
9 years ago

In 32897:

Use 3-digit x.x.x style for 4.3.0 @since versions.

see #32335, #32430.

#8 @boonebgorges
9 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

[32896] broke a couple of unit tests. 32335.2.patch suggests some fixes:

  • Eliminate the tests for <br /> and <hr /> conversion, and title/category stripping.
  • For the character-conversion tests, test the new convert_invalid_entries() instead of convert_chars(), and rename the file accordingly.

These changes make the tests pass again. azaozz, could you have a look to make sure that these changes to the tests accurately reflect the changes in expected behavior?

#9 @obenland
9 years ago

@ocean90, can you take a look at the new patch?

#10 @wonderboymusic
9 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 32947:

After [32896], update ConvertChars.php unit tests and rename to ConvertInvalidEntries.php.

Props boonebgorges.
Fixes #32335.

Note: See TracTickets for help on using tickets.