WordPress.org

Make WordPress Core

Opened 21 months ago

Last modified 3 weeks ago

#47912 accepted defect (bug)

Non-visible chars in title creates some encoded values in permalink

Reported by: dhanendran Owned by: SergeyBiryukov
Milestone: 5.8 Priority: normal
Severity: major Version: 5.2.2
Component: Permalinks Keywords: needs-patch
Focuses: ui, rtl, administration Cc:

Description

When someone copy past some values from word documents, some other editor or some site on the web, we end-up copying some non visible chars and pasting it in the WordPress post title. This creates some encoded values in the permalink.
https://dhanendranrajagopal.me/wp-content/uploads/2019/08/non-visible-chars-in-title.png

[In this link|https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128] there are so many non visible characters which can be easily added in to the post title. There are other examples as well for this issue one of them is [Byte Order Mark|https://en.wikipedia.org/wiki/Byte_order_mark].

sanitize_title() function only removes the accents from the title not these kind of characters.

I have searched all the open tickets in trac and couldn't find anything related to this issue, so opening one.

Attachments (1)

47912.diff (1.4 KB) - added by dhanendran 21 months ago.
added patch and unit test

Download all attachments as: .zip

Change History (28)

#1 @SergeyBiryukov
21 months ago

  • Milestone changed from Awaiting Review to 5.3
  • Owner set to SergeyBiryukov
  • Status changed from new to accepted

Hi @dhanendran, welcome to WordPress Trac! Thanks for the report.

sanitize_title() indeed does not remove this kind of characters, but sanitize_title_with_dashes() does.

Looks like we need to add some more characters there.

#2 @SergeyBiryukov
21 months ago

  • Keywords needs-patch needs-unit-tests added

#3 @SergeyBiryukov
21 months ago

The characters are generally added on a case-by-case basis, not all at once.

Let's start with these two here:

@dhanendran
21 months ago

added patch and unit test

#4 @dhanendran
21 months ago

  • Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed

@sergeybiryukov I have added the patch with unit test.

Last edited 21 months ago by dhanendran (previous) (diff)

This ticket was mentioned in Slack in #core by marybaum. View the logs.


20 months ago

#7 @garrett-eclipse
20 months ago

  • Keywords needs-testing added

@SergeyBiryukov will you have time to review for 5.3?

#8 @SergeyBiryukov
20 months ago

  • Milestone changed from 5.3 to 5.4

#9 @audrasjb
15 months ago

  • Milestone changed from 5.4 to Future Release

Hi,

With 5.4 Beta 3 approaching and the Beta period reserved for bugs introduced during the cycle, this is being moved to Future Release. If any maintainer or committer feels this should be included or wishes to assume ownership during a specific cycle, feel free to update the milestone accordingly.

#11 @SergeyBiryukov
15 months ago

  • Milestone changed from Future Release to 5.5

#12 @earnjam
11 months ago

@SergeyBiryukov You milestoned this for 5.5. Is it on your list before the beta, or do you want to punt?

#13 @SergeyBiryukov
10 months ago

  • Milestone changed from 5.5 to 5.6

#15 @helen
6 months ago

  • Milestone changed from 5.6 to 5.7

Moving to 5.7 to go with #50924 and #47594.

#16 @hellofromTonya
4 months ago

  • Keywords needs-testing-info added

Scheduling this ticket for Testing Scrub, but need more information for the testers to QA it:

  • What are the steps to reproduce the problem?
  • Are there any testing dependencies such as a plugin or script?
  • What is the expected behavior after applying the patch?

#17 @dhanendran
4 months ago

@hellofromTonya

What are the steps to reproduce the problem?

"Non visible char" you can copy-paste this text in the post title and can notice the permalink for some encoded chars or copy-paste some non-visible chars form https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128 in the title.

Are there any testing dependencies such as a plugin or script?

There are no dependencies for this.

What is the expected behavior after applying the patch?

When we paste these non-visible chars in the title, it should be removed and save only the valid chars.

#18 @hellofromTonya
4 months ago

  • Keywords needs-testing-info removed

Thank you @dhanendran. This ticket is ready for Test Scrub and manual testing.

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


4 months ago

#20 @paaljoachim
4 months ago

I wanted to see what this would look like before testing it in localhost:8889. So I opened one of my local test sites. Running WP 5.6. Gutenberg 9.8. Using Twenty Twenty One. Pasted in "Non visible char" (copy and pasted what was mentioned above) as a post title. Noticed that the post title added 1 space extra but that the URL slug looks like this: non-visible-char

This ticket was mentioned in Slack in #core by monikarao. View the logs.


4 months ago

This ticket was mentioned in Slack in #core by monikarao. View the logs.


4 months ago

#23 @poena
3 months ago

Result before patch:
/non%e2%80%80visible-char-you-can-copy-paste-this-text/

Result with the patch applied:
/nonvisible-char-you-can-copy-paste-this-text/

Looks like there is a character there that needs to be replaced with a hyphen, not removed?

Please remove the duplicate '%e2%80%9f' from the list.

#24 @peterwilsoncc
3 months ago

  • Keywords needs-refresh added

Adding needs-refresh per the comment above.

For characters that display without a width, that is no apparent content in to the viewer, then the current block they've been added to (// Strip these characters entirely.) is correct.

For characters that display with a width, that is apparently a space to the viewer, then the better approach would to be to replace them with a hyhpen as 2poena says.

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


3 months ago

#26 @hellofromTonya
3 months ago

  • Keywords needs-patch added; has-patch has-unit-tests needs-refresh removed
  • Milestone changed from 5.7 to 5.8

Changing to needs-patch to reflect the changes that need to be made to the original patch.

With 5.7 RC1 landing tomorrow and no activity on this ticket for a new patch, punting this ticket to 5.8.

If any maintainer or committer feels this can be resolved in time, or wishes to assume ownership during a specific cycle, feel free to update the milestone accordingly.

#27 @Boniu91
3 weeks ago

  • Keywords needs-testing removed
Note: See TracTickets for help on using tickets.