Make WordPress Core

Opened 9 months ago

Last modified 3 months ago

#50924 accepted defect (bug)

Do not use ZWNJ and ZWJ in post name (slug and url)

Reported by: behzadian Owned by: SergeyBiryukov
Milestone: 5.8 Priority: normal
Severity: normal Version:
Component: Permalinks Keywords: has-patch has-unit-tests needs-testing-info
Focuses: Cc:


If I use Zero Width Non Joiner char in post title (and post title will be used for post slug and name as default), url will be ugly.
ZWNJ is a character used to split two words without real space, for example I in work (پیامک‌ها) there is a ZWNJ between (پیامک) and (ها). If I remove ZWNJ, it will be "پیامکها" which is wrong in Persian (and similar languages). Now if I use ZWNJ in slug, makes url so ugly, for example "/پیامک%e2%80%8cها/".
I changed function remove_accents( $string ) and add two replacement as below:
Grave accent.
'Ǜ' => 'U',
'ǜ' => 'u',
'‌' => '-', added
'‍' => '-',
to replace these chars to space

Attachments (2)

50924.1.diff (648.1 KB) - added by Mista-Flo 7 months ago.
Added some unit tests + farsi pack for tests
50924.2.diff (2.7 KB) - added by Mista-Flo 7 months ago.
Fix huge po/mo file

Download all attachments as: .zip

Change History (15)

This ticket was mentioned in PR #475 on WordPress/wordpress-develop by behzadian.

9 months ago

  • Keywords has-patch added

I do not know if it is right place or wrong, but I added two lines to remove ZWNJ and ZWJ from url.
Right? because it works
Wrong? because Persian a language not an accent!
I'm not Wordpress developer, so I don't know where to add codes
I created a ticket and explained what I done here (https://core.trac.wordpress.org/ticket/50924)

Trac ticket:

#2 @johnbillion
9 months ago

  • Component changed from I18N to Charset
  • Milestone changed from Awaiting Review to Future Release
  • Type changed from feature request to enhancement
  • Version 5.5 deleted

Welcome @behzadian. Thanks for the ticket and the PR!

This would be better done by using the hexadecimal representation of the Unicode codepoint instead of the actual character in each array key, so it's visible to readers of the code.

I think these are correct but I've not done any testing at all:

  • "\xe2\x80\x8c"
  • "\xe2\x80\x8d"

#3 @SergeyBiryukov
9 months ago

  • Component changed from Charset to Permalinks
  • Milestone changed from Future Release to 5.6
  • Owner set to SergeyBiryukov
  • Status changed from new to accepted

Related: #44958, #47594, #47912.

This ticket was mentioned in Slack in #core by metalandcoffee. View the logs.

7 months ago

#5 @Mista-Flo
7 months ago

  • Keywords needs-refresh needs-unit-tests added

#6 @hellofromTonya
7 months ago

Per Sergey during today's scrub:

Will try to address before beta, or punt otherwise.

7 months ago

Added some unit tests + farsi pack for tests

#7 @Mista-Flo
7 months ago

  • Keywords has-unit-tests added; needs-refresh needs-unit-tests removed

Hi there,

I tried to drop a new patch for this issue.

The thing is, I had to add the persian translation in the test suite in order to properly switch to fa_IR locale. After that the RTL was ok and I could test.

That said, I tried to implement both @behzadian and @johnbillion suggestion, I'm not sure about the last one, but the unit tests work.

7 months ago

Fix huge po/mo file

This ticket was mentioned in Slack in #core by helen. View the logs.

7 months ago

#9 @SergeyBiryukov
7 months ago

  • Milestone changed from 5.6 to 5.7

Didn't get a chance to review this in time for 5.6 Beta 1, moving to the next release for now.

This ticket was mentioned in Slack in #core by lukecarbis. View the logs.

4 months ago

#11 @noisysocks
4 months ago

  • Keywords 2nd-opinion removed
  • Type changed from enhancement to defect (bug)

#12 @hellofromTonya
4 months ago

  • Keywords needs-testing-info added

Scheduling this ticket for Testing Scrub, but need more information for the testers to QA it:

  • What are the steps to reproduce the problem?
  • Are there any testing dependencies such as a plugin or script?
  • What is the expected behavior after applying the patch?

#13 @lukecarbis
3 months ago

  • Milestone changed from 5.7 to 5.8

Shifting this to 5.8, since 5.7 beta 3 has a soft string freeze.

Note: See TracTickets for help on using tickets.