WordPress.org

Make WordPress Core

Opened 12 months ago

Last modified 6 weeks ago

#50924 accepted defect (bug)

Do not use ZWNJ and ZWJ in post name (slug and url)

Reported by: behzadian Owned by: SergeyBiryukov
Milestone: 5.9 Priority: normal
Severity: normal Version:
Component: Permalinks Keywords: has-patch has-unit-tests needs-testing-info
Focuses: Cc:

Description

If I use Zero Width Non Joiner char in post title (and post title will be used for post slug and name as default), url will be ugly.
ZWNJ is a character used to split two words without real space, for example I in work (پیامک‌ها) there is a ZWNJ between (پیامک) and (ها). If I remove ZWNJ, it will be "پیامکها" which is wrong in Persian (and similar languages). Now if I use ZWNJ in slug, makes url so ugly, for example "/پیامک%e2%80%8cها/".
I changed function remove_accents( $string ) and add two replacement as below:
`
Grave accent.
'Ǜ' => 'U',
'ǜ' => 'u',
Persian.
'‌' => '-', added
'‍' => '-',
added
`
to replace these chars to space

Attachments (2)

50924.1.diff (648.1 KB) - added by Mista-Flo 10 months ago.
Added some unit tests + farsi pack for tests
50924.2.diff (2.7 KB) - added by Mista-Flo 10 months ago.
Fix huge po/mo file

Download all attachments as: .zip

Change History (19)

This ticket was mentioned in PR #475 on WordPress/wordpress-develop by behzadian.


12 months ago

  • Keywords has-patch added

I do not know if it is right place or wrong, but I added two lines to remove ZWNJ and ZWJ from url.
Right? because it works
Wrong? because Persian a language not an accent!
I'm not Wordpress developer, so I don't know where to add codes
I created a ticket and explained what I done here (https://core.trac.wordpress.org/ticket/50924)

Trac ticket:

#2 @johnbillion
12 months ago

  • Component changed from I18N to Charset
  • Milestone changed from Awaiting Review to Future Release
  • Type changed from feature request to enhancement
  • Version 5.5 deleted

Welcome @behzadian. Thanks for the ticket and the PR!

This would be better done by using the hexadecimal representation of the Unicode codepoint instead of the actual character in each array key, so it's visible to readers of the code.

I think these are correct but I've not done any testing at all:

  • "\xe2\x80\x8c"
  • "\xe2\x80\x8d"

#3 @SergeyBiryukov
11 months ago

  • Component changed from Charset to Permalinks
  • Milestone changed from Future Release to 5.6
  • Owner set to SergeyBiryukov
  • Status changed from new to accepted

Related: #44958, #47594, #47912.

This ticket was mentioned in Slack in #core by metalandcoffee. View the logs.


10 months ago

#5 @Mista-Flo
10 months ago

  • Keywords needs-refresh needs-unit-tests added

#6 @hellofromTonya
10 months ago

Per Sergey during today's scrub:

Will try to address before beta, or punt otherwise.

@Mista-Flo
10 months ago

Added some unit tests + farsi pack for tests

#7 @Mista-Flo
10 months ago

  • Keywords has-unit-tests added; needs-refresh needs-unit-tests removed

Hi there,

I tried to drop a new patch for this issue.

The thing is, I had to add the persian translation in the test suite in order to properly switch to fa_IR locale. After that the RTL was ok and I could test.

That said, I tried to implement both @behzadian and @johnbillion suggestion, I'm not sure about the last one, but the unit tests work.

@Mista-Flo
10 months ago

Fix huge po/mo file

This ticket was mentioned in Slack in #core by helen. View the logs.


9 months ago

#9 @SergeyBiryukov
9 months ago

  • Milestone changed from 5.6 to 5.7

Didn't get a chance to review this in time for 5.6 Beta 1, moving to the next release for now.

This ticket was mentioned in Slack in #core by lukecarbis. View the logs.


7 months ago

#11 @noisysocks
7 months ago

  • Keywords 2nd-opinion removed
  • Type changed from enhancement to defect (bug)

#12 @hellofromTonya
6 months ago

  • Keywords needs-testing-info added

Scheduling this ticket for Testing Scrub, but need more information for the testers to QA it:

  • What are the steps to reproduce the problem?
  • Are there any testing dependencies such as a plugin or script?
  • What is the expected behavior after applying the patch?

#13 @lukecarbis
5 months ago

  • Milestone changed from 5.7 to 5.8

Shifting this to 5.8, since 5.7 beta 3 has a soft string freeze.

This ticket was mentioned in Slack in #core-test by hellofromtonya. View the logs.


2 months ago

This ticket was mentioned in Slack in #core-test by dariak. View the logs.


2 months ago

#16 @dariak
2 months ago

QA feedback: in order we test the fix correctly, we would like to clarify the expected results.
To reproduce the bug:

  1. Start creating new post.
  2. Enter the following title "پیامک‌ها"
  3. Publish your post and check the slug, it is "پیامک%e2%80%8cها/"

Is this correct? This is current behaviour on 5.7.2 version.

To test the fix:

  1. Start creating new post.
  2. Enter the following title "پیامک‌ها"
  3. Publish your post and check the slug.

What is exactly expected to see in the slug?

#17 @audrasjb
6 weeks ago

  • Milestone changed from 5.8 to 5.9

The provided patch still needs some testing.
Given we're about to release 5.8 beta 1, let's move it for 5.9 consideration.

Note: See TracTickets for help on using tickets.