Make WordPress Core

Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#50924 closed defect (bug) (duplicate)

Do not use ZWNJ and ZWJ in post name (slug and url)

Reported by: behzadian's profile behzadian Owned by: sergeybiryukov's profile SergeyBiryukov
Milestone: Priority: normal
Severity: normal Version:
Component: Permalinks Keywords: has-patch has-unit-tests needs-testing-info
Focuses: Cc:

Description

If I use Zero Width Non Joiner char in post title (and post title will be used for post slug and name as default), url will be ugly.
ZWNJ is a character used to split two words without real space, for example I in work (پیامک‌ها) there is a ZWNJ between (پیامک) and (ها). If I remove ZWNJ, it will be "پیامکها" which is wrong in Persian (and similar languages). Now if I use ZWNJ in slug, makes url so ugly, for example "/پیامک%e2%80%8cها/".
I changed function remove_accents( $string ) and add two replacement as below:
`
Grave accent.
'Ǜ' => 'U',
'ǜ' => 'u',
Persian.
'‌' => '-', added
'‍' => '-',
added
`
to replace these chars to space

Attachments (2)

50924.1.diff (648.1 KB) - added by Mista-Flo 4 years ago.
Added some unit tests + farsi pack for tests
50924.2.diff (2.7 KB) - added by Mista-Flo 4 years ago.
Fix huge po/mo file

Download all attachments as: .zip

Change History (22)

This ticket was mentioned in PR #475 on WordPress/wordpress-develop by behzadian.


4 years ago
#1

  • Keywords has-patch added

I do not know if it is right place or wrong, but I added two lines to remove ZWNJ and ZWJ from url.
Right? because it works
Wrong? because Persian a language not an accent!
I'm not Wordpress developer, so I don't know where to add codes
I created a ticket and explained what I done here (https://core.trac.wordpress.org/ticket/50924)

Trac ticket:

#2 @johnbillion
4 years ago

  • Component changed from I18N to Charset
  • Milestone changed from Awaiting Review to Future Release
  • Type changed from feature request to enhancement
  • Version 5.5 deleted

Welcome @behzadian. Thanks for the ticket and the PR!

This would be better done by using the hexadecimal representation of the Unicode codepoint instead of the actual character in each array key, so it's visible to readers of the code.

I think these are correct but I've not done any testing at all:

  • "\xe2\x80\x8c"
  • "\xe2\x80\x8d"

#3 @SergeyBiryukov
4 years ago

  • Component changed from Charset to Permalinks
  • Milestone changed from Future Release to 5.6
  • Owner set to SergeyBiryukov
  • Status changed from new to accepted

Related: #44958, #47594, #47912.

This ticket was mentioned in Slack in #core by metalandcoffee. View the logs.


4 years ago

#5 @Mista-Flo
4 years ago

  • Keywords needs-refresh needs-unit-tests added

#6 @hellofromTonya
4 years ago

Per Sergey during today's scrub:

Will try to address before beta, or punt otherwise.

@Mista-Flo
4 years ago

Added some unit tests + farsi pack for tests

#7 @Mista-Flo
4 years ago

  • Keywords has-unit-tests added; needs-refresh needs-unit-tests removed

Hi there,

I tried to drop a new patch for this issue.

The thing is, I had to add the persian translation in the test suite in order to properly switch to fa_IR locale. After that the RTL was ok and I could test.

That said, I tried to implement both @behzadian and @johnbillion suggestion, I'm not sure about the last one, but the unit tests work.

@Mista-Flo
4 years ago

Fix huge po/mo file

This ticket was mentioned in Slack in #core by helen. View the logs.


4 years ago

#9 @SergeyBiryukov
4 years ago

  • Milestone changed from 5.6 to 5.7

Didn't get a chance to review this in time for 5.6 Beta 1, moving to the next release for now.

This ticket was mentioned in Slack in #core by lukecarbis. View the logs.


4 years ago

#11 @noisysocks
4 years ago

  • Keywords 2nd-opinion removed
  • Type changed from enhancement to defect (bug)

#12 @hellofromTonya
4 years ago

  • Keywords needs-testing-info added

Scheduling this ticket for Testing Scrub, but need more information for the testers to QA it:

  • What are the steps to reproduce the problem?
  • Are there any testing dependencies such as a plugin or script?
  • What is the expected behavior after applying the patch?

#13 @lukecarbis
4 years ago

  • Milestone changed from 5.7 to 5.8

Shifting this to 5.8, since 5.7 beta 3 has a soft string freeze.

This ticket was mentioned in Slack in #core-test by hellofromtonya. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-test by dariak. View the logs.


3 years ago

#16 @dariak
3 years ago

QA feedback: in order we test the fix correctly, we would like to clarify the expected results.
To reproduce the bug:

  1. Start creating new post.
  2. Enter the following title "پیامک‌ها"
  3. Publish your post and check the slug, it is "پیامک%e2%80%8cها/"

Is this correct? This is current behaviour on 5.7.2 version.

To test the fix:

  1. Start creating new post.
  2. Enter the following title "پیامک‌ها"
  3. Publish your post and check the slug.

What is exactly expected to see in the slug?

#17 @audrasjb
3 years ago

  • Milestone changed from 5.8 to 5.9

The provided patch still needs some testing.
Given we're about to release 5.8 beta 1, let's move it for 5.9 consideration.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

#19 @audrasjb
3 years ago

  • Milestone changed from 5.9 to Future Release

As per today's bug scrub, let's move this ticket to Future release pending testing info/feedback.

#20 @SergeyBiryukov
3 years ago

  • Milestone Future Release deleted
  • Resolution set to duplicate
  • Status changed from accepted to closed

Thanks everyone! This appears to be resolved as part of [51984] / #47912.

Last edited 3 years ago by SergeyBiryukov (previous) (diff)
Note: See TracTickets for help on using tickets.