Opened 2 years ago
Last modified 2 years ago
#56530 new defect (bug)
Combining tilde passes `sanitize_title_with_dashes()` and so do most other diacritics
Reported by: | anrghg | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | major | Version: | |
Component: | Formatting | Keywords: | |
Focuses: | Cc: |
Description
Combining diacritics that are not acute, grave, hacek or macron (or the deprecated Vietnamese acute tone mark) are not removed by sanitize_title_with_dashes()
removing characters individually and explicitly after URL-encoding.
If the goal is to remove accents, best would be removing the full class before URL-encoding:
$title = preg_replace( '/\p{M}/u', '', $title );
But isn’t remove_accents()
called when removing diacritics?
So I’d suggest moving the combining diacritics part to remove_accents()
.
Change History (3)
#2
follow-up:
↓ 3
@
2 years ago
- Keywords changes-requested removed
changes-requested
refers to: "Feedback has been provided, and the attached patch needs to be updated." Reference.
I'm not sure if such a change would need a dev note, so I'll leave this to others to give their thoughts.
#3
in reply to:
↑ 2
@
2 years ago
- Keywords needs-dev-note needs-patch removed
Replying to costdev:
changes-requested
refers to: "Feedback has been provided, and the attached patch needs to be updated." Reference.
I'm not sure if such a change would need a dev note, so I'll leave this to others to give their thoughts.
Sorry, indeed fixing a bug does not require any dev note, and as I’m to provide a patch, I’m removing all other keywords as well.
Addenda
sanitize_title_with_dashes()
also removes some format controls, some quotation marks and other punctuation, some modifier letters, and some symbols, among which the percent sign.I think that here too, best is to do the full job by removing classes extensively before URL-encoding:
To avoid screwing up the percent sign removal, URL-decoding first seems to be the way to go. As by
urldecode()
the plus sign is converted to space, it could be maintained: