#3782 closed enhancement (fixed)
Use Digraphs for Umlauts in German Permalinks
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | 3.6 | Priority: | normal |
Severity: | normal | Version: | 2.1 |
Component: | I18N | Keywords: | has-patch commit has-unit-tests |
Focuses: | Cc: |
Description
There’s this pretty useful plugin:
http://wiki.wordpress.org/GermanPermalinks
To quote from the author’s site:
In German, the convention is to transliterate umlauts into two letters (ä=ae, ö=oe, ü=ue). The CVS code has some two-character conversions, so I added the umlauts to that list in my plugin, and the result was schroeder-sucks, which is just what I wanted.
I wouldn’t want Schröder to suck, but I think that it shouldn’t take a plugin to fix this bug. There are more characters that are wrongly sanitized (e.g., å = aa, æ = ae, …)
(Note that in Spanish and French, the ü is not transliterated in this matter. Not sure about Turkish. Maybe WordPress should check the locale before creating the post slug)
Attachments (6)
Change History (29)
#2
@
18 years ago
- Type changed from defect to enhancement
I think this should stay a plugin, because you stated yourself that there's differences between languages/regions on how to interpret/translate these characters.
We can't assume everybody wants to use the German convention.
#3
@
18 years ago
- Resolution set to wontfix
- Status changed from new to closed
Yeah, this is definitely plugin material. Adding locale-specific checks (and maintaining them !!) would be waaay too much to put into the core.
#5
@
12 years ago
- Cc Bernhard Reiter added
- Resolution wontfix deleted
- Status changed from closed to reopened
Now that Wordpress seems to become more and more i18n/l10n aware, isn't it time to reconsider this? Having an umlaut normalise to only one letter seriously pisses of most German wordpress users I know of (including myself) that care about their permalinks, which is maybe something that latin-1-charset-language developers don't realise.
#6
@
12 years ago
- Milestone set to 3.6
I agree, this is doable by changing the remove_accents() mapping based on the locale.
We need a full list of all affected characters, and what their sanitized characters should be.
#10
@
12 years ago
3782.patch handles the characters from WP Permalauts plugin.
#11
@
12 years ago
In 1178/tests:
#12
@
12 years ago
In 1179/tests:
#13
@
12 years ago
Let's assign get_locale() to a variable so if/when we add more locale-specific rules, we don't call it additional times, the diff is a bit cleaner, and so the commit here looks like it is written to be extended.
#14
@
12 years ago
Ä in Äpfel should be Ae. Of course, when all characters are uppercased AE seems fine. But that's usually not the case.
#15
@
12 years ago
In 1180/tests:
#17
@
12 years ago
Looks quite solid. Three things:
Probably no need to ever call get_locale() if ! seems_utf8().
I wonder if array_merge() is particularly slow or inefficient here, as this isn't a small array. The union operator (+) has a bit less overhead, as might just adding on the individual key/value pairs.
I would tend to agree that A should become Ae, not AE. remove_accents() does follow the same uppercase pattern elsewhere in the file. Since this is a language-specific rule, I am not too worried about the inconsistency. Note the only other usage of remove_accents() beyond permalinks, is sanitize_user(), which does not enforce strtolower (it used to in multisite).
#18
@
12 years ago
Performed a few tests with 20,000 iterations of remove_accents()
.
Without the patch | 10.290s |
array_merge() (3782.3.patch) | 14.624s |
Union operator (+) (3782.4.patch) | 11.342s |
Direct assignment (3782.5.patch) | 10.619s |
So, 3782.5.patch shows the best performance.
This ticket was mentioned in PR #3783 on WordPress/wordpress-develop by @costdev.
2 years ago
#21
- Keywords has-unit-tests added
Now that the minimum PHP version has been raised, this test and its data provider can be removed.
The datasets from the data provider have been relocated to other data providers in the test class.
N.B. This PR has expected test failures in PHP 5.6 tests until #3782 is merged.
This ticket was mentioned in PR #3786 on WordPress/wordpress-develop by @costdev.
2 years ago
#22
The IMAGETYPE_WEBP
constant is defined in PHP 7.1 and later.
Now that the minimum PHP version has been raised, the IMAGETYPE_WEBP
constant definition can be removed.
Ref:
PHP manual - Migrating from PHP 7.0.x to PHP 7.1.x
https://www.php.net/manual/en/migration71.constants.php
N.B. This PR has expected test failures in PHP 5.6 and 7.0 tests until #3782 is merged.
This ticket was mentioned in PR #3787 on WordPress/wordpress-develop by @costdev.
2 years ago
#23
The random_compat
library polyfills functions from the Random PHP extension in PHP 7.x to PHP 5.x. Random is a core PHP extension, and will be included in all PHP binaries.
Now that the minimum PHP version has been raised, the random_compat
library can be removed.
This change also removes the inclusion of random_int()
in wp-includes/compat.php
.
Ref:
- PHP manual - Membership - Core Extensions
https://www.php.net/manual/en/extensions.membership.php
- Paragonie/random_compat GitHub repository
https://github.com/paragonie/random_compat
N.B. This PR has expected test failures for PHP 5.6 tests until #3782 is merged.
Another option would be to use IRIs, and then normalise them to URIs for compatibility with UAs that don't support IRIs.