Opened 5 years ago
Last modified 5 years ago
#49129 new enhancement
Incorrect German Umlaut substitutions
Reported by: | bmuessig | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | minor | Version: | 5.4 |
Component: | Formatting | Keywords: | 2nd-opinion |
Focuses: | Cc: |
Description
Hello,
as a native speaker, I find the German Umlaut substitutions quite strange.
Correctly, ü is turned into ue, but Ü is turned into Ue.
Since the second character should be considered as part of the former character, the former capitalization should be respected.
This is especially strange in uppercase text:
FRÖHLICH -> FROeHLICH
KÖNNEN -> KOeNNEN
If it was changed to be all uppercase, it would work much better:
FRÖHLICH -> FROEHLICH
KÖNNEN -> KOENNEN
When used at the start of a word, it would also work fine, if capitalized:
ÖFFENTLICH -> OEffentlich
ÜBERGANG -> UEbergang
Therefore, I would propose changing the table located in wp-includes/formatting.php:1941 (https://github.com/WordPress/WordPress/blob/master/wp-includes/formatting.php#L1941) to the following:
if ( 'de_DE' == $locale || 'de_DE_formal' == $locale || 'de_CH' == $locale || 'de_CH_informal' == $locale ) { $chars['Ä'] = 'AE'; $chars['ä'] = 'ae'; $chars['Ö'] = 'OE'; $chars['ö'] = 'oe'; $chars['Ü'] = 'UE'; $chars['ü'] = 'ue'; $chars['ß'] = 'ss';
Though, to be entirely correct, the surrounding characters would have to be checked, which would be difficult, given the current architecture.
There even is a capital ß now, which would be substituted with SS.
I am happy to hear any second opinions on this.
Best regards,
Benedikt
Change History (3)
#3
in reply to:
↑ 2
@
5 years ago
Replying to tobifjellner:
Perhaps de_AT should also be included? @pputzer ?
Yes. de_AT needs to be included as well.
Regarding the topic of the ticket, is this a problem in practice? Where are those transformation rules used that don't also convert to lower casw (i,e, for slugs)?
Hi there, welcome to WordPress Trac! Thanks for the ticket.
Just noting this was originally introduced in [23361] / #3782, and extended to other locales in [33027] and [37698].
There was an argument against that in comment:14:ticket:3782, hence the current list.
I guess we'll have to find a way to check the surrounding characters to match the case correctly.