WordPress.org

Make WordPress Core

#28197 closed enhancement (maybelater)

Fallback Languages

Reported by: downstairsdev Owned by:
Milestone: Priority: normal
Severity: normal Version: 4.0
Component: I18N Keywords:
Focuses: Cc:

Description

We should do a better job of loading translation files in the user's language if they are available.

For instance, if a Spanish speaker has their locale set as es_MX (Spanish Mexico) it would be preferable to load any available Spanish translations files (es_ES, es_CO, etc) before returning the default (generally English).

I wrote up a quick patch, tester plugin, and plugin that demonstrate this idea. If a $mofile is not available in the user's current locale, it will search for and return the first available translation that is also in the same language.

A better option might be to create a filterable stack rank of locales for WordPress to search for within the language before returning the default. Other suggestions are also welcome.

This idea was also discussed in an "ideas" thread:
http://wordpress.org/ideas/topic/fallback-to-generic-language-file-when-specific-locale-file-absent

Attachments (4)

fallback-languages.diff (1.6 KB) - added by downstairsdev 15 months ago.
WordPress Patch
fallback-language-tester.zip (5.8 KB) - added by downstairsdev 15 months ago.
Tester plugin.
fallback-languages-2.diff (1.3 KB) - added by downstairsdev 15 months ago.
Fallback languages (generic) patch.
fallback-language-tester-2.zip (7.1 KB) - added by downstairsdev 15 months ago.
Plugin to test generic fallback languages.

Download all attachments as: .zip

Change History (21)

@downstairsdev15 months ago

WordPress Patch

@downstairsdev15 months ago

Tester plugin.

comment:1 @downstairsdev15 months ago

The tester plugin displays two strings in the dashboard next to the "screen options" tab. The first one is the locale for the site, e.g. "en_MX". The second one display the translation for the string "default". So, if no translation file is loaded, "default" is displayed.

There are two bundled translation files, es_ES and es_MX. If those files are loaded, the translation of "default" is displayed- which I've just set as the locale for each file ('es_ES' and 'es_MX' respectively).

To test, set a Spanish locale other than one of the two translations (e.g. define('WPLANG', 'es_CO'); ). If everything is working correctly, the first string will return the locale ( 'es_CO' ), and the second string will display the translation for the actual translation file that got loaded. Since 'es_CO' is not available, it will be one of the other Spanish translations ( 'es_ES' ).

Hopefully this makes sense!

--

The filter also allows us to do this through a plugin. Suggestions and pull requests welcome there as well:
https://github.com/devinsays/fallback-languages

comment:2 follow-up: @nacin15 months ago

This is a cool idea, though it's going to be really slow as written. We have some plans to do this on the side of language packs (and WordPress.org), versus at the system level.

This also doesn't necessarily work linguistically for every language out there. A proper mapping of language codes that can be fallen back to could help a bit with both this and performance.

comment:3 @toscho15 months ago

I like the idea. I do the same in Multilingual Press for language negotiation.

substr( get_locale(), 0, 2) will fail. For many languages the code has three letters, because it is taken from ISO 639-2. Just use strtok( get_locale(), '_' ).

comment:4 @cvedovini15 months ago

Hi, nice work.

However my initial proposition was to fallback to a generic language file, like if fr_FR is not present then fallback to fr, not fr_CA or any other random variation that happen to be there.

This is much simpler to code but this would also require some work at the language packs level in order to define a "default" language (tho I can see where that could become a political issue)

@downstairsdev15 months ago

Fallback languages (generic) patch.

@downstairsdev15 months ago

Plugin to test generic fallback languages.

comment:5 follow-up: @downstairsdev15 months ago

I attached an updated patch that implements @cvedovini idea. I was trying to work within the existing framework for translations- but a generic fallback could work better.

1) Translation teams from different locales could focus on the generic fallback version first- and then quickly copy it to make country specific localizations. Hopefully it would not get political as any translator who disagreed with a translation in the default could just change it for their specific locale.

2) This avoids having to build and maintain an ordered list of fallbacks for each language.

Instead of defaulting to "generic" we could default to the locale with simply the largest population. E.g. English > United States, Spanish > Mexico, Portuguese > Brazil. But then you might get might get a Spanish (es_ES) translation for an es_MX localizations simply because it would be used more. So for this reason, a generic fallback would probably be better.

Thanks for the coding tip @toscho. I am using that instead for this updated patch.

Nacin, could you outline where the performance issues are? (I don't have much experience in that area). Would it be helpful to cache translation file paths in a transient?

Last edited 15 months ago by downstairsdev (previous) (diff)

comment:6 @grapplerulrich15 months ago

I am not sure a generic language fallback is best. I think you are best asking on the Polyglots make blog.

Who defines what is the generic form? Take the spelling of colour/color in English. How can you convince both parties that which is the most generic form? If we say for the sake of argument that the en_US form is more generic then we could just have en_UK fallback to en_US.
(I know there is no en_US translation file. This is just an example that most should be able to understand.)

I was told by @jenialaszlo at WordCamp Switzerland that Ukrainian could use a fallback to Russian. You would need to take that into account also.

By introducing a generic file, it creates extra work for core translators and also theme/plugin translators/developers who would need to create a generic file so to make use of this new feature.

comment:7 @cvedovini15 months ago

@grapplerulrich There's no additional work required from the translators, on the contrary. First, that's a totally optional solution, the "en_US" (or "en_UK") files don't have to be renamed to "en" (and even then, that's not much of a work), and second, it simplifies work for plugins and themes developers who wants to provide localised versions.

I can see some theoretical or political reasons why locales should be kept separated but from a purely practical standpoint, me having to provide 4+ version of spanish files where one would be enough to make most people happy is just ludicrous and actually prevent them from using my plugins in their language.

Contrary to what I said earlier there's also no actual requirement for WP to decide on a default locale per language, it would just be better. And it can also be done on a per language basis. I can imagine people agreeing on en_UK being the default English and if Ukrainian is an issue then let things separated (Tho I should note that the idea is to fallback on a default locale, not another language, like in your Ukrainian/Russian exemple).

Last edited 15 months ago by cvedovini (previous) (diff)

comment:8 @vanillalounge15 months ago

Answered on the Polyglots' P2. tl;dr: either no fallback at all, or else let the user choose from *all* available languages.

comment:9 in reply to: ↑ 2 @vanillalounge15 months ago

Replying to nacin:

A proper mapping of language codes that can be fallen back to could help a bit with both this and performance.

This might prove a task of epic proportions. Ethnologue has been trying, for years, and it's still not completely accurate, despite the thousands of entries. I understand the engineering drive to normalize this stuff, but suspect that there are simply too many shades and nuances (not to mention the fact that it's constantly changing) for a usable, discrete classification. I'm not even sure it's desirable.

Last edited 15 months ago by vanillalounge (previous) (diff)

comment:10 in reply to: ↑ 5 @vanillalounge15 months ago

Replying to downstairsdev:

1) Translation teams from different locales could focus on the generic fallback version first- and then quickly copy it to make country specific localizations. Hopefully it would not get political as any translator who disagreed with a translation in the default could just change it for their specific locale.

The problem with this is that there is no such thing as a "generic" version everyone would agree on, for any language, even if vaguely, and you'd end up with a total of 0 volunteers to translate that version. And of course, it would immediately get political, before anything else.

comment:11 @grapplerulrich15 months ago

I can see some theoretical or political reasons why locales should be kept separated but from a purely practical standpoint, me having to provide 4+ version of spanish files where one would be enough to make most people happy is just ludicrous and actually prevent them from using my plugins in their language.

You could just as easily rename the spanish files instead of having the rest of the users have to create a "es" file.

@vanillalounge mentioned that this should be optional. We could always allow users to define the fallback language when defining the language that they need. define ( 'WPLANG', 'es_ES', 'es_MX' ) Most people need to make the change in wp-config.php so they may just as well define the fallback languages. We can always provide a few examples in the codex that people can copy.

comment:12 follow-up: @cvedovini15 months ago

There's a misunderstanding I think. First my initial proposition is not to consider a "generic" language file but eventually to consider one of the locale to be the default for one given language, like es_ES would be the default if a es_MX file is not present. You do that by providing a es file (renaming the es_ES.mo file to es.mo)

Second there's no asking other people to change what they do (including WP translators) if they don't want. But including that fallback mechanism would allow me (and others) to provide a unique language file for all locales of that language without the user doing anything (they can still define es_MX and get MX locale when available), me having to copy the same file over with a different name (like es_ES, es_MX, es_CL, es_PE, es_VE, etc.) or adding the code I described here: http://vedovini.net/2013/12/smart-fallback-mechanism-for-loading-text-domains-in-wordpress/

comment:13 in reply to: ↑ 12 ; follow-up: @vanillalounge15 months ago

Replying to cvedovini:

...but eventually to consider one of the locale to be the default for one given language, like es_ES would be the default if a es_MX file is not present.

I'm sorry, but this is a recipe for disaster. Who gets to decide which one is the default? The es_ES translators? The es_MX ones? Whichever option you choose will inevitably raise a (probably nasty) discussion. Who will moderate that discussion (keeping in mind that it would be someone, who a)is qualified for that language and b)inevitably more familiar with one of the declinations)?

...including that fallback mechanism would allow me (and others) to provide a unique language file for all locales of that language...

Again, how will you provide that? Whose translation will it be? Will you be the one deciding which es_* is the generic one?

Look, I don't mean to sound irascible and unwilling to compromise, but you are approaching this from an angle that's completely irrelevant to translators or native speakers of another language. I may be wrong, but this ticket seems to aim to solve a problem of the developers, not of the translators or native speakers. We couldn't care less about fallback languages; from our point of view, meaning no offense, it's a solution looking for a problem to solve.

I would much rather see these efforts directed towards a procedure where a translator can easily and readily supply a current translation to a developer (instead of "oh, I'll include it in the next release", which is flawed), that we can trust that the code is properly gettexted and that .pot files are current with said code. Sadly, in 90% of the code I look at, this is most definitely not the case.

Also see http://make.wordpress.org/polyglots/2014/05/12/hi-folks-we-started-a-conversation-about-fallback/#comment-251797

comment:14 follow-up: @downstairsdev15 months ago

@cvedovini I generally agree with the criticisms of a generic language fallback. If the intention is to fallback to a specific locale (es_ES > es_MX for example), there is no need to create an additional language file or have translators change their workflow. WordPress would just load the fallback locale automatically.

This also appears to be too political and culturally sensitive to stack rank fallback locales based on native speaker population sizes (or other more arbitrary measures).

@grapplerulrich If the user needs to define the fallback rank order in wp-config.php, I'm not sure it's worth doing. The idea is that non-technical users and non-English speakers would have a better experience by providing a fallback automatically. A plugin would likely be easier for non-technical users to install and configure than editing code.

@vanillalounge This ticket is not aimed at solving a problem for translators nor developers- it's looking to provide a better experience for non-English speaking users.

That said, I think some exciting changes are on the way in terms of language packs and better tools for translators. I agree this is super important and would resolve some of the pain points in terms of getting plugin/theme authors to release translations and keep them up to date.

-

I appreciate all the discussion! I'm leaning towards closing the core ticket and releasing this idea as a plugin instead.

comment:15 in reply to: ↑ 13 @cvedovini15 months ago

Replying to vanillalounge:

Replying to cvedovini:

...but eventually to consider one of the locale to be the default for one given language, like es_ES would be the default if a es_MX file is not present.

I'm sorry, but this is a recipe for disaster. Who gets to decide which one is the default? The es_ES translators? The es_MX ones? Whichever option you choose will inevitably raise a (probably nasty) discussion. Who will moderate that discussion (keeping in mind that it would be someone, who a)is qualified for that language and b)inevitably more familiar with one of the declinations)?

First I meant "may be" not "eventually" (typical French speaker error), sorry for that. Now, like I said several times before:

  1. I know it's going to be a political issue (despite the fact it's detrimental to the end user experience and that other organisations actually manage to do it )
  2. The core translators don't have to do anything if they don't want to.

...including that fallback mechanism would allow me (and others) to provide a unique language file for all locales of that language...

Again, how will you provide that? Whose translation will it be? Will you be the one deciding which es_* is the generic one?

I am the one who provide the Spanish translations for my plugins, and I am using the translations that have been done by an American guy who happens to speak Spanish. I don't know which locale this is and I don't care, that's the point, I am practical, those translations are good enough for the Spanish speakers who use my plugins

Look, I don't mean to sound irascible and unwilling to compromise, but you are approaching this from an angle that's completely irrelevant to translators or native speakers of another language. I may be wrong, but this ticket seems to aim to solve a problem of the developers, not of the translators or native speakers. We couldn't care less about fallback languages; from our point of view, meaning no offense, it's a solution looking for a problem to solve.

it may be irrelevant to a translator but not to a Chilean who will have the core translated in his own locale of Spanish but none of the plugins or themes he is using despite some of them having an es_ES or es_MX file. If you let plugin and theme developers provide an es file (instead of forcing them to choose a locale or provide all of them) then that Chilean user may have more of his Wordpress in his language (or close enough)

And you are right about one thing, that solution does not aim at solving any translator's problem. But you're blind if you can't see the problem (I just explained it again in the previous paragraph)

I would much rather see these efforts directed towards a procedure where a translator can easily and readily supply a current translation to a developer (instead of "oh, I'll include it in the next release", which is flawed), that we can trust that the code is properly gettexted and that .pot files are current with said code. Sadly, in 90% of the code I look at, this is most definitely not the case.

I don't think this is relevant here, my plugins are "properly gettexted" and I don't see translators rushing to offer their services. I have to chase after my friends who have a command in another language or accept translations from plugin users, any of those being hard enough to keep up-to-date so that I end up with a Mexican volunteer who will create the first Spanish version, have a Argentinian maintain that file for the next version and a Colombian for the one after that.

comment:16 in reply to: ↑ 14 @cvedovini15 months ago

Replying to downstairsdev:

@cvedovini I generally agree with the criticisms of a generic language fallback. If the intention is to fallback to a specific locale (es_ES > es_MX for example), there is no need to create an additional language file or have translators change their workflow. WordPress would just load the fallback locale automatically.

That solution (fallback to es.mo when es_MX.mo is not there) is simple and just require that the chosen default locale file (say es_ES.mo) be provided without a locale (es.mo) that's how it works in Java for example and I guess many other platforms and I don't see an uprising. It doesn't require complex calculations or configuration, it's straightforward.
Now, choosing a default locale is the responsibility of those who provide the component, I get to choose what is the default Spanish in my plugins or themes (more like I take what's given to me actually) and if the Wordpress guys don't want to choose then they don't provide a default, your problem.

@vanillalounge This ticket is not aimed at solving a problem for translators nor developers- it's looking to provide a better experience for non-English speaking users.

exactly, if I wanted to solve the problem for myself I would tell my users to translate the plugin themselves and be done with it

I appreciate all the discussion! I'm leaning towards closing the core ticket and releasing this idea as a plugin instead.

I though about releasing a plugin already but that would be useless. To have my plugins in the right language my users would need to install another plugin. I am better off coding the hack in each and everyone of them, like I do now.

Last edited 15 months ago by cvedovini (previous) (diff)

comment:17 @nacin14 months ago

  • Milestone Awaiting Review deleted
  • Resolution set to maybelater
  • Status changed from new to closed

For the moment, this sounds like a good plugin.

Note: See TracTickets for help on using tickets.