Help talk:I18n

From ArchWiki
Jump to: navigation, search

Regarding ArchWiki internationalization

There have been a number of discussions about this over the years: 2006, 2007, 2009, and 2010. In short, there are a number of potential solutions; none are perfect. Currently, the interwiki implementation is considered "best" because it provides non-English users with a fully-localized experience and isolates each language. Other "good" solutions include the creation of language-specific namespaces or migration to a different wiki which provides "better" internationalization options -- but require more effort to implement. -- pointone 16:07, 21 October 2011 (EDT)

See also #Language namespace(s) in place of suffixes? for a more recent discussion. -- Kynikos (talk) 16:06, 3 June 2012 (UTC)

MediaWiki translation extension

Speaking of multi language support for MediaWiki. It does have an extension to support translation. See: http://translatewiki.net/wiki/Main_Page. Here is forum proposal [1] and bug FS#26638. As a user of KDE userbase and techbase, I think this extension is exactly what Arch wiki need. But again, lack of man power to do it.

Exactly, time's not ripe for talking about this. Please for now let's use the suffix method as consistently as possible: if one day another method will be enforced, it will be much easier to handle at least some parts of the transition automatically with bots or other scripts. -- Kynikos 06:01, 30 March 2012 (EDT)

Language namespace(s) in place of suffixes?

This discussion is about the possibility of replacing the current system of classification of the articles by language, using suffixes in the title, with a namespace-based system. This issue has currently a lower priority than #"Dummy" interlanguage links and deprecation of Template:i18n.

The main advantage would be that it would be possible to have only English articles as results when using the search engine, and, depending on the implementation of this idea, it may be possible also to select the language of the search.

Another advantage would be that in article-list pages (such as those in Special:SpecialPages) that list articles alphabetically, all the articles for a language would be grouped together.

There are many ways we can implement this solution, and each has its advantages and disadvantages; I'd like to also keep the current suffix solution in the discussion, for comparison and also because it has its advantages too.

1) Every language has its own namespace

  • This can be done either with local or English language names. Note that it's not possible to have namespaces named like interlanguage links! For example an article named Ru:Some Title could currently be created, but once the ru interlanguage links are activated, that article won't be accessible anymore, and it will be possible to edit/delete it only via the API using its ID (this has already happened with an article that was named with "tr:...").
  • This solution would create 2*N namespaces (where N is the number of languages) because every namespace must have a _talk namespace; I don't know what effect this would have on select menus and other interfaces that list the namespaces (e.g. in special pages filters).
  • Examples:
Dansk:Some Article, Dansk talk:Some Article, Magyar:Some Article, ...
Danish:Some Article, Danish talk:Some Article, Hungarian:Some Article, ...

2) There's one big namespace for non-English languages

  • There are various possible choices for the name of the namespace: "Lang", "Local", "i18n", ???...
  • The language can be separated from the title with a colon, a slash or some other punctuation mark
  • We could use language tags or full language names
  • Language names could still be suffixes or be part of the prefix
  • This solution just adds 1 namespace and its associated talk
  • Examples:
Lang:pl/Some Article, Lang talk:pl/Some Article, Lang:zh-CN/Some Article, ...
Local:Some Article (Polski), Local talk:Some Article (Polski), Local:Some Article (简体中文), ...

3) Some languages can have a proper namespace according to some objective rules based on the number of translations


Note that the namespace solution wouldn't be able to separate the languages completely, in fact we'd have to keep mixed Template and Category namespaces: how would we deal with those cases? Case 2) may have the simplest solution by using Template:es/Lorem Ipsum and Category:es/Lorem Ipsum or something like that, and we'd still have the advantage of having templates and categories grouped by language in alphabetical lists. About the Help and ArchWiki namespaces we could do something similar.

Note that solution 2) would break the use of Template:Lowercase title in non-English articles. The only way to solve that problem would be using an extension that can parse substrings, or force using {{DISPLAYTITLE:...}}.

The bot algorithm to implement such a change should avoid creating redirects for every title, and instead it should update all the backlinks of every article (Wiki Monkey should be able to do that, it has already done a similar thing when removing the English suffix from category names, although in this case it would be a much bigger job).

References:

I think this can be enough for now, as you can see it's quite intricate, I don't even have a clear idea about what's my preference at the moment, let's see if somebody can help sort out the ideas.

-- Kynikos (talk) 20:48, 2 June 2012 (UTC)

I like (2) -- a single non-English namespace. I had never even considered this option before! This will solve the biggest problem with our current implementation -- non-English articles polluting search results and other special pages -- whilst still promoting external wikis with interlanguage links as the "ideal" solution.
(We must keep in mind that, in the end, separate external wikis is the only complete solution to provide non-English readers with a fully-internationalized interface (as long as we are running MediaWiki, that is). Everything else at this point is simply a stepping-stone toward that goal.)
Creating separate namespaces for each language would quickly complicate maintenance, as you note, and add little benefit over the single-namespace solution. -- pointone (talk) 14:27, 3 June 2012 (UTC)
Yeah I too tend to prefer solution (2), especially in the form of Lang:pt/Lorem Ipsum because that would group articles by language in alphabetical lists.
I'd use Template:pt/Lorem Ipsum and Category:pt/Lorem Ipsum, but Lang:pt/Help:Lorem Ipsum and Lang:pt/ArchWiki:Lorem Ipsum for special namespaces.
The bot should be able to convert {{Lowercase title}} to {{DISPLAYTITLE:...}} in existing articles, but when a user copies an English article for translating it, he should remember to do that conversion by himself. Alternatives can be abolishing Template:Lowercase_title or using parser functions to detect the actual title (without the prefix).
-- Kynikos (talk) 16:16, 3 June 2012 (UTC)
Note that the format Lang:es/Title wouldn't be possible, only Lang:Es/Title or Lang:ES/Title would. -- Kynikos (talk) 14:42, 23 September 2012 (UTC)
Alternative formats (better isolate the title from the prefix, for readability when displayed in the h1 at the top of the page, especially with short titles): Lang:UK / KDE, Lang:UK KDE, Lang:UK - KDE, Lang:UK ~ KDE, Lang:(uk) KDE (parentheses should allow lowercase tags, note that square brackets would require html entities to be used in links), Lang:Українська KDE, Lang:Українська - KDE... -- Kynikos (talk) 16:11, 27 September 2012 (UTC)
Some considerations about restricting searches to a particular language:
  • both solutions (1) and (2) would give English-only results by default;
  • solution (1) would allow to select the right language namespace in the advanced search interface;
  • solution (2) would require to add the name of the language to the search keywords (this is how it's already working), but only if the full language names are retained in the titles (i.e. they aren't replaced by language tags like in Title (Español) -> Lang:ES/Title)
-- Kynikos (talk) 04:08, 16 January 2013 (UTC)
Moving here the considerations about interlanguage links and Templates/Categories (interlanguage links cannot be used directly with Templates and Categories if language namespaces are implemented):
-- Kynikos (talk) 19:36, 15 June 2012 (UTC)
More random notes:
  • {{Lowercase title}} shouldn't be converted to {{DISPLAYTITLE:...}} if the language tags are at the start of the title (right after the namespace), in fact the real title could be directly converted to the correct case.
  • A Template:Local title or similar could be created and put on every translation to prettify the localized titles through mw:Help:Magic words#Page names (would require setting $wgRestrictDisplayTitle to false).
    • For example if the titles were of the form Lang:Zh-CN/Local title/Subpage the template could be {{DISPLAYTITLE:{{lcfirst:{{PAGENAME}}}}}} to display "zh-CN/Local title/Subpage".
    • If we wanted to keep the current format, the titles could be of the form Lang:Local title/Subpage (简体中文) or Lang:Local title (简体中文)/Subpage (see also Help talk:Style#Localized subpages), and the template could be {{DISPLAYTITLE:{{PAGENAME}}}} to display "Local title/Subpage (简体中文)" or "Local title (简体中文)/Subpage". This template would require explicitly making the first letter of the title lowercase when necessary, like in the English page.
    • Quite cool and flexible, if the titles were of the form Lang:Local title/Subpage/zh-CN we could play with the tag through {{SUBPAGENAME}} and put it wherever we want (and change it in the future); for example the template could be {{DISPLAYTITLE:{{BASEPAGENAME}} [{{SUBPAGENAME}}]}} to display "Local title/Subpage [zh-CN]", or {{DISPLAYTITLE:[{{SUBPAGENAME}}] {{BASEPAGENAME}}}} to display "[zh-CN] Local title/Subpage". This template would require explicitly making the first letter of the title lowercase when necessary, like in the English page. Also, there won't be the advantage of having the pages grouped by language in article lists.
Also note that {{DISPLAYTITLE}} properly affects the HTML page title, so that the pretty format is displayed in browser tab, window title... and web search engine results!
-- Kynikos (talk) 13:50, 5 July 2014 (UTC) (Last edit: 03:31, 27 June 2015 (UTC))

Back-end changes

This reform would require two changes to the back-end: creating the new namespace and updating the local interlanguage links. The latter is trivial, but we have to make a final decision on the new namespace name, so let's put it to vote in #Poll. Only one vote per user, you can change your vote as many times as you want before the poll is closed. You can add more options if you want. Please add comments here, not under the poll. — Kynikos (talk) 01:00, 14 March 2015 (UTC)

In the discussion above, most of the considerations about DISPLAYTITLE are false, unless $wgRestrictDisplayTitle is set to false. -- Lahwaacz (talk) 08:08, 14 March 2015 (UTC)
I'd go with the "L10n" prefix, as it makes clear that translated pages are adaptations of their respective English versions [2] and must be kept as true to the original as possible. Both "I18n" and "L10n" are also common in Arch packaging [3] [4]. If we go with L10n, we should likely rename this page as well.
That said, as we're the only ones likely to vote here, we should agree on a name instead of all voting on a different one. :P
-- Alad (talk) 11:46, 25 June 2015 (UTC)
I think what also matters is how readable localized titles would look on web engine search results, e.g. in https://www.google.com/search?q=archwiki+main+page , and this probably also depends on the language prefix/suffix format we choose to enforce. If we decide for the "NS:CODE/Title" format, probably "Lang" is indeed a better-looking solution; on the other hand, if we kept language suffixes, e.g. "NS:Title (Language)", "Lang" wouldn't introduce the language code anymore, and it wouldn't look as good, so in that case "L10n" may be the best option. For this reason I think we should open a poll for the language format first. — Kynikos (talk) 04:03, 27 June 2015 (UTC)
Regarding #Poll (syntax), what does the "Local title" part mean? Shouldn't it be just "Title", i.e. the title of the relevant English page as it is now? -- Lahwaacz (talk) 12:57, 28 June 2015 (UTC)
Sorry, I blindly copy-pasted the examples above, I've fixed the poll. Allowing localized titles would be something that I wanted to discuss later on. — Kynikos (talk) 03:39, 29 June 2015 (UTC)

MediaWiki has a new related feature since 1.24, see "Manually changing page language" in mw:Manual:Language#Page_content_language. It is not enabled by default, but since we don't use the Translate extension, it would come in handy for us. -- Lahwaacz (talk) 13:29, 28 June 2015 (UTC)

I've just tested it in my local testing wiki, but I don't see much use for that feature honestly... Or maybe I haven't understood what it should do? (Not much according to the "What does it define?" section in mw:Manual:Language#Page_content_language) Also, the language doesn't seem to be set automatically based on the title, but it should be set with periodic bot runs (with an ad-hoc script). — Kynikos (talk) 13:46, 30 June 2015 (UTC)
I thought that it would produce better search results when using a search engine like Google, but it appears that the language is already recognized correctly. Anyway, fixing the lang attribute of the html tag manually "just because" is probably not worth the effort. -- Lahwaacz (talk) 14:23, 30 June 2015 (UTC)
I'm not surprised that Google detects the language of a page without relying on html attributes, anyway I didn't make the connection between your observation and my concerns about search engine results, sorry for the misunderstanding :)
But does the lang attribute really change? I've just checked in my local wiki and it stays "en" no matter what language I set for my test page... O_o
Kynikos (talk) 11:01, 1 July 2015 (UTC)

Poll (namespace)

Lang

  • Lahwaacz (talk) 08:14, 14 March 2015 (UTC)
  • [vote here with ~~~~]

Int

  • [vote here with ~~~~]

Local

  • [vote here with ~~~~]

Loc

  • [vote here with ~~~~]

I18n

  • [vote here with ~~~~]

L10n

  • [vote here with ~~~~]

Poll (syntax)

Lang:ES/Title/Subpage

Lang:ZH-CN/Title/Subpage

  • Fengchao (talk) 12:45, 28 June 2015 (UTC)
  • [vote here with ~~~~]


L10n:Title/Subpage (Español)

L10n:Title/Subpage (简体中文)

  • [vote here with ~~~~]


L10n:Title/Subpage/Español

L10n:Title/Subpage/简体中文

[+ set $wgRestrictDisplayTitle to false and prettify with DISPLAYTITLE]

  • [vote here with ~~~~]

Status update

I am sorry if this is not related here. I would like to know that has any conclusion been made about changing the back-end of all I18n articles? More importantly, is translation work still encouraged here in ArchWiki? --NonerKao (talk) 13:16, 21 October 2015 (UTC)

  1. No, for the moment there's no change to internationalization rules, however any change would probably be carried out with bots, so you can keep translating normally. This also means that you're still in time to express your opinion, if you have something to add.
  2. Yes, translation work is fully supported: in the past the policy was to encourage the establishment of external wikis for non-English articles, but more recently this view seems to have changed, and for example I tend to support local translations because translators tend to bring additional improvements to English articles too, although an official stand on this issue has never been made. Note however that local translations have the disadvantage that many parts of the wiki interface aren't affected by the language preferences set by each user, so they will always be displayed in English.
Kynikos (talk) 04:05, 22 October 2015 (UTC)
OK, let me share some rough thoughts here. As a tranditional Chinese(TC below) user, actually I feel that current I18n policy is quite good and acceptable, because some of the TC users has made some conventions and newcomers like me can easily follow them. I can maintaining, translating as many pages as I want.
However I also acknowledge that it might be a totally different story for administrators for management purpose. So, I am kind of neutral about future changes on this issue. Everything is fine now, but if the current state really bothers managers, matainers and normal archers, then I would vote for change as well.
Maybe the long silence of this discussion indicates that this issue does not, in fact, bother many people. NonerKao (talk) 04:43, 22 October 2015 (UTC)
Having seen external i18n wiki translation sites come and go, I think current practice shows that we should encourage keeping a united Arch wiki for all translations. Keep running a wiki site need a great effort which is under estimated by most external wiki translation site maintainer. We should saving the energy on actual content. --Fengchao (talk) 07:17, 24 November 2017 (UTC)

Incorrect metadata and RTL style issues

MediaWiki pages have a content language (under Page Information), which we currently do not use, i.e. it's English even for non-English pages. This means that the div containing article content (id mw-content-text) always claims dir="ltr" class="mw-content-ltr" lang="en". This is not just embarrassing but also screws up articles written in a right-to-left (RTL) language, case in point ar:Arch User Repository. Notice how the table of contents is left aligned because #mw-content-text has the wrong CSS class and the Related articles box is right floated, which cannot be fixed in MediaWiki:Common.css because of the same reason.

Unless we suddenly decide to revive the dead #Language namespace(s) in place of suffixes?, I propose that we write a bot to fix the page content languages.

--Larivact (talk) 14:46, 11 August 2018 (UTC)

Right now we cannot manually change the page language, because it is not enabled with $wgPageLanguageUseDB = true, see mw:Manual:Language#Page_content_language.
However, there is also a PageContentLanguage hook so we could write a MediaWiki extension which parses the ArchWiki-specific localized titles properly. Besides the RTL issue, this would also allow us to fix the issue with the breadcrumb links (see Help_talk:Style#Localized_subpages) by parsing Title (Language)/Sub-page. The only remaining problem would be with the interlanguage links, but I think that would be possible with the help of some magic words like {{PAGELANGUAGE}} and {{#language:...}}.
-- Lahwaacz (talk) 07:36, 13 August 2018 (UTC)