Help talk:I18n

From ArchWiki
Jump to navigation Jump to search

Regarding ArchWiki internationalization

There have been a number of discussions about this over the years: 2006, 2007, 2009, and 2010. In short, there are a number of potential solutions; none are perfect. Currently, the interwiki implementation is considered "best" because it provides non-English users with a fully-localized experience and isolates each language. Other "good" solutions include the creation of language-specific namespaces or migration to a different wiki which provides "better" internationalization options -- but require more effort to implement. -- pointone 16:07, 21 October 2011 (EDT)

See also #Language namespace(s) in place of suffixes? for a more recent discussion. -- Kynikos (talk) 16:06, 3 June 2012 (UTC)

MediaWiki translation extension

Speaking of multi language support for MediaWiki. It does have an extension to support translation. See: http://translatewiki.net/wiki/Main_Page. Here is forum proposal [1] and bug FS#26638. As a user of KDE userbase and techbase, I think this extension is exactly what Arch wiki need. But again, lack of man power to do it.

Exactly, time's not ripe for talking about this. Please for now let's use the suffix method as consistently as possible: if one day another method will be enforced, it will be much easier to handle at least some parts of the transition automatically with bots or other scripts. -- Kynikos 06:01, 30 March 2012 (EDT)

Language namespace(s) in place of suffixes?

This discussion is about the possibility of replacing the current system of classification of the articles by language, using suffixes in the title, with a namespace-based system. This issue has currently a lower priority than #"Dummy" interlanguage links and deprecation of Template:i18n.

The main advantage would be that it would be possible to have only English articles as results when using the search engine, and, depending on the implementation of this idea, it may be possible also to select the language of the search.

Another advantage would be that in article-list pages (such as those in Special:SpecialPages) that list articles alphabetically, all the articles for a language would be grouped together.

There are many ways we can implement this solution, and each has its advantages and disadvantages; I'd like to also keep the current suffix solution in the discussion, for comparison and also because it has its advantages too.

1) Every language has its own namespace

  • This can be done either with local or English language names. Note that it's not possible to have namespaces named like interlanguage links! For example an article named Ru:Some Title could currently be created, but once the ru interlanguage links are activated, that article won't be accessible anymore, and it will be possible to edit/delete it only via the API using its ID (this has already happened with an article that was named with "tr:...").
  • This solution would create 2*N namespaces (where N is the number of languages) because every namespace must have a _talk namespace; I don't know what effect this would have on select menus and other interfaces that list the namespaces (e.g. in special pages filters).
  • Examples:
Dansk:Some Article, Dansk talk:Some Article, Magyar:Some Article, ...
Danish:Some Article, Danish talk:Some Article, Hungarian:Some Article, ...

2) There's one big namespace for non-English languages

  • There are various possible choices for the name of the namespace: "Lang", "Local", "i18n", ???...
  • The language can be separated from the title with a colon, a slash or some other punctuation mark
  • We could use language tags or full language names
  • Language names could still be suffixes or be part of the prefix
  • This solution just adds 1 namespace and its associated talk
  • Examples:
Lang:pl/Some Article, Lang talk:pl/Some Article, Lang:zh-CN/Some Article, ...
Local:Some Article (Polski), Local talk:Some Article (Polski), Local:Some Article (简体中文), ...

3) Some languages can have a proper namespace according to some objective rules based on the number of translations


Note that the namespace solution wouldn't be able to separate the languages completely, in fact we'd have to keep mixed Template and Category namespaces: how would we deal with those cases? Case 2) may have the simplest solution by using Template:es/Lorem Ipsum and Category:es/Lorem Ipsum or something like that, and we'd still have the advantage of having templates and categories grouped by language in alphabetical lists. About the Help and ArchWiki namespaces we could do something similar.

Note that solution 2) would break the use of Template:Lowercase title in non-English articles. The only way to solve that problem would be using an extension that can parse substrings, or force using {{DISPLAYTITLE:...}}.

The bot algorithm to implement such a change should avoid creating redirects for every title, and instead it should update all the backlinks of every article (Wiki Monkey should be able to do that, it has already done a similar thing when removing the English suffix from category names, although in this case it would be a much bigger job).

References:

I think this can be enough for now, as you can see it's quite intricate, I don't even have a clear idea about what's my preference at the moment, let's see if somebody can help sort out the ideas.

-- Kynikos (talk) 20:48, 2 June 2012 (UTC)

I like (2) -- a single non-English namespace. I had never even considered this option before! This will solve the biggest problem with our current implementation -- non-English articles polluting search results and other special pages -- whilst still promoting external wikis with interlanguage links as the "ideal" solution.
(We must keep in mind that, in the end, separate external wikis is the only complete solution to provide non-English readers with a fully-internationalized interface (as long as we are running MediaWiki, that is). Everything else at this point is simply a stepping-stone toward that goal.)
Creating separate namespaces for each language would quickly complicate maintenance, as you note, and add little benefit over the single-namespace solution. -- pointone (talk) 14:27, 3 June 2012 (UTC)
Yeah I too tend to prefer solution (2), especially in the form of Lang:pt/Lorem Ipsum because that would group articles by language in alphabetical lists.
I'd use Template:pt/Lorem Ipsum and Category:pt/Lorem Ipsum, but Lang:pt/Help:Lorem Ipsum and Lang:pt/ArchWiki:Lorem Ipsum for special namespaces.
The bot should be able to convert {{Lowercase title}} to {{DISPLAYTITLE:...}} in existing articles, but when a user copies an English article for translating it, he should remember to do that conversion by himself. Alternatives can be abolishing Template:Lowercase_title or using parser functions to detect the actual title (without the prefix).
-- Kynikos (talk) 16:16, 3 June 2012 (UTC)
Note that the format Lang:es/Title wouldn't be possible, only Lang:Es/Title or Lang:ES/Title would. -- Kynikos (talk) 14:42, 23 September 2012 (UTC)
Alternative formats (better isolate the title from the prefix, for readability when displayed in the h1 at the top of the page, especially with short titles): Lang:UK / KDE, Lang:UK KDE, Lang:UK - KDE, Lang:UK ~ KDE, Lang:(uk) KDE (parentheses should allow lowercase tags, note that square brackets would require html entities to be used in links), Lang:Українська KDE, Lang:Українська - KDE... -- Kynikos (talk) 16:11, 27 September 2012 (UTC)
Some considerations about restricting searches to a particular language:
  • both solutions (1) and (2) would give English-only results by default;
  • solution (1) would allow to select the right language namespace in the advanced search interface;
  • solution (2) would require to add the name of the language to the search keywords (this is how it's already working), but only if the full language names are retained in the titles (i.e. they aren't replaced by language tags like in Title (Español) -> Lang:ES/Title)
-- Kynikos (talk) 04:08, 16 January 2013 (UTC)
Moving here the considerations about interlanguage links and Templates/Categories (interlanguage links cannot be used directly with Templates and Categories if language namespaces are implemented):
-- Kynikos (talk) 19:36, 15 June 2012 (UTC)
More random notes:
  • {{Lowercase title}} shouldn't be converted to {{DISPLAYTITLE:...}} if the language tags are at the start of the title (right after the namespace), in fact the real title could be directly converted to the correct case.
  • A Template:Local title or similar could be created and put on every translation to prettify the localized titles through mw:Help:Magic words#Page names (would require setting $wgRestrictDisplayTitle to false).
    • For example if the titles were of the form Lang:Zh-CN/Local title/Subpage the template could be {{DISPLAYTITLE:{{lcfirst:{{PAGENAME}}}}}} to display "zh-CN/Local title/Subpage".
    • If we wanted to keep the current format, the titles could be of the form Lang:Local title/Subpage (简体中文) or Lang:Local title (简体中文)/Subpage (see also Help talk:Style#Localized subpages), and the template could be {{DISPLAYTITLE:{{PAGENAME}}}} to display "Local title/Subpage (简体中文)" or "Local title (简体中文)/Subpage". This template would require explicitly making the first letter of the title lowercase when necessary, like in the English page.
    • Quite cool and flexible, if the titles were of the form Lang:Local title/Subpage/zh-CN we could play with the tag through {{SUBPAGENAME}} and put it wherever we want (and change it in the future); for example the template could be {{DISPLAYTITLE:{{BASEPAGENAME}} [{{SUBPAGENAME}}]}} to display "Local title/Subpage [zh-CN]", or {{DISPLAYTITLE:[{{SUBPAGENAME}}] {{BASEPAGENAME}}}} to display "[zh-CN] Local title/Subpage". This template would require explicitly making the first letter of the title lowercase when necessary, like in the English page. Also, there won't be the advantage of having the pages grouped by language in article lists.
Also note that {{DISPLAYTITLE}} properly affects the HTML page title, so that the pretty format is displayed in browser tab, window title... and web search engine results!
-- Kynikos (talk) 13:50, 5 July 2014 (UTC) (Last edit: 03:31, 27 June 2015 (UTC))

Back-end changes

This reform would require two changes to the back-end: creating the new namespace and updating the local interlanguage links. The latter is trivial, but we have to make a final decision on the new namespace name, so let's put it to vote in #Poll. Only one vote per user, you can change your vote as many times as you want before the poll is closed. You can add more options if you want. Please add comments here, not under the poll. — Kynikos (talk) 01:00, 14 March 2015 (UTC)

In the discussion above, most of the considerations about DISPLAYTITLE are false, unless $wgRestrictDisplayTitle is set to false. -- Lahwaacz (talk) 08:08, 14 March 2015 (UTC)
I'd go with the "L10n" prefix, as it makes clear that translated pages are adaptations of their respective English versions [2] and must be kept as true to the original as possible. Both "I18n" and "L10n" are also common in Arch packaging [3] [4]. If we go with L10n, we should likely rename this page as well.
That said, as we're the only ones likely to vote here, we should agree on a name instead of all voting on a different one. :P
-- Alad (talk) 11:46, 25 June 2015 (UTC)
I think what also matters is how readable localized titles would look on web engine search results, e.g. in https://www.google.com/search?q=archwiki+main+page , and this probably also depends on the language prefix/suffix format we choose to enforce. If we decide for the "NS:CODE/Title" format, probably "Lang" is indeed a better-looking solution; on the other hand, if we kept language suffixes, e.g. "NS:Title (Language)", "Lang" wouldn't introduce the language code anymore, and it wouldn't look as good, so in that case "L10n" may be the best option. For this reason I think we should open a poll for the language format first. — Kynikos (talk) 04:03, 27 June 2015 (UTC)
Regarding #Poll (syntax), what does the "Local title" part mean? Shouldn't it be just "Title", i.e. the title of the relevant English page as it is now? -- Lahwaacz (talk) 12:57, 28 June 2015 (UTC)
Sorry, I blindly copy-pasted the examples above, I've fixed the poll. Allowing localized titles would be something that I wanted to discuss later on. — Kynikos (talk) 03:39, 29 June 2015 (UTC)

MediaWiki has a new related feature since 1.24, see "Manually changing page language" in mw:Manual:Language#Page_content_language. It is not enabled by default, but since we don't use the Translate extension, it would come in handy for us. -- Lahwaacz (talk) 13:29, 28 June 2015 (UTC)

I've just tested it in my local testing wiki, but I don't see much use for that feature honestly... Or maybe I haven't understood what it should do? (Not much according to the "What does it define?" section in mw:Manual:Language#Page_content_language) Also, the language doesn't seem to be set automatically based on the title, but it should be set with periodic bot runs (with an ad-hoc script). — Kynikos (talk) 13:46, 30 June 2015 (UTC)
I thought that it would produce better search results when using a search engine like Google, but it appears that the language is already recognized correctly. Anyway, fixing the lang attribute of the html tag manually "just because" is probably not worth the effort. -- Lahwaacz (talk) 14:23, 30 June 2015 (UTC)
I'm not surprised that Google detects the language of a page without relying on html attributes, anyway I didn't make the connection between your observation and my concerns about search engine results, sorry for the misunderstanding :)
But does the lang attribute really change? I've just checked in my local wiki and it stays "en" no matter what language I set for my test page... O_o
Kynikos (talk) 11:01, 1 July 2015 (UTC)

Poll (namespace)

Lang

  • Lahwaacz (talk) 08:14, 14 March 2015 (UTC)
  • [vote here with ~~~~]

Int

  • [vote here with ~~~~]

Local

  • [vote here with ~~~~]

Loc

  • [vote here with ~~~~]

I18n

  • [vote here with ~~~~]

L10n

  • [vote here with ~~~~]

Poll (syntax)

Lang:ES/Title/Subpage

Lang:ZH-CN/Title/Subpage

  • Fengchao (talk) 12:45, 28 June 2015 (UTC)
  • [vote here with ~~~~]


L10n:Title/Subpage (Español)

L10n:Title/Subpage (简体中文)

  • [vote here with ~~~~]


L10n:Title/Subpage/Español

L10n:Title/Subpage/简体中文

[+ set $wgRestrictDisplayTitle to false and prettify with DISPLAYTITLE]

  • [vote here with ~~~~]

Status update

I am sorry if this is not related here. I would like to know that has any conclusion been made about changing the back-end of all I18n articles? More importantly, is translation work still encouraged here in ArchWiki? --NonerKao (talk) 13:16, 21 October 2015 (UTC)

  1. No, for the moment there's no change to internationalization rules, however any change would probably be carried out with bots, so you can keep translating normally. This also means that you're still in time to express your opinion, if you have something to add.
  2. Yes, translation work is fully supported: in the past the policy was to encourage the establishment of external wikis for non-English articles, but more recently this view seems to have changed, and for example I tend to support local translations because translators tend to bring additional improvements to English articles too, although an official stand on this issue has never been made. Note however that local translations have the disadvantage that many parts of the wiki interface aren't affected by the language preferences set by each user, so they will always be displayed in English.
Kynikos (talk) 04:05, 22 October 2015 (UTC)
OK, let me share some rough thoughts here. As a tranditional Chinese(TC below) user, actually I feel that current I18n policy is quite good and acceptable, because some of the TC users has made some conventions and newcomers like me can easily follow them. I can maintaining, translating as many pages as I want.
However I also acknowledge that it might be a totally different story for administrators for management purpose. So, I am kind of neutral about future changes on this issue. Everything is fine now, but if the current state really bothers managers, matainers and normal archers, then I would vote for change as well.
Maybe the long silence of this discussion indicates that this issue does not, in fact, bother many people. NonerKao (talk) 04:43, 22 October 2015 (UTC)
Having seen external i18n wiki translation sites come and go, I think current practice shows that we should encourage keeping a united Arch wiki for all translations. Keep running a wiki site need a great effort which is under estimated by most external wiki translation site maintainer. We should saving the energy on actual content. --Fengchao (talk) 07:17, 24 November 2017 (UTC)

I think we can close this, unless someone wants to update #Regarding ArchWiki internationalization... — Lahwaacz (talk) 08:39, 5 June 2021 (UTC)

Incorrect metadata and RTL style issues

MediaWiki pages have a content language (under Page Information), which we currently do not use, i.e. it's English even for non-English pages. This means that the div containing article content (id mw-content-text) always claims dir="ltr" class="mw-content-ltr" lang="en". This is not just embarrassing but also screws up articles written in a right-to-left (RTL) language, case in point ar:Arch User Repository. Notice how the table of contents is left aligned because #mw-content-text has the wrong CSS class and the Related articles box is right floated, which cannot be fixed in MediaWiki:Common.css because of the same reason.

Unless we suddenly decide to revive the dead #Language namespace(s) in place of suffixes?, I propose that we write a bot to fix the page content languages.

--Larivact (talk) 14:46, 11 August 2018 (UTC)

Right now we cannot manually change the page language, because it is not enabled with $wgPageLanguageUseDB = true, see mw:Manual:Language#Page_content_language.
However, there is also a PageContentLanguage hook so we could write a MediaWiki extension which parses the ArchWiki-specific localized titles properly. Besides the RTL issue, this would also allow us to fix the issue with the breadcrumb links (see Help_talk:Style#Localized_subpages) by parsing Title (Language)/Sub-page. The only remaining problem would be with the interlanguage links, but I think that would be possible with the help of some magic words like {{PAGELANGUAGE}} and {{#language:...}}.
-- Lahwaacz (talk) 07:36, 13 August 2018 (UTC)
There is a pull request to enable $wgPageLanguageUseDB, then we will be able to change the content language manually or with a bot. — Lahwaacz (talk) 07:59, 5 June 2021 (UTC)

Add new language

Belarusian

I want to start translating wiki into my language (Belarusian), but didn't understand about checklist and a minimum number of translated articles. Where I can't find more about it and how I can start? — JustPieceOfMeat (talk) 18:10, 4 January 2020 (UTC)

Hi, we started a checklist above in #Draft, I'm not sure if that's what you're referring to.
You've done point 1 here.
You need to do point 2, I guess using the "Беларуская" spelling.
We can then do point 3.
Meanwhile start translating articles (point 4), since we'll require at least those when considering Help:i18n#Adding local interlanguage links (point 6).
Let us know here when ready to do point 5.
-- Kynikos (talk) 04:22, 5 January 2020 (UTC)

Romanian

Hello there! I want to add translations in my language, Romanian. Any basic advice? I'm pretty new at this wiki editing thing Andy3153 (talk) 18:26, 13 May 2020 (UTC)

Hi, we started a draft procedure in #Draft, and this is step 1 done, congratulations :)
When creating translation pages, append "_(Română) to the end of every title.
You can try step 2 by yourself, we'll do step 3, and definitely you have to do step 4.
Don't worry about steps 5+ for the moment.
If you're new to wiki editing, you can start from Help:Editing.
-- Kynikos (talk) 16:00, 14 May 2020 (UTC)
Oh, okay! How do I do step 2 though? And, how do I get started with step 4? Andy3153 (talk) 10:12, 18 May 2020 (UTC)
To perform step 2, for your language, create Category:Română.
For step 4, please check ArchWiki:Translation Team for help.
-- Blackteahamburger (talk) 10:20, 18 May 2020 (UTC)
Alright, thank you. I'll keep you updated if I am not able to do something Andy3153 (talk) 10:23, 18 May 2020 (UTC)
Hello! I just created the category and I translated the main page, and I wanted to ask if I've done everything right so far. Here it is : https://wiki.archlinux.org/index.php/Ro:ArchWiki:Main_page Andy3153 (talk) 13:48, 18 May 2020 (UTC)
The name of your translated main page is wrong. You should use Main page (Română). Since there is no Romanian intermediate language link, you should not follow the steps in ArchWiki:Translation Team completely (after there is Romanian intermediate language link, you can follow the steps in ArchWiki:Translation Team), you need to add (Română) at the end of the page to be translated (note that there should be a space between the page name and the (Română) added at the end)
This should not be the above problem (which is also a useful tip, of course), because the page name you translated is completely wrong.
The wrong page has been moved, please use the correct title in the future, thank you for your translation. -- Blackteahamburger (talk) 23:34, 18 May 2020 (UTC)
Okay! Thank you for helping me Andy3153 (talk) 06:07, 19 May 2020 (UTC)
It has been some time since the creation of Category: Română, @Kynikos can you update the help:i18n#Languages table first? -- Blackteahamburger (talk) 09:48, 30 May 2020 (UTC)
You're right, done. -- Kynikos (talk) 11:06, 31 May 2020 (UTC)

Classical Chinese

I want to start the translation of Classical Chinese. Since this language is not listed in Help:i18n#Languages, I don’t know the localized name of Classical Chinese, please tell me, thanks! -- Blackteahamburger (talk) 15:33, 27 May 2020 (UTC)

Pardon my ignorance, and I'm genuinely interested to know, are you talking about w:Classical Chinese? The article says it is a literary variant of Chinese: I totally do not want to curb your enthusiasm, but usually we translate the wiki with the goal of making Arch Linux more easily accessible also to users speaking other languages, so what audience would be the target of Classical Chinese articles? :) -- Kynikos (talk) 11:23, 31 May 2020 (UTC)
This is actually the language used by ancient Chinese (but modern Chinese will also learn). I just think that since Wikipedia has this language [5], archwiki can have it too. -- Blackteahamburger (talk) 12:02, 31 May 2020 (UTC)
Thanks, well, I guess there's no problem if you want to have some fun translating into Classical Chinese, but regarding your original question and given your goal, you should know the localized name better than anyone else here ;) Wikipedia says 文言文 or 古文, or using my very limited knowledge of Chinese, maybe 古中文 would make sense? Or perhaps you meant the language subtag? We could just use zh-classical, or maybe I just didn't get what you meant? -- Kynikos (talk) 15:16, 2 June 2020 (UTC)
Thank you for your answers. What I want to ask is the localized name. I think it’s better to use Wikipedia’s standard, and use “文言文” because this name is standard and common. -- Blackteahamburger (talk) 09:13, 3 June 2020 (UTC)
Can you add this language to Help:i18n#Languages? I have translated the Main page: Main page (文言文), and I plan to translate more pages. -- Blackteahamburger (talk) 11:35, 19 June 2020 (UTC)
No worries, done. -- Kynikos (talk) 14:03, 20 June 2020 (UTC)
I can't help but notice, the pages of Classical Chinese are in simplified Chinese where it technically should be traditional one, like in wikipedia. Windowsboy111 (talk) 03:50, 29 November 2020 (UTC)
Thanks, it's done. -- Blackteahamburger (talk) 12:20, 5 June 2021 (UTC)

Bangla

Hello! I want to translate Archwiki pages into my mother tongue, Bangla. But my language isn't listed in Help:i18n#Languages. How do I start?

—This unsigned comment is by FOSS ভক্ত (talk) 16:30, 2 September 2020‎. Please sign your posts with ~~~~!

Please see #Add new language for instructions. – Lahwaacz (talk) 17:47, 2 September 2020 (UTC)
How do I create a category? Would you please help? I'm new to wiki editing. - FOSS ভক্ত (talk) 05:23, 3 September 2020 (UTC)
Categories are like normal pages, but their title starts with the "Category:" prefix. So if you want to create a category named "Some name", you need to create the Category:Some name page. Of course, replace "Some name" with the actual name relevant for you. -- Lahwaacz (talk) 08:03, 3 September 2020 (UTC)
I have created Category:বাংলা and also created Main Page (বাংলা). Please tell me if I did right and pardon my mistakes :) -- FOSS ভক্ত (talk) 09:08, 3 September 2020 (UTC)
Please note that it is Main page (বাংলা), not Main Page (বাংলা). -- Blackteahamburger (talk) 11:35, 4 September 2020 (UTC)
Oh, I see. Noted. And, I also saw that the English interlanguage link has been removed from my page. What was the reason for that? --FOSS ভক্ত (talk) 11:47, 4 September 2020 (UTC)
This should be a problem with the bot. I think this problem should be fixed. -- Blackteahamburger (talk) 11:58, 4 September 2020 (UTC)
Oh. So, should I re-add English interlanguage link to Main page (বাংলা) now? I will now perform step 4 of #Draft now. Will notify you when completed. --FOSS ভক্ত (talk) 12:07, 4 September 2020 (UTC)
No. Since the bot runs every day, your change will be overwritten. So only fixing this bot problem can be solved. -- Blackteahamburger (talk) 03:22, 5 September 2020 (UTC)
Hey, can you please help me with a pronunciation? How to pronounce initcpio? I need it to translate Arch Linux -- FOSS ভক্ত (talk) 16:40, 15 September 2020 (UTC)
In fact, you do not need to translate "initcpio". -- Blackteahamburger (talk) 10:17, 16 September 2020 (UTC)
Actually, I want to write it in Bangla. I think, it will be great if I put the pronunciation in Bangla there. Because It will help people learn the pronunciation at the same time. Please tell me the pronunciation if you can allow this. -- FOSS ভক্ত (talk) 11:40, 16 September 2020 (UTC)
Sorry, I don’t know... (My native language is not English) -- Blackteahamburger (talk) 10:33, 17 September 2020 (UTC)
Most people likely pronounce it as "init. C. P. I. O." (init see pee eye oh) /ɪnɪt siː piː aɪ oʊ/ -- Kynikos (talk) 01:32, 18 September 2020 (UTC)

Cantonese

Hello, I'd like to translate ArchWiki into w:Cantonese. Should I first create Category:粵語? (Step 1 in #Draft is done?) Windowsboy111 (talk) 04:32, 17 February 2021 (UTC)

What is the correct language tag for Cantonese? It seems complicated... -- Lahwaacz (talk) 18:23, 27 April 2021 (UTC)
I actually have no idea either, but in the wiki settings, the "zh-Hant-HK" language is just pure Traditional Chinese, and "yue" is Cantonese. I don't see "zh-yue" or similar forms. Windowsboy111 (talk) 00:49, 28 April 2021 (UTC)
yue is the correct ISO639-3 code for Cantonese. zh-HK/zh-Hant-HK is Traditional Chinese (Hong Kong). zh-yue is, well, incorrect, since Cantonese is not a "dialect of Chinese". PyanayaLuna (talk) 00:56, 28 April 2021 (UTC)
OK, I've added the yue tag to wiki-scripts and Cantonese to the table in Help:I18n#Languages. -- Lahwaacz (talk) 06:42, 28 April 2021 (UTC)

Hindi

Hello, I'd like to translate ArchWiki into w:Hindi. I have created the Category:हिन्दी and translated the Main page to Main page (हिन्दी). I think that is all for step 1 and 2. --PulkitKrishna (talk) 12:11, 23 June 2021 (UTC)

Hi, thanks for your contributions, however in the title suffixes you need to use the localized spelling of "Hindi", see Help:I18n#Page titles (is it हिन्दी?). -- Kynikos (talk) 14:14, 30 June 2021 (UTC)
Oh Sorry. Now I have corrected it. I have also completed the translation of About Arch page. Now working on Installation guide. --PulkitKrishna (talk) 12:46, 2 July 2021 (UTC)

Automation of i18n

There are a lot of things to consider when we have to translate a page. For example, a lot of times we may forget to use instead of . when translating to Chinese. And I believe there are a lot of language-related things that can be very convenient to be "fixed" or "edited" by bots.

Issues about Chinese varients

Note:

Translations between Traditional between Simplified Chinese

From what I know, (I mean, I am native Chinese anyway,) zh-Hant and zh-Hans literally uses the same words, but just different character sets. The translation between two languages can be fully automated by bots really. The exceptions might be just some words like 電腦 instead of 计算器, which is still easy to be automated。

Classical Chinese

Classical Chinese is an old form of written Chinese. By that, I want to point out the fact that lzh should be, is, and always has been, written using zh-Hant character set, not zh-Hans. We might have to thoroughly discuss to draw a conclusion on which to use. (Currently, pages of Classical Chinese are in zh-Hant.)

Links of pages

Let's say I am having an English link of Wikipedia:Arch Linux. These might not be able to be changed to Wikipedia:zh:Arch Linux automatically. But surely links like Main Page can, right? (When bots can detect interlanguage links — see below.)

Bugs caused by pending tag-adding

I've noticed that Lahwaacz.bot has treated pages of non-English tag-adding-pending languages' pages (that ends with (Lang)) as English pages and removed their [[en:Page]] Interlanguage links, like this one. I am not sure if this is intended, but it's probably best to detect it and not remove that.

Templates

Template:Translateme (正體中文) is apparently the zh-Hant version of Template:Translateme. Yet, there are still zh-Hant pages using Template:Translateme instead, as you can see here. This can be fixed automatically.

windowsboy111 (talk) 01:04, 29 April 2021 (UTC)

This is interesting. Fengchao also had some ideas about updating links, but we never got to implementing them. The first thing needed is identifying issues that can be automated and this discussion is an excellent start. Then we will need to discuss more deeply what and how (and also in which context) exactly should be replaced and assemble rules that the bot should follow. Unfortunately I can't help with non-English languages, so you or other translators would need to help a lot with this. Then, of course, we will need to actually implement the rules for the bot, which should be relatively easy since most of the page-editing stuff is already prepared in wiki-scripts (but there is no natural-language parsing, which might get tricky quite easily). And finally, we will need to test everything before it is used on the wiki.
There are also some general (or English-specific) ideas for wiki-scripts, see the wishlist. I have some work-in-progress code for an English spell-checker, which might be close to what we would need here (of course with different rules, but I think it would be a good example to get started). I noticed you asked about wiki-scripts on the IRC, it would be awesome if you wanted to contribute some code. Let me know if you need help to get started.
-- Lahwaacz (talk) 20:03, 29 April 2021 (UTC)
I am not sure if this is a previously rejected idea, but mw:Extension:Translate seems like a perfect solution to the i18n linking automation issues to me. (or even better, the entire mw:MediaWiki Language Extension Bundle) PyanayaLuna (talk) 01:53, 30 April 2021 (UTC)
I'm pretty sure we already discussed mw:Extension:Translate, but I can't find the discussion now. We're not fans of what the extension does to the original page, the markers get messy pretty quickly and they require every editor, not just translators, to be aware of the translation workflow and to deal with the markers correctly. The other extensions from the language bundle don't help with translating content, they just improve the user interface. We can discuss enabling more extensions, but in a new discussion – it's not mutually exclusive with making automated edits with bots. — Lahwaacz (talk) 06:59, 30 April 2021 (UTC)
Hmm. For the Chinese translation automation part, what about a merger of the zh-hans and zh-hant versions, enable $wgPageLanguageUseDB and change the content language of the new Chinese pages? Then we could make use of LanguageConverter for automated conversion. PyanayaLuna (talk) 18:06, 30 April 2021 (UTC)
I am not sure how it works, but if words like 電腦 and 计算器 are interchangable with that feature, it should be ok. — windowsboy111 (talk) 01:36, 1 May 2021 (UTC)
Live demo at w:zh:电子计算机. Using the language conversion tab to the right of 條目 and 討論, you can switch between Chinese variants. Clicking on the 汉漢 button to the right would reveal more phrase conversion "magic". Documentation is available at w:zh:WP:LC, but it is only in Chinese. The conversion table source is available at [6]. PyanayaLuna (talk) 02:12, 1 May 2021 (UTC)
I've created a pull request to enable $wgPageLanguageUseDB. — Lahwaacz (talk) 07:56, 5 June 2021 (UTC)
It's been merged and we now have a new script for setting the page language property in the database, so the correct language should appear in the page info. — Lahwaacz (talk) 16:19, 5 July 2021 (UTC)