Locale: Difference between revisions

From ArchWiki
(→‎LC_TIME: date and time format: suggest C.UTF-8 instead for generic 24h locale)
(→‎LC_ALL: troubleshooting: the issue with LC_ALL=C.UTF-8 not overriding LANGUAGE was fixed in glibc 2.39)
 
(14 intermediate revisions by 8 users not shown)
Line 37: Line 37:


{{Note|1=<nowiki></nowiki>
{{Note|1=<nowiki></nowiki>
* {{ic|locale-gen}} also runs with every update of {{Pkg|glibc}}. [https://github.com/archlinux/svntogit-packages/blob/packages/glibc/trunk/glibc.install#L2]
* {{ic|locale-gen}} also runs with every update of {{Pkg|glibc}}. [https://gitlab.archlinux.org/archlinux/packaging/packages/glibc/-/blob/main/glibc.install?ref_type=heads#L2]
* {{ic|UTF-8}} is recommended over other character sets. [https://utf8everywhere.org/]}}
* {{ic|UTF-8}} is recommended over other character sets. [https://utf8everywhere.org/]}}


Line 143: Line 143:
=== LC_TIME: date and time format ===
=== LC_TIME: date and time format ===


If {{ic|LC_TIME}} is set to {{ic|en_US.UTF-8}}, for example, the date format will be "MM/DD/YYYY".  If wanting to use the the ISO 8601 date format of "YYYY-MM-DD" use:
If {{ic|LC_TIME}} is set to {{ic|en_US.UTF-8}}, for example, the date format will be "MM/DD/YYYY".  If wanting to use the [[Wikipedia:ISO 8601|ISO 8601]] date format of "YYYY-MM-DD" use:


{{hc|locale.conf|2=
{{hc|locale.conf|2=
LC_TIME=en_DK.UTF-8
LC_TIME=en_DK.UTF-8
}}
}}
You can print the current timestamp using your locale date and time format with {{ic|date +"%c"}}.


{{pkg|glibc}} 2.29 fixed a bug, {{ic|en_US.UTF-8}} started showing in 12-hour format, as was intended.  If wanting to use 24-hour format, use {{ic|1=LC_TIME=C.UTF-8}}.
{{pkg|glibc}} 2.29 fixed a bug, {{ic|en_US.UTF-8}} started showing in 12-hour format, as was intended.  If wanting to use 24-hour format, use {{ic|1=LC_TIME=C.UTF-8}}.
Line 169: Line 171:
=== LC_ALL: troubleshooting ===
=== LC_ALL: troubleshooting ===


The locale set for this variable will always override {{ic|LANG}} and all the other {{ic|LC_*}} variables, whether they are set or not. If {{ic|LC_ALL}} is set to {{ic|C}}, it will also override {{ic|LANGUAGE}}.
The locale set for this variable will always override {{ic|LANG}} and all the other {{ic|LC_*}} variables, whether they are set or not. If {{ic|LC_ALL}} is set to {{ic|C}} or {{ic|C.UTF-8}}, it will also override {{ic|LANGUAGE}}.


{{ic|LC_ALL}} is the only {{ic|LC_*}} variable which '''cannot''' be set in {{ic|locale.conf}} files: it is meant to be used only for testing or troubleshooting purposes, for example in {{ic|/etc/profile}}.
{{ic|LC_ALL}} is the only {{ic|LC_*}} variable which '''cannot''' be set in {{ic|locale.conf}} files: it is meant to be used only for testing or troubleshooting purposes, for example in {{ic|/etc/profile}}.
{{Note|1={{ic|1=LC_ALL=C.UTF-8}}, unlike {{ic|1=LC_ALL=C}}, does not override {{ic|LANGUAGE}}. See [https://sourceware.org/bugzilla/show_bug.cgi?id=16621 glibc bug 16621] and [https://savannah.gnu.org/bugs/?62815 gettext bug 62815].}}


== Troubleshooting ==
== Troubleshooting ==


=== My terminal does not support UTF-8 ===
For encoding problems, check [[Character encoding#Troubleshooting]].
 
The following lists some (not all) terminals that support UTF-8:
 
* gnustep-terminal
* konsole
* [[mlterm]]
* [[rxvt-unicode]]
* [[st]]
* [[List of applications/Utilities#VTE-based|VTE-based terminals]]
* [[xterm]] - Run with the argument {{ic|-u8}} or configure resource {{ic|xterm*utf8: 2}}.
 
==== Gnome-terminal or rxvt-unicode ====
 
You need to launch these applications from a UTF-8 locale or they will drop UTF-8 support.  Enable the {{ic|en_US.UTF-8}} locale (or your local UTF-8 alternative) per the instructions above and set it as the default locale, then reboot.


=== My system is still using wrong language ===
=== My system is still using wrong language ===
Line 205: Line 191:
[[LightDM]] will automatically use {{pkg|accountsservice}} to set a user's locale if it is installed. Otherwise, LightDM stores the user session configuration in {{ic|~/.dmrc}}. It is possible that an unwanted locale setting is retrieved from there as well.
[[LightDM]] will automatically use {{pkg|accountsservice}} to set a user's locale if it is installed. Otherwise, LightDM stores the user session configuration in {{ic|~/.dmrc}}. It is possible that an unwanted locale setting is retrieved from there as well.


=== Incorrect encoding for extracted files ===
=== Using a custom locale causes problems ===
 
When installing a locale that is not officially supported (e.g., {{AUR|locale-en_xx}}), some problems can occur, like [https://youtrack.jetbrains.com/issue/IDEA-59679/Cannot-type-dead-keys-in-Linux#focus=Comments-27-5724886.0-0 dead/compose keys not working in some applications] or [https://git.suckless.org/tabbed/commit/aa5f91e0d54333e4da247528f5eacb721710d16d.html#h0-0-304 applications reporting missing locales].
After installing a custom locale, manual intervention is required to resolve these problems.
There are [https://xyne.dev/projects/locale-en_xx/#usage two approaches] (replace {{ic|en_XX.UTF-8}} with the identifier of your custom locale):


Old versions of Windows (XP, Vista, and 7) use different encoding for the content of compressed files. To unzip use the command:
==== Set LC_CTYPE ====


$ unzip -O CP936 ''file.zip''
Set {{ic|LC_CTYPE}} to an officially supported locale (like {{ic|en_US.UTF-8}}), e.g.:
 
{{hc|/etc/locale.conf|2=
LANG=en_XX.UTF-8
LC_CTYPE=en_US.UTF-8
}}
 
==== Modify the Xlib database ====
 
Modify the Xlib database by adding the following:
 
{{hc|/usr/share/X11/locale/locale.dir|
en_US.UTF-8/XLC_LOCALE en_XX.UTF-8
en_US.UTF-8/XLC_LOCALE: en_XX.UTF-8
}}
 
{{hc|/usr/share/X11/locale/compose.dir|
en_US.UTF-8/Compose en_XX.UTF-8
en_US.UTF-8/Compose: en_XX.UTF-8
}}


== See also ==
== See also ==


* [https://sourceware.org/glibc/wiki/Locales Locales - glibc wiki]
* [[Gentoo:Localization/Guide]]
* [[Gentoo:Localization/Guide]]
* [http://wikigentoo.ksiezyc.pl/Locales.htm Supposedly 2008, or earlier, Gentoo wiki article]
* [https://web.archive.org/web/20220715154210/http://wikigentoo.ksiezyc.pl/Locales.htm Supposedly 2008, or earlier, Gentoo wiki article]
* [https://icu4c-demos.unicode.org/icu-bin/collation.html ICU's interactive collation testing]
* [https://icu4c-demos.unicode.org/icu-bin/collation.html ICU's interactive collation testing]
* [http://www.openi18n.org/ Free Standards Group Open Internationalisation Initiative]
* [http://www.openi18n.org/ Free Standards Group Open Internationalisation Initiative]
* [https://pubs.opengroup.org/onlinepubs/007908799/xbd/locale.html ''The Single UNIX Specification'' definition of Locale] by The Open Group
* [https://pubs.opengroup.org/onlinepubs/007908799/xbd/locale.html The Single UNIX Specification definition of Locale] by The Open Group
* [https://help.ubuntu.com/community/EnvironmentVariables#Locale_setting_variables Locale environment variables]
* [https://help.ubuntu.com/community/EnvironmentVariables#Locale_setting_variables Locale environment variables]

Latest revision as of 15:26, 2 March 2024

Locales are used by glibc and other locale-aware programs or libraries for rendering text, correctly displaying regional monetary values, time and date formats, alphabetic idiosyncrasies, and other locale-specific standards.

Generating locales

Locale names are typically of the form language[_territory][.codeset][@modifier], where language is an ISO 639 language code, territory is an ISO 3166 country code, and codeset is a character set or encoding identifier like ISO-8859-1 or UTF-8. See setlocale(3).

For a list of enabled locales, run:

$ locale -a

Before a locale can be enabled on the system, it must be generated. This can be achieved by uncommenting applicable entries in /etc/locale.gen, and running locale-gen. Equivalently, commenting entries disables their respective locales. While making changes, consider any localisations required by other users on the system, as well as specific #Variables.

For example, uncomment en_US.UTF-8 UTF-8 for American-English:

/etc/locale.gen
...
#en_SG ISO-8859-1
en_US.UTF-8 UTF-8
#en_US ISO-8859-1
...

Save the file, and generate the locale:

# locale-gen
Note:
  • locale-gen also runs with every update of glibc. [1]
  • UTF-8 is recommended over other character sets. [2]

Setting the locale

To display the currently set locale and its related environmental settings, type:

$ locale

The locale to be used, chosen among the previously generated ones, is set in locale.conf files. Each of these files must contain a new-line separated list of environment variable assignments, having the same format as output by locale.

To list available locales which have been previously generated, run:

$ localedef --list-archive

Alternatively, using localectl(1):

$ localectl list-locales

Setting the system locale

To set the system locale, write the LANG variable to /etc/locale.conf, where en_US.UTF-8 belongs to the first column of an uncommented entry in /etc/locale.gen:

/etc/locale.conf
LANG=en_US.UTF-8

Alternatively, run:

# localectl set-locale LANG=en_US.UTF-8

See #Variables and locale.conf(5) for details.

Overriding system locale per user session

The system-wide locale can be overridden in each user session by creating or editing $XDG_CONFIG_HOME/locale.conf (usually ~/.config/locale.conf).

The precedence of these locale.conf files is defined in /etc/profile.d/locale.sh.

Tip:
  • This can also allow keeping the logs in /var/log/ in English while using the local language in the user environment.
  • You can create a /etc/skel/.config/locale.conf file so that any new users added using useradd and the -m option will have ~/.config/locale.conf automatically generated.

Make locale changes immediate

Once system and user locale.conf files have been created or edited, their new values will take effect for new sessions at login. To have the current environment use the new settings unset LANG and source /etc/profile.d/locale.sh:

$ unset LANG
$ source /etc/profile.d/locale.sh
Note: The LANG variable has to be unset first, otherwise locale.sh will not update the values from locale.conf. Only new and changed variables will be updated; variables removed from locale.conf will still be set in the session.

Other uses

Locale variables can also be defined with the standard methods as explained in Environment variables.

For example, in order to test or debug a particular application during development, it could be launched with something like:

$ LANG=C ./my_application.sh

Similarly, to set the locale for all processes run from the current shell (for example, during system installation):

$ export LANG=C

Variables

locale.conf files support the following environment variables.

  • LANG
  • LANGUAGE
  • LC_ADDRESS
  • LC_COLLATE
  • LC_CTYPE
  • LC_IDENTIFICATION
  • LC_MEASUREMENT
  • LC_MESSAGES
  • LC_MONETARY
  • LC_NAME
  • LC_NUMERIC
  • LC_PAPER
  • LC_TELEPHONE
  • LC_TIME

Full meaning of the above LC_* variables can be found on manpage locale(7), whereas details of their definition are described on locale(5).

Note: Programs follow the priority order when looking up locale dependent values.

LANG: default locale

The locale set for this variable will be used for all the LC_* variables that are not explicitly set.

Tip: Assume that you are an English user in Spain, and you want your programs to handle numbers and dates according to Spanish conventions, and only the messages should be in English. Then you could set the LANG variable to es_ES.UTF-8 and the LC_MESSAGES (user interface for message translation) variable to en_US.UTF-8.

LANGUAGE: fallback locales

Programs which use gettext for translations respect the LANGUAGE option in addition to the usual variables. This allows users to specify a list of locales that will be used in that order. If a translation for the preferred locale is unavailable, another from a similar locale will be used instead of the default. For example, an Australian user might want to fall back to British rather than US spelling:

locale.conf
LANG=en_AU.UTF-8
LANGUAGE=en_AU:en_GB:en
Note: Many applications do not name or alias their English locale as en or en_US, but instead make it the default locale, which is C. If in LANGUAGE a non-English locale is placed after English, e.g. LANGUAGE=en_US:en:es_ES, then applications may choose the secondary locale despite English strings being available.[3] The solution is to always explicitly place the C locale after English. E.g. LANGUAGE=en_US:en:C:es_ES.

LC_TIME: date and time format

If LC_TIME is set to en_US.UTF-8, for example, the date format will be "MM/DD/YYYY". If wanting to use the ISO 8601 date format of "YYYY-MM-DD" use:

locale.conf
LC_TIME=en_DK.UTF-8

You can print the current timestamp using your locale date and time format with date +"%c".

glibc 2.29 fixed a bug, en_US.UTF-8 started showing in 12-hour format, as was intended. If wanting to use 24-hour format, use LC_TIME=C.UTF-8.

Note: Programs do not necessarily respect this variable to format the date. For example, date(1) uses its own parameters to do so, and Firefox stopped honouring LC_TIME with versions 57 to 84 (Bug 1429578).

LC_COLLATE: collation

This variable governs the collation rules used for sorting and regular expressions.

Setting the value to C can for example make the ls command sort dotfiles first, followed by uppercase and lowercase filenames:

locale.conf
LC_COLLATE=C

See also [4].

To get around potential issues, Arch used to set LC_COLLATE=C in /etc/profile, but this method is now deprecated.

LC_ALL: troubleshooting

The locale set for this variable will always override LANG and all the other LC_* variables, whether they are set or not. If LC_ALL is set to C or C.UTF-8, it will also override LANGUAGE.

LC_ALL is the only LC_* variable which cannot be set in locale.conf files: it is meant to be used only for testing or troubleshooting purposes, for example in /etc/profile.

Troubleshooting

For encoding problems, check Character encoding#Troubleshooting.

My system is still using wrong language

It is possible that the environment variables are redefined in other files than locale.conf. See Environment variables#Defining variables for details.

If you are using a desktop environment, such as GNOME, its language settings may be overriding the settings in locale.conf.

KDE Plasma also allows to change the UI's language through the system settings. If the desktop environment is still using the default language after the modification, deleting the file at ~/.config/plasma-localerc (previously: ~/.config/plasma-locale-settings.sh) should resolve the issue.

If you are using a display manager in combination with accountsservice, follow the instructions in Display manager#Set language for user session.

LightDM will automatically use accountsservice to set a user's locale if it is installed. Otherwise, LightDM stores the user session configuration in ~/.dmrc. It is possible that an unwanted locale setting is retrieved from there as well.

Using a custom locale causes problems

When installing a locale that is not officially supported (e.g., locale-en_xxAUR), some problems can occur, like dead/compose keys not working in some applications or applications reporting missing locales. After installing a custom locale, manual intervention is required to resolve these problems. There are two approaches (replace en_XX.UTF-8 with the identifier of your custom locale):

Set LC_CTYPE

Set LC_CTYPE to an officially supported locale (like en_US.UTF-8), e.g.:

/etc/locale.conf
LANG=en_XX.UTF-8
LC_CTYPE=en_US.UTF-8

Modify the Xlib database

Modify the Xlib database by adding the following:

/usr/share/X11/locale/locale.dir
en_US.UTF-8/XLC_LOCALE en_XX.UTF-8
en_US.UTF-8/XLC_LOCALE: en_XX.UTF-8
/usr/share/X11/locale/compose.dir
en_US.UTF-8/Compose en_XX.UTF-8
en_US.UTF-8/Compose: en_XX.UTF-8

See also