Difference between revisions of "Input Japanese using uim"

From ArchWiki
Jump to: navigation, search
(Compiling Modified Anthy using PKGBUILD)
(Compiling Mozc for uim using PKGBUILD)
Line 160: Line 160:
==== Compiling Mozc for uim using PKGBUILD ====
==== Compiling Mozc for uim using PKGBUILD ====
First, you get mozc-svn tarball from AUR (or use AUR tools like yaourt) and edit the PKGBUILD to enable uim-mozc. That is, uncomment {{Codeline|_uim_mozc}} line and you can comment out {{Codeline|_ibus_mozc}} line to disable ibus module if unnecessary:
First, you get {{Package AUR|mozc-svn}} tarball from the [[AUR]] and edit the PKGBUILD to enable uim-mozc. That is, uncomment {{Codeline|_uim_mozc}} line and you can comment out {{Codeline|_ibus_mozc}} line to disable ibus module if unnecessary:
  ## You can choose the input method framework to use either ibus, uim or both.
  ## You can choose the input method framework to use either ibus, uim or both.
  ## If you will be not using ibus, comment out below.
  ## If you will be not using ibus, comment out below.

Revision as of 05:28, 3 January 2011

This template has only maintenance purposes. For linking to local translations please use interlanguage links, see Help:i18n#Interlanguage links.

Local languages: Català – Dansk – English – Español – Esperanto – Hrvatski – Indonesia – Italiano – Lietuviškai – Magyar – Nederlands – Norsk Bokmål – Polski – Português – Slovenský – Česky – Ελληνικά – Български – Русский – Српски – Українська – עברית – العربية – ไทย – 日本語 – 正體中文 – 简体中文 – 한국어

External languages (all articles in these languages should be moved to the external wiki): Deutsch – Français – Română – Suomi – Svenska – Tiếng Việt – Türkçe – فارسی

This page explains how to get the Japanese input to work using uim.
If you use SCIM, see Smart Common Input Method platform.


You need the following packages to input Japanese.

  • Japanese fonts
  • Japanese input method (Kana to Kanji conversion engine): This article describes about Anthy and Mozc.
  • Input method framework: uim

Japanese fonts

see also Fonts for more detail.

Recommended Japanese fonts are as follows.

A high quality and formal style opensource font set including Gothic (sans-serif) and Mincho (serif) glyphs. Default font of openSUSE-ja.
A community developed derivatives of IPA Fonts. It aims to fix the problems of IPA fonts promptly. Default of Ubuntu-ja.
Default Gothic font of Debian-ja, Fedora-ja, Vine Linux, et al.
Default Gothic font of Mandriva Linux ja environment.

If you want to show 2channel Shift JIS art properly, use one of the following fonts:


Using pacman

Pull down the necessary things with a :

# pacman -S uim

Compiling uim from source using PKGBUILD

For instance, in the case of the followings, you should compile from source:

  • You cannot wait Arch repo's update.
  • You want to use Anthy(UTF-8): As of uim 1.6.0, Anthy(UTF-8) support is disabled by default (default text encoding is EUC-JP).
    Note: From uim 1.6.0-2 in Arch repo, Anthy(UTF-8) support is enabled.
  • You use KDE and want to use uim-qt-tools: All tools for Qt are disabled by default.

The one of the easy way to build from source is using ABS.
First, install ABS:

# pacman -S abs

Update ABS:

# abs

Then, copy uim's directory to under your $HOME. For example:

$ cp -R /var/abs/extra/uim ~/sources/

Next, edit Template:Filename. Typical build options are as follows:

Enable Anthy(UTF-8) support
Do not build gnome-applet. You can drop gnome-panel from makedepends (also optdepends).
Build uim-tools for Qt (needs Qt)
Build UimQt (Qt immodule support) for Qt4 (needs Qt)

Here is a sample of custom PKGBUILD for enable anthy-utf8 support and disable gnome-applet (based on uim 1.6.0 in extra).

pkgdesc='Multilingual input method library'
arch=('i686' 'x86_64')
depends=('m17n-lib' 'ncurses' 'gtk2')
makedepends=('pkg-config' 'gettext' 'intltool')
#makedepends=('pkg-config' 'gettext' 'intltool' 'gnome-panel')
optdepends=('gnome-panel: for using the GNOME applet')

build() {
  cd "${srcdir}/${_pkgname}-${pkgver}"

  # makechrootpkg runs build() as "nobody", which has HOME=/
  # However, UIM's make needs $HOME to be writable.
  patch -p0 < ../uim-home.patch
  export HOME="`pwd`"

  ./configure --prefix=/usr --libexecdir=/usr/lib/uim \
              --with-anthy-utf8 \

package() {
  cd "${srcdir}/${_pkgname}-${pkgver}"
  make DESTDIR="${pkgdir}" install
  install -D -m644 COPYING "${pkgdir}/usr/share/licenses/${_pkgname}/COPYING"

Finally, run makepkg under uim directory to make and install package:

$ makepkg -s -i

Input method


Anthy is one of the most popular Japanese input method in open source world. Though it is not maintained for a long time, Debian succeeds it from May 2010.

To install Anthy :

# pacman -S anthy

Extra dictionary

Default dictionary of original Anthy does not include several characters which are not specified on EUC-JP (JIS X 0208) such as "①", "♥", etc. alt-cannadic provides extra dictionaries including those characters.

Get alt-cannadic dictionary and put them under your Template:Filename.

$ tar jxvf alt-cannadic-091230.tar.bz2
$ mkdir ~/.anthy/imported_words_default.d (if not exist)
$ cp alt-cannadic-091230/extra/*.t ~/.anthy/imported_words_default.d/

Please see official wiki for more detail (but Japanese).

Warning: If you will be using this extra dictionary, choose Anthy (UTF-8) for default input method on uim.

Modified Anthy (anthy-ut)

Modified Anthy is a set of patches and huge extended dictionaries which aims to improve the Kana to Kanji conversion quality of original Anthy.

Modified Anthy consists two different upstreams:

  • Patched source of Anthy by G-HAL
  • Huge extended dictionalies by UTSUMI
Warning: Modified Anthy applies to only Anthy (UTF-8). So you have to choose Anthy (UTF-8) for default input method on uim.
Warning: Modified Anthy does not have compatibility of the dictionaries and learning data with original Anthy.

Compiling Modified Anthy using PKGBUILD

Modified Anthy is available on AUR named Template:Package AUR.

Get anthy-ut tarball and makepkg to make and install package:

$ wget http://aur.archlinux.org/packages/anthy-ut/anthy-ut.tar.gz
$ tar xvf anthy-ut.tar.gz
$ cd anthy-ut
$ makepkg -s -i

If you have already used original Anthy, you have to convert the existing learning data format.

$ rm ~/.anthy/last-record1_*.bin
$ anthy-agent --update-base-record
$ rm ~/.anthy/last-record1_*.bin
$ anthy-agent --update-base-record

(Though this step repeats the same commands twice, it is not mistypes.)

Anthy Kaomoji

Anthy Kaomoji is a modified version of Anthy that converts Hiragana text to Kana Kanji mixed text and has emoticon (顔文字) and 2ch dictionaries. It can be found in the AUR (Template:Package AUR).


Mozc (on AUR) is a Japanese open source input method originates from Google Japanese Input. It is considered that it has better conversion quality than Anthy as for multi segments conversion (e.g. one sentence) in a lump but the dictionary is not so sufficient. Though Mozc adapts to only ibus and scim input method framework, macuim provides uim-mozc plugin and you can use it with Template:Package AUR on AUR.

Compiling Mozc for uim using PKGBUILD

First, you get Template:Package AUR tarball from the AUR and edit the PKGBUILD to enable uim-mozc. That is, uncomment Template:Codeline line and you can comment out Template:Codeline line to disable ibus module if unnecessary:

## You can choose the input method framework to use either ibus, uim or both.
## If you will be not using ibus, comment out below.
## If you will be using uim, uncomment below.
## If you will be using scim, uncomment below.

Configure Japanese zip code dictionary (optional)

mozc can import Japanese zip code by Japan Post as dictionary seed.

To add zip code, edit the following lines in PKGBUILD:

## You can add Japanese zip code provided by Japan Post.
## If you want to use it, uncomment below.

Build, install and register

Next, build and install:

$ makepkg -s -i

Finally, register Mozc on uim.

# uim-module-manager --register mozc
Note: You must run this command whenever you upgrade or (re-)install uim.

When you have uninstalled Mozc, unregister it from uim.

# uim-module-manager --unregister mozc


Add the followings to ~/.xprofile, ~/.xinitrc or ~/.xsession:

Environment variables

export GTK_IM_MODULE='uim'
export QT_IM_MODULE='uim'
uim-xim &
export XMODIFIERS=@im='uim'

Toolbar utilities

If you want to use UimToolbar utilities which shows and controls uim mode, add one of the followings too.


Using toolbar appears as a window:

uim-toolbar-gtk &

or if you built Template:Codeline, you can add:

uim-toolbar-qt &


Using toolbar for system tray:

uim-toolbar-gtk-systray &

Panel applet

Or, if you use GNOME, KDE or Xfce, you can use uim-toolbar panel applet (Xfce requires xfce4-xfapplet-plugin to use uim-applet-gnome and KDE requires to buid uim with Template:Codeline).

uim preferences

Configure uim preferences by running :

$ uim-pref-gtk

which brings forth a GUI.
Choose Template:Codeline, Template:Codeline or Template:Codeline for 'Default input method'.

Note: Mozc will be not listed in 'Default input method' at first time so you will need to add it into 'Enabled input methods' to use.

You can run Template:Codeline or restart X to test your settings.
Provided everything went well you should be able to input Japanese in X.


Using uim on Emacs

uim provides uim.el the bridge software between Emacs and uim. Here is a sample to use uim on Emacs with utf-8 encoding.

Please see Official wiki for more detail.

Meanwhile, Anthy and Mozc provide frontend for Emacs, i.e. anthy.el and mozc.el. The one of the features of uim.el is the inline candidates displaying mode. It displays conversion candidates just below (or above) preedit text vertically.

LEIM or minor-mode

You can call uim.el from Emacs in two ways; directly or with the LEIM framework. Though settings of them are different, basic functions are same. If you want to switch between uim.el and other Emacs IMs frequently, you should use LEIM framework.

Settings for the minor-mode

If you will be using on minor-mode, write the following settings into your Template:Filename or some other file for Emacs customizing.

;; read uim.el
(require 'uim)
;; uncomment next and comment out previous to load uim.el on-demand
;; (autoload 'uim-mode "uim" nil t)

;; key-binding for activate uim (ex. C-\)
(global-set-key "\C-\\" 'uim-mode)

Settings for the LEIM

If you will be using via LEIM, write the following settings into your Template:Filename or some other file for Emacs customizing and choose default input method.

;; read uim.el with LEIM initializing
(require 'uim-leim)

;; set default IM. Uncomment the one of the followings.
;(setq default-input-method "japanese-anthy-utf8-uim") ; Anthy (UTF-8)
;(setq default-input-method "japanese-mozc-uim")       ; Mozc

Preferred character encoding

uim.el uses euc-jp character encoding by default. To set UTF-8 as preferred encodings, add the followings into your Template:Filename or some other file for Emacs customizing.

;; Set UTF-8 as preferred character encoding (default is euc-jp).
(setq uim-lang-code-alist
      (cons '("Japanese" "Japanese" utf-8 "UTF-8")
           (delete (assoc "Japanese" uim-lang-code-alist) 

Enable inline candidates displaying mode by default

If you want to enable inline candidates displaying mode by default, write as follows.

;; set inline candidates displaying mode as default
(setq uim-candidate-display-inline t)

Set Hiragana input mode by default

To set Hiragana input mode at activting uim, add one of the followings into your Template:Filename or some other file for Emacs customizing.

Anthy (UTF-8):

;; Set Hiragana input mode at activating uim with anthy (utf-8).
(setq uim-default-im-prop '("action_anthy_utf8_hiragana"))


;; Set Hiragana input mode at activating uim with mozc.
(setq uim-default-im-prop '("action_mozc_hiragana"))

Disabling XIM on Emacs

When you are using input method on your desktop and assigning activation/deactivation of input method to C-SPC, you will be not able to use C-SPC/C-@ as set-mark-command on Emacs. To avoid this problem, add the following into your Template:Filename or Template:Filename. xim will be disabled on Emacs.

Emacs*UseXIM: false


Cannot input Japanese on Opera

If you use Opera and cannot input Japanese with uim, try to edit environment variable as follows:

export QT_IM_MODULE='xim'

uim-toolbar-gtk-systray: tray icon is crushed

uim-toolbar-gtk-systray isn't compliant with freedesktop.org system tray specs so some icons are shown in one icon space by default. Choose just one of them to solve it. The steps to display only 'Input mode' icon for example as follows:

  1. Run Template:Codeline.
  2. Click 'Toolbar' on 'Group' list.
  3. Take the all checkmarks off.
  4. Click 'Anthy', 'Anthy (UTF-8)' or 'Mozc' which you are using on 'Group' list.
  5. Click Edit button in 'Toolbar' box -> 'Enable toolbar buttons' line.
  6. Enable only 'Input mode' and click 'Close' button.
  7. Click 'OK' button to close uim-pref-gtk.

The tray icon will be displayed "あ" (Hiragana mode) or "ー" (Direct mode).

I use darker theme, I cannot read the uim mode icons

You can chose icons for darker background (uim 1.6.0 or later).

  1. Run uim-perf-gtk
  2. Click 'Toolbar' on 'Group' list.
  3. Check 'Use icon for dark background'.

Useful literature

uim official document
uim on wikibooks
Japanese fonts showcase
modified Japanese fonts