Input Japanese using uim

From ArchWiki
Revision as of 15:33, 15 February 2010 by Hattori Shinsan (talk | contribs)
Jump to: navigation, search

This page explains how to get the Japanese input to work using UIM.
If you use SCIM, see Smart Common Input Method platform.

Installation

You need the following packages to input Japanese.

  • Japanese fonts
  • Kana-Kanji conversion engine: This article describes about Anthy (UTF-8).
  • Input method: uim

Japanese fonts

see also Fonts for more detail.

Recommended Japanese fonts are as follows.

A community developed derivatives of IPA Fonts. It fixed some problems of IPA fonts.
IPA fonts is a high quality and formal style opensource font set including Gothic (sans-serif) and Mincho (serif) glyphs.
Default Gothic fonts of Debian-ja, Fedora-ja, Ubuntu-ja, Vine Linux, et al.
Default Gothic fonts of Mandriva Linux ja environment.

If you wont to show 2channel Shift JIS art properly, use one of the following fonts:

Anthy

First we need to install Anthy :

# pacman -S anthy

Extra dictionary

Default dictionary of original Anthy does not include several characters which are not specified in EUC-JP (JISX0208) such as "①", "♥", etc. alt-cannadic provides extra dictionaries includes UTF-8 characters for Anthy and Canna.

Get alt-cannadic and put them under ~/.anthy/imported_words_default.d.

$ tar jxvf alt-cannadic-091230.tar.bz2
$ mkdir ~/.anthy/imported_words_default.d (if not exist)
$ cp alt-cannadic-091230/extra/*.t ~/.anthy/imported_words_default.d/

Some of the default conversion rules are defined as "hiragana" + "t", i.e. "あおうみがめt" → "黿" (Green turtle). However uim does not accept non-Hiragana or non-operational characters by default, so drop "t" from rules to use it.

$ cd ~/.anthy/imported_words_default.d
$ sed -i 's/^\([^ ]\+\)t /\1 /' *.t

Improve conversion quality with G-HAL patch

G-HAL provides some patches to improve in conversion quality of Anthy.

Compiling Anthy from source using PKGBUILD

The one of the easy way to build from source is using ABS.
First, install ABS:

# pacman -S abs

Update ABS:

# abs

Then, copy anthy's directory to under your $HOME. For example:

$ cp -R /var/abs/extra/anthy ~/sources/

Next, edit anthy/PKGBUILD. Here is a sample of custom PKGBUILD to apply G-HAL patch13:

Warning: Applying this patch loses dictionaries and learning data compatibility with original Anthy.
pkgname=anthy
pkgver=9100h
_filecode=37536
pkgrel=1
pkgdesc="Hiragana text to Kana Kanji mixed text Japanese input method"
arch=("i686" "x86_64")
url="http://sourceforge.jp/projects/anthy/"
license=('LGPL' 'GPL')
depends=('glibc')
options=('!libtool' 'force')
source=("http://downloads.sourceforge.jp/anthy/${_filecode}/$pkgname-$pkgver.tar.gz"
        "http://www.fenix.ne.jp/~G-HAL/soft/nosettle/anthy-9100h.patch13Bptn23.iconv.2009Y15.bz2"
        "http://www.fenix.ne.jp/~G-HAL/soft/nosettle/anthy-9100h.mkworddic_fix.tar.bz2")
md5sums=('1f558ff7ed296787b55bb1c6cf131108'
         '592c4452e9012c438d925e064d1a2939'
         'a4b81528b95bcaed2b62d30d6584ac8a')

build() {
  cd ${srcdir}/${pkgname}-${pkgver}

  # G-HAL patch13
  patch -Np1 -i ${srcdir}/anthy-9100h.patch13Bptn23.iconv.2009Y15
  cp -f ${srcdir}/mkworddic/* mkworddic/

  ./configure --prefix=/usr --sysconfdir=/etc || return 1
  make EMACS=emacs sysconfdir=/etc || return 1
  make EMACS=emacs DESTDIR=$pkgdir install || return 1
}

Finally, run makepkg under anthy directory to make and install package:

$ makepkg -s -i

If downloading these patchs is failed due to permission denied, download them manually.

  1. Open http://www.fenix.ne.jp/~G-HAL/soft/nosettle/#anthy
  2. Search "anthy-9100h.patch13Bptn23.iconv.2009Y15.bz2" and clieck the link.
  3. Search "anthy-9100h.mkworddic_fix.tar.bz2" and click the link.

If you already installed original Anthy, you have to convert the learning data format.

$ rm ~/.anthy/last-record1_*.bin
$ anthy-agent --update-base-record
$ rm ~/.anthy/last-record1_*.bin
$ anthy-agent --update-base-record

(Though this steps repeats the same commands twice, it is not mistypes.)

UIM

Using pacman

Pull down the necessary things with a :

# pacman -S uim

Compiling uim from source using PKGBUILD

For instance, in the case of the following, you should compile from source:

  • Arch repo's package is out-of-date: It's frequently sometimes out-of-date :P
  • You want to use Anthy(UTF-8): As of uim 1.5.7, Anthy(UTF-8) support is disabled by default.
  • You use KDE and want to use uim-qt-tools: All tools for Qt are disabled by default.

The one of the easy way to build from source is using ABS.
Setup ABS if you do not it yet (see #Compiling Anthy from source using PKGBUILD). Then, copy uim's directory to under your $HOME. For example:

$ cp -R /var/abs/extra/uim ~/sources/

Next, edit uim/PKGBUILD. Typical build options are as follows:

  • --with-anthy-utf8
Enable Anthy(UTF-8) support
  • --disable-gnome-applet
Do not build gnome-applet. You can drop gnome-panel from makedepends (also optdepends).
  • --with-qt4
Build uim-tools for Qt (needs Qt)
  • --with-qt4-immodule
Build UimQt (Qt immodule support) for Qt4 (needs Qt)

Here is a sample of custom PKGBUILD (enable anthy-utf8 support and drop gnome-applet) based on uim 1.5.7 in extra.

pkgname=uim
pkgver=1.5.7
pkgrel=1
pkgdesc="Multilingual input method library"
arch=('i686' 'x86_64')
url="http://code.google.com/p/uim/"
license=('custom')
depends=('m17n-lib' 'ncurses' 'gtk2')
makedepends=('pkgconfig' 'gettext' 'intltool')
optdepends=()
options=('!libtool')
install=uim.install
source=(http://uim.googlecode.com/files/${pkgname}-${pkgver}.tar.bz2)
md5sums=('b84a43fb92d7ceb4bd801a76120c2a71')
sha1sums=('fbea2590286ddc857a7824d8544cb08842f4299f')

build() {
  cd "${srcdir}/${pkgname}-${pkgver}"
  ./configure --prefix=/usr --libexecdir=/usr/lib/uim \
              --disable-gnome-applet --with-anthy-utf8 \
  || return 1
  make || return 1
  make DESTDIR="${pkgdir}" install || return 1
  install -D -m644 COPYING "${pkgdir}/usr/share/licenses/${pkgname}/COPYING"
}

Finally, run makepkg under uim directory to make and install package:

$ makepkg -s -i

Setting the environment variables

Add the followings to your ~/.xsession (for XDM, KDM or GDM) or ~/.xinitrc (for startx from command line):

export GTK_IM_MODULE='uim'
export QT_IM_MODULE='uim'
uim-xim &
export XMODIFIERS=@im='uim'
Note: I had to name the file ~/.xprofile to make it work (KDE 4.4). Don't know why though.

Toolbar utilities

If you want to use UimToolbar utilities, add one of the followings too.

uim-toolbar-gtk/qt

Using toolbar appears as a window:

uim-toolbar-gtk &

or if you built --with-qt, you can add:

uim-toolbar-qt &

uim-toolbar-gtk-systray

Using toolbar for system tray:

uim-toolbar-gtk-systray &

Panel applet

Or, if you use GNOME, KDE or Xfce, you can use uim-toolbar panel applet (Xfce requires xfce4-xfapplet-plugin to use uim-applet-gnome).

UIM preferences

You can configure uim preferences by running :

$ uim-pref-gtk

which brings forth a GUI.


You can run uim-xim or logout/login to test your settings.
Provided everything went well you should be able to input Japanese in X.

お疲れ様です!

Using UIM on Emacs

UIM provides uim.el the bridge software between Emacs and uim.

Settings for the minor-mode

Here is a sample to use uim + anthy (UTF-8) with utf-8 encoding. Add the followings into your .emacs or some other file for Emacs customizing.

;; read uim.el
(require 'uim)
;; uncomment next and comment out previous to load uim.el on-demand
;; (autoload 'uim-mode "uim" nil t)

;; key-binding for activate uim (ex. C-\)
(global-set-key "\C-\\" 'uim-mode)

;; Set UTF-8 as preferred character encoding (default is euc-jp).
(setq uim-lang-code-alist
      (cons '("Japanese" "Japanese" utf-8 "UTF-8")
           (delete (assoc "Japanese" uim-lang-code-alist) 
                   uim-lang-code-alist)))

;; Set Hiragana input mode at activating uim.
(setq uim-default-im-prop '("action_anthy_utf8_hiragana"))

You may add the following to your ~/.Xresources or ~/.Xdefaults.

Emacs.UseXIM: false

Troubleshooting

Cannot input Japanese on Opera

If you use Opera and cannot input Japanese with uim, try to edit environment variable as follows:

export QT_IM_MODULE='xim'

uim-toolbar-gtk-systray: tray icon is crushed

uim-toolbar-gtk-systray isn't compliant with freedesktop.org system tray specs. You have to choose just one of the items in toolbar to solve it. The steps to display only 'Input mode' icon for example as follows:

  1. Run uim-pref-gtk.
  2. Click 'Toolbar' on 'Group' list.
  3. Take the all checkmarks off.
  4. Click 'Anthy' or 'Anthy (UTF-8)' which you are using on 'Group' list.
  5. Click Edit button in 'Toolbar' box -> 'Enable toolbar buttons' line.
  6. Enable only 'Input mode' and click 'Close' button.
  7. Click 'OK' button to close uim-pref-gtk.

The tray icon will be displayed "あ" (Hiragana mode) or "ー" (Direct mode).

Useful literature

http://code.google.com/p/uim/wiki/OfficialUserDocument
http://en.wikibooks.org/wiki/Uim Insert non-formatted text here