Input Methods

Apple IMs

Simplified Chinese

Update: In OS X 10.5, the Simplified Chinese Input Method comes with built-in English-language Help. See What's New in Leopard.

Traditional Chinese

Update: In OS X 10.5, the Traditional Chinese Input Method comes with built-in English-language Help. See What's New in Leopard.

Character Palette

Other IMs

QuickCore Input Method (QIM)

QIM is a Hanyu Pinyin input method with support for both Simplified and Traditional Chinese.

Simply put, QIM 1.4 is the best Pinyin input method available for Mac OS X. Its underlying data structure has been completely redesigned in order to support the powerful Heima shenpin [黑马神拼] technology. Currently, this provides two giant steps forward. First is intelligent sentence parsing—the ability to type long sentences as input strings while maintaining a high rate of accuracy in transforming them into hanzi. Second is a very large and highly efficient database structure—it contains more than 22 million entries across the entire spectrum of professional usage. The Sogou [搜索] search engine's dictionary has also been included, providing a frequently-updated database of contemporary terms.

Other useful features include the ability to limit the display of candidates to standard hanzi in the Xinhua Dictionary [新华字典], optional support for filtering candidates by Pinyin tone, and support for all keyboard layouts, like French, German, etc. QIM also allows the user to inspect and edit the User Dictionary, with import and export capabilities.

For OS X 10.3 and above. Localized for both Chinese and English. It comes in a demo mode which lasts for 10,000 hanzi or 30 days, whichever comes first.

Download and registration: http://glider.ismac.cn/RegQIME.html

User forum: http://www.sinomac.com/modules/newbb/viewforum.php?forum=24

Fun Input Toy (FIT)

Free! OS X 10.4 and above. A good, basic input method for Pinyin input of words and phrases (without tones).

http://fit.coollittlethings.com/

DerYi

DerYi [奇易, originally 得意 as Macintosh shareware] is based in Taiwan. First developed in 1994, its main input modes are Zhuyin, Pinyin, and Cangjie, along with user-defined phrase input. Like Hanin, it uses a rolling approach to converting input strings as you type, but DerYi gives the user more control over the process.

Version 4xxx (OS X 10.4 and above) comes with a Unicode option, along with Big5 and GB.

http://www.smartinside.com.tw/

MacKEY

MacKEY's built-in input method can convert long strings of Pinyin text, with tones optional. For best results, you should parse the Pinyin into words and phrases as you type. See Study Tools.

Wenlin

Wenlin's built-in input method allows you to input any character or compound (word or phrase) in the dictionary, with tones optional. See Study Tools.

Cantonese

OpenVanilla provides CIN-format data tables for Cantonese using the Jyutping [粵拼] romanization system developed by the Linguistic Society of Hong Kong.

Dominic Yu's Apple-format plug-ins for Cantonese also use the Jyutping system. Versions available for OS X 10.2 and above, as well as OS 9 to OS X 10.1. He also provides a plug-in for the Yale system (OS X 10.2 and above only). See: http://rescomp.stanford.edu/~domingo2/Chinese.html

IM Frameworks

Input-method frameworks generate input methods from data tables. In the past, these source files had to be converted into a format the framework could handle, like a binary data file with the ".dat" extension. More recent frameworks, like Mac OS X 10.5 and OpenVanilla, can handle plain-text data tables directly.

OpenVanilla

OpenVanilla is a free, open-source Unicode-based Chinese input method framework based in Taiwan. Supports data tables in the CIN format for its generic input module. A variety of tables are available, including data for CangJie [倉頡], Dayi [大易], Jianyi [簡易], Wubizixing [五笔字形], and Four-corner [四角號碼], among others. The tables can easily be edited, and other tables can be adapted for use in OpenVanilla. For example, tables from the popular but proprietary Boshiamy [嘸蝦米] input method can be installed. See here for a more detailed discussion of this topic (in Chinese).

OpenVanilla also includes more sophisticated modules, like the open-source, intelligent Chewing [酷音] input method. It is similar to Hanin, but is Zhuyin-only and supports an even wider array of keyboards. The well-known Array [行列] input method comes with a dedicated input module to use instead of the generic one. Taiwanese (POJ) and Tibetan (Sambhota and TCC) also have their own modules, among others.

For OS X 10.4 and above.

http://openvanilla.org/

Help: http://docs.google.com/View?docid=ah6d8th954vw_1896zrnrb

CIN plug-ins

In OS X 10.5 (only), you can create a plain-text source file using the CIN plug-in data format described below, change the file extension to ".cin", and then place it in the /Library/Input Methods folder or your Home ~/Library/Input Methods folder. See the Plug-in Input Method Help.

For samples, see: http://openvanilla.googlecode.com/svn/trunk/Modules/SharedData/

  • The plain-text source file must be in UTF-8 encoding.
    • Windows-style CR+LF newlines are not allowed.
    • Mac OS X and UNIX-style /n newlines are required.
  • For comments, use # at the start of a line and it will be ignored.
    • Comments are not allowed in the %chardef block.
  • Blank lines are ignored.
  • %gen_inp
    • Required. Tells the input-method framework what to do.
  • %cname
    • Determines the name that will appear in the Input menu.
    • Only %cname is valid in OS X 10.5.
    • %cname, %ename, %tcname, and %scname are all valid in OpenVanilla.
  • %encoding
    • UTF-8 is required for both OS X 10.5 and OpenVanilla.
  • %selkey
    • Defines keys used to select candidates.
    • In Mac OS X 10.5, this setting is ignored and defaults to 123456789.
  • %endkey
    • %endkey is designed to provide punctuation and other symbols in input methods.
    • Keys declared here also need to be defined in both the %keyname and %chardef blocks.
    • See OpenVanilla's bpmf.cin [注音] and cj.cin [倉頡] for examples of how this works. Also Biaoyin 2.
  • %keyname begin
    • Required. Key definitions begin on the next line.
    • These definitions map keys typed to characters displayed in the inline/input window. It is used mostly to display radicals in radical-based input methods like CangJie, Dayi, and Wubi, but it can also be used in other ways.
  • %keyname end
    • Required. Key defintions end on the previous line.
  • %chardef begin
    • Required. The data table begins on the next line.
    • Unlike the Apple format, the CIN format does not use a delimiter. Instead, each entry is listed on a new line, with a space (or tab) between the input string and the output string.
    • Input strings are not case-sensitive.
  • %chardef end
    • Required. The data table ends on the previous line.

Apple plug-ins

Plain-text source files in the Apple plug-in data format described below can be used with the Input Method Plug-in Converter utility in OS 9, and OS X 10.3 and 10.4. In OS X 10.5, you simply change the file extension to ".inputplugin" and then place the file in the /Library/Input Methods folder or your Home ~/Library/Input Methods folder. See the Plug-in Input Method Help.

For samples, see Biaoyin/Biauyin and the Apple-format Cantonese plug-ins listed above.

  • In OS X 10.3, 10.4, and 10.5, the plain-text source file must be in UTF-16 encoding.
  • In OS 9, it must be in GB 2312 (Simplified Chinese only) or Big Five (Traditional Chinese only) encoding.
  • For comments, use # at the start of a line and it will be ignored.
  • Blank lines are ignored.
  • Don't leave fields blank. Either omit the line (if it is not required and the default meets your needs) or define a value.
  • METHOD:
    • TABLE
    • Required. Defines the format of the data.
  • ENCODE:
    • SC [Simplified Chinese]
    • TC [Traditional Chinese]
    • Unicode [Unicode, UTF-16 also valid]
    • Required. Determines the character set and script/framework used in the plug-in.
    • Legacy values, like GB2312 and BIG5, are depreciated.
  • PROMPT:
    • Required. Determines the name that will appear in the Input menu.
    • 32 characters maximum.
  • VERSION:
    • For the version number of the data.
    • 8 characters maximum
  • DELIMITER:
    • Required. Determines the character used to delimit multiple output strings for the same input string.
  • MAXINPUTCODE:
    • Limits the length of input strings.
    • In OS 9 to OS X 10.4, the maximum length of input strings is seven (the default).
    • In OS X 10.5, there is no maximum length of input strings.
  • VALIDINPUTKEY:
    • Required. Determines the characters that are valid in input strings.
    • Input strings are not case sensitive.
  • BEGINCHARACTER
    • Required. The body of the data table begins on the next line.
    • For lines of data, the tab key is the delimiter between the input string and the output string(s).
    • Output strings can vary in length, up to 100 characters. [?]
  • ENDCHARACTER
    • Required. The data table ends on the previous line.

Handwriting Input

PenPower HandWriter

HandWriter 6.0 comes with a Tooya Pro graphics tablet when you buy it through PenPower. Requires OS X 10.4 or 10.5. Universal binary. See:

http://store.penpowerchinese.com/mac.html

Wenlin

Wenlin includes a simple, mouse-based handwriting input tool. Wenlin does not distinguish one pointing device from another, and most tablets can be set up so they have the same effect as a mouse, so you can also use it with a tablet (e.g., Wacom). Set the options so that "List characters to choose" is turned on. See Study Tools.

Optical Character Recognition (OCR)

Readiris Pro Asian

Readiris Pro comes in an Asian edition that supports Japanese, Simplified Chinese, Traditional Chinese, and Korean, along with a few dozen other languages. However, Readiris is a Latin-based OCR application. While the Asian edition functions well within this limitation and provides basic Chinese OCR, to improve it would need Chinese capabilities built into the core application. The major drawbacks are the absence of learning for Chinese and the lack of any sort of effective proofing scheme for Chinese, both essential features of good Chinese OCR.

As of September 2006, Readiris Pro 11 Asian for the Mac OS does not have its own web page, but you will find it in the I.R.I.S. online shop:

http://www.irislink.com/c2-562/OCR-Software-pricelist-shop-US.aspx

IRISPen Express Asian

IRISPen Express Asian is a pen scanner that supports Japanese, Simplified Chinese, Traditional Chinese, and Korean, along with a few dozen other languages. The software runs as a foreground or background process. In the foreground, a window displays the bitmapped scan of the line of text and the OCR interpretation of the scan. As a background process, the program sends the OCR result to wherever the cursor is in your foreground application or to the Clipboard.

http://www.irislink.com/

Troubleshooting: