OS X 10.4 and above. UnicodeChecker allows you to browse the Unicode character set on your machine. For any character, it will tell you the decimal Unicode number, hexadecimal Unicode scalar value, hexadecimal UTF-8, UTF-16 and UTF-32 code, Unicode name, and more. You can also install Unihan.txt for direct access to the information from the Unihan database. See: http://www.earthlingsoft.net/UnicodeChecker/
OS X 10.5 and above. Allows you to change the user interface of any given application to any of the languages that the application supports.
Marcel Bresink's TinkerTool System gives you access to advanced system settings and internal maintenance features built into Mac OS X. It allows you to change the language used for system startup and login. This does not affect the language used in the Finder (which you can change in System Preferences).
- OS X 10.3–10.5: http://www.bresink.com/osx/TinkerToolSys.html
- OS X 10.6 and above: http://www.bresink.com/osx/TinkerToolSys2.html
Comes with OS X. Handles Chinese text well. However, Preview jumbles the font MingLiU/PMingLiU (the traditional-Chinese system font in Windows XP and earlier) when it is embedded in PDF documents. The result looks something like this:
There is a simple workaround: Adobe Reader (and Acrobat) handles this font with no problems.
Free. Handles Chinese text well, including searches. If the author of a PDF file embeds Chinese fonts in the document, Reader will be able to display and print the document on any system. If the author uses Chinese fonts but does not embed them in the document, then you will need to take two steps in order to view and print the file:
- Use the Custom Installation option during installation of Reader. This allows you to install Chinese, Japanese, and Korean fonts inside the Reader package.
- To print the file, select the Download Asian Fonts option in the Advanced Print Setup dialog box (requires a PostScript Level 2 or higher printer).
docXConverter allows you to open files saved in the Word docx, Excel xlsx, and AppleWorks/ClarisWorks cwk formats. It reads the file you want to access and converts it to the RTF or CSV formats. Handles batch conversions seamlessly. Panergy's icWord and icExcel have similar capabilities. See: http://www.panergy-software.com/products/docxconverter/features.html
Note: Most OCR programs are designed to run at 300 dpi. Some can handle 600 dpi, but it is much slower and, in our collective experience, not much better than 300 dpi. If the type on the page is very small, you can scan at a higher resolution and then downsample to 600 or 300 dpi for a better result. Acrobat can do this itself, otherwise you'll have to use Photoshop or another image editor.
Acrobat 8 and above (both Pro and Standard) do Chinese OCR. You can run it on an already-scanned PDF document via Document > OCR Text Recognition, or you can run it as you scan a new document via Document > Scan to PDF (check the "Make Searchable (Run OCR)" box and set the Options to the appropriate Chinese setting). See: http://www.adobe.com/products/acrobat/main.html
Readiris Pro comes in an Asian edition that supports Simplified and Traditional Chinese, Japanese, and Korean, along with other languages. However, Readiris is a Latin-based OCR application. While the Asian edition functions well within this limitation and provides basic Chinese OCR, to improve it would need Chinese capabilities built into the core application. The major drawbacks are the absence of learning for Chinese and the lack of any sort of effective proofing scheme for Chinese, both essential features of good Chinese OCR.
Readiris Pro 12 Asian does not have its own web page, but you will find it in the I.R.I.S. online shop:
IRISPen Express Asian supports Simplified and Traditional Chinese, Japanese, and Korean, along with other languages. There are no CJK-specific proofing tools, although it can do vertical CJK text. For a review, see here.
WorldPenScan supports Simplified and Traditional Chinese, Japanese, and Korean, along with other languages. There are no CJK-specific proofing tools, although it can do vertical CJK text. The Pro version includes the Babylon translation software and has a somewhat different design for the pen, with a transparent tip.
There are currently two applications that can do Chinese OCR using the camera on an iPhone 4 or an iPad 2:
There are two levels of Chinese support that must be addressed. The first is in the Unix terminal emulation software, which needs to be set up to emulate a localized Chinese Unix terminal in order to handle double-byte Chinese character set encodings (as opposed to handling Chinese with Unicode). See the entries below for some details.
The second level is in the Unix operating system. The shell environment variables have to be set to the corresponding Chinese locale (when connecting to a localized Chinese server, for example). You can either type in the commands manually at the command line or add them to your ~/.cshrc file. Both terminal emulators start up with the default shell (tcsh, for example). When tcsh starts, it looks for and reads a number of initialization files. One of those files is ~/.cshrc. Placing
setenv commands in the ~/.cshrc file ensures that the locale is set at startup and saves you the trouble of manually typing the commands every time.
Commands for the tcsh shell and Traditional Chinese locale (Big Five character set):
setenv LC_CTYPE zh_TW.Big5
setenv LANG zh_TW.Big5
Commands for the tcsh shell and Simplified Chinese locale (GB 2312 character set):
setenv LC_CTYPE zh_CN.EUC
setenv LANG zh_CN.EUC
To upload or download files containing Chinese, you must set the FTP transfer mode to binary format before you enter the
OS X only, in the /Applications/Utilities folder. This discussion is based on Terminal 1.4.1 in OS X 10.3.
UTF-8 is the default setting, and you can enter Unicode Chinese characters using the Character Palette without changing any settings. Other standard Chinese character set encodings are available in Terminal > Window Settings... Display, and thus Terminal shell windows can be set up to emulate localized Chinese Unix terminals.
OS X only. iTerm is an open-source project focused on multilingual support, including Unicode and all standard East Asian encodings. Change the encoding in Preferences > Shell settings to set up iTerm to emulate a localized Chinese Unix terminal.