People on translator forums sometimes ask for recommendations on Japanese OCR (optical character recognition) software. Our experience at Localization Ninja is that there is no single OCR software package that consistently outperforms all the others on Japanese text. Performance can vary widely depending on a variety of factors, including image quality, scanning resolution, fonts used, layout, and file type (gif, png, jpg, pdf, etc.).
We use all of the following for Japanese OCR:
- Adobe Acrobat: This is an obvious choice for translators because in general you need to subscribe to Adobe Acrobat for your work anyway. To perform OCR in Adobe Acrobat, open the image file in Acrobat and choose Tools -> Scan & OCR -> Open.
- Google: This is one of the best OCR tools, and best of all it’s completely free. Upload the image file to Google Drive, right click on it, and choose Open with Google Docs. The image is displayed at the top of the document, and the recognized text appears below it. Unfortunately, Google Docs makes no effort to preserve the appearance and formatting of the text, which is a major drawback compared to the other software listed here.
- Readiris 17: Readiris is commercial OCR software for Windows and Mac sold by IRIS, a Canon company. Starting at just $49, it is a relative bargain. You load the image file into Readiris, specify the language, and then save it as a searchable PDF. Note that it does not accept GIF files.
- 読取革命 (Yomitori Kakumei): This is the only software listed here that is specific to the Japanese language. The interface and documentation are also Japanese only. At a price tag of 12,980 JPY, it’s the most expensive option here. 読取革命 was originally developed by Panasonic and is now sold by SourceNext.
Again, our experience is that there’s no single clear winner, and it’s difficult to predict which software will yield the best OCR results on a given image file. Typically we enter the scanned image file into all four, and it quickly becomes obvious which one handled it the best.