Supported OCR Languages¶
OCR & PDF Tools supports over 100 languages for text recognition, powered by the Tesseract OCR engine. All language processing happens locally on your computer.
Major Languages¶
These languages have the highest recognition accuracy and are the most commonly used:
| Language | Code | Script |
|---|---|---|
| English | eng | Latin |
| Spanish | spa | Latin |
| French | fra | Latin |
| German | deu | Latin |
| Italian | ita | Latin |
| Portuguese | por | Latin |
| Dutch | nld | Latin |
| Russian | rus | Cyrillic |
| Chinese (Simplified) | chi_sim | CJK |
| Chinese (Traditional) | chi_tra | CJK |
| Japanese | jpn | CJK |
| Korean | kor | CJK |
| Arabic | ara | Arabic |
| Hindi | hin | Devanagari |
| Thai | tha | Thai |
| Vietnamese | vie | Latin (extended) |
| Turkish | tur | Latin |
| Polish | pol | Latin |
| Swedish | swe | Latin |
| Czech | ces | Latin |
Additional Languages¶
Beyond the major languages listed above, OCR & PDF Tools supports many more including but not limited to:
- European: Danish, Finnish, Norwegian, Hungarian, Romanian, Greek, Bulgarian, Croatian, Slovak, Slovenian, Lithuanian, Latvian, Estonian, Serbian, Ukrainian, Catalan, Galician, Basque, Icelandic, Maltese, Albanian, Macedonian, Bosnian
- Asian: Bengali, Tamil, Telugu, Kannada, Malayalam, Gujarati, Marathi, Nepali, Sinhala, Myanmar, Khmer, Lao, Tibetan, Urdu
- African: Amharic, Swahili, Afrikaans
- Other scripts: Hebrew, Georgian, Armenian
Installing Language Packs¶
English is installed by default. To add additional languages:
- Open Settings > Language Packs
- Browse or search for the language you need
- Check the checkbox next to each language to install
- Click Download to install the selected language data files
Disk Space
Each language pack ranges from 1 to 15 MB. CJK languages (Chinese, Japanese, Korean) have larger data files due to the complexity of their character sets.
Selecting a Language for OCR¶
Before running OCR on an image or document:
- Click the Language dropdown in the toolbar or settings
- Select the primary language of the text in your image
- For documents with multiple languages, see Multilingual OCR
Accuracy Tip
Always select the correct language before running OCR. Using the wrong language setting will produce poor results even on clear, high-quality images.