Supported languages matrix
This table displays each language supported by a Relativity feature and its corresponding functionality status. The features include OCR, Assisted Review, Structured Analytics, Processing, and the Viewer. Stemming, date recognition, and querying on abbreviations (i.e., a single letter followed by a period) are only available in English text in a dtSearch index. The SQL Server settings determine the languages available for word-break characters used in the full text index.
Use the following resources for more information on SQL Server and dtSearch supported languages:
- See this site for a list of search features for languages supported by dtSearch.
- See this site for a list of Unicode supported languages also supported by dtSearch.
- See this site for a list of SQL Server supported languages.
See Command line import for a complete list of alternate language encoding values and Importing document metadata, files, and extracted text for instructions on importing documents with the Relativity Desktop Client and selecting the appropriate file encoding value.
- √ - indicates that the language is supported.
- √* - indicates that the language must be installed in the Microsoft operating system for the viewer to function. Specifically, you must install the language to the web server, conversion agent server, and local workstation.
- If the cell is empty, the feature is not supported.
Special considerations
Note the following details about the supported languages:
-
dtSearch in Relativity is accent-insensitive by default. This means characters with accent marks and other diacritics are stored in the same fashion as those without those marks. If you need to perform a search that includes accents, change the Create Accent Sensitive setting on the dtSearch index to Yes.
- Indexing in SQL is based on the character set of the language you select. Western languages are similar grammatically, which means that you should experience no issues when searching for English words with SQL. In addition, SQL tokenization is only used for symbols that mean one thing when they are alone, but something else when they are put together with other symbols, such as with CJK languages.
-
Conceptual Analytics and Classification indexes are language-agnostic and therefore support all languages. Categorization does not display Unicode choices in the field tree properly.
-
The Arabic, Hebrew, Thai, Vietnamese, and Belarusian languages are displayed as selectable for the Default OCR language on the Processing Profile but they are not supported and will not work if you select them. They will be removed in Server 2023 patch 1.
Supported languages
Language | OCR | Processing | Native Imaging | Structured Analytics Language Identification | Viewer |
---|---|---|---|---|---|
English | √ | √ | √ | √ | √ |
Abkhazian | √ | ||||
Afar | √ | ||||
Afrikaans | √ | √ | √ | √ | √ |
Akan | √ | ||||
Albanian | √ | √ | √ | √ | √ |
Amharic | √ | ||||
Arabic | √ | √ | √ | √* | |
Armenian | √ | √ | √ | √* | |
Assamese | √ | ||||
Aymara | √ | √ | √ | √ | √ |
Azerbaijani | √ | ||||
Bashkir | √ | ||||
Basque | √ | √ | √ | √ | √ |
Belarusian | √ | √ | √ | ||
Bengali | √ | ||||
Bemba | √ | √ | √ | √ | |
Bihari | √ | ||||
Bislama | √ | ||||
Blackfoot | √ | √ | √ | √ | |
Bosnian | √ | ||||
Breton | √ | √ | √ | √ | √ |
Bugotu | √ | √ | √ | √ | |
Bulgarian (Cyrillic) | √ | √ | √ | √ | √ |
Byelorussian (Cyrillic) | √ | √ | √ | √ | √ |
Burmese | √ | ||||
Catalan | √ | √ | √ | √ | √ |
Cebuano | √ | ||||
Chamorro | √ | √ | √ | √ | |
Chechen | √ | √ | √ | √ | |
Cherokee | √ | ||||
Chinese (Simplified) | √ | √ | √ | √ | √* |
Chinese (Traditional) | √ | √ | √ | √ | √* |
Chuana or Tswana | √ | √ | √ | √ | |
Corsican | √ | √ | √ | √ | √ |
Croatian | √ | √ | √ | √ | √ |
Crow | √ | √ | √ | √ | |
Czech | √ | √ | √ | √ | √ |
Danish | √ | √ | √ | √ | √ |
Dhivehi | √ | ||||
Dholuo | √ | ||||
Dutch | √ | √ | √ | √ | √ |
Dzongkha | √ | ||||
Eskimo | √ | √ | √ | √ | |
Esperanto | √ | √ | √ | √ | √ |
Estonian | √ | √ | √ | √ | √ |
Ewe | √ | ||||
Faroese | √ | √ | √ | √ | √ |
Fijian | √ | √ | √ | √ | √ |
Finnish | √ | √ | √ | √ | √ |
French | √ | √ | √ | √ | √ |
Frisian | √ | √ | √ | √ | √ |
Friulian | √ | √ | √ | √ | |
Ga | √ | ||||
Gaelic Irish | √ | √ | √ | √ | |
Gaelic Scottish | √ | √ | √ | √ | |
Galician | √ | √ | √ | √ | √ |
Ganda or Luganda | √ | √ | √ | √ | √ |
Georgian | √ | √ | √* | ||
German | √ | √ | √ | √ | √ |
Greek | √ | √ | √ | √ | √ |
Greenlandic | √ | ||||
Guarani | √ | √ | √ | √ | √ |
Gujarati | √ | ||||
Haitian Creole | √ | ||||
Hani | √ | √ | √ | √ | |
Hausa | √ | ||||
Hawaiian | √ | √ | √ | √ | √ |
Hebrew | √ | √ | √ | √* | |
Hindi | √ | ||||
Hmong | √ | ||||
Hungarian | √ | √ | √ | √ | √ |
Icelandic | √ | √ | √ | √ | √ |
Ido | √ | √ | √ | √ | |
Igbo | √ | ||||
Indic Languages | √* | ||||
Indonesian | √ | √ | √ | √ | √ |
Interlingua | √ | √ | √ | √ | √ |
Interlingue | √ | ||||
Inuktitut | √ | ||||
Inupiak | √ | ||||
Irish | √ | ||||
Italian | √ | √ | √ | √ | √ |
Japanese | √ | √ | √ | √ | √* |
Javanese | √ | ||||
Kabardian | √ | √ | √ | √ | |
Kannada | √ | ||||
Kashmiri | √ | ||||
Kashubian | √ | √ | √ | √ | |
Kawa | √ | √ | √ | √ | |
Kazakh | √ | ||||
Khasi | √ | ||||
Khmer | √ | ||||
Kikuyu | √ | √ | √ | √ | |
Kinyarwanda | √ | ||||
Kongo | √ | √ | √ | √ | |
Korean | √ | √ | √ | √ | √* |
Kpelle | √ | √ | √ | √ | |
Krio | √ | ||||
Kurdish | √ | √ | √ | √ | √ |
Kyrgyz | √ | ||||
Laothian | √ | ||||
Latin | √ | √ | √ | √ | √ |
Latvian | √ | √ | √ | √ | √ |
Limbu | √ | ||||
Lingala | √ | ||||
Lithuanian | √ | √ | √ | √ | √ |
Lozi | √ | ||||
Luba | √ | √ | √ | √ | |
Lule Sami | √ | √ | √ | √ | |
Luxembourgian | √ | √ | √ | √ | √ |
Macedonian (Cyrillic) | √ | √ | √ | √ | √ |
Malagasy | √ | √ | √ | √ | √ |
Malay | √ | √ | √ | √ | √ |
Malayalam | √ | ||||
Malinke | √ | √ | √ | √ | |
Maltese | √ | √ | √ | √ | √ |
Manx | √ | ||||
Maori | √ | √ | √ | √ | √ |
Marathi | √ | ||||
Mauritian Creole | √ | ||||
Mayan | √ | √ | √ | √ | |
Miao | √ | √ | √ | √ | |
Minankabaw | √ | √ | √ | √ | |
Mohawk | √ | √ | √ | √ | |
Moldavian (Cyrillic) | √ | √ | √ | √ | |
Mongolian | √ | ||||
Montengrin | √ | ||||
Nahuatl | √ | √ | √ | √ | |
Nauru | √ | ||||
Nepali | √ | ||||
Newari | √ | ||||
Northern Sami | √ | √ | √ | √ | |
Norwegian | √ | √ | √ | √ | √ |
Norwegian Nynorsk | √ | ||||
Nyanja | √ | √ | √ | √ | √ |
Occidental | √ | √ | √ | √ | |
Occitan | √ | ||||
Ojibway | √ | √ | √ | √ | |
Oriya | √ | ||||
Oromo | √ | ||||
Ossetian | √ | ||||
Pampanga | √ | ||||
Papiamento | √ | √ | √ | √ | |
Pashto | √ | ||||
Pedi | √ | ||||
Persian | √ | ||||
Pidgin English | √ | √ | √ | √ | |
Polish | √ | √ | √ | √ | √ |
Portuguese | √ | √ | √ | √ | √ |
Portuguese (Brazilian) | √ | √ | √ | √ | √ |
Provencal | √ | √ | √ | √ | |
Punjabi | √ | ||||
Quechua | √ | √ | √ | √ | √ |
Rajasthani | √ | ||||
Rhaetic | √ | √ | √ | √ | |
Rhaeto - Romance | √ | ||||
Romanian | √ | √ | √ | √ | √ |
Romany | √ | √ | √ | √ | |
Ruanda | √ | √ | √ | √ | |
Rundi | √ | √ | √ | √ | √ |
Russian (Cyrillic) | √ | √ | √ | √ | √ |
Sami | √ | √ | √ | √ | |
Samoan | √ | √ | √ | √ | √ |
Sango | √ | ||||
Sanskrit | √ | ||||
Sardinian | √ | √ | √ | √ | |
Scots | √ | ||||
Scottish Gaelic | √ | ||||
Serbian (Cyrillic) | √ | √ | √ | √ | √ |
Serbian (Latin) | √ | √ | √ | √ | √ |
Seselwa | √ | ||||
Sesotho | √ | ||||
Shona | √ | √ | √ | √ | √ |
Sindhi | √ | ||||
Sinhalese | √ | ||||
Sioux | √ | √ | √ | √ | |
Siswant | √ | ||||
Slovak | √ | √ | √ | √ | √ |
Slovenian | √ | √ | √ | √ | √ |
Somali | √ | √ | √ | √ | √ |
Sotho, Suto, or Sesuto | √ | √ | √ | √ | |
Southern Sami | √ | √ | √ | √ | |
Spanish | √ | √ | √ | √ | √ |
Sudanese | √ | √ | √ | √ | √ |
Swahili | √ | √ | √ | √ | √ |
Swazi | √ | √ | √ | √ | |
Swedish | √ | √ | √ | √ | √ |
Syriac | √ | ||||
Tagalog | √ | √ | √ | √ | √ |
Tahitian | √ | √ | √ | √ | |
Tajik | √ | ||||
Tamil | √ | ||||
Tatar | √ | ||||
Telugu | √ | ||||
Thai | √ | √ | √ | √* | |
Tibetan | √ | ||||
Tigrinya | √ | ||||
Tinpo | √ | √ | √ | √ | |
Tonga | √ | ||||
Tongan | √ | √ | √ | √ | |
Tshiluba | √ | ||||
Tsonga | √ | ||||
Tswana | √ | ||||
Tumbuka | √ | ||||
Tun | √ | √ | √ | √ | |
Turkish | √ | √ | √ | √ | √ |
Turkmen | √ | ||||
Twi | √ | ||||
Uighur | √ | ||||
Ukrainian (Cyrillic) | √ | √ | √ | √ | √ |
Urdu | √ | ||||
Uzbek | √ | ||||
Venda | √ | ||||
Vietnamese | √ | √ | √* | ||
Visayan | √ | √ | √ | √ | |
Volapuk | √ | ||||
Waray-Waray | √ | ||||
Welsh | √ | √ | √ | √ | √ |
Wend or Sorbian | √ | √ | √ | √ | |
Wolof | √ | √ | √ | √ | √ |
Xhosa | √ | √ | √ | √ | √ |
Yiddish | √ | ||||
Yoruba | √ | ||||
Zapotec | √ | √ | √ | √ | |
Zhuang | √ | ||||
Zulu | √ | √ | √ | √ | √ |
See Command line import for a complete list of supported languages encoding values.