Supported languages matrix

This table displays each language supported by a Relativity feature and its corresponding functionality status. The features include OCR, Structured Analytics, Processing, and the Viewer. Stemming, date recognition, and querying on abbreviations (i.e., a single letter followed by a period) are only available in English text in a dtSearch index. The SQL Server settings determine the languages available for word-break characters used in the full text index.

Use the following resources for more information on SQL Server and dtSearch supported languages:

See dtSearch - International Languages for a list of search features for languages supported by dtSearch.
See dtSearch Support for Unicode for a list of Unicode supported languages also supported by dtSearch.
See sys.fulltext_languages (Transact-SQL) for a list of SQL Server supported languages.

Note the following about the table below:

√ - indicates that the language is supported.
√* - indicates that the language must be installed in the Microsoft operating system for the viewer to function. Specifically, you must install the language to your local workstation.
If the cell is empty, the feature is not supported.

Special considerations

Note the following details about the supported languages:

dtSearch in Relativity is accent-insensitive by default. This means characters with accent marks and other diacritics are stored in the same fashion as those without those marks. If you need to perform a search that includes accents, change the Create Accent Sensitive setting on the dtSearch index to Yes.
Analytics indexes are language-agnostic and therefore support all languages. Categorization does not display Unicode choices in the field tree properly.
The Processing column reflects languages supported during OCR in Processing. Processing's text extraction is natively Unicode and supports the full Unicode spectrum.

Language support in aiR products

The underlying large language model (LLM) used by Relativity's aiR products has been evaluated for use with 83 languages. For a list of those languages, see Language support for Azure AI Content Safety on the Microsoft website.

Relativity's aiR products have been primarily tested on English-language documents, and unofficial testing with non-English datasets has resulted in the following recommendations:

Rigorously follow best practices for writing and iterating on the Prompt Criteria. For more information, see Best practices and Developing prompt criteria in the aiR for Review documentation.
Analyze the extracted text as-is. You do not need to translate it into English.
When possible, write the Prompt Criteria in the same language as the documents being analyzed. This should also be the subject matter expert's native language. If that is not possible, write the Prompt Criteria in English.

When you view the results of the analysis, all citations stay in the same language as the document they cite. By default, the rationales and considerations are in English.

If you want the rationales and considerations to be in a different language, type “Write rationales and considerations in [desired language]” in the Additional Context field of the Prompt Criteria.

For the study used to evaluate Azure OpenAI across languages, see MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks on the arXiv website.

Language support in AV transcription

The audio-visual (AV) transcription application recognizes audio from over 100 languages and regional variants. For a full list, see AV transcription languages.

Supported languages

Language	OCR	Processing	Native Imaging	Structured Analytics Language Identification	Viewer
English	√	√	√	√	√
Abkhazian				√
Afar				√
Afrikaans	√	√	√	√	√
Akan				√
Albanian	√	√	√	√	√
Amharic				√
Arabic	√	√	√	√	√*
Armenian		√	√	√	√*
Assamese				√
Aymara	√	√	√	√	√
Azerbaijani				√
Bashkir				√
Basque	√	√	√	√	√
Belarusian			√	√
Bengali				√
Bemba	√	√	√		√
Bihari				√
Bislama				√
Blackfoot	√	√	√		√
Bosnian				√
Breton	√	√	√	√	√
Bugotu	√	√	√		√
Bulgarian (Cyrillic)	√	√	√	√	√
Byelorussian (Cyrillic)	√	√	√	√	√
Burmese				√
Catalan	√	√	√	√	√
Cebuano				√
Chamorro	√	√	√		√
Chechen	√	√	√		√
Cherokee				√
Chinese (Simplified)	√	√	√	√	√*
Chinese (Traditional)	√	√	√	√	√*
Chuana or Tswana	√	√	√		√
Corsican	√	√	√	√	√
Croatian	√	√	√	√	√
Crow	√	√	√		√
Czech	√	√	√	√	√
Danish	√	√	√	√	√
Dhivehi				√
Dholuo				√
Dutch	√	√	√	√	√
Dzongkha				√
Eskimo	√	√	√		√
Esperanto	√	√	√	√	√
Estonian	√	√	√	√	√
Ewe				√
Faroese	√	√	√	√	√
Fijian	√	√	√	√	√
Finnish	√	√	√	√	√
French	√	√	√	√	√
Frisian	√	√	√	√	√
Friulian	√	√	√		√
Ga				√
Gaelic Irish	√	√	√		√
Gaelic Scottish	√	√	√		√
Galician	√	√	√	√	√
Ganda or Luganda	√	√	√	√	√
Georgian			√	√	√*
German	√	√	√	√	√
Greek	√	√	√	√	√
Greenlandic				√
Guarani	√	√	√	√	√
Gujarati				√
Haitian Creole				√
Hani	√	√	√		√
Hausa				√
Hawaiian	√	√	√	√	√
Hebrew	√	√	√	√	√*
Hindi				√
Hmong				√
Hungarian	√	√	√	√	√
Icelandic	√	√	√	√	√
Ido	√	√	√		√
Igbo				√
Indic Languages					√*
Indonesian	√	√	√	√	√
Interlingua	√	√	√	√	√
Interlingue				√
Inuktitut				√
Inupiak				√
Irish				√
Italian	√	√	√	√	√
Japanese	√	√	√	√	√*
Javanese				√
Kabardian	√	√	√		√
Kannada				√
Kashmiri				√
Kashubian	√	√	√		√
Kawa	√	√	√		√
Kazakh				√
Khasi				√
Khmer				√
Kikuyu	√	√	√		√
Kinyarwanda				√
Kongo	√	√	√		√
Korean	√	√	√	√	√*
Kpelle	√	√	√		√
Krio				√
Kurdish	√	√	√	√	√
Kyrgyz				√
Laothian				√
Latin	√	√	√	√	√
Latvian	√	√	√	√	√
Limbu				√
Lingala				√
Lithuanian	√	√	√	√	√
Lozi				√
Luba	√	√	√		√
Lule Sami	√	√	√		√
Luxembourgian	√	√	√	√	√
Macedonian (Cyrillic)	√	√	√	√	√
Malagasy	√	√	√	√	√
Malay	√	√	√	√	√
Malayalam				√
Malinke	√	√	√		√
Maltese	√	√	√	√	√
Manx				√
Maori	√	√	√	√	√
Marathi				√
Mauritian Creole				√
Mayan	√	√	√		√
Miao	√	√	√		√
Minankabaw	√	√	√		√
Mohawk	√	√	√		√
Moldavian (Cyrillic)	√	√	√		√
Mongolian				√
Montengrin				√
Nahuatl	√	√	√		√
Nauru				√
Nepali				√
Newari				√
Northern Sami	√	√	√		√
Norwegian	√	√	√	√	√
Norwegian Nynorsk				√
Nyanja	√	√	√	√	√
Occidental	√	√	√		√
Occitan				√
Ojibway	√	√	√		√
Oriya				√
Oromo				√
Ossetian				√
Pampanga				√
Papiamento	√	√	√		√
Pashto				√
Pedi				√
Persian				√
Pidgin English	√	√	√		√
Polish	√	√	√	√	√
Portuguese	√	√	√	√	√
Portuguese (Brazilian)	√	√	√	√	√
Provencal	√	√	√		√
Punjabi				√
Quechua	√	√	√	√	√
Rajasthani				√
Rhaetic	√	√	√		√
Rhaeto - Romance				√
Romanian	√	√	√	√	√
Romany	√	√	√		√
Ruanda	√	√	√		√
Rundi	√	√	√	√	√
Russian (Cyrillic)	√	√	√	√	√
Sami	√	√	√		√
Samoan	√	√	√	√	√
Sango				√
Sanskrit				√
Sardinian	√	√	√		√
Scots				√
Scottish Gaelic				√
Serbian (Cyrillic)	√	√	√	√	√
Serbian (Latin)	√	√	√	√	√
Seselwa				√
Sesotho				√
Shona	√	√	√	√	√
Sindhi				√
Sinhalese				√
Sioux	√	√	√		√
Siswant				√
Slovak	√	√	√	√	√
Slovenian	√	√	√	√	√
Somali	√	√	√	√	√
Sotho, Suto, or Sesuto	√	√	√		√
Southern Sami	√	√	√		√
Spanish	√	√	√	√	√
Sundanese	√
Swahili	√	√	√	√	√
Swazi	√	√	√		√
Swedish	√	√	√	√	√
Syriac				√
Tagalog	√	√	√	√	√
Tahitian	√	√	√		√
Tajik				√
Tamil				√
Tatar				√
Telugu				√
Thai	√	√	√	√	√*
Tibetan				√
Tigrinya				√
Tinpo	√	√	√		√
Tonga				√
Tongan	√	√	√		√
Tshiluba				√
Tsonga				√
Tswana				√
Tumbuka				√
Tun	√	√	√		√
Turkish	√	√	√	√	√
Turkmen				√
Twi				√
Uighur				√
Ukrainian (Cyrillic)	√	√	√	√	√
Urdu				√
Uzbek				√
Venda				√
Vietnamese	√	√		√	√*
Visayan	√	√	√		√
Volapuk				√
Waray-Waray				√
Welsh	√	√	√	√	√
Wend or Sorbian	√	√	√		√
Wolof	√	√	√	√	√
Xhosa	√	√	√	√	√
Yiddish				√
Yoruba				√
Zapotec	√	√	√		√
Zhuang				√
Zulu	√	√	√	√	√