Redact Language Support
While Redact only supports English when it is initially installed, it can support over 100 languages. If you are interested in having non-English languages added to Redact, please
Language packs can be included in an image project either individually or mixed together to cover multiple languages. In some cases you may find that the results are more accurate if you use a single language pack even if there are words in English in the documents. Since results can vary, we recommend testing non-English language support on a set of documents before doing a full project run.
Language settings in Relativity
Documents should be loaded into Relativity with the proper language settings. This is an important component of interacting with images for non-English languages and a critical requirement for non-Romance languages.
For example, if documents with Japanese text are being run using Redact, they will need to be loaded from a processing engine that supports processing Japanese documents with the appropriate language settings.
If using Relativity Processing, select the Japanese language in either the processing set or processing profile. This ensures that the extracted text is encoded correctly and can be used for comparison when determining how much quality control is needed.
Entering Language Codes in an image markup project
If non-English languages have been enabled, please note the following:
- By default, projects will run with the eng code which causes English to be the only supported language.
- If there is a chance that multiple languages could be combined together in the documents, you can ensure the languages receive the appropriate markups by adding a + between each language code. For example, entering spa+eng causes Spanish and English to receive markups by a project.
- Language codes are prioritized from left to right so the left-most code will be given the highest priority.
- The Redact application only supports three character length for supported language codes. For example, if a user entered chi_sim_vert as the language code for the trained data file chi_sim_vert.traineddata in an image project, the project fails.
Instead, you should rename the trained data file to three characters like csv before uploading it to Relativity in order to successfully use that trained data file.