Last date modified: 2026-May-08
Contracts OCR
Contracts OCR (Optical Character Recognition) converts images of text—such as scanned and redacted documents—into machine-readable text characters. When you run Contracts OCR, it also captures word-level location data, which maps extracted text to the document image so that terms are highlighted accurately in the Contracts Image Viewer.
You must run Contracts OCR before you can use the Contracts Image Viewer and its search and navigation features.
Prerequisites
Before you run Contracts OCR, verify the following:
- Run Imaging first—You must run Imaging and successfully generate images for each document. If no images exist, there is nothing for Contracts OCR to process. For more information, see Imaging.
- Verify the Contracts Extracted Text field is empty—Contracts OCR skips documents that already have a value in the Contracts Extracted Text field. To overwrite existing values, see Rerun Contracts OCR.
- Check image dimensions—Images must not exceed 32,000 pixels in either dimension. The Contracts OCR engine does not support larger images.
- Exclude Section documents—Contracts OCR and Analysis do not process Section documents. These documents, created by Contracts Segmentation, are automatically excluded. You can identify them by Document Identifiers ending in .htk.
Run Contracts OCR
To run Contracts OCR:
- Go to the Contracts OCR Sets tab.
- Click New Contracts OCR Set.
- Enter a Name for the Contracts OCR Set.
- For Data Source, select a saved search that contains the documents you want to OCR.
- For Language, select the language or languages you want the OCR engine to recognize.
For a list of languages supported by Contracts OCR see Supported languages matrix. -
Optional) Under OCR Set Notification, enter the email addresses of users you want to notify when the OCR job completes.
- Click Save.

- Click Run OCR in the right console.
If errors occur, click Download Errors in the console to download and review the error report.
Fields auto-populated by Contracts
Below is a list of all fields that Contracts will auto-populate when you run imaging and OCR.
| Field name | Field type | Field auto population |
|---|---|---|
| Contracts Extracted Text | Long-text | Contains the OCR-extracted text of the document. |
If you manually edit the Contracts Extracted Text field and then rerun Imaging and OCR, the new OCR results replace your manual edits.
Rerun Contracts OCR
By default, Contracts OCR skips documents that already have a value in the Contracts Extracted Text field. To overwrite existing extracted text, you must enable the overwrite setting.
To rerun Contracts OCR:
- Navigate to your Contracts OCR set and select Edit.
- Set Overwrite Contracts Extracted Text to Yes.

- Click Save.
- Click Run OCR in the right console.