Supported file types for processing
Relativity supports many different file types for processing. There are also a number of file types that are incompatible with the processing engine. Before you begin to process your data, it may be helpful to note which types are supported and unsupported, as well as any caveats involved with processing those types of files.
This page contains the following information:
Supported file types
The following file types and extensions are supported by Relativity for processing.
Note: Renaming a file extension has little effect on how Relativity identifies the file type. When processing a file type, Relativity looks at the actual file properties, such as digital signature, regardless of the named extension. Relativity only uses the named extension as a tie-breaker if the actual file properties indicate multiple extensions.
File type | Extensions |
---|---|
Adobe files |
PDF, FM, PS, EPS
|
AppleDouble | AppleDouble-encoded attachments in e-mails |
CAD files |
DXF, DWG, SLDDRW, SLDPRT, 3DXML, SLDASM, PRTDOT, ASMDOT, DRWDOT, STL, EPRT, EASM, EDRW, EPRTX, EDRWX, EASMX
|
Compressed files |
7Z, ZIP, TAR, GZ, BZ2, RAR, Z, CAB, ALZIP
|
Database files |
DBF
|
PST, OST, NSF, MSG, P7M, P7S, ICS, VCF, MBOX, EML, EMLX, TNEF, DBX, Bloomberg Mail XML
|
|
|
E01, Ex01, L01, LX01
|
Excel |
XLSX, XLSM, XLSB, XLAM, XLTX, XLTM, XLS, XLT, XLA, XLM, XLW, UXDC
Note: If you save a Powerpoint or Excel document in pre-2007 format (e.g., .PPT or .XLS) and the document is read-only, we use the default known password to decrypt the document, regardless of whether or not the password exists in the Password Bank. |
HTML |
HTML, MHT, HTM, MHTML, XHTM, XHTML
|
Image files | JPG, JPEG, ICO, BMP, GIF, TIFF, TIF, JNG, KOALA, LBM, PBM, IFF, PCD, PCX, PGM, PPM, RAS, TARGA, TGA, WBMP, PSD, CUT, XBM, DDS, FAX, SGI, PNG, EXF, EXIF, WEBP, WDP, |
JungUm Global | GUL |
OneNote |
ONE
|
OpenOffice | ODC, ODS, ODT, ODP, XPS |
PowerPoint |
PPTX, PPTM, PPSX, PPSM, POTX, POTM, PPT, PPS, POT
Note: If you save a Powerpoint or Excel document in pre-2007 format (e.g., .PPT or .XLS) and the document is read-only, we use the default known password to decrypt the document, regardless of whether or not the password exists in the Password Bank. |
Publisher | PUB |
Project | MPP, MPT, MPD, MPX
Note: The text extracted from Project files is from the Gantt chart view and will include Task Notes. |
Relativity Collection Container | RCC |
Short message |
RSMF
|
Text files |
TXT, CSV, and others Note: Relativity Processing supports any file type whose underlying storage is ASCII or Unicode text and thus supports all text file types and extensions. |
Vector files | SVG, SVGZ, WMF, PLT, EMF, SNP, HPGL, HPG, PLO, PRN, EMZ, WMZ |
Visio | VSD, VDX, VSS, VSX, VST, VSW, VSDX, VSDM |
Word |
DOCX, DOCM, DOTX, DOTM, DOC, DOT, RTF
|
WordPerfect | WPD, WPS |
Note: Relativity currently doesn't support the extraction of embedded images or objects from Visio, Project, or OpenOffice files. In addition, Relativity never extracts any embedded objects or images that were added to any files as links. For a detailed list of the Office file extensions from which Relativity does and does not extract embedded objects and images, see Microsoft Office child extraction support.
Excel file considerations
Due to Excel specifications and limits, when processing a database file with the Native text extraction method, the DBF file may miss data in extracted text. For example, if a DBF file contains more than 1,048,576 rows and 16,384 columns, the extracted text of these files won’t contain text on row 1,048,577 and onward and on column 16,385 and onward. For more information, see Excel specifications and limits on the Microsoft website.
RSMF mapping considerations
Generally, Relativity maps all metadata based on EML, but the following RSMF-specific mappings are considered non-standard.
Note: All EML Header strings are case insensitive, which isn't unique to RSMF files.
EML Header | Metadata Field |
---|---|
X-RSMF-BeginDate |
Rsmf/BeginDate EmailSentOn CreatedOn InternalCreatedOn |
X-RSMF-EndDate |
Rsmf/EndDate LastModified |
X-RSMF-EventCount |
Rsmf/MessageCount |
Multi-part forensic file considerations
When processing a multi-part forensic image, make sure that the Source location points to the root folder that must contain all files that make up the image. If you select only the first file of the image (E01, L01, EX01, LX01), inventory and discovery will fail with an unrecoverable error.
This is due to the fact that inventory looks at files where they reside in the processing source folder and does not copy them to the repository. If only the first file is selected, during discovery that file and only that file will be copied to the repository and the workers will attempt to extract from it and fail since the rest of the archive is not available.
When processing E01 files, the following NTFS file system files are skipped:
- Unallocated space files
- Index $I30 files
-
$TXF_DATE files
Tracking inline/embedded images
It may be helpful for you to understand when Relativity treats an image that is attached to an email as an inline, or embedded, image and not as an actual attachment. The following table breaks down when this occurs based on email format and image characteristics:
Email format | Attachments that are inline (embedded) images |
---|---|
Plain text | None |
Rich text | IPicture-based OLE embedded images |
HTML |
|
You can arrange for the discovery of inline images when creating Processing profiles, specifically through the field called When extracting children, do not extract. If you discover inline images, Invariant denotes them using a field called HiddenAttachment when it publishes them to your workspace. This field is not typically mapped by default, which means you need to create a new field in your workspace and set HiddenAttachment as the source of the field before processing data. If you've done this, you can then create filters and searches that reference this field to pull up the inline images. If you don't do this before processing data, you won't be able to identify inline images through searching and filtering.
Native text extraction and OCR
Relativity Processing distinguishes between text and line art in the documents you process. For these documents, processing will only OCR the line art. This means that Relativity doesn’t skip OCR if a page has electronic text.
Accordingly, Relativity performs both native text extraction and OCR on the following file formats:
- All vector formats (SVG, CAD files, Metafiles [WMF, EMF], Postscript, Encapsulated postscript)
- PDF, Visio, Publisher, MS Project, Hancom and JungUm files
All image formats (TIFF/JPEG/GIF/BMP/PNG etc.) do not have native text, so only OCR is performed. If the file has electronic text and images, native text extraction and OCR will be performed.
Support for password-protected RAR archives
Relativity Processing doesn't decrypt a file that gets its encryption directly from the RAR file that contains it. This means that if you attempt to process a password-protected RAR archive on which the Encrypt file names property is checked, Relativity Processing is unable to extract the files inside that archive.
In addition, note that Relativity Processing can extract a single password-protected file from a RAR archive, but not multiple password-protected files in the same archive.
The following table breaks down Relativity Processing's support of password-protected RAR archives.
- √ - Relativity Processing will decrypt the file.
- Empty - Relativity Processing won't decrypt the file.
Archive type | Single password-protected file | Multiple password-protected files | Encrypt File Names property |
---|---|---|---|
RAR | √ | ||
Multi-part RAR | √ |
MSG to MHT conversion considerations
The following table provides details on the differences between how Relativity handles MSG and MHT files. This information may be especially useful if you plan on setting the Email Output field on the processing profile to MHT.
Category | Field/Attribute | MSG | MHT |
---|---|---|---|
Metadata fields | Show Time As | This field sometimes appears in the extracted text from MSG files when not explicitly stated in the MSG file itself. The default for a calendar invite is to show time as "busy;" the default for a cancellation is to show time as "free." | "Show Time As" will not appear in the extracted text if the default value is populated. |
Metadata fields | On behalf of | This field is sometimes present in text from MSG. In some cases, this field is populated with the same value as the From field. | "On behalf of" will not appear in the extracted text. |
Interline spacing | N/A |
The expected number of blank lines will appear in the extracted text. Line wrapping for long paragraphs will also be present. |
In some cases, the text in MHT format has fewer blank lines than the text from MSG. In addition, there is no built-in line wrapping for long paragraphs. |
Intraline spacing | N/A |
White-space characters are converted to standard space characters. |
White-space characters may remain as non-breaking spaces. |
Email addresses | When an MSG is converted to MHT, the text is extracted from the MHT using OutsideIn. This can lead to a loss of data. | If joe.smith@acme.com renders as Joe Smith in the MHT, the email address is not captured in the extracted text. |
Microsoft Office child extraction support

The following table displays which Office file extensions will have their embedded objects and images extracted by Relativity and which will not.
- √ - Relativity fully extracts the embedded object and image.
- √* - Relativity partially extracts the embedded object or image.
- Empty - Relativity does not extract the embedded object or image.
Office program | File extension | Embedded object extraction | Embedded image extraction |
---|---|---|---|
Excel | XLSX | √ | √ |
Excel | XLSM | √ | √ |
Excel | XLSB | √ | √ |
Excel | XLAM | √ | √ |
Excel | XLTX | √ | √ |
Excel | XLTM | √ | √ |
Excel | XLS | √ | √* |
Excel | XLT | √ | √* |
Excel | XLA | √ | √* |
Excel | XLM | √ | √* |
Excel | XLW | √ | √* |
Excel | UXDC | ||
Outlook | MSG | √ | √ |
Word | DOCX | √ | √ |
Word | DOCM | √ | √ |
Word | DOTX | √ | √ |
Word | DOTM | √ | √ |
Word | DOC | √ | √* |
Word | DOT | √ | √* |
Word | RTF | √ | √ |
Visio | VSD | ||
Visio | VDX | ||
Visio | VSS | ||
Visio | VSX | ||
Visio | VST | ||
Visio | VSW | ||
Visio | VSDX | √ | √ |
Visio | VSDM | √ | √ |
Project | MPP | ||
Publisher | PUB | √ | |
PowerPoint | PPTX | √ | √ |
PowerPoint | PPTM | √ | √ |
PowerPoint | PPSX | √ | √ |
PowerPoint | PPSM | √ | √ |
PowerPoint | POTX | √ | √ |
PowerPoint | PPT | √ | √ |
PowerPoint | PPS | √ | √ |
PowerPoint | POT | √ | √ |
OneNote | ONE | √ |
Notable unsupported file types
Processing doesn't support files created with the following programs and versions:
Product category | Product name and version | |
---|---|---|
DOS Word Processors |
DEC WPS Plus (DX) Through 4.0 DEC WPS Plus (WPL) Through 4.1 DisplayWrite 2 and 3 (TXT) All versions DisplayWrite 4 and 5 Through Release 2.0 Enable 3.0, 4.0, and 4.5 First Choice Through 3.0 Framework 3.0 IBM Writing Assistant 1.01 Lotus Manuscript Version 2.0 MASS11 Versions through 8.0 MultiMate Versions through 4.0 Navy DIF All versions Nota Bene Version 3.0 Office Writer Versions 4.0 through 6.0 PC-File Letter Versions through 5.0 PC-File+ Letter Versions through 3.0 PFS:Write Versions A, B, and C Professional Write Versions through 2.1 Q&A Version 2.0 Samna Word IV+ Versions through Samna Word SmartWare II Version 1.02 Sprint Versions through 1.0 Total Word Version 1.2 Volkswriter 3 and 4 Versions through 1.0 Wang PC (IWP) Versions through 2.6 WordMARC Plus Versions through Composer WordStar Versions through 7.0 WordStar 2000 Versions through 3.0 XyWrite Versions through III Plus |
|
Windows Word Processors |
Adobe FrameMaker (MIF) Version 6.0 Hangul Version 97, 2002 JustSystems Ichitaro Versions 5.0, 6.0, 8.0, 13.0, 2004 JustWrite Versions through 3.0 Legacy Versions through 1.1 Lotus AMI/AMI Professional Versions through 3.1 Lotus Word Pro Millenium Versions 96 through Edition 9.6, text only Novell Perfect Works Version 2.0 Professional Write Plus Version 1.0 Q&A Write Version 3.0 WordStar Version 1.0 |
|
Mac Word Processors |
MacWrite II Version 1.1 |
|
Disk Images |
Symantec Ghost |
|
Spreadsheets |
Enable Versions 3.0, 4.0, and 4.5 First Choice Versions through 3.0 Framework Version 3.0 Lotus 1-2-3 (DOS and Windows) Versions through 5.0 Lotus 1-2-3 (OS/2) Versions through 2.0 Lotus 1-2-3 Charts (DOS and Windows) Versions through 5.0 Lotus 1-2-3 for SmartSuite Versions 97 and Millennium 9.6 Lotus Symphony Versions 1.0, 1.1, and 2.0 Microsoft MultiPlan Version 4.0 Mosaic Twin Version 2.5 Novell Perfect Works Version 2.0 PFS: Professional Plan Version 1.0 Quattro Pro (DOS) Versions through 5.0 Quattro Pro (Windows) Versions through 12.0, X3 SmartWare II Version 1.02 SuperCalc 5 Version 4.0 VP Planner 3D Version 1.0 |
In addition, processing doesn't support the following files:
- Self-extracting RAR files
- PEM certificate files
- Apple i-Works suite (Pages, Numbers, Keynote)
- Apple Mail:
- .emlxpart
- .partial.emlx
Note: The .emlxpart and .partial.emlx are distinct from the .emlx file extension, which is supported by processing.
- Audio/Video files
- .wav
- iCloud backup files
- Microsoft Access
- Microsoft Works
- Raw partition files:
- ISO
- NTFS
- HFS
Supported container file types
The following file types can act as containers:
File type | Extensions |
---|---|
Bloomberg |
XML
|
Cabinet |
CAB
|
EnCase | E01, L01, LX01 |
AccessData Logical Image |
AD1
|
iCalendar |
ICS
|
Lotus Notes Database |
NSF.
|
MBOX Email Store |
MBOX
|
Outlook Offline Storage |
OST |
Outlook Mail Folder |
PST
|
Outlook Express Mail Folder | DBX |
PDF Portfolio | |
RAR |
RAR
|
Relativity Collection container | RCC |
TAR (Tape Archive) |
TAR
|
Zip |
7Z
|
Zip | ALZIP |
Zip | BZ2 |
Zip | GZ |
Zip | ZIP |
Zip | Z |
Lotus Notes considerations
Note the following about how Relativity Processing handles NSF files:
- Relativity Processing doesn't perform intermediate conversion on NSF files, meaning that we don't convert them to PST or DXL before discovering them. This ensures that we don't miss any document metadata during processing.
- Relativity Processing preserves the original formatting and attachments of the NSF file. In addition, forms are not applied, since they are designed to hide information.
- Relativity Processing extracts the contents of NSF files and puts them into individual MSG files using the Lotus Notes C/C++ API directly. This is because NSF doesn't have its own individual document entry file format. All of the original Lotus Notes metadata is embedded in the MSG, meaning if you look at the document metadata in an NSF within Lotus, all of the metadata listed is embedded in the MSG. In addition, the original RTF/HTML/Plaintext document body is written to the MSG. Relativity handles the conversion from NSF to MSG files itself, and any errors regarding metadata or the inability to translate content are logged to the processing Errors tab. Relativity can process the following NSF items as MSGs:
- Contacts
- Distribution lists
- Calendar items
- Emails and non-emails
This is an example of an original NSF file before being submitted to the processing engine:
This is an example of an NSF file that has been converted to an MSG:
Multi-part container considerations
When processing a multi-part container, the first part of the container must be included. If the first part of the container is not included, the Processing engine will ignore the file.
ICS/VCF file considerations
ICS/VCF files are deduplicated not as emails but as loose files based on the SHA256 hash. Since the system now considers these loose files, Relativity is no longer capturing the email-specific metadata that it used to get as a result of ICS/VCF files going through the system's email handler.
The following table breaks down which metadata values the system will populate for ICS files:

Processing engine property name | Relativity property name |
---|---|
Author | Author |
DocTitle | Title |
Email/AllDayEvent | [other metadata] |
Email/AllowNewTimeProposal | [other metadata] |
Email/BusyStatus | [other metadata] |
Email/CommonEnd | [other metadata] |
Email/CommonStart | [other metadata] |
Email/ConversationTopic | [other metadata] |
Email/CreatedOn | Email Created Date/Time |
Email/DisplayTo | [other metadata] |
Email/DomainParsedBCC | Recipient Domains (BCC) |
Email/DomainParsedCC | Recipient Domains (CC) |
Email/DomainParsedFrom | Sender Domain |
Email/DomainParsedTo | Recipient Domains (To) |
Email/Duration | [other metadata] |
Email/EndDate | Meeting End Date/Time |
Email/IntendedBusyStatus | [other metadata] |
Email/IsRecurring | [other metadata] |
Email/LastModified | Email Last Modified Date/Time |
Email/Location | [other metadata] |
Email/MessageClass | Message Class |
Email/MessageType | Message Type |
Email/NetMeetingAutoStart | [other metadata] |
Email/ReminderMinutesBeforeStart | [other metadata] |
Email/SentRepresentingEmail | [other metadata] |
Email/SentRepresentingName | [other metadata] |
Email/StartDate | Meeting Start Date/Time |
EmailBCC | [other metadata] |
EmailBCCName | [other metadata] |
EmailBCCSmtp | BCC (SMTP Address) |
EmailCC | [other metadata] |
EmailCCName | [other metadata] |
EmailCCSmtp | CC (SMTP Address) |
EmailConversation | Conversation |
EmailFrom | [other metadata] |
EmailImportance | Importance |
EmailSenderName | Sender Name |
EmailSenderSmtp | From (SMTP Address) |
EmailSensitivity | Email Sensitivity |
EmailSubject | Subject |
EmailTo | [other metadata] |
EmailToName | Recipient Name (To) |
EmailToSmtp | To (SMTP Address) |
SortDate | Sort Date/Time |
Subject | [other metadata] |
The following table breaks down which metadata values the system will populate for VCF files:

Processing engine property name | Relativity property name |
---|---|
DocTitle | Title |
Email/BusinessAddress | [other metadata] |
Email/BusinessAddressCity | [other metadata] |
Email/BusinessAddressCountry | [other metadata] |
Email/BusinessAddressPostalCode | [other metadata] |
Email/BusinessAddressState | [other metadata] |
Email/BusinessAddressStreet | [other metadata] |
Email/BusinessPostOfficeBox | [other metadata] |
Email/BusinessTitle | [other metadata] |
Email/CellNumber | [other metadata] |
Email/CompanyName | [other metadata] |
Email/ConversationTopic | [other metadata] |
Email/Country | [other metadata] |
Email/Department | [other metadata] |
Email/DisplayName | [other metadata] |
Email/DisplayNamePrefix | [other metadata] |
Email/Email2AddrType | [other metadata] |
Email/Email2EmailAddress | [other metadata] |
Email/Email2OriginalDisplayName | [other metadata] |
Email/Email3AddrType | [other metadata] |
Email/Email3EmailAddress | [other metadata] |
Email/Email3OriginalDisplayName | [other metadata] |
Email/EmailAddrType | [other metadata] |
Email/EmailEmailAddress | [other metadata] |
Email/EmailOriginalDisplayName | [other metadata] |
Email/FileUnder | [other metadata] |
Email/Generation | [other metadata] |
Email/GivenName | [other metadata] |
Email/HomeAddress | [other metadata] |
Email/HomeAddressCity | [other metadata] |
Email/HomeAddressCountry | [other metadata] |
Email/HomeAddressPostalCode | [other metadata] |
Email/HomeAddressState | [other metadata] |
Email/HomeAddressStreet | [other metadata] |
Email/HomeNumber | [other metadata] |
Email/HomePostOfficeBox | [other metadata] |
Email/Locality | [other metadata] |
Email/MessageClass | Message Class |
Email/MessageType | Message Type |
Email/MiddleName | [other metadata] |
Email/OfficeNumber | [other metadata] |
Email/OtherAddress | [other metadata] |
Email/OtherAddressCity | [other metadata] |
Email/OtherAddressCountry | [other metadata] |
Email/OtherAddressPostalCode | [other metadata] |
Email/OtherAddressState | [other metadata] |
Email/OtherAddressStreet | [other metadata] |
Email/OtherPostOfficeBox | [other metadata] |
Email/PostOfficeBox | [other metadata] |
Email/PostalAddress | [other metadata] |
Email/PostalCode | [other metadata] |
Email/PrimaryFaxNumber | [other metadata] |
Email/PrimaryNumber | [other metadata] |
Email/State | [other metadata] |
Email/StreetAddress | [other metadata] |
Email/Surname | [other metadata] |
EmailConversation | Conversation |
EmailSubject | Subject |
Subject | [other metadata] |
Container file types supported for the password bank
The following container file types are supported by Relativity for Password Bank in Inventory.
File type | Extensions |
---|---|
Lotus Notes Database | NSF |
PDF Portfolio | |
PST | PST |
RAR | RAR |
Zip | 7Z |
Zip | ALZIP |
Zip | ZIP |
Zip | Z |
Zip | BZ2 |
Zip | GZ |
Non-container file types supported for Password Bank in Inventory
The Password Bank also supports the following non-container formats:
- Excel*
- Word*
- PowerPoint*
- S/MIME
- P7M
* Except DRM or custom encryption
