A load file is used to transfer data and its associated metadata into a database. During import, the application reads the load file to determine what metadata should be written into each field and to copy it to the workspace. If your organization uses a processing vendor, you’ll need to upload case data with a load file. You'll also use load files when you receive a subset of data from another party, such as a production from opposing counsel.
Below are the load file specifications for Import/Export.
Supported file types
Import/Export supports the following file types.
ZIP and PST files—for transferring data to the server-side.
Native files for processing—see the list of supported file types for processing.
Document Load File import—.dat, .csv and .txt load files.
Image Load File import—Opticon-formatted page-level files. Formats supported: single page TIFF (Group IV) files, single page JPG files, and single and multiple PDF files.
Note: RelativityOne allows users to restrict given file types from being imported into their instances with the RestrictedFileTypes instance setting. Import/Export reads and applies this instance setting when importing materials into RelativityOne, therefore, all file types listed as restricted file types will be skipped.
Import/Export uses a flat, document-level load file to load metadata, document level extracted text, and natives files. Each line should represent one document.
- Arabic (ASMO 708)
- Arabic (ISO)
- Arabic (Windows)
- Baltic (ISO)
- Baltic (Windows)
- Central European (ISO)
- Central European (Windows)
- Chinese Simplified (GB18030)
- Chinese Simplified (GB2312)
- Chinese Traditional (Big5)
- Cyrillic (DOS)
- Cyrillic (ISO)
- Cyrillic (KOI8-R)
- Cyrillic (KOI8-U)
- Cyrillic (Mac)
- Cyrillic (Windows)
- Estonian (ISO)
- Greek (ISO)
- Greek (Windows)
- Hebrew (ISO-Logical)
- Hebrew (ISO-Visual)
- Hebrew (Windows)
- Japanese (EUC)
- Japanese (JIS 0208-1990 and 0212-1990)
- Japanese (JIS)
- Japanese (JIS-Allow 1 byte Kana - SO/SI)
- Japanese (JIS-Allow 1 byte Kana)
- Japanese (Shift-JIS)
- Korean (EUC)
- Latin 3 (ISO)
- Latin 9 (ISO)
- Thai (Windows)
- Turkish (ISO)
- Turkish (Windows)
- Ukrainian (Mac)
- Unicode (UTF-16)
- Unicode (Big-Endian)
- Unicode (UTF-8)
- Vietnamese (Windows)
- Western European (ISO)
- Western European (Mac)
- Western European (Windows)
Import/Export does not require load file header rows. However, they are strongly recommended to ensure accuracy.
The field names in your header do not need to match the field names in your workspace.
RelativityOne doesn’t require any specific load file field order. You can create any number of workspace fields to store metadata or coding.
During the load process, you can match your load file fields to the fields in your workspace. The identifier field is required for each load. When loading new records, this is your workspace identifier.
When performing an overlay, you can use the workspace identifier or select another field as the identifier. This is useful when overlaying production data. For example, you could use the Bates number field rather than the document identifier in the workspace.
All fields except Identifier are optional; however, you may find some of the following system fields beneficial.
- Identifier—the unique identifier of the record.
- Group Identifier—the identifier of a document’s family group.
- The group identifier repeats for all records in the group.
- Usually, this is the document identifier of the group’s parent document. For example:
- If an email with the document identifier of AS00001 has several attachments, the email and its attachments have a group identifier of AS00001.
- If a group identifier for a record is not set, the document identifier populates the group identifier field in the case. This effectively creates a “group” of one document.
- MD5 Hash—the duplicate hash value of the record.
- You can enter any type of hash value (and rename the field in your case).
- If documents share the same hash value, the software identifies the documents as a duplicate group.
- If a hash field for a record is not set, the document identifier populates the hash field in the case. This effectively creates a “group” of one document.
- Extracted Text—the text of the document. Either the OCR or Full Text. The extracted text appears in the viewer and is added to search indexes. This field can contain either:
- The actual OCR or Full Text.
- The path to a document level text file containing the OCR or Full Text. Both relative and absolute (full) paths are supported. To import load file data that contains absolute paths, you must activate Express Transfer. Use Windows-style formatting containing backslashes "\". Sample of relative path format is folder\filename. Sample of absolute path format is C:\folder\filename.
- Native File Path—the path to any native files you would like to load. Both relative and absolute (full) paths are supported. To import load file data that contains absolute paths, you must activate Express Transfer. Sample of relative path format is folder\filename. Sample of absolute path format is C:\folder\filename.
- Folder Info—builds the folder browser structure for the documents.
- This field is backslash “\” delimited.
- If not set, the documents load to the root of the case.
- Each entry between backslashes is a folder in the system's folder browser.
- Each backslash indicates a new subfolder in the browser.
Note: For example, if the load file contained the following entry in the folder information field, “Slinger, Ryan\Email\deleted_items”, then the software would build the following folder structure:
Each document with the above entry would be stored in the “deleted_items” folder.
- Relativity Native Time Zone Offset—RelativityOne's native viewer technology displays all email header dates and times in GMT. This numeric field offsets how email header dates and times appear in the viewer.
- If the value in this field is blank, or 0, for a document, then the email header date and time appears in GMT.
- You can enter a whole number in this field (positive or negative) to offset the time from GMT to the local time zone of the document. For example, if the document was collected from US CDT time, enter “-5” in the field, because the CDT offset from GMT is -5.
- This ONLY applies when viewing email header dates and times in the RelativityOne Native File Viewer. Your metadata fields display as imported.
RelativityOne accepts date and time as one field. For example, Date Sent and Time Sent should be one field. If date sent and time sent ship separately, you must create a new field for time. You can format date fields to accept the date without the time, but not the time without the date. Dates cannot have a zero value. Format dates in a standard date format such as “6/30/2023 1:23 PM” or “6/30/2023 13:23”.
Note: To import or export data with a date/time format that differs from the US format, be sure to select the correct Regional Settings option when creating a new Import/Export job.
The table below lists the date formats recognized by Import/Export and Import Service (IAPI). It contains both valid and invalid date formats:
|Entry in Load File||Object Type||Definition|
|Monday January 4 2023||1/4/2318 0:00|
|05/28/2023 7:11 AM||05/28/2023 7:11 AM|
|5.08:40 PM||6/30/2023 17:08||The current date will be entered if the date is missing. For this example, assume the import was done on 6/30/2023.|
|17:08:33||6/30/2023 17:08||The current date will be entered if the date is missing. For this example, assume the import was done on 6/30/2023.|
|17:08||6/30/2023 17:08||The current date will be entered if the date is missing. For this example, assume the import was done on 6/30/2023.|
|5:08 PM||6/30/2023 17:08|
|14-Apr||4/14/2023 0:00||The current year will be entered if the year is missing.|
|9-Apr||4/9/2023 0:00||The current year will be entered if the year is missing.|
|14-Mar||3/14/2023 0:00||The current year will be entered if the year is missing.|
|1-Mar||3/1/2023 0:00||The current year will be entered if the year is missing.|
|22-Feb||2/22/2023 0:00||The current year will be entered if the year is missing.|
|20230420 2:22:00 AM||4/20/2023 0:00|
|4/9/2023 16:13||4/9/2023 16:13|
|4/9/2023 8:49||4/9/2023 8:49|
|Apr. 9, 23||4/9/2023 0:00|
|Wednesday, 09 April 2023||4/9/2023 0:00|
|12-31-2023||12/31/2023 12:00 AM|
|4/9/23 13:30 PM||Results in an error|
|2023-044-09||Results in an error|
|4/9/2023 10:22:00 a.m.||Results in an error|
|00/00/0000||Results in an error unless the CreateErrorForInvalidDate value is set to false.|
During import, you can designate which delimiters are used in your load file. You can select each delimiter from the ASCII characters, 001 – 255.
The delimiter characters have the following functions:
- Column—separates load file columns.
- Quote—marks the beginning and end of each load file field (also known as a text qualifier).
- Newline—marks the end of a line in any extracted or long text field.
- Multi-value—separates distinct values in a column. This delimiter is only used when importing into a RelativityOne multi-choice field.
- Nested-values—denotes the hierarchy of a choice. This delimiter is only used when importing into a RelativityOne multi-choice field.
For example, say a load file contained the following entry, and was being imported into a multi-choice field: “Hot\Really Hot\Super Hot; Look at Later”
With the multi-value delimiter set as “;” and the nested value delimiter set as “\"”, the choices would appear in RelativityOne as:
All checkboxes are automatically selected under each nested value. The full path to each multi-choice element is required. For example:
04. Redact;01. Yes\b.
To select "01. Yes/a. Litigation," add it to the record after ";".
If you generate your own load files, you may choose to use the system defaults:
- Column—Unicode 020 (ASCII 020 in the application)
- Quote—Unicode 254 (ASCII 254 in the application)
- Newline—Unicode 174 (ASCII 174 in the application)
- Multi-Value—Unicode 059 (ASCII 059 in the application)
- Nested Values—Unicode 092 (ASCII 092 in the application)
For image imports, Import/Export requires Opticon load files with ANSI/Western European encoding. This .opt text file references the Control ID on a page level. The first page should match up to any data you intend to load. You can use this same process for importing page-level extracted text.
Import/Export does not support Unicode .opt files for image imports. When you have a Unicode .opt file, you must save it in ANSI/Western European encoding.
You must convert images in unsupported formats using a third-party conversion tool before Import/Export can successfully upload them.
Import/Export accepts only the following file types for image loads:
- Single page, Group IV TIF (1 bit, B&W)
- Single page JPG
Single page PDF
Multi page PDF
Multi page TIF can be imported into the system, but you must load them as native files
Only one PDF per document is supported
Load file format
The Opticon load file is a page-level load file, with each line representing one image.
Below is a sample:
The fields are, from left to right:
- Field One – (REL00001) – the page identifier
- Field Two – (REL01) – the volume identifier is not required.
- Field Three – (D:\IMAGES\001\REL00001.TIF) – a path to the image to be loaded
- Field Four – (Y) – Document marker – a “Y” indicates the start of a unique document.
- Field Five – (blank) – can be used to indicate folder
- Field Six – (blank) – can be used to indicate box
- Field Seven – (3) – often used to store page count, but unused in Import/Export
You can also import extracted text during the image import process by setting an option in Import/Export.
No changes are needed in the Opticon load file. If the aforementioned setting is active, Import/Export looks for page level .txt files that are named identically to their corresponding TIF files. For example:
Some data originates from client files and needs processing to extract the metadata. The following table shows the delimiters that your internal processing software must use to present data as fields.
You can provide this list to your vendor to help communicate the required delivery format for load files. The fielded data should be delivered as delimited files with column or field names located in the first line.