

You can use the RPC to import data into a store. The RPC extensively uses the job queue to distribute the import workload across multiple machines in order to maximize processing resource efficiency.
Before you can import data, you must create a data store in the RPC, which adds a new database to the SQL Server. Data stores allow you to organize your data by a store, group, project, and custodian. This organization allows you to optimally use the RPC when you perform text extraction, generate images, filter, and export.
Note: You can't create a data store enabled for DataGrid.
Any databases that you create in Relativity also appear in the RPC. See Working with Relativity-generated data stores.
Note: The RPC is designed to work in distributed environment across multiple SQL, file, and worker servers. You may see multiple SQL Servers listed in the Data Store window.
To create a data store:
Note: You can delete groups but not data stores. You can hide a data store by right-clicking the data store and select Hidden.
Once you've created a data store, you can import data. See Importing data.
You can use the RPC export feature to generate a load file from a Relativity-generated data store. For example, you may want to create a load file for Relativity import. See Exporting data.
The RPC generally uses the following workflow to import data (such as a PST file) from a hard drive:
The RPC extracts only metadata from the files during an import job. See Extracting text for more information.
Create an import to organize your data in a single database (data store). For example, you create a project and designate different custodians under that job. Use jobs to determine your export sequence.
Note: Prioritizing custodians for de duplication is not supported in the RPC. Custodian order is determined during ingestion. These settings can be changed at any time including during the import process.
To create an import job:
Custodian - the custodian that you want specify in the project.
Note: You can import data without defining a project or custodian. You can change these settings at any time, including during the import process.
Note: It is possible to add a SQL setting if you have a fixed import location that you would like to see as a location to import data from. In the Invariant database add an entry to the AppSettings table with a category of ‘MapVolume’ and enter the UNC path you would like to import from in the Value2 column. The other columns should stay as NULL. Multiple MapVolume entries can be added to the table.
Options | Select this option if you want to... |
---|---|
File Handling | |
De-NIST | Remove known system files from a set of data. The RPC uses the database provided by the National Institute of Standards (NIST), which includes approximately 14 million unique SHA-1 hashes for known system files, such as kernel32.dll, user32.dll, and others. These files can automatically be eliminated from a forensic collection because they are not custodian created files. (There is discoverable information in files on the NIST list, but none of it is pertinent to the case/matter unless you're using NIST.) |
Do not import embedded images in Office docs | Excludes various image file types found inside Microsoft office files. For example, .JPG, .BMP, or.PNG in a Word file. |
Do not import embedded objects in Office docs | Excludes various files found inside Microsoft Office files. For example, an Excel spreadsheet in a Word file. |
Do not import children | Performs a one for one import of the files selected for import extracting no children or embedded items. |
Do not import inline images in e-mails | Excludes various files found inside emails. For example, .JPG, .BMP, or .PNG in an email file. |
Filter by extensions box | Exclude or include file types added to this text box during import. You can't select both in the same import. The RPC will filter on detected file extension, rather than the original extension. To include a file type, enter a plus sign, file extension, and a semicolon. Replace the plus with a minus sign to exclude a file type as illustrated below:
Note: You can choose to either include or exclude items on an import. You cannot have both in the same import. |
Only filter parent documents | Applies filter only to top-level (or parent) and loose documents. Since this filters only top-level documents, all the associated attachments will be returned. |
Enable logging | Debug or troubleshoot a job that crashed, or other minor performance issues. This log file is created on the C: drive of worker machine processing the file, and log file names use the storage ID of the worker. The file contains subject names, entry IDs, and other information. |
Ignore PST/OST CRC errors | Ignores errors generated when a cyclic redundancy check is performed on PST files. |
Note: Time zone is applicable to OCR/imaging, text extraction, exporting, and publishing, but not importing. By default, the RPC stores date type metadata in the database in Coordinated Universal Time (UTC). While you can optionally set the time zone on the Job Settings General tab when importing, this information is required prior to running text extraction, imaging or exporting.
Once you've imported data to a data store, you can extract text. See Extracting text.
The RPC imports all documents into the system, even those with unsupported file types. This comprehensive import allows you to run exception reports that list any files that the RPC couldn't process.
While the RPC can identify most file types, it doesn't have corresponding handlers for all types. The RPC imports unsupported file types but they aren't added to the job queue, and undergo no further processing. You can generate a detailed error report that lists these files as well as any password-protected files.
The RPC also collects the following import information found in various reports:
The following reports are available for troubleshooting an import job:
Note: Each error listed in the summary report has a corresponding detailed version in the error report. If you reprocess error files and they don't encounter new errors, they will still appear on the report when you run that report again. If the reprocessed files encounter errors on the retry, these new errors are listed along with the original errors. The bad container report is dynamic, which means if a container was successfully re-imported, it doesn't appear on the bad container report when you run that report again.
See Running standard reports for information about other reports.
Why was this not helpful?
Check one that applies.
Thank you for your feedback.
Want to tell us more?
Great!