

Note: Incorporate Feedback has been moved to the Data Analysis tab. The Incorporate Feedback page in Privacy Workflow was deprecated on March 17, 2025. Please begin using the Data Analysis page prior to this date.
Data Analysis is a combination of machine stages that make predictions, perform calculations, curate machine output, and generate reports.
There are two types of data, and when it comes to finding PI, Data Breach Response treats each one differently.
Stage | Description | Can documents be reviewed while the stage is running? | When to run the stage |
---|---|---|---|
Run Blocklisting | The blocklist consists of terms added from the Blocklisting tool that will not be detected as PI. PI detectors will ignore new detections that match the blocklisted terms, and prior detections matching blocklisted terms will be marked as Blocklisted and have their links broken. Manually added detections that match terms in the blocklist will not be removed. If blocklisting is run when no changes have been made to the blocklist, the stage status will show as Skipped in Progress section. | No |
|
Run Unstructured Detectors | Identifies PI by running all enabled PI detectors on unlocked unstructured documents. As soon as unstructured detectors have finished processing on a document, the document becomes available for review. | Yes |
|
Run Structured Detectors | Identifies PI by running all enabled PI detectors on unlocked structured documents. In addition, all names and PI from structured documents are automatically linked. As soon as structured detectors have finished processing on a document, the document becomes available for review. | Yes |
|
Run Normalization | Standardizes names and PI into consolidated entities and generates an updated entity report. | No |
|
Compile Insights | Calculates and consolidates PI and entity statistics for reporting. | No |
|
This section provides instructions for running Data Analysis and lists common use cases when running Data Analysis.
To run Data Analysis:
On each run you can configure Data Analysis to run all, or some, stages.
Depending on what the goal of running Data Analysis is, it may be helpful to only select some stages to run. Common use cases when running Data Analysis include:
Note: The initial Entity Centric Report is generated from spreadsheet entities only. Entities from unstructured documents will appear on the Entity Centric Report when they are linked in the document viewer.
QC review primarily focuses on refining detectors and potentially blocklisting false hits. At this stage, having an up-to-date Entity Centric Report is not the priority. For this reason and to reduce runtime, run the following stages only:
Just as in the the QC process, detectors may be refined during Review. You can choose to run the same steps in Case 2 if you wish to make detector or blocklist updates during Review.
If you wish to just generate updated versions of the Reviewer Progress and/or Document Report, run the following stage only:
During the deduplication process, entities may be merged, entities may be unmerged, or Deduplication Settings may be updated. It is not typical that detectors are updated at this stage.
If changes have been made to structured documents, for example adding or removing PI, since the entity report was last generated, include Run Structured Detectors. This stage is responsible for automatically linking names and PI in structured documents to create entities. Running Structured Detectors ensures those links are up to date for the entity report.
A run’s progress can be monitored on the Data Analysis page. Data Analysis breaks down each of stage into sections that include dashboard summaries, sub-job details, and counts.
Overall progress can be monitored using the Progress section. Statuses can be:
Dashboard numbers
If Blocklisting is run but there have been no changes to the blocklist, the status section displays a Skipped status.
Dashboard numbers
You can retry these errors, see Canceling and retrying Data Analysis for details.
Note: When running Data Analysis, you can choose to run only unstructured or structured detection. If one is not run, it’s Documents Completed Count will remain zero. For example, if only unstructured detection is run, Structured Documents Completed will display zero documents as the detectors were not run.
Dashboard numbers
You can retry these errors, see Canceling and retrying Data Analysis for details.
Dashboard numbers
You can retry these errors, see Canceling and retrying Data Analysis for details.
While Data Analysis is running, reviewers will not be able to add/edit/delete entities or PI on documents. However, to reduce time to review, the Unstructured and Structured PI Detection stages follow a document streaming approach. This means that as individual documents finish the PI Detection stage they become available for review. Blocklisting, Entity Normalization & Deduplication, and Compile Insights do not follow this approach and all documents must finish processing before they become available for review, and this should be taken into consideration when selecting what stages to run.
All documents are assigned a Data Analysis Status to indicate it’s availability for review:
Not ready for review—the document is currently being processed through Blocklisting, PI Detection, or Compile Insights. The status will change to Not ready for review when Data Analysis is run and reviewers will not be able to edit the document.
Running normalizer—the document is currently being processed through Entity Normalization & Deduplication. The status will change to Running normalizer when Entity Normalization & Deduplication is running and reviewers will not be able to edit the document.
You can view a document’s Data Analysis Status on the Project Dashboard Document List and the field can be searched on using PI and Entity Search.
You can stop Data Analysis at any time while it is in progress. To stop it, select the Cancel Run button in the Project Actions console.
If a stage fails or Data Analysis is manually stopped, it can be restarted using the Retry button in the Progress card. Data Analysis will restart from the failed or interrupted stage when retrying.
To start a new run, select Run Data Analysis in the Project Actions console.
The Errors field will indicate the number of documents that have errored or encountered an issue during that stage. Select View Errors or the View Errored Documents button in the console to view these documents and their specific errors on the Project Dashboard. For more information on how to address these, see Document Flags.
Data Analysis, like other complex features in Relativity, provides the option in the View Run History modal for gathering audits of various runs.
The following information is available in View Run History:
Why was this not helpful?
Check one that applies.
Thank you for your feedback.
Want to tell us more?
Great!