Document flags
Data Breach Response uses document flags in a number of ways to denote information about a document. There are two types of document flags:
- System Flags
- TIQ Flags
Many flag types are available by default, however lead users can create new User Flags types in a project.
System flags
The application logic creates system flags during an Ingestion or Data Analysis run and users cannot remove them. System flags denote documents for which the application has issues processing. The presence of a system flag may prevent that document from processing in certain pipeline stages.
System default flag types
The following table describes default flag types:
Flag name | Description |
---|---|
Exceeds Excel Size Limit | Excel file has exceeded the maximum file size supported by Data Breach Response and was not processed. This document will not have PI predictions and not be available for review. |
System Technical Issue | A file that has been impacted by some Technical Issue that has prevented the document from being processed for review and reporting. |
The document failed to convert to PDF | A non-spreadsheet native was not converted to PDF and will not be viewable in the PDFtron viewer. Native is not viewable in PDFTtron but PI predictions are still made. The Document Report will contain information on the document. |
File Extension is not currently supported | The file extension is not supported. |
Exceeds Native File Size Limit | The file is too large. |
Empty Text | The associated text file is empty. |
TIQ flags
The application logic creates TIQ flags during an Ingestion or Data Analysis run and users can remove them. They draw attention to issues encountered when processing a document which may require user action. Each flag generates from a stage in Data Analysis.
TIQ default flag types
The following table describes TIQ default flag types and resolutions:
Flag name | Stage | Description | Resolution |
---|---|---|---|
Document deduplication could not run on this document | Deduplication | Application runs into an error during document deduplication. | No immediate resolution is available for this error. If it impedes the output of the project, please contact Relativity Support. |
Document text could not be extracted | Extraction | Application runs into an error extracting the raw text from a document. | No immediate resolution is available for this error. If it impedes the output of the project, please contact Relativity Support. |
Document splitting could not be performed on this document | Document Splitting | Application runs into an error splitting the document into sub-documents. This may prevent PI detections from running on the document. | Review the document manually to ensure PI detections are complete. If it impedes the output of the project, please contact Relativity Support. |
Excel header detection did not complete on this document | Excel Header Detection | Application runs into an error performing header detection. This may prevent PI predictions from completing on the document. | Manually review the document to identify personal information. If it impedes the output of the project, please contact Relativity Support. |
Sheet contains conflicting name headers | Excel Header Detection | Multiple columns were tagged as the same name type in a single table. Documents with these results are blocked from the Entity Report. For example, a table cannot have 2 columns that are 'Full Name.' This can occur for Full Name, First Name, Middle Name, and Last Name fields. | Review documents or Use Spreadsheet QC to remove duplicate labels (i.e. two columns labeled as "Full Name"). A table can only contain a single entity per row and the entity's name is generated from the labels. Multiple labels will create conflicting information and will be excluded from the Entity Report. If all name fields need to be captured correctly, try to create multiple tables so there is only one instance of a label per table. If two tables cannot be created to capture the entity's information, remove the duplicate labels and upload another version of the document in Data Breach Response to capture the other entity's information. Alternatively, the documents can be reviewed in Relativity. |
Sheet contains conflicting PI Types within single column | Excel Header Detection | A single column has been tagged as multiple conflicting PI types. | Review documents or Use Spreadsheet QC to remove duplicate labels (i.e. two columns labeled as "Address 1"). A table can only contain a single instance of a label per row. Multiple labels will create conflicting information and will be excluded from the Entity Report. If all PI fields need to be captured correctly, try to create multiple tables so there is only one instance of a label per table. If two tables cannot be created to capture the PI, remove the duplicate labels and upload another version of the document in Data Breach Response to capture the other PI. Alternatively, the documents can be reviewed in Relativity. |
Excel detection process not complete due to timeout | Excel Header Detection | Excel document takes too long to process. | Manually review the document and/or turn off any unnecessary detectors. If it impedes the output of the project, please contact Relativity Support. |
Excel header assignment did not complete on this document. | Excel Header Detection | Application runs into an error applying header assignments. | Manually review the document and/or turn off any unnecessary detectors. If it impedes the output of the project, please contact Relativity Support. |
Spreadsheet tag statistics could not be fully generated | Excel Tag Stats Processing | Error occurred while calculating spreadsheet statistics used for Spreadsheet QC experience. | If it impedes the output of the project, please contact Relativity Support. |
Spreadsheet content extraction error | Excel Content Detection | An error occurred while extracting information from the spreadsheet to generate entity records. Individuals from this document may not be available in the Entity Report. | Manually review the document to ensure that the tables and columns are properly labelled. If the issue persists, please contact Relativity Support. |
Cell values were excluded from the Entity report | Excel Content Detection | Cell values were excluded from the Normalization process and the Entity report because they failed to match the format of the assigned PI type. For example, a Social Security Number has 5 digits instead of 9. | Use the Spreadsheet Blocklisting tool to determine if some terms should be allowed. Rerun the normalization process to show these results in the Entity Report. If it impedes the output of the project, please contact Relativity Support. |
Detector regex timed out | Keyword and Regex Detection | Keyword or regex takes too long to run. | Review documents manually and/or update the detector that is timing out. Turn off PI types that are not necessary for the project. |
Detection process not complete due to timeout | Keyword and Regex Detection | Processing a document takes too long. | Modify the detector regex to be less computationally expensive. |
Contains errors from the NER Detector | Machine Learning Detection | Application runs into an error during Name Detection. | Manually review the impacted documents. |
Error applying document classifications | Machine Learning Detection | Error occurred while applying document classifications that may impact the reported accuracy of predictions. | Manually review the impacted documents. |
Personal Information counts could not be calculated | Document Statistics | An error occurred that may cause the PI and name counts to be inaccurately reported. | Manually review the impacted documents and rerun Data Analysis. If it impedes the output of the project, please contact Relativity Support. |
The document could not be scored | Document Scoring | Application runs into error scoring document. | Manually review the impacted documents and rerun Data Analysis. If it impedes the output of the project, please contact Relativity Support. |
Overlapped annotation removal could not complete | Overlap Removal | An error occurred while removing overlapping annotations from machine predictions. | Manually review the impacted documents and rerun Data Analysis. If it impedes the output of the project, please contact Relativity Support. |
PDF annotations could not be generated for document | PDF Annotation Generation | Application runs into error trying to generate annotations on the PDF view of the documents. | Contact Relativity Support. |