Errors and flags

Data Breach Response uses document flags in a number of ways to denote information about a document. There are three types of document flags:

  • System Flags
  • TIQ Flags
  • User Flags

Many flag types are available by default, however lead users can create new User Flags types in a project.

System flags

The application logic creates system flags during an Ingestion or IF run and users cannot remove them. System flags denote documents for which the application has issues processing. The presence of a system flag may prevent that document from processing in certain pipeline stages.

System default flag types

The following table describes descriptions of default flag types and the action required by the user when they appear.

Flag name Description
Exceeds Excel Size Limit Excel file has exceeded the maximum file size supported by Data Breach Response and was not processed. This document will not have PI predictions and not be available for review.
System Technical Issue A file that has been impacted by some Technical Issue that has prevented the document from being processed for review and reporting.
The document failed to convert to PDF A non-spreadsheet native was not converted to PDF and will not be viewable in the PDFtron viewer. Native is not viewable in PDFTtron but PI predictions are still made. The Document Report will contain information on the document.
File Extension is not currently supported The file extension is not supported.
Exceeds Native File Size Limit The file is too large.
Empty Text The associated text file is empty.

TIQ flags

The application logic creates TIQ flags during an Ingestion or IF run and users can remove them. They draw attention to issues encountered when processing a document which may require user action. Each flag generates from a stage in the pipeline.

TIQ default flag types

Flag name Stage Description Resolution
Document deduplication could not run on this document Deduplication Application runs into an error during document deduplication. No immediate resolution is available for this error. If it impedes the output of the project, please contact Relativity Support.
Document text could not be extracted Extraction Application runs into an error extracting the raw text from a document. No immediate resolution is available for this error. If it impedes the output of the project, please contact Relativity Support.
Document splitting could not be performed on this document Document Splitting Application runs into an error splitting the document into sub-documents. This may prevent PI detections from running on the document. Review the document manually to ensure PI detections are complete. If it impedes the output of the project, please contact Relativity Support.
Excel header detection did not complete on this document Excel Header Detection Application runs into an error performing header detection. This may prevent PI predictions from completing on the document. Manually review the document to identify personal information. If it impedes the output of the project, please contact Relativity Support.
Sheet contains conflicting name headers Excel Header Detection Multiple columns were tagged as the same name type in a single table. Documents with these results are blocked from the Entity Report. For example, a table cannot have 2 columns that are 'Full Name.' This can occur for Full Name, First Name, Middle Name, and Last Name fields. Review documents or Use Spreadsheet QC to remove duplicate labels (i.e. two columns labeled as "Full Name"). A table can only contain a single entity per row and the entity's name is generated from the labels. Multiple labels will create conflicting information and will be excluded from the Entity Report. If all name fields need to be captured correctly, try to create multiple tables so there is only one instance of a label per table. If two tables cannot be created to capture the entity's information, remove the duplicate labels and upload another version of the document in Data Breach Response to capture the other entity's information. Alternatively, the documents can be reviewed in Relativity.
Sheet contains conflicting PI Types within single column Excel Header Detection A single column has been tagged as multiple conflicting PI types. Review documents or Use Spreadsheet QC to remove duplicate labels (i.e. two columns labeled as "Address 1"). A table can only contain a single instance of a label per row. Multiple labels will create conflicting information and will be excluded from the Entity Report. If all PI fields need to be captured correctly, try to create multiple tables so there is only one instance of a label per table. If two tables cannot be created to capture the PI, remove the duplicate labels and upload another version of the document in Data Breach Response to capture the other PI. Alternatively, the documents can be reviewed in Relativity.
Excel detection process not complete due to timeout Excel Header Detection Excel document takes too long to process. Manually review the document and/or turn off any unnecessary detectors. If it impedes the output of the project, please contact Relativity Support.
Excel header assignment did not complete on this document. Excel Header Detection Application runs into an error applying header assignments. Manually review the document and/or turn off any unnecessary detectors. If it impedes the output of the project, please contact Relativity Support.
Spreadsheet tag statistics could not be fully generated Excel Tag Stats Processing Error occurred while calculating spreadsheet statistics used for Spreadsheet QC experience. If it impedes the output of the project, please contact Relativity Support.
Spreadsheet content extraction error Excel Content Detection An error occurred while extracting information from the spreadsheet to generate entity records. Individuals from this document may not be available in the Entity Report. Manually review the document to ensure that the tables and columns are properly labelled. If the issue persists, please contact Relativity Support.
Cell values were excluded from the Entity report Excel Content Detection Cell values were excluded from the Normalization process and the Entity report because they failed to match the format of the assigned PI type. For example, a Social Security Number has 5 digits instead of 9. Use the Spreadsheet Blocklisting tool to determine if some terms should be allowed. Rerun the normalization process to show these results in the Entity Report. If it impedes the output of the project, please contact Relativity Support.
Detector regex timed out Keyword and Regex Detection Keyword or regex takes too long to run. Review documents manually and/or update the detector that is timing out. Turn off PI types that are not necessary for the project.
Detection process not complete due to timeout Keyword and Regex Detection Processing a document takes too long. Modify the detector regex to be less computationally expensive.
Contains errors from the NER Detector Machine Learning Detection Application runs into an error during Name Detection. Manually review the impacted documents.
Error applying document classifications Machine Learning Detection Error occurred while applying document classifications that may impact the reported accuracy of predictions. Manually review the impacted documents.
Personal Information counts could not be calculated Document Statistics An error occurred that may cause the PI and name counts to be inaccurately reported. Manually review the impacted documents and rerun the Incorporate Feedback Pipeline. If it impedes the output of the project, please contact Relativity Support.
The document could not be scored Document Scoring Application runs into error scoring document. Manually review the impacted documents and rerun the Incorporate Feedback Pipeline. If it impedes the output of the project, please contact Relativity Support.
Overlapped annotation removal could not complete Overlap Removal An error occurred while removing overlapping annotations from machine predictions. Manually review the impacted documents and rerun the Incorporate Feedback Pipeline. If it impedes the output of the project, please contact Relativity Support.
PDF annotations could not be generated for document PDF Annotation Generation Application runs into error trying to generate annotations on the PDF view of the documents. Contact Relativity Support.

User flags

User flags are created by users and can be deleted by users. They are meant to be used to manage document sets and the presence (or lack of) User Flags has no bearing on application logic.

Flag name Description Intended usage
Needs Further Review A user has identified a document as needing additional review by a project lead. Project lead batches out these documents for review, or reviews one by one.

Technical Issue

A user has identified a document as having some technical issue that prevented them from completing their review. Documents get reviewed outside of Data Breach Response.
Illegible A user has identified a document as not being readable and, therefore, not reviewable. Leads should ensure that all documents that were not reviewed are flagged to stakeholders.
Duplicate A user has identified a document as a duplicate and may not have completed the review of the document. Leads should ensure that all documents that were flagged as duplicates, and likely not reviewed, are brought to the attention of project stakeholders.
Bulk Document A user has identified a document as a bulk document with lots of PI and will need to be reviewed during an alternative review process. Leads should determine how to handle bulk documents based on the project requirements and either batch out the bulk documents for review or review them outside of Data Breach Response.

Stage errors

Stage errors appear in the error message column in the Document or Non-Document based error tabs on the Incorporate Feedback page.

Constant name Type Readable error message Stage Item (for non-document errors) Error flag Description
METADATA_GENERATION_ERROR Document Error Document meta data generation error. Document Generator   UNSUPPORTED_EXTENSION | EMPTY_TEXT | EXCEEDS_SIZE_LIMIT Failed to generate document statistics about the document's PI.
KEYWORD_REGEX_ERROR Document Error Unable to process initial keyword/regexes. PI Text Detection   DETECTION_PROCESS_TIMEOUT_FAILURE | JAVA_DETECTORS_FAILURE An exception is being caught during the process, which may cause missing PI.
PROCESS_SETUP_ERROR Document Error An error occurred during the process setup. PI Text Detection   DETECTION_PROCESS_TIMEOUT_FAILURE | JAVA_DETECTORS_FAILURE Before running the global keyword filter process, it first clears all the cache and reloads detector regexes and keywords. This error will be logged if something went wrong during the setup.
DETECTION_GENERATION_ERROR Document Error Unable to generate detections. Spreadsheet Regex Processing, Excel PI Contents Detector   EXCEL_ANNOTATION_GENERATION_FAILURE | EXCEL_HEADER_DETECTOR_FAILURE | SPREADSHEET_REGEX_FAILURE  
DETECTION_PROCESS_TIMEOUT Document Error The detection process timed out. PI Text Detection   DETECTION_PROCESS_TIMEOUT_FAILURE | JAVA_DETECTORS_FAILURE  
DUPLICATE_DOC_CHECK_ERROR Document Error Unable to determine if the document is a duplicate. Deduplication   DEDUPLICATION_FAILURE  
DOC_HAS_DUPLICATES_CHECK_ERROR Document Error Unable to check the document for duplicates. Deduplication   DEDUPLICATION_FAILURE  
DAT_TEXT_PATH_ERROR Document Error DAT entry does not include a text path. Minimal Extractor   EXTRACTION_FAILURE  
DAT_TEXT_EXTRACTION_ERROR Document Error Unable to extract text from the DAT record. Minimal Extractor   EXTRACTION_FAILURE  
DOC_SIZE_LIMIT_EXCEEDED Document Error Document size exceeds the maximum processing limit. Minimal Extractor   EXTRACTION_FAILURE  
NATIVE_FILE_NOT_FOUND Document Error Native file not found. PDF Conversion   PDF_CONVERSION_FAILURE  
NATIVE_FILE_INACCESSIBLE Document Error The native file at the given path is inaccessible. PDF Conversion   PDF_CONVERSION_FAILURE  
NATIVE_TO_PDF_ERROR Document Error The native file is not available in the Viewer. Please use the extracted text. PDF Conversion   PDF_CONVERSION_FAILURE  
NATIVE_PDF_CORRUPT_ERROR Document Error The native file is corrupted and not available in the Viewer. Please use the extracted text. PDF Conversion   PDF_CONVERSION_FAILURE  
NATIVE_PDF_CORRUPT_OR_ENCRYPTED Document Error The native file is either corrupted or encrypted. Please check if the file can be opened before re-ingesting the data. PDF Conversion   PDF_CONVERSION_FAILURE  
NATIVE_PDF_ENCRYPTED_ERROR Document Error The file is encrypted. Please decrypt the file and re-ingest the data or use the extracted text. PDF Conversion   PDF_CONVERSION_FAILURE  
NATIVE_PDF_INVALID_FILE_ERROR Document Error The file is not in a valid PDF format. Please use the extracted text. PDF Conversion   PDF_CONVERSION_FAILURE  
NATIVE_PDF_CANNOT_CONVERT_FORMAT Document Error The format is not directly supported on this platform. Please manually convert and re-ingest the data. PDF Conversion   PDF_CONVERSION_FAILURE  
BLOCKLISTING_ERROR Non-Document Error Unable to process blocklisted terms. Blocklisting blacklist_id (from PiiBlacklistedData)   An interruption occurred during the blocklisting stage, and the term is failed to be blocklisted.
DETECTION_CLASSIFICATION_ERROR Non-Document Error Unable to classify the detections for the document. Machine Learning Detection classification_id (from PiiClassificationRows)   The process for verifying the PI and creating annotations was interrupted, the detection entry will not show up.
DETECTION_MODEL_RUN_ERROR Non-Document Error Prediction models failed to run for document: Machine Learning Detection classification_id (from PiiClassificationRows)   An exception occurred when trying to load the prediction model. Please contact Relativity Support
DETECTION_MODEL_LOAD_ERROR Non-Document Error Detection models failed to load. Machine Learning Detection classification_id (from PiiClassificationRows) | detectorType   An error occurred when trying to load the detector model. Please contact Relativity Support
DETECTION_MODEL_TRAINING_ERROR Non-Document Error Detectors failed to train. Detector Training detectorType   An interruption occurred during training detector. A model cannot be generated for the detector.
APPLY_ANNOTATIONS_ERROR Document Error Unable to apply document-level annotations. Machine Learning Detection   JAVA_DETECTORS_FAILURE  
DOCUMENT_SCORER_ERROR Document Error Unable to score the document. Document Scoring   DOCUMENT_SCORER_FAILURE  
PII_DOCUMENT_NOT_FOUND Document Error Document not found in the database. Validation | Overlap Removal   OVERLAP_DETECTOR_FAILURE  
OVERLAP_REMOVAL_ERROR Document Error Overlapping annotations could not be removed. Overlap Removal   OVERLAP_DETECTOR_FAILURE  
DOCUMENT_STAT_ERROR Document Error Unable to update document statistics. Document Statistics   DOCUMENT_STATS_FAILURE  
DAT_ENTRY_NOT_FOUND Document Error DAT entry not found. Validation      
TEXT_NOT_FOUND Document Error Text file for the document could not be found. Validation, PDF Conversion   PDF_CONVERSION_FAILURE  
MISSING_EXCEL_NATIVE Document Error Native file for Excel document could not be found. Validation      
PDF_ANNOTATIONS_ERROR Document Error PDF coordinates were not added to annotations. PDF Annotation generation   PDF_ANNOTATION_FAILURE  
REPORT_GENERATION_ERROR Non-Document Error Unable to generate reports. Report Generation Report type name   An interruption occurred during report generation. The specific report failed to be generated.
EXCEL_HEADER_STATS_ERROR Document Error Unable to generate header statistics. Excel Tag Stats Processing      
EXCEL_HEADER_STAT_WRITER_ERROR Document Error Unable to write header statistics to the database. Excel Tag Stats Processing      
EXCEL_HEADER_ASSIGNMENTS_ERROR Non-Document Error Unable to process header assignments. Excel Header Detection headerValue from PiiExcelHeaderAssignment   An interruption occurred during insert/delete annotation for spreadsheet, failed to add/delete annotation.
EXCEL_STATS_UPDATE_ERROR Document Error Unable to update spreadsheet statistics. Excel Header Detection   EXCEL_HEADER_ASSIGNMENT_FAILURE  
PI_MAPPING_ERROR Document Error Unable to retrieve PI types and detector data. Excel PI Contents Detector   EXCEL_ANNOTATION_GENERATION_FAILURE  
NATIVE_PII_DOCUMENT_NOT_FOUND Document Error Unable to find the native file or document in the database. Excel PI Contents Detector   EXCEL_ANNOTATION_GENERATION_FAILURE  
FILE_READ_ERROR Document Error Failed to read the file. Excel PI Contents Detector   EXCEL_ANNOTATION_GENERATION_FAILURE  
UNSUPPORTED_FIL_EXTENSION Document Error Unsupported file extension provided. Excel PI Contents Detector   EXCEL_ANNOTATION_GENERATION_FAILURE  
EXCEL_SEARCH_TERMS_ERROR Non-Document Error Unable to process search terms. Excel Header Detection SearchTermToProcess   An interruption occurred when processing search term annotations on spreadsheet.
EXCEL_SEARCH_TERM_STATS_ERROR Document Error Unable to update search term statistics. Excel Header Detection      
CHANGE_RELATIVITY_STATE_FAILURE Non-Document Error Failed to change Relativity state. Report Generation ""   Unable to reset Relativity state for report generation.