Errors and flags

PI Detect uses document flags in a number of ways to denote information about a document. There are three types of document flags:

System Flags
TIQ Flags
User Flags

Many flag types are available by default, however lead users can create new User Flags types in a project.

System flags

The application logic creates system flags during an Ingestion or IF run and users cannot remove them. System flags denote documents for which the application has issues processing. The presence of a system flag may prevent that document from processing in certain pipeline stages.

System default flag types

The following table describes descriptions of default flag types and the action required by the user when they appear.

Flag name	Description
Exceeds Excel Size Limit	Excel file has exceeded the maximum file size supported by PI Detect and was not processed. This document will not have PI predictions and not be available for review.
System Technical Issue	A file impacted by aTechnical Issue that has prevented the document from processing for review and reporting.
The document failed to convert to PDF	A non-spreadsheet native was not converted to PDF and will not be viewable in the PDFtron viewer. Native is not viewable in PDFTtron but PI predictions are still made. The Document Report will contain information on the document.
File Extension is not currently supported	The file extension is not supported.
Exceeds Native File Size Limit	The file is too large.
Empty Text	The associated text file is empty.

TIQ flags

The application logic creates TIQ flags during an Ingestion or IF run and users can remove them. They draw attention to issues encountered when processing a document which may require user action. Each flag generates from a stage in the pipeline.

TIQ default flag types

Flag name	Stage	Description	Resolution
Document deduplication could not run on this document	Deduplication	Application runs into an error during document deduplication.	No immediate resolution is available for this error. If it impedes the output of the project, please contact Relativity Support.
Document text could not be extracted	Extraction	Application runs into an error extracting the raw text from a document.	No immediate resolution is available for this error. If it impedes the output of the project, please contact Relativity Support.
Document splitting could not be performed on this document	Document Splitting	Application runs into an error splitting the document into sub-documents. This may prevent PI detections from running on the document.	Review the document manually to make sure PI detections are complete. If it impedes the output of the project, please contact Relativity Support.
Excel header detection did not complete on this document	Excel Header Detection	Application runs into an error performing header detection. This may prevent PI predictions from completing on the document.	Manually review the document to identify personal information. If it impedes the output of the project, please contact Relativity Support.
Sheet contains conflicting name headers	Excel Header Detection	Multiple columns were tagged as the same name type in a single table. Documents with these results are blocked from the Entity Report. For example, a table cannot have 2 columns that are 'Full Name.' This can occur for Full Name, First Name, Middle Name, and Last Name fields.	Review documents or Use Spreadsheet QC to remove duplicate labels; for example, two columns labeled as "Full Name." A table can only contain a single entity per row and the entity's name is generated from the labels. Multiple labels will create conflicting information and will be excluded from the Entity Report. If all name fields need to be captured correctly, try to create multiple tables so there is only one instance of a label per table. If two tables cannot be created to capture the entity's information, remove the duplicate labels and upload another version of the document in PI Detect to capture the other entity's information. Or, the documents can be reviewed in Relativity.
Sheet contains conflicting PI Types within single column	Excel Header Detection	A single column has been tagged as multiple conflicting PI types.	Review documents or Use Spreadsheet QC to remove duplicate labels; for example, two columns labeled as "Address 1." A table can only contain a single instance of a label per row. Multiple labels will create conflicting information and will be excluded from the Entity Report. If all PI fields need to be captured correctly, try to create multiple tables so there is only one instance of a label per table. If two tables cannot be created to capture the PI, remove the duplicate labels and upload another version of the document in PI Detect to capture the other PI. Or, the documents can be reviewed in Relativity.
Excel detection process not complete due to timeout	Excel Header Detection	Excel document takes too long to process.	Manually review the document and turn off any unnecessary detectors. If it impedes the output of the project, please contact Relativity Support.
Excel header assignment did not complete on this document.	Excel Header Detection	Application runs into an error applying header assignments.	Manually review the document and turn off any unnecessary detectors. If it impedes the output of the project, please contact Relativity Support.
Spreadsheet tag statistics could not be fully generated	Excel Tag Stats Processing	Error occurred while calculating spreadsheet statistics used for Spreadsheet QC experience.	If it impedes the output of the project, please contact Relativity Support.
Spreadsheet content extraction error	Excel Content Detection	An error occurred while extracting information from the spreadsheet to generate entity records. Individuals from this document may not be available in the Entity Report.	Manually review the document to ensure that the tables and columns are labeled correctly. If the issue persists, please contact Relativity Support.
Cell values were excluded from the Entity report	Excel Content Detection	Cell values were excluded from the Normalization process and the Entity report because they failed to match the format of the assigned PI type. For example, a Social Security Number has 5 digits instead of 9.	Use the Spreadsheet Blocklisting tool to decide if some terms should be allowed. Rerun the normalization process to show these results in the Entity Report. If it impedes the output of the project, please contact Relativity Support.
Detector regex timed out	Keyword and Regex Detection	Keyword or regex takes too long to run.	Review documents manually and update the detector that is timing out. Turn off PI types that are not necessary for the project.
Detection process not complete due to timeout	Keyword and Regex Detection	Processing a document takes too long.	Change the detector regex to be less computationally expensive.
Contains errors from the NER Detector	Machine Learning Detection	Application runs into an error during Name Detection.	Manually review the impacted documents.
Error applying document classifications	Machine Learning Detection	Error occurred while applying document classifications that may impact the reported accuracy of predictions.	Manually review the impacted documents.
Personal Information counts could not be calculated	Document Statistics	An error occurred that may cause the PI and name counts to be inaccurately reported.	Manually review the impacted documents and rerun the Incorporate Feedback Pipeline. If it impedes the output of the project, please contact Relativity Support.
The document could not be scored	Document Scoring	Application runs into error scoring document.	Manually review the impacted documents and rerun the Incorporate Feedback Pipeline. If it impedes the output of the project, please contact Relativity Support.
Overlapped annotation removal could not complete	Overlap Removal	An error occurred while removing overlapping annotations from machine predictions.	Manually review the impacted documents and rerun the Incorporate Feedback Pipeline. If it impedes the output of the project, please contact Relativity Support.
PDF annotations could not be generated for document	PDF Annotation Generation	Application runs into error trying to generate annotations on the PDF view of the documents.	Contact Relativity Support.

User flags

Users can create and delete User Flags. They help manage document sets, but the presence of User Flags has no bearing on application logic.

User default flag types

Flag name	Description	Intended usage
Needs Further Review	A user has identified a document as needing extra review by a project lead.	Project lead batches out these documents for review, or reviews one by one.
Technical Issue	A user has identified a document as having some technical issue that prevented them from completing their review.	Documents get reviewed outside of PI Detect .
Illegible	A user has identified a document as not readable and, therefore, not reviewable.	Leads should make sure that all unreviewed documents are flagged to stakeholders.
Duplicate	A user has identified a document as a duplicate and may not have completed the review of the document.	Leads should make sure all documents flagged as duplicates, and likely not reviewed, are brought to the attention of project stakeholders.
Bulk Document	A user has identified a document as a bulk document with lots of PI. It should be reviewed during an alternative review process.	Leads should decide how to handle bulk documents based on the project requirements and either batch out the bulk documents for review or review them outside of PI Detect.

Stage errors

Stage errors appear in the error message column in the Document or Non-Document based error tabs on the Incorporate Feedback page.

Constant name	Type	Readable error message	Stage	Item (for non-document errors)	Error flag	Description
METADATA_GENERATION_ERROR	Document Error	Document meta data generation error.	Document Generator		UNSUPPORTED_EXTENSION \| EMPTY_TEXT \| EXCEEDS_SIZE_LIMIT	Failed to generate document statistics about the document's PI.
KEYWORD_REGEX_ERROR	Document Error	Unable to process initial keyword/regexes.	PI Text Detection		DETECTION_PROCESS_TIMEOUT_FAILURE \| JAVA_DETECTORS_FAILURE	An exception is being caught during the process, which may cause missing PI.
PROCESS_SETUP_ERROR	Document Error	An error occurred during the process setup.	PI Text Detection		DETECTION_PROCESS_TIMEOUT_FAILURE \| JAVA_DETECTORS_FAILURE	Before running the global keyword filter process, it first clears all the cache and reloads detector regexes and keywords. This error is logged if something went wrong during the setup.
DETECTION_GENERATION_ERROR	Document Error	Unable to generate detections.	Spreadsheet Regex Processing, Excel PI Contents Detector		EXCEL_ANNOTATION_GENERATION_FAILURE \| EXCEL_HEADER_DETECTOR_FAILURE \| SPREADSHEET_REGEX_FAILURE
DETECTION_PROCESS_TIMEOUT	Document Error	The detection process timed out.	PI Text Detection		DETECTION_PROCESS_TIMEOUT_FAILURE \| JAVA_DETECTORS_FAILURE
DUPLICATE_DOC_CHECK_ERROR	Document Error	Unable to determine if the document is a duplicate.	Deduplication		DEDUPLICATION_FAILURE
DOC_HAS_DUPLICATES_CHECK_ERROR	Document Error	Unable to check the document for duplicates.	Deduplication		DEDUPLICATION_FAILURE
DAT_TEXT_PATH_ERROR	Document Error	DAT entry does not include a text path.	Minimal Extractor		EXTRACTION_FAILURE
DAT_TEXT_EXTRACTION_ERROR	Document Error	Unable to extract text from the DAT record.	Minimal Extractor		EXTRACTION_FAILURE
DOC_SIZE_LIMIT_EXCEEDED	Document Error	Document size exceeds the maximum processing limit.	Minimal Extractor		EXTRACTION_FAILURE
NATIVE_FILE_NOT_FOUND	Document Error	Native file not found.	PDF Conversion		PDF_CONVERSION_FAILURE
NATIVE_FILE_INACCESSIBLE	Document Error	The native file at the given path is inaccessible.	PDF Conversion		PDF_CONVERSION_FAILURE
NATIVE_TO_PDF_ERROR	Document Error	The native file is not available in the Viewer. Please use the extracted text.	PDF Conversion		PDF_CONVERSION_FAILURE
NATIVE_PDF_CORRUPT_ERROR	Document Error	The native file is corrupted and not available in the Viewer. Please use the extracted text.	PDF Conversion		PDF_CONVERSION_FAILURE
NATIVE_PDF_CORRUPT_OR_ENCRYPTED	Document Error	The native file is either corrupted or encrypted. Please check if the file can be opened before re-ingesting the data.	PDF Conversion		PDF_CONVERSION_FAILURE
NATIVE_PDF_ENCRYPTED_ERROR	Document Error	The file is encrypted. Please decrypt the file and re-ingest the data or use the extracted text.	PDF Conversion		PDF_CONVERSION_FAILURE
NATIVE_PDF_INVALID_FILE_ERROR	Document Error	The file is not in a valid PDF format. Please use the extracted text.	PDF Conversion		PDF_CONVERSION_FAILURE
NATIVE_PDF_CANNOT_CONVERT_FORMAT	Document Error	The format is not directly supported on this platform. Please manually convert and re-ingest the data.	PDF Conversion		PDF_CONVERSION_FAILURE
BLOCKLISTING_ERROR	Non-Document Error	Unable to process blocklisted terms.	Blocklisting	blacklist_id (from PiiBlacklistedData)		An interruption occurred during the blocklisting stage, and the term has failed to be blocklisted.
DETECTION_CLASSIFICATION_ERROR	Non-Document Error	Unable to classify the detections for the document.	Machine Learning Detection	classification_id (from PiiClassificationRows)		The process for verifying the PI and creating annotations was interrupted, the detection entry will not show up.
DETECTION_MODEL_RUN_ERROR	Non-Document Error	Prediction models failed to run for document:	Machine Learning Detection	classification_id (from PiiClassificationRows)		An exception occurred when trying to load the prediction model. Please contact Relativity Support
DETECTION_MODEL_LOAD_ERROR	Non-Document Error	Detection models failed to load.	Machine Learning Detection	classification_id (from PiiClassificationRows) \| detectorType		An error occurred when trying to load the detector model. Please contact Relativity Support
DETECTION_MODEL_TRAINING_ERROR	Non-Document Error	Detectors failed to train.	Detector Training	detectorType		An interruption occurred during detector training. A model cannot be generated for the detector.
APPLY_ANNOTATIONS_ERROR	Document Error	Unable to apply document-level annotations.	Machine Learning Detection		JAVA_DETECTORS_FAILURE
DOCUMENT_SCORER_ERROR	Document Error	Unable to score the document.	Document Scoring		DOCUMENT_SCORER_FAILURE
PII_DOCUMENT_NOT_FOUND	Document Error	Document not found in the database.	Validation \| Overlap Removal		OVERLAP_DETECTOR_FAILURE
OVERLAP_REMOVAL_ERROR	Document Error	Overlapping annotations could not be removed.	Overlap Removal		OVERLAP_DETECTOR_FAILURE
DOCUMENT_STAT_ERROR	Document Error	Unable to update document statistics.	Document Statistics		DOCUMENT_STATS_FAILURE
DAT_ENTRY_NOT_FOUND	Document Error	DAT entry not found.	Validation
TEXT_NOT_FOUND	Document Error	Text file for the document could not be found.	Validation, PDF Conversion		PDF_CONVERSION_FAILURE
MISSING_EXCEL_NATIVE	Document Error	Native file for Excel document could not be found.	Validation
PDF_ANNOTATIONS_ERROR	Document Error	PDF coordinates were not added to annotations.	PDF Annotation generation		PDF_ANNOTATION_FAILURE
REPORT_GENERATION_ERROR	Non-Document Error	Unable to generate reports.	Report Generation	Report type name		An interruption occurred during report generation. The specific report failed to generate.
EXCEL_HEADER_STATS_ERROR	Document Error	Unable to generate header statistics.	Excel Tag Stats Processing
EXCEL_HEADER_STAT_WRITER_ERROR	Document Error	Unable to write header statistics to the database.	Excel Tag Stats Processing
EXCEL_HEADER_ASSIGNMENTS_ERROR	Non-Document Error	Unable to process header assignments.	Excel Header Detection	headerValue from PiiExcelHeaderAssignment		An interruption occurred during insert/delete annotation for spreadsheet, failed to add/delete annotation.
EXCEL_STATS_UPDATE_ERROR	Document Error	Unable to update spreadsheet statistics.	Excel Header Detection		EXCEL_HEADER_ASSIGNMENT_FAILURE
PI_MAPPING_ERROR	Document Error	Unable to retrieve PI types and detector data.	Excel PI Contents Detector		EXCEL_ANNOTATION_GENERATION_FAILURE
NATIVE_PII_DOCUMENT_NOT_FOUND	Document Error	Unable to find the native file or document in the database.	Excel PI Contents Detector		EXCEL_ANNOTATION_GENERATION_FAILURE
FILE_READ_ERROR	Document Error	Failed to read the file.	Excel PI Contents Detector		EXCEL_ANNOTATION_GENERATION_FAILURE
UNSUPPORTED_FIL_EXTENSION	Document Error	Unsupported file extension provided.	Excel PI Contents Detector		EXCEL_ANNOTATION_GENERATION_FAILURE
EXCEL_SEARCH_TERMS_ERROR	Non-Document Error	Unable to process search terms.	Excel Header Detection	SearchTermToProcess		An interruption occurred when processing search term annotations on spreadsheet.
EXCEL_SEARCH_TERM_STATS_ERROR	Document Error	Unable to update search term statistics.	Excel Header Detection
CHANGE_RELATIVITY_STATE_FAILURE	Non-Document Error	Failed to change Relativity state.	Report Generation	""		Unable to reset Relativity state for report generation.