Detector QC
Detector QC is the process by which leads adjust detectors for precision and recall before the review team begins their review. You can use the following suggested workflow to complete this task.
QCing batches containing a certain PI type
QC batches typically fall into one of two categories, batches containing PI and batches not containing PI. The following steps assume you have created Review Queues as outlined in Quality control.
To QC documents containing PI:
Navigate to the Review Queues tab and locate the card with the same name as the queue you want.
Click Start Review.
This opens the document viewer.
- Open the PI Detection panel on the left-hand side of the document viewer. This will show all PI predictions in the document.
- Review the detected PI. If a PI prediction is a false positive, we recommend taking note of the over-prediction in a way that makes sense to you, without logging the actual PI. You can utilize the note when you actually make the adjustment to detectors or add to the blocklist.
- After determining the accuracy of predicted PI, review the rest of the document for additional PI that should be captured.
If you see missed PI take note of the under-prediction in your records, as you will likely want to update a detector to include it going forward.
- When you are satisfied reviewing the document, click Save and Next.
Do not lock the document as it will prevent detector changes from being applied the next time DA is run.
- Continue through the documents in your batch. Look out for the document ID you started with. When you see it again, you have finished the batch.
QCing batches containing no PI
If you are QCing a No PI batch, review the document to get an idea of what type of document you are handling. Some types of documents, such as training manuals, advertisements, business receipts, or academic papers are less likely to contain PI. Emails, internal reports, and customer-related spreadsheets are more likely to contain PI.
You can use the Search feature to look up common phrases that are typically near PI. Phrases like “SSN,” “DOB,” “cell,” “mobile” and more can be used to find potential under-predictions.
If you see PI that should be captured but isn't, take note of the underprediction so that you can update the appropriate detector.