Prioritize documents for review

Once Populate Privilege Results is complete, aiR for Privilege will populate various Result fields to help filter and prioritize categories of documents for review.

When prioritizing documents for aiR for Privilege, it is important to understand the concepts of Precision and Recall:

  • Precision – Measures the accuracy of the positive predictions made by the model or the percentage of documents in the predicted privileged document population that are truly privileged.
  • Recall – Measures the ability of the model to identify all positive results or the percentage of truly privileged documents that were predicted as Privileged by the model.

aiR for Privilege was designed to optimize for Recall, to ensure all Privileged documents are captured in a dataset. This will provide confidence that if all annotations are completed correctly, that all truly privileged documents are predicted as Privileged. However, because aiR for Privilege optimizes for Recall, over-predictions are expected and will require a review of the documents predicted to be privileged to validate they are truly privileged.

Note: aiR for Privilege analyzes the whole email and all segments. It also considers which segment the privileged content is in.

  • If the bottom segment contains privileged content and the segments on top have communicating privilege breakers, then the whole document will be Not Privileged.
  • If the bottom segment contains a communicating privilege breaker, but then that individual drops off the conversation and the top segment contains privilege content, then the whole document will be Privileged.

To ensure an efficient review, this documentation provides a series of strategies for how to use the various results fields to validate aiR for Privilege Results.

Predicted Privileged documents

Because aiR for Privilege optimizes for Recall, over-predictions are expected. To help ensure efficient review of documents predicted as Privileged, aiR for Privilege populates the Priv::Category field to help drive QC of results.

Certain categories are designed to help indicate the likelihood of presence of Privileged documents and where to prioritize review time.

Estimated Precision  Category Name Explanation Suggested Review
High Precision Wholly Privileged Privilege Redaction Attorney Involvement Related to Legal Matter Documents in these categories are documents that aiR for Privilege has high confidence of privileged content. When reviewing these documents, validate the Rationale and Considerations and make any necessary updates to the Draft Log Description.
Medium Precision Borderline aiR for Privilege will categorize documents that it was uncertain of the presence of Privileged content without outside context into this category. It is recommended to approach documents in this category with the highest level of scrutiny, using the Rationale, Consideration, and Citations for each document to assign a proper privilege coding.  If privileged, use the Draft Log Description to create a Privilege Log Description.
Low Precision Privileged Individual / No Privileged Content Legalese / No Privileged Content Documents in this category are documents that would normally be false positives captured by priv screens, but usually don’t contain privileged information, according to our models and algorithms.  When reviewing these documents, validate the Rationale, Consideration, and Citations to ensure that there is no privileged content.

Predicted Not Privileged documents

aiR for Privilege has an estimated recall of >99%. Therefore, it is expected that all privileged documents in the dataset are expected to be predicted as Privileged.

However, underpredictions may occur due to various reasons and it is important to validate the documents predicted as Not Privileged. To conduct a check of documents marked as not Privileged:

  1. Filter for documents where Priv::Prediction = Not Privileged
  2. Create a Sample of documents
  3. Review the sample of documents to validate that the Rationale and Considerations for documents predicted as Not Privileged is correct.

If an underprediction is identified, there could have been a privilege conferring attorney that either was accidentally annotated by an annotator as Privilege Neutral or Privilege Breaking, or one of its aliases was not properly merged with the Entity. To validate the Entity presence, add the following fields to the View to confirm the Entities on any underpredicted documents were correctly annotated:

  • Priv Sender/Recipient Entities (all email segments)
  • Priv Sender/Recipient Entities (all email segments)::Annotator Decision
  • Priv Sender/Recipient Entities (all email segments)::Final Privilege Status

If the entity was annotated incorrectly, use the Priv Sender/Recipient Entities (all email segments) chain to identify other documents that may have been impacted.