Iterating on the Prompt Criteria
After running an aiR for Review job for the first time, the initial results can be used as feedback for improving the Prompt Criteria. The cycle of examining the results, fine-tuning the Prompt Criteria, then running a new job on the sample documents is known as iterating on the Prompt Criteria.
We recommend the following workflow:
-
For your first analysis, run the Prompt Criteria on a saved search of 50 test documents that are a mix of relevant and not relevant.
-
Compare the results to human coding. In particular, look for documents that aiR coded differently than the humans did and investigate possible reasons. This could include unclear instructions, needing to define an acronym or code word, or other blind spots in the Prompt Criteria.
-
Tweak the Prompt Criteria to adjust for blind spots.
-
Repeat steps 1 through 3 until aiR predicts coding decisions accurately for the test documents.
-
Test the Prompt Criteria on a sample of 50 more documents and compare results. Continue tweaking and adding documents until you are satisfied with the results for a diverse range of documents.
-
Finally, run the Prompt Criteria on a larger set of documents.
See these related pages:
- aiR for Review
- Creating an aiR for Review project
- Managing aiR for Review jobs
- aiR for Review results
- aiR for Review security permissions
- aiR for Review Analysis
See these additional resources:
Navigating the aiR for Review dashboard
When you select a project from the aiR for Review Projects tab, a dashboard displays showing the project's Prompt Criteria, the list of documents, and controls for editing the project. If the project has been run, it also displays the results.
Project details strip
At the top of the dashboard, the project details strip displays:
-
Project name
-
Version selector—if this is the first version of the Prompt Criteria, this will say Version 1 as a text-only field. For later versions, the version number becomes clickable, and you can choose older versions to view their statistics. For more information, see How Prompt Criteria versioning works.
-
Data source—name of the saved search chosen at project creation, as well as the document count.
-
If you add or remove documents from the saved search, those changes are not reflected in the aiR for Review project until you refresh the data source.
-
To refresh the data source, click the refresh symbol beside the name.
-
-
Run button—press this to analyze the selected documents using the current version of the Prompt Criteria. If no documents are selected or filtered, it will analyze all documents in the data source.
-
If you are viewing the newest version of the Prompt Criteria and no job is currently running, this says Analyze [X] documents.
-
If an analysis job is currently running or queued, this button becomes disabled and a Cancel option appears.
-
If you are viewing older versions of the Prompt Criteria, this button becomes disabled.
-
Prompt Criteria panel
On the left side of the dashboard, the Prompt Criteria panel displays tabs that match the project type you chose when creating the project. These tabs contain fields for writing the criteria you want aiR for Review to use when analyzing the documents.
Possible tabs include:
-
Case Summary—appears for all analysis types.
-
Relevance—appears for Relevance and Relevance and Key Documents analysis types.
-
Key Documents—appears for the Relevance and Key Documents analysis type.
-
Issues—appears for the Issues analysis type.
For information on filling out the Prompt Criteria tabs, see Step 2: Writing the Prompt Criteria.
Aspect selector bar
For projects that use Issues analysis or Relevance and Key Documents analysis, an aspect selector appears in the upper middle section of the dashboard. This lets you choose which metrics, citations, and other results to view for the analysis results.
For a Relevance and Key Documents analysis, two options appear: one for the field you selected as the Relevant Choice, and one for the field you selected as the Key Document Choice. For Issues analysis, an option appears for every Issues field choice that has choice criteria.
When you select one of the aspects in the bar, the project metrics section and Analysis Results both update to show results related to that aspect. For example, if you choose the key document field, the project metrics section shows how many documents have been coded as key. The Analysis Results grid updates to show predictions, rationales, citations, and all other fields related to whether the document is key. If you choose an issue from the aspect selector, the metrics section and results grid both update to show results related to that specific issue.
Project metrics section
In the middle section of the dashboard, the project metrics section shows the results of previous analysis jobs. There are two tabs: one for the current version's most recent results, and one for a list of historical results for all previous versions.
Version Metrics tab
The Version [X] Metrics tab shows these metrics:
-
Reviewer Uncoded Docs (Relevance or Relevance and Key Documents analysis only)—documents that do not have a value assigned in the Relevance Field. When viewing the Key aspect, this shows documents that do not have a value assigned in the Key Field.
-
Reviewer Coded Issues (Issues analysis only)—total number of documents that reviewers coded as having the selected issue.
-
Docs w/aiR Predictions—documents in the data source that have an aiR prediction attached from this Prompt Criteria version.
-
Docs w/o aiR Predictions—documents in the data source that do not have an aiR prediction attached from this Prompt Criteria version.
-
Errored Docs—documents that received an error code during analysis. For more information, see How document errors are handled.
-
aiR Not Relevant—documents that aiR predicted not relevant to the current aspect.
-
aiR Borderline—documents that aiR predicted as bordering between relevant and not relevant to the current aspect.
-
aiR Relevant—documents that aiR predicted relevant or very relevant to the current aspect.
-
Total Conflicts—total number of documents that have a different coding decision from aiR's predicted result. This is the sum of the Relevant Conflicts and Not Relevant Conflicts fields.
-
Relevant Conflicts—documents that aiR predicted as relevant or very relevant to the current aspect, but the coding decision in the related field says something else.
-
Not Relevant Conflicts—documents that aiR predicted as not relevant to the current aspect, but the coding decision in the related field says relevant.
Depending which type of results you view, the metrics base their counts on different fields:
-
When viewing Relevance results, the relevance-related metrics base their counts on the Relevance Field.
-
When viewing Key Document results, the relevance-related metrics base their counts on the Key Document Field.
-
When viewing results for Issues analysis, the relevance-related metrics base their counts on whether documents were marked for the selected issue.
For example, if you view results for an issue called Fraud, the aiR Predicted Relevant field will show documents that aiR predicted as relating to Fraud. If you view Key Document results, the aiR Predicted Relevant field will show documents that aiR predicted as being key.
Filtering the Analysis Results using metrics
To filter the Analysis Results table below, click on any of the metrics. This narrows the results shown in the table to only documents that are part of the metric. It also auto-selects those documents for the next analysis job. This makes it easier to analyze a subset of the document set, instead of selecting all documents every time.
To remove filtering, click Clear selection underneath the Run button.
History tab
The History tab shows results for all previous versions of the Prompt Criteria. This table includes all fields from the Version Metrics tab, sorted into rows by version.
For a list of all Version Metrics fields and their definitions, see Version Metrics tab.
Analysis Results section
In the lower middle section of the dashboard, the Analysis Results section shows a list of all documents in the project. If the documents have aiR for Review analysis results, those results appear beside them in the grid.
The fields that appear in the grid vary depending on what type of analysis was chosen. For a list of all results fields and their definitions, see aiR for Review results.
Note: aiR's predictions do not overwrite the Relevance, Key, or Issues fields chosen during Prompt Criteria setup. Instead, the predictions are held in other fields. This makes it easier to distinguish between human coding choices and aiR's predictions.
To view inline highlighting and citations for an individual document, click on the Control Number. This opens the Viewer and shows results for the selected Prompt Criteria version. For more information on using aiR for Review in the Viewer, see aiR for Review Analysis.
Filtering and selecting documents
If you check the box beside individual documents in the Analysis Results grid, this manually selects those documents for the next analysis run. The number of checked documents is reflected in the Run button's text.
To un-check all documents, click Clear selection underneath the Run button. This resets the selections, and the next analysis will run on all documents in the data source.
You can also filter the Analysis Results grid by clicking the metrics in the Version Metrics section. For more information, see Version Metrics tab.
How Prompt Criteria versioning works
Each aiR for Review project comes with automatic versioning controls, so that you can compare results from running different versions of the Prompt Criteria. Each analysis job that uses a unique set of Prompt Criteria counts as a new version.
When you run aiR for Review analysis for the first time, the Prompt Criteria you use are saved under the name Version 1. This is the initial version of the Prompt Criteria.
After that, if you edit the Prompt Criteria and save your changes, these changes are saved under Version 2. Version 2 is not finalized until you run the analysis, so you can edit the Prompt Criteria as many times as you like. When you have finished editing and are ready to see results, run the analysis again. This finalizes Version 2. Later edits are saved as Version 3 until you run the analysis the third time, then as Version 4 until you run the analysis the fourth time, and so on.
To see dashboard results from a previous version, click the arrow next to the version name in the project details strip. From there, select the version you want to see.
How version controls affect the Viewer
When you select a Prompt Criteria version from the dashboard, this also changes the version results you see when you click on individual documents from the dashboard. For example, if you are viewing results from Version 2, clicking on the Control Number for a document brings you to the Viewer with the results and citations from Version 2. If you select Version 1 on the dashboard, clicking the Control Number for that document brings you to the Viewer with results and citations from Version 1.
When you access the Viewer from other parts of Relativity, it defaults to showing the aiR for Review results from the most recent version of the Prompt Criteria. However, you can change which results appear by using the linking controls on the aiR for Review Jobs tab. For more information, see Managing aiR for Review jobs.
Revising the Prompt Criteria
After you run the analysis for the first time on a sample set, use the dashboard to examine the results and refine the Prompt Criteria.
In particular, ask the following questions about each document:
-
Did aiR for Review and the human reviewer agree on the relevance of the document?
-
Read the aiR for Review rationale and considerations. Do they make sense?
-
Do the citations make sense?
For all of these, if you see something incorrect, make notes on where aiR seems to be confused. Here are the most common sources of confusion:
-
Insufficient context. For example, an internal acronym, key person, or code word may not have been defined. To fix this, add it to the proper section of the Case Summary tab.
-
Ambiguous instructions or unclear language. To fix this, edit the instructions on the Relevance, Key Documents, or Issues tabs.
In general, consider how you would help a human reviewer making the same mistakes. For example, if aiR for Review is having trouble identifying a specific issue, try explaining the criteria for that issue with simpler language.
After you have revised the Prompt Criteria to address any weak points, run the analysis again. Continue refining the Prompt Criteria until aiR accurately predicts the human coding decisions for all test documents in the sample.
Note: aiR for Review only looks at the extracted text of each document. If a human reviewer marked a document as relevant because of an attachment or other criteria beyond the extracted text, aiR for Review will not be able to match that relevance decision.
Increasing the job size
When aiR for Review accurately matches human coding decisions on the initial sample documents, increase the sample size. Typically, we recommend starting with an initial sample of about 50 documents, then increasing it to include another 50. However, you may find a different number works better for your project.
To increase the aiR for Review job size:
-
Add the fresh documents to the saved search that acts as the project's data source. For more information about saved searches, see Creating or editing a saved search.
-
Have a skilled human reviewer review the fresh documents. We recommend doing this before running aiR for Review, so that the reviewer is not biased by aiR's predictions.
-
On the aiR for Review Projects tab, select the project.
-
At the top of the project dashboard, click the refresh symbol next to the data source's name.
-
In the Project Metrics section, click Docs w/o aiR Predictions. This selects the new documents.
-
After the document count has updated, click Analyze [X] documents.
The analysis job runs on the new documents, while the previously run documents keep their old results.
After you have run aiR for Review on the larger sample, continue revising the Prompt Criteria until aiR returns satisfactory results. Continue to increase the job size incrementally until you feel satisfied with the Prompt Criteria. After that, use the refined Prompt Criteria on the larger set of documents. You can do this either from the dashboard, or as a mass operation.
For more information, see the following articles on the Community site:
- Selecting a Prompt Criteria Iteration Sample for aiR for Review
- Evaluating aiR for Review Prompt Criteria Performance
Running aiR for Review as a mass operation
If you want to run previously refined Prompt Criteria on a set of documents, you have the option of running aiR for Review as a mass operation from the document list page.
To run aiR for Review as a mass operation:
-
From the Documents tab, select the documents you want to analyze.
-
Under Mass Actions, select aiR for Review. An options modal appears.
-
Under Prompt Criteria, select one of the following:
-
Select Existing—load a set of previously created Prompt Criteria from your workspace. This only shows Prompt Criteria that have been run at least once, and it selects the most recent version of them.
-
Create New—this closes the modal and redirects you to the aiR for Review Projects tab. At this time, it does not save your selected documents.
-
-
After you have loaded a set of Prompt Criteria, click Start Analysis.
A banner appears at the top of the page, confirming that the analysis job has been queued. This banner also updates to show when the job is complete.
To view and manage jobs that are not part of an existing project, use the aiR for Review Jobs tab. For more information, see Managing aiR for Review jobs.