Project setup

A Sample-Based Learning project is an object in which you define the document set you plan on using as the basis for your predictive coding job, as well as the fields you want Relativity to recognize as responsive so that it can differentiate between responsive and non-responsive documents from that set. Once you create a project, you run rounds on it to further determine how documents should be coded.

The page contains the following sections:

Creating a Sample-Based Learning project

To create an Assisted Review project:

  1. Click the Sample-Based Learning tab and then click the Projects tab.
  2. Click New Assisted Review Project.
  3. Complete all required fields on the project settings page. See Fields.
  4. Click Save. This stores the new project.
  5. When you create a new project, Sample-Based Learning creates a saved search folder for the project containing the saved search you specified in the project console for the Documents to be Categorized field.

      Notes:
    • A saved search folder designated for the sample sets under the project saved search folder will be created when you create your first round and generate a sample set. This folder will contain a saved search for the sample set of documents that are generated for each round you create. Sample-Based Learning uses the sample set saved search as the data source when you are automatically creating batch sets for reviewers. See Rounds for more information.
    • The default category searches (e.g., Categorized, Not Responsive, Responsive, and Uncategorized) are automatically created in the project-specific saved search folder when you finalize your first training round and categorize documents for the first time. After documents are categorized, the documents in each search set will by default have a minimum coherence score of .7 in relation to the example documents that categorized them. See Viewing categorized and uncategorized documents for your RAR project for more information.

Fields

The project settings layout contains the following fields:

(Click to expand)

  • Project name - the name of the project. Make this unique to differentiate it from other projects in the workspace.
  • Project prefix - the group of letters used to identify the project in various reports and layouts. For example, P01. This is automatically populated for you when you go to create the project, but it is editable. You can't use some special characters, such as / and \, in the project name or prefix or you will receive an error.
  • Project status - an automatically populated read-only display of the project's current state. Status values and their meanings are covered later on this page.
  • Project description - a text box for entering a description for the project.
  • Analytics index - the index used to categorize documents during the project. Click ellipsis button to display a list of all Analytics indexes. If there are no active indexes in the workspace, or if you want to create an index specifically for the purposes of this project, you must navigate to the Search Indexes tab and create an index before beginning the project. This index does not need to be active when you add it to the project; however, the index must be active when you are finishing a round. If the index becomes inactive at the time you are finishing a round, you get an error.
  • Documents to be categorized - the saved search containing all the documents included when categorizing the project.
  • Note: The saved search you use should exclude documents that cannot be categorized.

  • Designation field - the field used to code each document. Click ellipsis button to display a popup picker of available single choice fields. If the desired field is not available, you must create it in the Fields tab. This value is typically a standard designation or responsiveness field. If you have multiple projects in the same workspace, you must use a different Designation field for each project. This single choice field must contain two choices; more than two choices results in an error when you attempt to save the project. Once you save the project, you can't add, delete, or rename a choice for the Designation field. This is to ensure that all statistics, graphs, and reports are calculated and render correctly. If you delete the project, the object rule that disallows you from adding or removing a choice from the designation field is also deleted.
  • Positive choice for designation - select the positive choice radio button for the Designation single choice field that you selected above. The positive choice for this field refers to the set of documents you are trying to find (e.g., the responsive or relevant documents). Sample-Based Learning uses this selection when calculating precision and recall for the control set. This field is required and populated with two choices after you select a Designation field above. You can't edit this field after you save the project. For more information on how this field correlates to precision and recall, see precision and recall considerations.
  • Designation excerpt field - used for applying text excerpts from documents to the coding layout during manual review for designation. Using excerpts while manually reviewing enhances the training of the system because the relevant part of the document is included as an example. The field you select here is available in the viewer's context menu for reviewers' coding. Click ellipsis button to display a list of available long text fields.
  • Use as an example field - used to indicate which documents are good examples for training the system. This field is set to yes by default on all documents in the sample set, except for control set rounds, to indicate that those documents are examples. The reviewer should de-select this field on a document if the document is not a good example. Doing this prevents poor examples from being included in the Analytics examples. You can't use an example field across projects.
  • Note: For more information on adding example documents, see Best practices for adding example documents.

  • Key issue field - the field reviewers use to code documents with a single issue. Click ellipsis button to display a list of available single choice fields. You can't edit this field once you save the project. A second categorization set is created and run for this issue field once you save the project. If you don't add an issue field when initially creating the project but you want to later, you can only add an issue coding field between rounds. You can use the same Key issue field across multiple projects. See Modifying issue importance for more information on how to weight issue importance.
  • Key issue excerpt field - used for applying text excerpts from documents to the coding layout during manual review for issues. Using excerpts while manually reviewing enhances the training of the system because the relevant part of the document is included as an example.The field you select here is available in the viewer's context menu for reviewers' coding. Click ellipsis button to display a list of available long text fields. See Modifying issue importance for more information on how to weight issue importance.
  • Confidence level - the probability that the rate in the sample is a good measure of the rate in the project universe. This value is the default for the confidence level field when you start a round. The choices are: 99%, 95%, and 90%. Selecting a higher confidence level requires a larger sample size.
  • Sampling type - the method used to create the sample set. The sample set is the randomly-selected group of documents produced by Sample-Based Learning to be used for manual review as a means of training the system. Select one of the following:

    • Statistical sampling - creates a sample set based on statistical sample calculations, which determines how many documents your reviewers need to code in order to get results that reflect the project universe as precisely as needed. Selecting this option makes the Margin of error field required.
    • Percentage - creates a sample set based on a specific percentage of documents from the project universe. Selecting this option makes the Sampling percentage field required.
    • Fixed sample size - creates a sample set based on a specific number of documents from the project universe. Selecting this option makes the second Fixed sample size field required.

    Note: To execute sampling, Sample-Based Learning uses a randomization algorithm called the Fisher-Yates shuffle, which guarantees an efficient and unbiased result.

  • Margin of error - the predicted difference between the observed rate in the sample and the true rate in the project universe. This is the amount of random sampling error you can allow for the project when you select a sampling type of statistical sampling. The options are +/- 0.5%, +/-1.0%, +/-1.5%, +/-2.0%, +/-2.5%, and +/-5.0%. Selecting a lower margin of error requires a larger sample size.
  • Sampling percentage - the percentage of the universe you want to act as your sample. Enter any percentage between 1-100% when you select percentage as the sampling type.
  • Fixed sample size - the number of documents out of the universe that you want to act as your sample when you select fixed samples size as the sampling type.
  • Automatically create batches - determines whether or not batch sets and batches are automatically created for the project's sample set(s) to expedite review kickoff. Selecting Yes here makes the Maximum batch size field below required. The batch set and batches created from this project are editable after you create the project. By default, this field is empty. The value you select here appears as the default value when you are starting the first round.
  • Enter email addresses - an optional text box where you list the email addresses of all recipients you want to receive notifications when various parts of the project have completed. Separate email addresses with a semi-colon. Email notifications are sent if the project encounters an error and after the following parts of the project have completed:
    • Sample set creation
    • Manual document review of the sample set
    • Categorization
    • Saving categorization results

Error scenarios when saving or editing a project

To preserve the unique settings of each project in the workspace, Relativity prevents you from performing the following actions. If you attempt to do any of these, you get an error:

  • Save a new project under the name of an existing project.
  • Save a new project with an existing project prefix or a prefix already used on a Categorization Set. For example, if you deleted a Sample-Based Learning project but did not delete the associated Categorization Sets and try to reuse the same prefix, you get an error.
  • Save a project with a Document coding field that does not contain two choices.

Console on the saved project layout

The Sample-Based Learning console sits on the right side of the saved project layout. Upon first saving the project, limited options are available to select until the project creates the necessary Analytics categorization sets.

Click Refresh Page to update the console.

The console provides the following options:

  • View Project Home - takes you to the project home layout, where you can view the project status, progress, and a list of all rounds in the project. For details, see Project Home.
  • View Project Settings - takes you to the project settings layout, where you can view, edit, or delete the settings you specified when you created the project
  • Start Round - start a new round for the project by selecting the appropriate round and sampling settings
  • Finish Round - marks the completion of the round at its current state and categorize and save results, depending on how many rounds you’ve completed and where you are in the project. This changes to Finish Round only after the round has been started.
  • View Round Summary - launches a graphical summary of the project categorization results by round
  • View Rank Distribution - launches a graphical summary of the percentage of documents categorized for each designation choice in rank ranges
  • View Overturn Summary - launches a graphical summary of overturned documents and percentages per round. This is only available after the second round has started.
  • View Project Summary - launches a report that provides a consolidated set of quality metrics from the entire project so that you can see the state of the project based on the last round completed. This includes manually coded documents, project categorization results, project overturn results, and control set statistics.
  • View Overturned Documents - launches details on documents that were manually coded by a reviewer differently from the value the system applied. This report is only available after the second round is started.
  • View Saved Categorization Results - launches details on all saved categorization results for issues and designation.This report is only available if results were saved while finishing a previous round.
  • View Issue Summary - launches a graphical summary of what percentage of documents in the project were coded with what issues
  • View Designation-Issue Comparison - launches a graphical representation of documents' designation vs. issue categorization results
  • View Control Set Statistics - launches a graph and set of tables breaking down categorization results for all documents selected for a control set round, as well as data on precision, recall, and F1. This is only available if you have completed a control set round.
  • View Control Set Documents - launches a view that reflects how all documents in the control set were categorized each round. This is only available if you have completed a control set round.
  • Retry Errors - kicks off an attempt to retry errors encountered during the project. If the error occurs on a categorization set, this option is disabled and you have to go to the Categorization Set to resolve the error.
  • View Errors - takes you to a layout containing details on all errors encountered
  • View Audit History - takes you to a layout containing all audited actions executed during the project
  • Refresh Page - updates the page to its current status. After first saving the new project, you must click this at least once in order to enable the Start Round button. Refreshing the page after you start a round also shows the current state of the round's review progress in the round list.

For more information on reports, see Sample-Based Learning reports.

Project Home

When you navigate to the Project Home layout from the Sample-Based Learning console, you can view the Sample-Based Learning Round object, which provides a number of fields to help you understand the progress of your project.

  • Round name - the name given to the round.
  • Round type - the type of round: Training, Quality Control, Pre-coded Seeds, or Control Set.
  • Round description - the text, if any, entered to describe the round.
  • Saved search name- the name of the saved search used in the round.
  • Sampling type - the sampling type used for the round.
  • Sample set size - the number of documents in the round's sample set.
  • Review count for designation- number of documents that have been coded with the designation field.
  • Seed count - the number of documents in the round's sample set that are being used to train the system. This includes documents that have been excerpted.
  • Eligible sample docs - the number of documents in the round eligible to be included in the sample set.
  • Docs in round - the total number of documents in the round.
  • Docs in project - the total number of documents in the project list.

Viewing categorized and uncategorized document saved searches

When you create your first project, Sample-Based Learning creates an unsecured RAR Saved Searches folder. For every project you create, Relativity creates a project-specific folder beneath the RAR Saved Searches folder containing the saved search you specified in the project console for Documents to be Categorized. When you create your first round, a RAR sample sets folder is created that will contain saved searches for each round's sample set of documents that are batched out to reviewers.

After the first categorization occurs during the round finish process, the category saved searches are automatically created in the project-specific folder (e.g., saved searches that return Categorized, Not Responsive, Responsive, and Uncategorized documents in the project universe).

Note the following:

  • You can add a search to the RAR Saved Searches folder, and you can select it when you start a round.
  • Saved searches inherit security from the folder they are in. The RAR Saved Searches folder is unsecured by default.
  • A saved search will still be created if there are no documents for a designation choice.
  • If the system cancels the round, no saved search is created.

Note: All automatically-created RAR saved searches contain the phrase RAR created in the Keywords field to distinguish them from other searches.

All of the automatically-created searches listed below include the saved search criteria for your project's original saved search that you specified in the Documents to be Categorized field on the project console. (i.e., they're all rooted in your project's original document universe).

Each search contains the following as the first field condition.

  • Field: (Saved Search)
  • Operator: Document is in
  • Value: <Your project saved search>

You can select these searches for future rounds of the project in the Saved search for sampling field when you're starting the round. The automatically-created saved searches are:

  • <Project Saved Search> - this is the saved search you specified in the Documents to be Categorized field on the project console. The conditions/criteria are:
    • Field: (Saved Search)
    • Operator: Document is in
    • Value: Your project saved search
  • <Project Saved Search> - Categorized - returns all categorized documents; this focuses the round's sampling to categorization results for the purposes of QC. This search includes the <Project Saved Search> criteria and the following additional criteria:
    • Field: Categories - <Project Prefix> RAR Designation Cat. Set
    • Operator: these conditions
    • Value: Categories <Project Prefix> RAR Designation Cat. Set; is set
  • <Project Saved Search> - Not Responsive - returns all documents that were categorized as not responsive during designation coding. This search includes the <Project Saved Search> criteria and the following additional criteria:
    • Field: Categories - <Project Prefix> RAR Designation Cat. Set
    • Operator: these conditions
    • Value: [Categories - <Project Prefix> RAR Designation Cat. Set] is any of these: Not Responsive
  • <Project Saved Search> - Responsive - returns all documents that were categorized as responsive during designation coding. This search includes the <Project Saved Search> criteria and the following additional criteria:
    • Field: Categories - <Project Prefix> RAR Designation Cat. Set
    • Operator: these conditions
    • Value: [Categories - <Project Prefix> RAR Designation Cat. Set] is any of these: Responsive
  • <Project Saved Search> - Uncategorized - returns all uncategorized documents and focuses the round's sampling to those documents that haven't been categorized yet. This search includes the <Project Saved Search> criteria and the following additional criteria:
    • Field: Categories - <Project Prefix> RAR Designation Cat. Set
    • Operator: not these conditions
    • Value: Categories <Project Prefix> RAR Designation Cat. Set; is set

Viewing round sample set document saved searches

Sample-Based Learning automatically creates a saved search for each round's sample set in the project. Sample-Based Learning then uses those searches as data sources when automatically creating batch sets for reviewers. Sample-Based Learning puts sample set saved searches into the RAR Sample Sets subfolder of the project's saved searches folder.

Note: Sample set saved searches do not show up as choices in the Saved Search for Sampling field on the Start Round layout.

A sample set saved search is identified as <Round Name> Sample Set and contains the following criteria:

  • Field: RAR Sample Set
  • Operator: these conditions
  • Value: [RAR Sample Set] is any of these: <Round Name>

Note: The RAR Sample Set folder and the searches it contains are deleted when the project is deleted. An individual sample set saved search is deleted when the round that created it is deleted.

Viewing a round's overturned documents (and seeds causing the overturn)

When you save a project, Relativity creates a multi-choice Document field called RAR overturn status - <Project prefix> that identifies overturned documents and the seed documents and excerpts that caused overturns per round. Each round that you start in the project automatically creates choices for overturned document, seed document, and seed excerpt, except for the first round, which can't contain overturns. You can select these in the field tree and/or create a view or a saved search with these sets as criteria.

Overturn status in field tree

These fields are valuable because:

  • They make it easy to aggregate the documents found in the Overturned Documents report on the project, which reviewers might not have access to view.

Note the following about the seed excerpt choice in the overturn status field:

  • For every overturn caused by a seed excerpt, the seed document containing that excerpt is tagged with the seed excerpt choice for the round in which the overturn occurred.
  • If you don't select a Designation excerpt field for the project, the seed excerpt choices are not created.
  • The seed excerpt choice is still created if there were no overturns caused by seed excerpts.

Viewing a RAR project's categorized documents in the Field tree

You can also view documents that have been categorized by RAR for both issues and designation in the field tree.

RAR field tree categories are identified as Categories - <RAR project prefix> RAR Designation Cat. Set or Categories - <RAR project prefix> RAR Issue Cat. Set.

When you expand the list, you see the available designation or issue categories and also a [Not Set] node which you can use to view documents that have not been categorized in your project.

Click on the desired field tag to view the corresponding documents.

Modifying issue importance

When you save a project Relativity automatically creates the Assisted Review Issue Importance RDO and populates it with the issue choices attached to the key issue field selected on the project. You can find this object and the issue choices at the bottom of the project settings layout.

By default, all issues have the same Medium Importance value. Any issue choices added to or removed from the key issue field are automatically reflected in this list.

If you want to make an issue more or less important than its default value of Medium, you can modify the relative importance of that issue. Doing this can provide more insight later when you refer to the Designation-Issue Comparison report.

To modify issue importance:

  1. Click the Sample-Based Learning tab and select the project you wish to edit.
  2. Scroll down to the Assisted Review Issue Importance list on the project settings page.
  3. Click Edit next to a listed issue to change the level of importance.
  4. In the Importance field, select one of the following weight values to apply to the issue:
    • High (500%)
    • Medium (100%)
    • Low (10%)

Changing the importance weighs the more important issues more heavily than the less important issues when calculating the total Issues Rank on the Designation-Issue Comparison report.

See Designation-Issue Comparison report for more information.

Deleting a project

To delete a project, you must have the Delete Object Dependencies permission, which is under Admin Operations. To confirm or to get this permission:

  1. Navigate to the Workspace Details tab
  2. Click the Manage Workspace Permissions link in the Relativity Utilities console.
  3. Click Edit Permissions.
  4. If the box next to Delete Object Dependencies under Admin Operations is unselected, select it.
  5. Select the Delete radio button on all of these objects. You must also have the delete permission on all Assisted Review Objects.

To delete a project from the project layout:

  1. Click Delete.
  2. Click OK on the confirmation message. This message states that you are about to delete all rounds and audits associated with the project and that the categorization sets associated with the project won't be automatically deleted.
  3. Note: A project with saved categorization results may take longer to delete than a project that has not been finalized. Allow extra time for a finalized project to be deleted.

To delete a project through a mass operation, perform the following:

  1. Select the checkbox next to the project you want to delete.
  2. Select Delete from the mass operations drop-down menu in the lower left corner of the view.
  3. You will receive a pop-up stating that deleting this project will delete children and unlink associative objects. Click Delete.