Find similar documents

You can use Find similar documents to identify documents that are conceptually similar to the one you are viewing. Relativity ranks the documents based on the conceptual similarity of their content in the concept space rather than a strict word comparison.

When you click Find Similar Documents in the Viewer, the entire document is submitted as a query string. The process is similar to a concept search, except instead of a query string, the whole document's position in the concept space is used as the query. A hit sphere with minimum concept rank of 60 is drawn around the document, and any documents that are within that hit sphere are returned as search results. This minimum rank value is not configurable.

This page contains the following information:

Special considerations

Note the following special considerations about running conceptual analytics operations:

  • The following security permissions are required to run the operations:
    Object SecurityTab Visibility
    • Document - View
    • Analytics Index - View
    • Analytics Categorization Set - View, Edit, Add
    • Analytics Categorization Category - View, Edit, Add
    • Analytics Example - View, Edit, Add
    Documents
  • In order to run an operation from the viewer, the document must be in the data set of an active Analytics index.
  • You can only run operations in the Native Viewer and Extracted Text Viewer.

Best practices

  • Large documents with many topics are not optimal for finding similar documents. Instead of using this feature, select the text that is relevant to your query, and then submit that text as a concept search.

Running find similar documents from the viewer

To find similar documents, perform the following steps:

  1. Select a document from the document list and open it in the Native Viewer or Extracted Text Viewer. This is your primary document.
  2. Click the View Similar Documents button at the bottom of the viewer. Alternatively, right-click in the white space of the document and select Find Similar Documents

When the operation is executed, all of the unfiltered text of the document is used as the query. The Documents list pane opens and displays the Similar Documents tab, which contains other conceptually similar documents. This tab contains the following information about the results:

  • Rank - the conceptual similarity of the document to the primary document. The higher the rank, the higher the relevance to the query. A rank of 100 represents the closest possible distance. The rank doesn't indicate the percentage of shared terms or the percentage of the document that isn't relevant.
  • Control Number - the control number of the document.

Navigating results

Once the conceptual analytics operation is executed, the following takes place:

  1. The breadcrumb navigation includes Conceptual Analytics if you have run a concept search, find similar documents, or keyword expansion. If you navigate back to the Documents tab, this breadcrumb is removed.
  2. The Documents list panel updates to display the results of the operation.
  3. The document navigation updates to display the number of documents returned by the operation.

(click to expand)

If you have more than one active index, the oldest active index (lowest Artifact ID) is chosen by default. Click the drop-down to select a different active index.