dtSearch

dtSearch icon

Relativity's dtSearch engine provides advanced search functionality such as proximity, stemming, and fuzzy searches across any field type. It also supports the use of Boolean operators and custom noise word lists as well as the basic searching features available in keyword searches. After building your dtSearch index, the Dictionary search option is also available.

This page contains the following:

See these related pages:

Also see these related recipes:

Creating a dtSearch index

You can build custom dtSearch indexes for a subset of documents or for certain document fields in a workspace. You must have the appropriate permissions to complete this task. See Workspace security.

Before you begin, you need to create a saved search that includes the fields that you want to include in the index. You can then name the index based on the document search set used to create it.

Note: Within a field, dtSearch truncates any string longer than 32 characters that doesn't contain a space character. It indexes only the first 32 characters of the string. For more information, see Searching for words longer than 32 characters.

To create a new dtSearch index:

  1. Navigate to the Search Indexes tab and click New dtSearch Index. The dtSearch index form appears with required fields in orange.
  2. Complete the fields on the dtSearch index form. See Fields.
  3. Click Save to display the index details page. The index details page now displays three additional read-only fields and the dtSearch index console. See Fields and dtSearch console.
  4. Click Build Index: Full. A dialog box asks you to verify that you want to run a full build. You can also select Activate this index upon completion. Indexes must be active in order to search them.

    Note: Click OK to build your index. Network problems can slow down your dtSearch builds. If a dtSearch manager or worker agent encounters a network-related error during the build process, it will execute up to three retry attempts at 30 second intervals.

  5. If you didn't select Activate this index upon completion in the dialog box, click Activate Index on the console. The index won't activate if there are errors. Activating an index makes it available in the Search menu.
  6. (Optional) Click Refresh Page at any point in the build to see the index's current build status. If errors occur during the build, the Retry Errors button enables on the console under the Errors and Status heading. Click this button to attempt to resolve any errors.

Once the index is built, the console enables additional options. See dtSearch console.

Accent-insensitive indexes

By default, Relativity builds an accent-insensitive index. In an accent-sensitive index, some characters are translated to the base character, which causes those characters and any terms containing those characters to be treated the same in a Search Terms Report.

Note: dtSearch uses .ABC files, but only for characters in the range from 33 to 127. All other characters are handled according to the definitions in the Unicode character tables.

Example: accented characters like á or ñ are converted to the unaccented versions, a or n.

Example: If you searching for the term fröhlich, searching that term as fröhlich or frohlich would both return the hit. However, highlighting in the Viewer may not display both variations.

Fields

The dtSearch index page includes the following fields:

  • Name - the dtSearch index name. This name appears within the "search with" menu in the Documents tab.
  • Order - the integer value (positive or negative) representing the position of the index in the search indexes list. Indexes sort from lowest (top) to highest (bottom) order number. Those with the same order number sort alphanumerically.
  • Searchable set - the saved search of documents to be indexed. Relativity indexes the documents returned by the search as well as the returned documents' fields. It may use a dtSearch or Analytics index. Make sure the index is active.

      Note: When creating a dtSearch index, it's best practice to only index the long text fields you want to search. Move all other fields to the Fields (Required) left column. Typically, you only index the extracted text field if you're searching the body of emails. If you select a saved search that doesn’t contain a long text field, a warning message appears.

  • Index share - populated by default by a system admin.
  • Auto recognize date, email, and credit card numbers - a yes/no field. See Auto-recognition for details.
  • Create accent sensitive - a yes/no field. Setting this field to Yes allows dtSearch index builds to be sensitive to accents and other language-specific characters.
  • Send Email Notification upon Completion or Failure to - send email notifications when your index population fails or completes. Enter the email address(es) of the recipient(s). Separate multiple entries with a semicolon.
  • Sub-index size - determines the size of each sub-index created when you generate a dtSearch index. The minimum value is 1000.

    Note: To set a new default for this field, a system admin can edit the dtSearchDefaultSubIndexSize instance setting. See Instance settings' descriptions.

  • Sub-index fragmentation threshold - determines the fragmentation level at which the system automatically compresses a dtSearch sub-index during an incremental build. An incremental build automatically compresses any sub-index equal to or greater than the fragmentation threshold. The Sub-index fragmentation threshold value must be equal to or greater than one.

    Note: The dtSearchDefaultSubIndexFragmentationThreshold instance setting value determines the default Sub-index fragmentation threshold. It is set to 9 by default.

  • Noise Words - edit the list of words that are ignored during indexing.
  • Alphabet - edit the index’s alphabet file. See Making a special character searchable.

    Note: If you search for long, uninterrupted strings that have no spaces or word breaks, such as when you've made a character searchable, dtSearch truncates the string after 32 characters and inserts a wildcard. For more information, see Searching for words longer than 32 characters.

Index status fields:

  • Active - indicates whether the index is active (Yes) or inactive (No).

Note: File type fields, linked fields, and HTML enabled fields may have text associated with them that is not visible in your document views. This includes the system FileIcon field, which is populated with the original file name upon import. See Fields.

dtSearch console

The dtSearch index console includes the following options:

dtSearch Index console options

  • Build Index: Full - creates a full build of the index. During the build, the button toggles to Cancel Build. If you add an additional field to your index or change the auto-recognize or accent sensitive settings, you must perform a full build.

    Note: Canceling the build aborts the indexing thread, leaving the index in an unstable state. Relativity deletes these indexes from the population table and gives them an inactive status. You can't search against an index with an inactive status until you run a full build. Canceling also deletes the index files in the index share.

  • Build Index: Incremental - updates an index after adding or removing documents. During an incremental build the existing index remains available for searching, but changes to the index are not reflected in search results until the incremental build is complete. Canceling an incremental build returns the index to its previous state.

    Note: The incremental build process copies each sub-index that requires modification, updates the copy, then replaces existing sub-indexes with the updated copies. Duplicate sub-indexes are removed when the Case manager agent runs. The system automatically compresses a sub-index during an incremental build only if the sub-index fragmentation level is equal to or above the Sub-index fragmentation threshold value.

  • Compress Index - compresses the dtSearch index returning all sub-indexes with a fragmentation level greater than zero to a fragmentation level of zero. You can search against the original (uncompressed) dtSearch index while compression is in progress. Once compression is complete, the system automatically replaces the old sub-indexes with the defragmented sub-indexes. Duplicate sub-indexes are removed when the Case manager agent runs.
  • Note: The Compress Index button only runs compression against sub-indexes that have a fragmentation level greater than zero. Canceling compression returns the index to its original fragmented state before compression began.

  • Deactivate Index - deactivates the index and removes it from the "search with" menu in the Documents tab (but not from the database).
  •  Swap Index - swaps your index with a replacement index in order to use its resources while your index builds or is inactive or disabled for any reason. This enables you to keep searching while your primary index experiences downtime. You can only select indexes in the Replacement Index with an Active status. This index you swap to doesn't automatically run an incremental update.

    Selecting the index from the drop-down list and clicking OK completes the index swap. You can't reverse the swap results in the current dialog box. You must close this swap and run again to swap back or swap another time. This functionality is useful in very limited cases for example, if you are doing a full rebuild on a very large index. Since dtSearch incremental builds are online, someone can search documents already indexed.

    Note: The Swap Index function updates anything in the Views table, which affects batches, saved searches, and nested searches.

  •  Retry Errors - enables only if errors occur, you can use this button to resolve errors.
  •  Show Document Errors - enables only if document errors occur. This button creates an exportable list of document-level errors.
  •  Show Detailed Status - shows you statistical data for the index, including:
    • Doc Count - the total number of documents in the index
    • Index Size - the size of the index in bytes
    • Created Date - the date you created the index
    • Updated Date - the date you updated the index
    • Last Build Duration - how long the last build took to complete in hours, minutes, and seconds
  • Refresh Page - shows the index's current build status.

dtSearch index page

After you create and build a dtSearch index, the dtSearch page contains several sections where you can view details about your index.

Index Status

The Index Status section of the layout is where you can view what state your dtSearch index is in. The name of the Index Status section is populated with the name of your dtSearch index. When you're building an index, this section changes to a progress bar where you can track your index's progress in real-time. When the index is no longer in progress, this section changes to a static field that displays the below fields.

  • Status - the status of the index. For example, "Active - Indexed" or "Inactive - Indexed".
  • Document Breakdown -the number of indexed documents.

dtSearch Index Information

The dtSearch Index Information section provides general details about the settings applied to your dtSearch index. This section contains the following information:

  • Name - the name of your index.
  • Order - the integer value (positive or negative) representing the position of the index in the search indexes list. Indexes sort from lowest (top) to highest (bottom) order number. Those with the same order number sort alphanumerically.
  • Searchable set - the set of documents to be indexed. You can choose from any saved search in the workspace.
  • Index share - populated by default by a system admin.
  • Auto-recognize date, email, and credit card numbers - a yes/no field.
  • Email notification recipients - the emails that receive an email notification when your index population fails or completes.

Advanced Settings

The Advanced Settings section provides sub-index details about your dtSearch index. This section contains the following information:

  • Sub-index size - determines the maximum size of each sub-index created when you generate a dtSearch index. The minimum value is 1000.
  • Sub-index fragmentation threshold - the fragmentation level at which the system automatically compresses a dtSearch sub-index during an incremental build.
  • Sub-indexes scheduled for compression - the number of sub-indexes at or above the sub-index fragmentation threshold. If one or more sub-indexes is equal to or greater than the sub-index fragmentation level, the system automatically compresses those sub-indexes during the next incremental build.

Temporary Index Details

The Temporary Index Details section only appears during an incremental build. This table displays sub-indexes that were copied from your original index and are in the process of modification during the incremental build. Once the sub-indexes in this table are updated, they replace the original sub-indexes from which they were copied. This section contains the following information:

  • Population Table - the name of the table that a sub-index is populating.
  • Build Status - the state that the sub-index is currently in.
  • Worker Agent - the name of the agent that's handling the sub-index.
  • Worker Agent Status - the state that the worker agent is currently in.
  • Index File Share - the location where your sub-index is stored.
  • Document count - the number of documents assigned to the sub-index.
  • Error(s) - any errors encountered by the sub-index.
  • Fragmentation Level - the fragmentation level of the sub-index. Any index at or above the Sub-index fragmentation threshold appears in red.

Current Index Details

The Current Index Details section displays the sub-indexes that make up your dtSearch index. This section contains the following information:

  • Population Table - the name of the table that a sub-index is populating.
  • Build Status - the state that the sub-index is currently in.
  • Worker Agent - the name of the agent that's handling the sub-index.
  • Worker Agent Status - the state that the worker agent is currently in.
  • Index File Share - the location that your sub-index is stored.
  • Document count - the number of documents assigned to the sub-index.
  • Error(s) - any errors encountered by the sub-index.
  • Fragmentation Level - the fragmentation level of the sub-index. Any index at or above the Sub-index fragmentation threshold appears in red.

View Audit

Using the View Audit button, you can see when dtSearch index settings were modified. The View Audit layout contains the following fields:

  • Action
  • Field
  • Old Value
  • New Value
  • User Name
  • Timestamp
  • Details

Temporary storage

If you specify a temporary storage location, dtSearch builds the index in this directory and then copies the index over to the final index share when the build completes. Using a temporary storage location could potentially speed up the build time and reduce network contention. See Servers.