dtSearch index

You can build custom dtSearch indexes for a subset of documents or for certain document fields in a workspace. You must have the appropriate permissions to complete this task. See Workspace security.

Before you begin, you need to create a saved search that includes the fields that you want to include in the index. You can then name the index based on the document search set used to create it.

Note: Within a field, dtSearch truncates any string longer than 32 characters that does not contain a space character. It indexes only the first 32 characters of the string. For more information, see Searching for words longer than 32 characters.

Accent-insensitive indexes

By default, Relativity builds an accent-insensitive index. In an accent-insensitive index, some characters translate to the base character, which causes those characters and any terms containing those characters to be treated the same in a search terms report.

Note: dtSearch uses .ABC files, but only for characters in the range from 33-127. Relativity handles all other characters according to the definitions in the Unicode character tables.

When searching for characters in a character-insensitive index, accented characters are typically treated as their base, unaccented form. This simplifies searching because it does not matter if a character has an accent or not.

What to expect:

  • Accented characters will show up as their base character. For example, “ṽ” will be shown as “v”.
  • The search will not show you the accented character, only the base character.
  • You will not see the breakdown, decomposition, of how a character is split into simpler parts. You can still find the right base character.
  • Example: Relativity converts accented characters like á or ñ to the unaccented versions, a or n.
  • Example: If you search for the term fröhlich, searching that term as fröhlich or frohlich would both return the hit. However, highlighting in the Viewer may not display both variations.

If you want more detailed information about how characters are broken down or mapped, you can use tools like the Compart Unicode tool for extra help. The way that Compart shows that a character is a base character is by omitting the decomposition row entirely. For example, the letter v on the Compart website.

Creating a dtSearch index

To create a new dtSearch index:

  1. Use the search feature to navigate to the Search Indexes tab.
  2. Click New Search Index.
    The dtSearch Index Information form appears with required fields having an orange asterisks.
  3. Complete the fields on the dtSearch index form. See more details, see Fields.
  4. Click Save to display the index details page. The index details page now displays three additional read-only fields and the dtSearch index console. See Fields and dtSearch console.
  5. Click Build Index: Full. A dialog box asks you to verify that you want to run a full build. You can also select Activate this index upon completion. Indexes must be active to search them.
  6. Click OK to build your index.
    Note: Network problems can slow down your dtSearch builds. If a dtSearch manager or worker agent encounters a network-related error during the build process, it executes up to three retry attempts at 30-second intervals.
  7. If you did not select Activate this index upon completion in the dialog box, click Activate Index on the console. The index will not activate if there are errors. Activating an index makes it available in the Search menu.
  8. (Optional) Click Refresh Page at any point in the build to see the index's current build status. If errors occur during the build, the Retry Errors button enables on the console under the Errors and Status heading. Click this button to try to resolve any errors.

Once the index builds, the console enables more options. See dtSearch console.

Fields

The dtSearch index page includes the following fields:

dtSearch Index Information

Add information to the following dtSearch index fields:

  • Name—the dtSearch index name. This name appears within the search with menu in the Documents tab.
  • Order—the integer value, positive or negative, representing the position of the index in the search indexes list. Indexes sort from lowest, top, to highest, bottom, order number. Those with the same order number sort alphanumerically.
  • Searchable set—the saved search of documents for indexing. Relativity indexes the documents returned by the search as well as the returned documents' fields. It may use a dtSearch or an Analytics index. Make sure the index is active.
    Note: When creating a saved search for a dtSearch index, the best practice is to use long text fields. You will see a warning message, when creating a dtSearch index, if you select a saved search that does not contain at least one long text field.
  • Email notification recipients—specifies recipients to send an email notification to when your dtSearch index finishes running. Enter the email addresses of the recipients. Separate entries with a semicolon.
  • Notes—enter any considerations, or thoughts, about the search index.

Searchable Set - Current Indexable Fields

This section is a list of the fields and field types in the saved search used for indexing.

Note: The fields listed may not match those indexed if someone made updates after indexing the searchable set.

If a long-text field not included in the searchable set, it's possible to not have search any search results returned. For better search results, check the saved search to ensure a field with the field type as Long Text. For example, extracted text is included.

Advanced Settings

Add information to the following advanced setting fields:

  • Auto recognize date, email, and credit card numbers—a yes/no field. See Auto-recognition for details.
  • Create accent sensitive—a yes/no field. When set to Yes, dtSearch indexes are sensitive to accents and other language-specific characters.
  • Skip malicious files—a yes/no field. Select Yes to skip potentially malicious quarantined files and continue processing the index. The final index breakdown includes the number of skipped documents.
  • Use Default Sub-index Size—select Yes to use the default sub-index size of 16 GB.

    Sub-index size is automatically optimized for system performance, unless you select No. Then you must enter sub-index size.

  • Custom Sub-Index Size (GB)—determines the size of each sub-index created when you generate a dtSearch index. The field must contain a numerical value between 1-16 GBs.

  • Index share—the location of the file share for storing the search index. Select the bold (default) location. You will see a warning about system performance if you select a location other than the default.

Noise Words

Edit the list of words that Relativity ignores during indexing.

Alphabet

Edit the index’s alphabet file. See Making a character searchable.

Note: If you search for long, uninterrupted strings that have no spaces or word breaks, such as when you have made a character searchable, dtSearch truncates the string after 32 characters and inserts a wildcard. For more information, see Searching for words longer than 32 characters.

dtSearch index page

After you create and build a dtSearch index, the dtSearch page has several sections where you can view details about your index.

Index Status

You can view the state of your dtSearch index from the Index Status section. The name of the Index Status section populates with the name of your dtSearch index. When you are building an index, this section changes to a progress bar where you can track your index's progress in real-time. When the index is no longer in progress, this section changes to a static field that displays the below fields.

  • Status—the status of the index. For example, Active - Indexed or Inactive - Indexed.
  • Document Breakdown —the number of indexed documents.

dtSearch Index Information

The dtSearch Index Information section provides general details about the settings applied to your dtSearch index. This section has the following information:

  • Name—the name of your index.
  • Order—the integer value, positive or negative, representing the position of the index in the search indexes list. Indexes sort from lowest, top, to highest, bottom, order number. Those with the same order number sort alphanumerically.
  • Searchable set—the set of documents to be indexed. You can choose from any saved search in the workspace.
  • Email notification recipients—the emails that receive an email notification when your index population fails or completes.
  • Notes—the notes added by the user that created the index.

Searchable Set - Current Indexable Fields

The fields listed may not match those indexed if someone made updates after indexing the searchable set.

Advanced Settings

The Advanced Settings section provides sub-index details about your dtSearch index. This section has the following information:

  • Auto-recognize date, email, and credit card numbers—a yes/no field.
  • Create accent sensitive—a yes/no field. When set to Yes, dtSearch indexes are sensitive to accents and other language-specific characters.
  • Skip malicious files—a yes/no field. Select Yes to skip potentially malicious quarantined files and continue processing the index. The final index breakdown includes the number of skipped documents.
  • Use Default Sub-Index Size—a yes/no field that determines whether to use the default 16 GB size or a custom size. Custom Sub-Index Size field.
  • Index share—the location of the file share for storing the search index. Select the bold (default) location. You will see a warning about system performance if you select a location other than the default.

Noise Words

The list of words that Relativity ignored during indexing.

Alphabet

The index’s alphabet file. See Making a character searchable.

Note: If you search for long, uninterrupted strings that have no spaces or word breaks, such as when you have made a character searchable, dtSearch truncates the string after 32 characters and inserts a wildcard. For more information, see Searching for words longer than 32 characters.

Temporary Index Details

The Temporary Index Details section only appears during an incremental build. This table displays sub-indexes that copy from your original index and are in the process of modification during the incremental build. Once the sub-indexes in this table update, they replace the original sub-indexes from which they were copied. This section has the following information:

  • Population Table—the name of the table that a sub-index is populating.
  • Build Status—the state that the sub-index is currently in.
  • Worker Agent—the name of the agent that is handling the sub-index.
  • Worker Agent Status—the current state of the worker agent.
  • Index File Share—the location where you store your sub-index.
  • Document count—the number of documents assigned to the sub-index.
  • Error(s)—any errors encountered by the sub-index.
  • Fragmentation Level—the fragmentation level of the sub-index. Any index equal to or greater than the sub-index fragmentation threshold appears in red.

Current Index Details

The Current Index Details section displays the sub-indexes that make up your dtSearch index. This section has the following information:

  • Population Table—the name of the table that a sub-index is populating.
  • Build Status—the current state of the sub-index.
  • Worker Agent—the name of the agent that is handling the sub-index.
  • Worker Agent Status—the current state of the worker agent.
  • Index File Share—the location where you store your sub-index.
  • Document count—the number of documents assigned to the sub-index.
  • Error(s)—any errors encountered by the sub-index.
  • Fragmentation Level—the fragmentation level of the sub-index. Any index equal to or greater than the sub-index fragmentation threshold appears in red.
  • Rebuild Selected Sub-Indexes—manually rebuilds selected sub-indexes. Do not use this option unless directed by the Support team.
    Rebuild Selected Sub-Indexes

View Audit

Using the View Audit button, you can see when a user modified dtSearch index settings. The View Audit layout has the following fields:

  • Action
  • Field
  • Old Value
  • New Value
  • User Name
  • Timestamp
  • Details

Temporary storage

If you specify a temporary storage location, dtSearch builds the index in this directory and then copies the index over to the final index share when the build completes. Using a temporary storage location could speed up the build time and reduce network contention

dtSearch console

The dtSearch index console includes the following options:

  • Build Index: Full—creates a full build of the index. During the build, the button toggles to Cancel Build.
    Note: You must perform a full build when you add an additional field to the index, change any index settings, change fields of the searchable set, or overlay text on existing fields.
    • Cancel build—canceling the build stops the indexing thread, leaving the index in an unstable state.
      • Relativity deletes these indexes from the population table and gives them an inactive status.
      • You cannot search on an index with an inactive status until you run a full build.
      • Canceling also deletes the index files in the index share.
  • Build Index: Incremental—updates an index after adding or removing documents. 
    • Incremental builds do not modify the text of any documents included in previous builds, even if the original fields are no longer returned in the source search.
    • Note: If you are modifying the text of already indexed documents, you must either run a full build or re-index documents with an incremental build. If you do not want to perform a full build, you can follow the Re-index documents with incremental builds workflow to remove the impacted documents from the index. Then add them back with the new text.
    • During an incremental build the existing index remains available for searching, but changes to the index are not reflected in search results until the incremental build completes.
    • Canceling an incremental build returns the index to its previous state.
    • The incremental build process copies each sub-index that requires modification, updates the copy, then replaces existing sub-indexes with the updated copies.
      • When run, the Case manager agent removes duplicate sub-indexes.
      • The system automatically compresses a sub-index during an incremental build only if the sub-index fragmentation level is equal to or greater than the sub-index fragmentation threshold value.
  • Compress Index—compresses the dtSearch index returning all sub-indexes with a fragmentation level greater than zero to a fragmentation level of zero.
    • The Compress Index button only runs compression on sub-indexes that have a fragmentation level greater than zero.
    • You can search on the original, uncompressed, dtSearch index while compression is in progress.
    • Once compression completes, the system automatically replaces the old sub-indexes with the defragmented sub-indexes.
    • When run, the Case manager agent removes duplicate sub-indexes.
    • Canceling compression returns the index to its original fragmented state before compression began.
  • Deactivate Index—deactivates the index and removes it from the search with menu in the Documents tab, but not from the database.
  • Swap Index—swaps your index with a replacement index to use its resources while your index builds or is inactive or disabled for any reason.
    • This enables you to keep searching while your primary index experiences downtime.
    • The Swap Index function updates anything in the Views table, which affects batches, saved searches, nested searches, and more.
    • You can only select indexes in the Replacement Index with an Active status.
    • This index you swap to does not automatically run an incremental update.
    • Selecting the index from the drop-down list and clicking OK completes the index swap.
    • You cannot reverse the swap results in the current dialog box.
    • You must close this swap and run it again to swap back or swap another time.
    • This functionality is useful in limited cases. For example, if you are performing a full rebuild on a very large index.
    • Since dtSearch incremental builds are online, you can search documents once indexed.
  • Retry Errors—enables only if errors occur, you can use this button to resolve errors.
  • Show Document Errors—enables only if document errors occur. This button creates an exportable list of document-level errors.
  • Show Detailed Status—shows you statistical data for the index, including:
    • Doc Count—the total number of documents in the index.
    • Index Size—the size of the index in bytes.
    • Created Date—the date you created the index.
    • Updated Date—the date you updated the index.
    • Last Build Duration—how long the last build took to complete in hours, minutes, and seconds.
  • Refresh Page—shows the index's current build status.

Re-index documents with incremental builds

In large workspaces, dtSearch indexes can take a long time to build. This results in users being unable to run dtSearch search terms. In these situations, incremental builds are favored over full builds because users can run searches while building. However, incremental builds only add or remove documents. Incremental builds do not update the text of existing documents. The following workflow can be used as a workaround.

Due to Relativity's structure of sub-indexes, the following workflow will re-index existing documents to bring in updated extracted text or other modified fields for already indexed documents:

  1. Change the dtSearch index's searchable set to a search with the documents you want to keep. Such as removing the ones you want to re-index.
  2. Perform an incremental build. Incremental builds remove documents no longer in the searchable set as of Search Grid’s implementation.
  3. Perform an incremental build and those records will now be indexed again.
  4. Perform an incremental build to index the new documents that were added back to the search.

Note: The dtSearch index will be active and searchable during the incremental builds. This means search terms will not contain hits for the documents that are being re-indexed until step 4 is completed.