

dtSearch includes special characters and other operators used to define search criteria. The following table lists the syntax options available for queries that run against a dtSearch index. Click the search functionality name for more details on the syntax use.
Search functionality | Special characters or operators |
---|---|
Auto-recognition of dates, emails, credit cards | date(), mail(), creditcard() |
Boolean operators | AND, OR, NOT |
Built-in search words | xfirstword, xlastword |
Connector words | and, or, not, to, contains |
Exact phrase - double quotes | " " |
Operator precedence | () |
Exact phrase - no double quotes | |
Fuzzy searching | % |
Noise words and the alphabet file | Noise Words, Alphabet |
Numerical patterns | = |
Phonic searching | # |
Regular expressions (Redirects to another topic.) | "##" |
Stemming | ~ |
Wildcards | ?, * |
W/N operator | W/N |
Proximity with terms order | PRE |
Words and phrases |
For the list of the special characters recognized as spaces that cause word breaks, see Noise words and the alphabet file.
Auto-recognition provides you with the ability to search for various date formats, email addresses, and credit card numbers. However, it can dramatically affect indexing and searching performance. You must activate auto-recognition before you can use it in your workspace. Contact your system administrator for more information.
Date recognition searches for strings that appear to be dates. It uses English-language months, including common abbreviations, and numerical formats. For example, dtSearch recognized the following date formats:
Note: The short month format, Jan, Feb, and so forth, can be problematic, and is occasionally rejected by Relativity. The recommendation is to stick with the full name of the month to avoid any errors. For example January, February, and so forth.
Note the following date and date range search strings:
dtSearch recognizes numeric strings as dates, as long as it interpretes as a valid date. This includes formats common in the US and UK, including:
In the case of ambiguous dates, such as 01/05/10, dtSearch defaults to MM/DD/YY. If the date contains words dtSearch converts the words to a numeric value to help interpret the date. For example, 30 must be a day and not a month, and 2015 must be a year, not a day or month.
Email address recognition searches for text with the syntax of a valid email address, such as sales@example.com. With this feature, you can search for a specific email address regardless of the alphabet settings for "@", ".", or other punctuation in the email address.
You can also use the word listing functions in dtSearch to enumerate all email addresses in a document collection. You must include either the * or ? wildcard expression to enumerate all email addresses in a document collection.
Credit card number recognition searches for any sequence of numbers that matches the syntax for a valid credit card number issued by a major company, such as Visa and MasterCard. dtSearch recognizes a credit card number regardless of the pattern of spaces or punctuation embedded in the number:
Credit card issuers use numerical tests to exclude sequences of numbers that are not valid credit card numbers. Since these tests do not detect all invalid numbers, the feature for credit card number recognition may find additional invalid numbers.
To search for a credit card number, enter a credit card number between the parentheses in creditcard() as exemplified in creditcard(1234*).
The dtSearch engine supports Boolean operators, including AND, OR, and NOT. You can use these operators to connect multiple phrases or terms in a single search expression.
Note: When using Boolean operators in a proximity search, dtSearch includes noise words. Although not searchable, a proximity searches still counts noise words.
Note: For details on parsing proximity and Boolean strings in search conditions, see dtSearch - How are Proximity and Boolean (AND/OR) parsed in search conditions? knowledge base article on the Relativity Community site.
When you use the AND operator to connect expressions, only documents that contain all the expressions in the search string return in the result set. The following search strings illustrate how to use this operator:
dtSearch includes the following built-in search words:
You can use these terms to limit a search to the beginning or end of a file. For example, apple W/10 xlastword searches for apple within 11 words of the end of a document.
The dtSearch connector words include:
To search for a phrase that contains one of the dtSearch connector words, quote a connector word or the phrase it is in, or put a tilde after the connector. The following search strings work in returning phrases that contain connector words:
Note the following:
You must use double quotes when searching for exact phrases that contain dtSearch operator reserved words, such as the Boolean connectors AND, OR. For example:
Note: Connector words such as and and not are in the noise word list by default. All these words are noise words and you must remove these words from the list to make dtSearch index these files
Search string: clear and present danger
Search string: "clear and present danger"
Note: Do not confuse the parentheses function for order of preference with the double quotes function.
You can combine a search for required search terms with other optional terms. The words before the AndAny connector constitute required search terms, and the words after the AndAny connector are optional. A document only returns if it contains at least the required search terms. For example, (apple and pear) AndAny (grape or banana) would find any document that contains apple and pear, with grape and banana also counts as hits only if apple and pear are also present in the document.
The following example further explains the AndAny operator:
You have three documents, each containing the terms specified below:
Note the following behavior:
When you use the OR operator to connect expressions in a search string, documents that contain one or more of these expressions return in the result set. For example, the search string apple pie or poached pear returns documents that contain apple pie, poached pear, or both phrases.
In a dtSearch, you can use the NOT operator at the beginning of a search expression to negate its meaning and exclude documents from a result set. For example, the search expression applesauce and NOT pear returns documents that contain the word applesauce, but not those documents that contain both the words applesauce and pear.
Note: You can also use NOT in a proximity search as illustrated by the NOT W/N, NOT Within N words, operator.
Document | OCR | Recipient | Author |
---|---|---|---|
AS00001 | From: Ryan To: Will | Will | Ryan |
You can perform a dtSearch using the search string Ryan AND NOT Will and return results that do not include document AS00001.
The dtSearch engine combines into a single pool the text for all fields identified for inclusion in an index. A search string using the AND NOT operator queries the index that includes the combine text from all indexed fields, rather than querying the content of individual fields. This behavior ensures consistent result sets when querying with the AND NOT operator.
Note: A keyword search is an SQL full text search, which queries individual fields. Keyword searches do not return the same results as dtSearch when using the NOT operator to query across multiple fields. See NOT operator.
The precedence, or order of evaluation, determines how a group of expressions evaluates in a query.
Note: By default, dtSearch evaluates OR expressions before AND expressions: A AND (B OR C). Unlike dtSearch, the order of precedence for a keyword search evaluates AND expressions before OR expressions: (A AND B) OR C. See Keyword search.
Evaluation order for the search string: apple AND pear OR grape
Documents containing the following terms return:
Parentheses allow you to group expressions and control the order of query string execution where the query string contains both AND and OR operators. dtSearch requires both AND and OR operators for the parentheses to affect query results and ignores parentheses when the query string does not contain both operators.
For query strings containing both AND and OR operators, dtSearch evaluates OR first before AND. However, expressions contained within parentheses take precedence. If you want AND evaluated before OR, place the AND expression within parentheses.
Evaluation order for the search string: grape OR (apple AND pear)
dtSearch returns documents containing the following terms:
Use a proximity operator to separate query expressions. For example, insert a PRE proximity operator between each expression of the search string.
Evaluation of the search phrase: (grape OR apple) PRE/1 (banana OR pear)
dtSearch returns documents containing the following terms:
Evaluation of the search phrase: (grape OR apple) (banana OR pear)
dtSearch ignores the parentheses and analyzes the query as grape OR apple banana OR pear and returns documents with the following terms:
Searching for words next to each other with no operator between them constitutes an exact phrase in dtSearch. For example, if you search for apple pear, dtSearch returns documents that contain the exact phrase apple pear. There is no rule that requires double quotes around a phrase of any number of words. You only need to use double quotes when searching for a word that is a dtSearch operator. For more details, see Exact phrase - double quotes.
Search string: pear orange
Search string: apple grape banana
Returns the exact phrase: apple grape banana
Does not return partial phrase: apple grape
Does not return standalone word: grape banana
Using the dtSearch engine, you can perform fuzzy searches, which return documents containing spelling variations of a specified term. You may want to use fuzzy searching when querying documents that contain misspelled terms, typographical errors, or you have scanned with Optical Character Recognition (OCR).
The percent sign (%) is the character used for fuzzy searches. The number of % used indicates how many characters in the search term dtSearch engine ignores when it runs the query. The position of the % indicates the number of characters from the beginning of the term that must match exactly with words in the result set. The following search strings illustrate how to use this character:
In Relativity, you can use the fuzziness character (%) or the Fuzziness Level menu to perform fuzzy searches. The availability of these search options depends on the location where you are running a dtSearch:
In the Fuzziness Level menu, you can select a value from 1 to 10, which applies to all terms in the text box. Larger numbers return terms with more variation. We recommend using values between 1-3 for moderate error tolerance. The following table describes the expected results for sample settings.
Fuzziness level | Description of search results |
---|---|
Blank | Only returns the entered term. |
1 | Returns slight variations of the entered term. |
4 | Returns multiple variations of the entered term. |
Note: The Fuzziness Level menu is independent of the fuzziness (%) character that you can enter in the text box. A search for appl% without a Fuzziness Level setting may return documents containing apple or apply, since these terms have the stem appl and differ by one character.
Fuzzy searching uses term length and fuzziness level to decide how many % characters to add. This is not a straight level to character match. This means a level seven fuzziness search does not necessarily mean up to seven additional characters return.
The dtSearch engine references a default list of noise words and an alphabet file when it creates a new index. The dtSearch index excludes the noise words to improve query performance and prevent unnecessary index growth. When you run a query, dtSearch ignores words such as AND, THE, and WILL. The alphabet file determines how queries handle characters and spaces.
Note: If your dtSearches do not return expected results, you may want to contact your system administrator to adjust the noise word list or alphabet file.
The dtSearch engine uses an alphabet file to define which characters to treat as text, cause word breaks, and ignore. System administrators can modify the default alphabet file when they create or edit a dtSearch index. See Making a special character searchable.
The alphabet file determines which characters to treat as text, which cause spaces, which cause word breaks, and which to ignore. The categories of items in the alphabet file include:
\09—horizontal tab
\0a—line feed
\0c—form feed
\0d—carriage return
\5c—backslash (\)
Note: Do not remove these Unicode characters from your alphabet file.
Note: dtSearch does not recognize the underscore (_) as a space by default. Check the [Spaces] section to ensure that any character you want to treat as a word separator is properly defined in dtSearch.
The following table shows the default noise words list. System administrators can modify this list when they create or edit a dtSearch index. Thus, if you search for a phrase that contains a term in the noise words list, you need to remove the term from the list and rebuild your index.
Begins with... | Noise words |
---|---|
A | a, about, after, all, also, an, and, another, any, are, as, at |
B | be, because, been, before, being, between, both, but, by |
C | came, can, come, could |
D | did, do |
E | each, even |
F | for, from, further, furthermore |
G | get, got |
H | had, has, have, he, her, here, hi, him, himself, his, how, however |
I | i, if, in, indeed, into, is, it, its |
J | just |
L | like |
M | made, many, me, might, more, moreover, most, much, must, my |
N | never, not, now |
O | of, on, only, or, other, our, out, over |
S | said, same, see, she, should, since, some, still, such |
T | take, than, that, the, their, them, then, there, therefore, these, they, this, those, through, thus, to, too |
U | under, up |
V | very |
W | was, way, we, well, were, what, when, where, which, while, who, will, with, would |
Y | you, your |
Note: You can make special characters searchable in a dtSearch index. However, you must escape some characters when using regular expressions. For more information, see Searching for symbols.
Note: You must also begin with a space.
Note: If you make any symbol a searchable character in your dtSearch index and then build an index on a long, uninterrupted search string, such as a file path, dtSearch truncates the string after the 32nd character. For more information, see Searching for words longer than 32 characters.
To search for other numerical patterns such as social security numbers, you can use the = wildcard, which matches any single digit. For example, if you include hyphens as spaces, then the following search request would find U.S. social security numbers:
=== == ====
This searching pattern can return false hits. For example, no valid social security number begins with nine. However, this is the only way to get social security numbers with spaces instead of dashes.
Note: dtSearch support notes that the === == ==== notation is higher performing than a regular expression for the same pattern, assuming you are comfortable with getting some false hits.
Using the dtSearch engine, you can perform phonic searching, which returns documents containing words that sound like the word you are searching for and begins with the same letter. The pound sign (#) is the character used for phonic searches when added to the front of a word. For example, a phonic search for pear also finds pair and pare.
You can also use phonic searching in Dictionary searches.
Using the dtSearch engine, you can perform stemming searches, which return documents containing grammatical variations of a root word. Stemming limits to English only. The tilde (~) is the character used for stemming searches when added at the end of the root word. For example, a search on apply~ returns documents containing the words apply, applying, applies, and applied. After you perform a stemming search, you can enter applied in the Find Next box, and then click the Find Next icon to locate hits or grammatical variations.
Because stemming only works with the root word, it generally does not return irregular variations of a verb. For example, a search on run~ would not return ran. The dtSearch engine only supports stemming for the English language.
In Relativity, you can use the stemming character (~) or the Enable Stemming checkbox to perform stemming searches. The availability of these search options depends where you are running a dtSearch:
The Enable Stemming checkbox is independent of the stemming (~) character that you can enter in the Search Terms box or Dictionary Search text box. A search for apply~ with Enable Stemming checkbox unselected returns apply, applied, applies, or applying. A search for apply with Enable Stemming checkbox selected returns the same results.
With fuzzy searching and stemming enabled, it checks for a fuzzy match twice, once on the original term, and once comparing the stemmed word with the stemmed word in the index. A match on either counts as a hit.
The dtSearch engine supports special characters that you can use as wildcards. It also supports the use of leading wildcards, or those added to the beginning of a word. The following characters represent wildcards in dtSearches:
Special character |
Function |
---|---|
? | Matches any single character. |
* | Matches any number of characters. Note: This character slows searches when used near the beginning or middle of a word. |
~ | Matches words containing grammatical variations of a root word. The tilde (~) is the stemming character available in dtSearches. See Stemming. |
= | Matches any numerical character (ex. === == ==== for Social Security Numbers). See Numerical Patterns. |
As illustrated in the following table, you can add wildcards to the root of any word to return matching terms from a dtSearch.
Sample search string | Description of search results |
---|---|
appl* | Matches apple, application. |
*cipl* | Matches principle, participle. |
appl? | Matches apply and apple, but not apples. |
ap*ed | Matches applied, approved. |
apply~ | Matches apply, applied, applies. |
=th | Matches 4th, 5th, 6th, 7th, 8th. |
You can use the W/N, within N words, operator to return documents with two words or phrases occur within a certain proximity of each other. When using Boolean operators in a proximity search with the W/N operator, dtSearch includes noise words. The N value represents the number of intervening words. For example, the search expression apple W/5 pear returns documents that contain apple only when it occurs within five words of pear. The documents returned by the search must contain the terms within the required proximity, such as five words.
The W/N operator is symmetrical. The search expression apple W/5 pear returns the exact same document as pear W/5 apple.
Note: dtSearch treats Single characters as full words when using this operator. For instance, if you search for Harry W/2 Truman, your search retrieves documents that include Harry S Truman or Harry S. Truman.
Note: Relativity does not support the WI operator. Use the W/N syntax to search for documents having words or phrases within a certain proximity of each other.
You can use the NOT W/N, not within N words, operator to exclude documents from a result set when two words or phrases are within a certain proximity of each other.
For example, the search expression apple NOT W/20 pear returns documents that contain apple when separated from pear by at least 20 words. It also returns documents that do not contain pear. Documents that contain apple separated from pear multiple times with varying proximity return as long as there is at least one concurrence where apple separates from pear by at least 20 words.
The NOT W/N is not symmetrical. The search expression apple NOT W/20 pear does not return the same documents as pear NOT W/20 apple.
You can create complex expressions with the W/N operator by connecting words or phrases. At least one of these expressions must be a single word, phrase, or group of words and phrases connected by an OR operator as illustrated by the following:
Note: You can break up complex expressions with OR connectors into separate searches. Search apple w/10 "orange tree" OR banana w/10 "orange tree" to return the same results as (apple OR banana) W/10 "orange tree".
Avoid creating complex expressions that produce ambiguous results as illustrated in the following examples:
Note: dtSearch displays a warning message when you enter an ambiguous search request.
You can also use the Boolean operators AND and OR to connect proximity expressions as illustrated in the following examples:
Note: When connecting proximity expressions using Boolean operators, you must use parentheses.
You can use the PRE operator to search for a word that appears within a certain number of words before another word.
For example, the search string apple PRE/5 pear returns documents where apple appears within five words before pear.
Note: Relativity does not use the POST operator. However, you can mimic this functionality by reversing the order of the terms, and using the PRE operator.
With a dtSearch, you can use double quotes to search for a phrase. For example, the phrase fruit salad in the search string apple w/5 "fruit salad". The following list outlines how dtSearch queries on words or phrases with noise words or punctuation:
On this page
Why was this not helpful?
Check one that applies.
Thank you for your feedback.
Want to tell us more?
Great!