

A processing profile is an object that stores the numbering, deNIST, extraction, and deduplication settings that the processing engine refers to when publishing the documents in each data source that you attach to your processing set. You can create a profile specifically for one set or you can reuse the same profile for multiple sets.
Relativity provides a Default profile upon installation of processing.
You're a litigation support specialist, and your firm has requested you to bring a custodian's data into Relativity without bringing in any embedded Microsoft office objects or images. You have to create a new processing profile for this because none of the profiles in the workspace have specified to exclude embedded images or objects when extracting children from a data set.
To do this, you simply create a new profile with those specifications and select that profile when creating the processing set that you want to use to bring the data into Relativity.
To create or edit a processing profile:
The Processing Profile Information category of the profile layout provides the following fields:
The Numbering Settings category of the profile layout provides the following fields.
When Level numbering is selected as the Numbering Type in the Processing Profile, the prefix corresponds to the PPP section in the PPP.BBB.FFF.NNNN format. It can be used to identify the source or owner of the documents also known as ‘party code’ or ‘source’.
In the Number of digits section, you can determine the number of digits to use in each level. For example, selecting 4 in the drop-down list will allow for the following range in that level: 0001 - 9999.
Level 2 (box number)—corresponds to the BBB level in the PPP.BBB.FFF.NNNN format. Default value is 3 digits.
Level 3 (folder number)—corresponds to the FFF level in the PPP.BBB.FFF.NNNN format. Default value is 3 digits.
Level 4 (document number)—corresponds to the NNNN level in the PPP.BBB.FFF.NNNN format. Default value is 4 digits.
Once published, Numbering Type cannot be changed. Thus, Level numbering and data source cannot be changed upon publish, retry, or republish. Non-level numbering cannot be changed to level numbering on a published processing set and then republished.
Create a new Processing Set and add the data sources that you need. If the profile used by the Processing Set is Level Numbering, you can define the start number for each Data Source when adding or modifying data sources to the Processing Set.
When you create a new data source, the system will use # to indicate how many digits were configured for that level in the Processing Profile used on the current Processing Set. If a level was configured to take up to 3 digits, you can enter a start number with no padding, (e.g., 1), or with padding, (e.g., 0001).
By using Level Numbering, you can define a prefix text and three numbering levels as the control number to be used on documents that are published. For example:
When creating the control number, each level will be separated by a dot symbol, e.g., REL.001.001.0001.
Each level has a range of numbers that it can support. For example, 01 supports from 01 to 99. On the other hand, 001 supports from 001 to 999.
Fields like Family/Group Identifier, Attachments, and Parent ID are created based on the new control number.
The Level Numbering applies only at the document level. For example, if a data source is processing data using 01 for Level 1 numbering, 001 for Level 2 numbering, and 0001 for level 3 numbering, then the corresponding control numbers will be as follows:
Example List of Documents to Process | Resulting Control Number |
---|---|
Doc 1: document with 3 pages | PREFIX.001.001.0001 |
Doc 2: a one-page document | PREFIX.001.001.0002 |
Doc 3: a one-page document | PREFIX.001.001.0003 |
Doc 4: a 5 pages document | PREFIX.001.001.0004 |
Doc 5: a 2 pages document | PREFIX.001.001.0005 |
Doc 6: an email with no attachments | PREFIX.001.001.0006 |
When a family does not fit on the current level, the whole family rolls over to next level to keep the family together. See the example below:
When a family does not fit on the current level, the whole family rolls over to next level to keep the family together. See example below:
Number | Document |
---|---|
REL.001.0001.9999 | Excel document |
REL.001.0002.0001 | Word document |
REL.001.0002.0002 | Word document |
9,997 documents later
Number | Document |
---|---|
REL.001.0002.9997 | email with no attachments |
REL.001.0003.0001 | email with 4 attachments |
REL.001.0003.0002 | attachment 1 |
REL.001.0003.0003 | attachment 2 |
REL.001.0003.0004 | attachment 3 |
REL.001.0003.0005 | attachment 4 |
The email with 4 attachments couldn’t use 9998 because the current level only had 2 values left (9998 - 9999), but families are required to stay together in the same level, so it roll overs to the next level.
A family is every document that can be traced to the same parent. Grandchildren are in same family as children, thus, grandchildren stay in the same level as the rest of the family.
Publish scenario:
Number - Document |
---|
REL.001.0001.9999 – excel document |
REL.001.0002.0001 – word document |
REL.001.0002.0002 – word document |
9,995 documents later
Number - Document |
---|
REL.001.0002.9995 – email with no attachments |
REL.001.0003.0001 – email with 4 attachments |
REL.001.0003.0002 – attachment 1 from REL.001.0003.0001 |
REL.001.0003.0003 – attachment 2 from REL.001.0003.0002 |
REL.001.0003.0004 – attachment 3 from REL.001.0003.0001 |
REL.001.0003.0005 – attachment 4 from REL.001.0003.0001 |
If there are more children documents than it can fit in a single level, then Relativity will suffix the children that overflow.
Publish scenario:
Number - Document |
---|
REL.001.0003.0001 – email with 10,000 attachments |
REL.001.0003.0002 – attachment 1 |
REL.001.0003.0003 – attachment 2 |
REL.001.0003.0004 – attachment 3 |
REL.001.0003.0005 – attachment 4 |
[...] |
REL.001.0003.9999 – attachment 9998 |
REL.001.0003.0001_0001 – attachment 9999 |
REL.001.0003.0001_0002 – attachment 10,000 |
If during Retry-Discover, Relativity finds new children from a password-protected file, then Relativity will publish these children using the parent control number and a suffix appended to it. See the example below:
Initial Publish:
Number | Document |
---|---|
REL.001.001.0001 | |
REL.001.001.0002 | |
REL.001.001.0003 | (Password-protected file) |
REL.001.001.0004 |
Republish:
Number | Document |
---|---|
REL.001.001.0001 | |
REL.001.001.0002 | |
REL.001.001.0003 | (Password-protected file) |
REL.001.001.0003_0001 | new child found in REL.001.001.0003 |
REL.001.001.0003_0002 | new child found in REL.001.001.0003 |
If during Retry-Discover, Relativity finds new children in a document that holds the highest control number in the last level, then Relativity will publish these children with their parent's control number and a suffix appended to it. The family will not be moved to a new folder. See example below:
Initial Publish:
Number | Document |
---|---|
REL.001.001.9997 | |
REL.001.001.9998 | |
REL.001.001.9999 | (Password-protected file) |
REL.001.002.0001 | |
REL.001.002.0002 |
Republish:
Number | Document |
---|---|
REL.001.001.9997 | |
REL.001.001.9998 | |
REL.001.001.9999 | (Password-protected file) |
REL.001.001.9999_0001 | new child found in password-protected file |
REL.001.002.0001 | |
REL.001.002.0002 |
If during Retry-Discover, Relativity finds new children in a document that holds the highest control number in a level, and those children also have children, then Relativity will publish these children with the ORIGINAL parent control number + a suffix appended to it. Family will not be moved to a new folder. See example below:
Initial publish:
Number | Document |
---|---|
REL.001.001.9997 | |
REL.001.001.9998 | |
REL.001.001.9999 | (Password-protected file) |
REL.001.002.0001 | |
REL.001.002.0002 |
Republish:
Number | Document |
---|---|
REL.001.001.9997 | |
REL.001.001.9998 | |
REL.001.001.9999 | (Password-protected file) |
REL.001.001.9999_0001 | new child found in REL.001.001.9999 |
REL.001.001.9999_0002 | new child found in REL.001.001.9999_0001 |
REL.001.001.9999_0003 | new child found in REL.001.001.9999_0001 |
REL.001.001.9999_0004 | new child found in REL.001.001.9999 |
REL.001.002.0001 | |
REL.001.002.0002 |
When Relativity finds new root level documents, Relativity will not suffix them. Instead, Relativity will assign them to the next control number available.
Initial publish received error on ZIP container and can publish only two documents:
Number | Document |
---|---|
REL.001.001.9997 | |
REL.001.001.9998 | |
REL.001.001.9999 | SourceFolder/containerFile.Zip |
REL.001.002.0001 | SourceFolder/containerFile.Zip/1.txt |
REL.001.002.0002 | SourceFolder/containerFile.Zip/2.txt |
REL.001.002.0003 | SourceFolder/flatDocument |
When retry-discover yields two more documents from the ZIP container:
Number | Document |
---|---|
REL.001.001.9997 | |
REL.001.001.9998 | |
REL.001.001.9999 | SourceFolder/containerFile.Zip |
REL.001.002.0001 | SourceFolder/containerFile.Zip/1.txt |
REL.001.002.0002 | SourceFolder/containerFile.Zip/2.txt |
REL.001.002.0003 | SourceFolder/flatDocument |
REL.001.002.0004 | SourceFolder/containerFile.Zip/3.txt (new doc) |
REL.001.002.0005 | SourceFolder/containerFile.Zip/3.txt (new doc) |
If new documents are published with a start number that is within a range that have unused numbers, new documents will be published in those gaps. See example below:
Initial Publish started at REL.001.001.001. First 998 documents are single documents with no families or attachment. Document 999 is family with 30 documents, so it is published on the next level:
Number | Document |
---|---|
REL.001.001.001 | |
REL.001.001.998 | Next document is a family with 30 documents that rollovers. |
REL. 001.002.001 | Family with 30 attachments. |
REL.001.002.030 | Last document published. |
Republish finds 3 new root documents, each family is a single document. Thus, new documents are published using any numbering gaps.
Number | Document |
---|---|
REL.001.001.999 | First document is published using 999. |
REL.001.002.031 | Second document is published in the next available number. |
REL.001.002.032 | Third document uses next available number and so on. |
Let's assume that documents were already published using numbers REL.001.001.001 to REL.001.001.010. If a new data source is created with a start number that was already used (e.g., REL.001.001.008), then the new data source start number is the next available number: REL.001.001.011.
If a user adds 3 data sources and each data source has 10 documents and the same start number, when published, each data source start number will be the next available number. For example:
Data source 1: REL.001.001.0001- REL.001.001.0010
Data souce 2: REL.001.001.0011- REL.001.001.0020
Data souce 2: REL.001.001.0021 - REL.001.001.0030
If the number of new children found during republish is higher than the maximum allowed by the suffix padding digits, then Relativity would use the next consecutive number without increasing the padding of the previous published children.
Initial Publish:
Number | Document |
---|---|
REL.001.001.9998 | |
REL.001.001.9999 | (Password-protected file) |
REL.001.002.0001 | |
REL.001.002.0002 |
Republish:
Number | Document |
---|---|
REL.001.001.9998 | |
REL.001.001.9999 | (Password-protected file) |
REL.001.001.9999_0001 | new child found in REL.001.001.9999 |
REL.001.001.9999_0002 | new child found in REL.001.001.9999_0001 |
REL.001.001.9999_0003 | new child found in REL.001.001.9999_0001 |
... | |
REL.001.001.9999_9999 | new child found in REL.001.001.9999 (uses 4 digits padding) |
REL.001.001.9999_10000 | new child found in REL.001.001.9999(uses 5 digits padding) |
REL.001.002.0001 |
The Inventory | Discovery Settings category of the profile layout provides the following fields.
DWGwill create a list of DWG, XML, ISO, EXE, D to exclude from Discovery.
XML
ISO
EXE
D
The Extraction Settings category of the profile layout provides the following fields.
Import short message files in their native format directly into Relativity for processing. This feature eliminates having to convert short message files to RSMF (Relativity Short Message Format) before processing. You can define conversion settings in the processing profile's Short Message Conversion Settings section. The short message conversion settings you define only apply to processing jobs where RSMF conversion occurs during processing. The settings do not impact data already in RSMF format before processing takes place.
Use the following settings to define short message conversion extraction parameters.
The Deduplication Settings category of the profile layout provides the following fields:
The Publish Settings category of the profile layout provides the following fields.
The follow sections describe other considerations for numbering, prioritizing publishing speed, and dtSearch,
To better understand how each parent/child numbering option appears for published documents, consider the following scenario.
Your data source includes an MSG file containing three Word documents, one of which is password protected:
When you process the .msg file, three documents are discovered and published, and there’s an error on the one password-protected child document. You then retry discovery, and an additional two sub-child documents are discovered. You then republish the processing set, and the new two documents are published to the workspace.
If you’d chosen Suffix Always for the Parent/Child Numbering field on the profile, the identifiers of the published documents would appear as the following:
If you’d chosen Continuous Always for the Parent/Child Numbering field on the profile, the identifiers of the published documents would appear as the following:
If you’d chosen Continuous, Suffix on Retry for the Parent/Child Numbering field on the profile, the identifiers of the published documents would appear as the following:
Publishing speed can be prioritized by performing one of the following actions:
Note the following details regarding how Relativity uses suffixes:
When you publish Word, Excel, and PowerPoint files with the text extraction method set to dtSearch on the profile, you'll typically see faster extractions speeds, but note that those file properties may or may not be populated in their corresponding metadata fields or included in the Extracted Text value.
The dtSearch text extraction method does not populate the following properties:
The following table breaks down which file properties are populated in corresponding metadata fields and/or Extracted Text for the dtSearch text extraction method:
File type | Property | Included in dtSearch Corresponding metadata field | Included in dtSearch Extracted text |
---|---|---|---|
Excel (.xls, .xlsx) | Has Hidden Data | ✓ | ✓ |
Excel (xls, .xlsx) | Track Changes (Inserted cell, moved cell, modified cell, clear cell, inserted column, deleted column, inserted row, deleted row, inserted sheet, renamed sheet) | ✓ | |
Word (.doc, .docx) | Has Hidden Data | ✓ | |
Word (.doc, .docx) | Track Changes (Insertions, deletions, moves) | ✓ | |
Powerpoint (.ppt, .pptx) | Has Hidden Data | ✓ | |
Powerpoint (.ppt, .pptx) | Speaker Notes | ✓ |
As text extraction directly impacts search results, the following table lists which features are supported by the Relativity, Native, and dtSearch methods:
Features | Relativity | Native | dtSearch | ||||||
---|---|---|---|---|---|---|---|---|---|
FEATURE DIFFERENCES | Excel Features Supported |
Word Features Supported |
Power Point Features Supported |
Excel Features Supported |
Word Features Supported |
Power Point Features Supported | Excel Features Supported |
Word Features Supported |
Power Point Features Supported |
Math equations. For more information, see Math equations. | Not Supported | Not Supported | Not Supported | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Math formulas (sum, avg, etc.) |
Not Supported | Not Supported | Not Supported | Not Supported | Not Supported | Not Supported | ✓ | ✓ | ✓ |
SmartArt | ✓ * | ✓ * | ✓ * | ✓ * | ✓ * | ✓ * | ✓ * | ✓ * | ✓ * |
Speaker notes | N/A | N/A | ✓ ** | N/A | N/A | ✓ | N/A | N/A | ✓ *** |
Track changes | ✓ | ✓ | N/A | ✓ | ✓ | N/A | ✓ *** | ✓ *** | N/A |
Hidden data | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ *** | ✓ *** | ✓ *** |
2016+ new chart styles | Not Supported |
✓ | ✓ | Not Supported |
✓ | ✓ | Not Supported |
✓ | ✓ |
* Pre-2007 Office SmartArt are considered attachments and will be extracted and OCRd. ** When a header or footer is in the Speaker Notes section, field codes are not extracted. *** For more information, see dtSearch special Considerations. |
|||||||||
FULLY COMPATIBLE AND SUPPORTED FEATURES | |||||||||
Bullet lists | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Chart box | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
CJK and other foreign language characters | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Clip art | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Comments and replies | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Currency format | ✓ | ✓ | N/A | ✓ | ✓ | N/A | ✓ | ✓ | N/A |
Date / Time format | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Field codes | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Footer | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Header | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Hidden slide | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ |
Macros | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A |
Margins / Alignment Format | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A |
Merged cell (horizontal) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Merged cell (vertical) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Number format (positive / negative) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Number format (fraction) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Number format (with comma) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Number format (with decimal point) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Password protected (cell level) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Password protected (column level) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Password protected (file level) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Password protected (row level) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Password protected (sheet / page level) | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Phone number format | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Pivot table | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A |
Right to left test format | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A |
Slide numbers | N/A | N/A | ✓ | N/A | N/A | N/A | N/A | N/A | ✓ |
Table | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Text box | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Transitions | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ |
WordArt | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Word wrapping format | N/A | ✓ | N/A | N/A | ✓ | N/A | N/A | ✓ | N/A |
The following table includes examples of what the extracted text would look like if Native or dtSearch are used rather than Relativity:
Original Document |
Text Extraction Method | ||
---|---|---|---|
Relativity | Native | dtSearch | |
![]() |
NO TEXT |
![]() |
![]() |
Why was this not helpful?
Check one that applies.
Thank you for your feedback.
Want to tell us more?
Great!