Conflicts
Once the Entity normalization process is complete, you should review those results in the Conflicts subtab in the Entity Analysis tab.
Conflicts fields
The following are definitions of fields found within the Conflicts subtab.
Field | Description |
---|
Conflict ID | Unique identifier for that conflict |
Number of Entities | Number of entities in the conflict cluster |
Total Docs | Total documents containing records affected by this cluster |
Conflict Reason | Reason there is a conflict for this group of entities |
Total Conflicts | Number of conflicts |
Primary Conflicts | Specific list of conflicting PI Types that are also primary PI types based on your Data Analysis categorization. |
Secondary Conflicts | Specific list of conflicting PI Types that are also secondary PI types based on your Data Analysis categorization. |
Tertiary Conflicts | Specific list of conflicting PI Types that are also tertiary PI types based on your Data Analysis categorization. |
Status | Review status of the conflict cluster: - Not Reviewed—the conflict cluster has not been reviewed by a human user yet.
- Review in Progress—the conflict cluster is being reviewed by a human user.
- Review Complete—the conflict cluster has been reviewed by a human user and been marked as reviewed in the Conflict Details view
|
Resolved By | The name of the user who reviewed or is reviewing the conflict cluster |
Resolved On | Time stamp of when the user marked the conflict cluster as reviewed |
Example conflict logic
Scenario description | Name | Primary | Secondary | Tertiary | Example Alias 1 | Example Alias 2 |
---|
Names match or are similar, but primary identifiers do not match. Tertiary identifiers match. | Match | No match | - | Match | Name: John Smith
SSN: 223-55-6788
Email: jsmith@gmail.com | Name: John Smith
SSN: 123-45-6789
Email: jsmith@gmail.com |
Names match or are similar. There is not enough information on primary identifiers to compare. Secondary identifiers do not match. Tertiary identifiers match. | Match | Not enough info | No match | Match | Name: John Smith
Email: jsmith@gmail.com DOB: 01/20/1990 | Name: John Smith
Email: jsmith@gmail.com
DOB: 12/10/1970 |
Names match or are similar. There is not enough information on primary identifiers to compare. Secondary identifiers do not match. Tertiary identifiers match. | No match | Match | - | - | Name: John Smith
SSN: 123-45-6789
DOB: 01/20/1990 | Name: David Johnson
SSN: 123-45-6789 |
Names are not similar and primary identifiers do not match. Tertiary matches. | No match | No match | - | Match | Name: John Smith
SSN: 123-45-6789
Email: jsmith@gmail.com | Name: Adam Brown
SSN: 623-65-6678
Email: jsmith@gmail.com |
Names are not similar and there is not enough information on primary identifiers to compare. Tertiary matches. | No match | Not enough info | - | Match | Name: John Smith
Email: jsmith@gmail.com | Name: David Johnson
Email: jsmith@gmail.com |
Recommended workflow
We recommend using the following workflow for review:
- Sort the table to review all clusters with Primary Conflicts > 0.
- Sort the table to review all clusters with Secondary Conflicts > 0.
- Sort the table to review all clusters with Tertiary Conflicts > 0. This may not be necessary based on project requirements.
- If there are too many unnecessary conflicts update the Identifier Types, located in the Project Settings, to reduce the number of conflicts.
- After Conflicts are resolved or Clusters with Conflicts are reviewed, sort Clusters by size in descending order and review for similar named entities.
- This is best for large clusters.
- Unless time permits, it may not be beneficial to review all clusters with a size > 1.
For Cluster review:
- To merge two entities together, select the entities and select the Merge button.
- Entities left separate in the Cluster will remain as separate line items in the Entity List and Entity Export.
Once cluster review is complete rerun normalization so that the entity report is updated.