Blocklisting

A blocklist is a list of banned or excluded PI Values. Using the blocklisting tool, project leads can identify potential PI tagged within the application and blocklist those that were falsely tagged. Blocklisting such information removes tagged annotations within the document review screen, instances of that information in the final report, and any linkages attached to that annotation. You can use blocklisting to remove false positives across the dataset without reviewing every document.

With the Blocklisting tool you can:

  • See a list of all annotations from the machine or reviewers on non-spreadsheets
  • Select one or more annotations and blocklist them, removing them from the documents and preventing that string of text from being predicted by the machine again.
  • Access documents with annotations to validate if they should be blocklisted or not.
  • Search, sort, and filter information within the table.
  • Change the PI type in focus, enabling a user to blocklist all PI types.

Permissions

The Blocklisting page is only available for users assigned the role of Lead.

From the left-hand pane, click to expand the Quality Control icon and then select the Blocklisting symbol.

Available PI detections

This table shows all tagged personal information that is available to be blocklisted. This table does not have information that is already blocklisted. The Available PII Detections table has the following columns:

  • Text—the personal information tagged within the document
  • Sources—the source of the tag, this field can have one of three values:
    • Python—indicates that the machine predicted the tag.
    • Annotator—indicates a user created the tag.
    • Excel—value exists within a spreadsheet table.
  • PII Count—the number of times the personal information appears across the data set.
  • Document Count—the number of documents the personal information appears in.
  • PI Type—the PI Type the personal information has been tagged with.

Excel blocklisted by documents

The Excel Blocklisted By Documents table has a list of document IDs for spreadsheets that have personal information that is already blocklisted by the machine. This is different from the Available PI Detections table.

The Excel Blocklisted by Documents table has the following columns:

  • Document—the document ID of the spreadsheet.

  • Frequency—the number of PI values blocklisted within the document.

  • PI Types—the PI types of the information blocklisted within the document.

  • PI Values—the PI values blocklisted within the document.

  • Actions—the Actions column has two icons:

    • View Document in New Tab—open the document in a new tab.

    • View More—view a breakdown of the blocklisted terms in the spreadsheet.

      From the View More screen, you can add a blocklisted item contained within a spreadsheet to the allowlist via the icon in the Actions column.

Excel blocklisted by term

The Excel Blocklisted By Terms table has terms blocklisted by the machine on spreadsheets.

The table contains the following columns:

  • Terms—blocklisted terms.

  • Frequency—the number of times the terms appear across the document set.

  • PI Types—the PI type assigned to the term before blocklisting.

  • Document Count—the count of documents that the term appears in.

    By opening the Document Count drop down, you can view a list of the impacted document IDs.

    Clicking a document ID within this list opens a table that lists all blocklisted terms for that document ID. You can add a blocklisted item contained within a spreadsheet to the allowlist via the icon in the Actions column.

Blocklisting PI text

To blocklist text:

  1. Open the Available PI Detections table.

    Review the Text column to identify PI values that require blocklisting.

  2. If you are unsure where to start, filter the table so items with the highest document count are most visible.
    Glance through the highest frequency items and decide if any immediately look blockable.

  3. Focus on high-priority PI types and analyze their results by filtering the table to the specific PI type.

  4. Once you locate an item to block, select the check box next to it.
    The orange Blocklist button will show how many items you will be blocklisting. This count should match what you are blocklisting.

  5. Once you select all items, click the Blocklist button to confirm the blocklisting action.

You will not see blocklisting changes in the UI until running Incorporate Feedback.

Viewing documents

To view documents from the blocklisting table:

  1. To review the documents in which the text appears, click the Document Count field.

  2. A dropdown will appear with the document ID and the number of times the text appears in the document.

  3. Select a row in the dropdown to open a new tab with the document.
    Hold control/command + click to open the document in a separate tab to avoid losing your location on the blocklisting table.

Table navigation

There are many ways to adjust the blocklisting table to better assist your review.

  • Option 1— use the Search feature in the top-left corner to type in a specific phrase and see if there are any results within the dataset.
  • Option 2— filter the table. Click the inverted pyramid in the top-right corner, and filter by:

    • Specific text

    • Annotation source

    • Specific PI types

  • Option 3— adjust the sort order of the columns within the Blocklisting table.

Hover the cursor over the column header for an arrow to appear. When activated, the arrow will freeze next to the column header.
To sort a column, take the following actions:

  • One click: Ascending order
  • Two clicks: Descending order
  • Three clicks: Deactivates sorting

You can sort many columns at once. When this occurs, columns have an arrow next to the column header indicating sorting by ascending or descending order.