Google Workspace

This page covers additional supported file type (for processing) information for Google Workspace. To view the Relativity supported and unsupported file type tables, see Supported file types for processing.

Note: This documentation contains references to third-party software, or technologies. While efforts are made to keep third-party references updated, the images, documentation, or guidance in this topic may not accurately represent the current behavior or user interfaces of the third-party software. For more considerations regarding third-party software, such as copyright and ownership, see Terms of Use.

When you gather Google Workspace data, Google provides a supplemental metadata file. For Google Mail (Gmail), the metadata file is a .csv file. For Google Drive, the metadata file is an .xml file.

Processing automatically identifies when the supplemental metadata file is present in your file share, links the metadata fields to the processed files, and makes them available as mappable fields.

There are two methods for collecting Google Workspace data:

  • Collect in Relativity. For more information, see the Google Workspace documentation.
  • Manual download via Google Vault.

Collect

When using Collect to gather Gmail data, the metadata file is provided as a .csv file, an .xml file is also provided but not used. When using Collect to gather Google Drive data, the metadata file is provided as an .xml file.

Relativity Collect: Gmail Relativity Collect: Google Drive
Collect Gmail Files Collect Google Drive

Note: Collect places additional files in the root of the data source. They are processed as loose individual files if they are not removed prior to processing.

Other notes

  • Google Chat data is converted to Relativity's short message format (RSMF) for processing when using Collect to gather data.
  • Do not edit the supplemental metadata files or Processing will not recognize them.

Google Vault

When using Google Vault to gather Gmail data, download the [EXPORT_NAME].zip, ;and [EXPORT_NAME]-metadata.csv files. For Google Drive data, download the [EXPORT_NAME].zip and [EXPORT_NAME]-metadata.xml files. When collecting data manually from Google Vault, you must export Gmail data in MBOX format. Relativity does not sync additional metadata if exported as a .pst file.

Google Vault: Gmail Google Vault: Google Drive
Google Vault Gmail Google Vault Drive

Note: If you have files saved in your root folder other than the supplemental metadata files, they are processed as loose individual files.

Other notes

  • Google chat data must be converted to Relativity's short message format (RSMF) prior to processing when using Google Vault. Google chat data does not need to be converted to RSMF when using Collect.
  • Do not edit the supplemental metadata files or Processing will not recognize them.

Google Workspace metadata field lists

The Google supplemental metadata files contain the fields listed below, all of which are available for mapping to Document object fields.

Note: Google may add, remove, or edit fields at any time. You can find the most current lists on their Vault export contents web page.

Mapping Google Drive fields

The following sections describe Relativity's behavior when processing Google Drive files with deduplication enabled or disabled. Whether or not to enable deduplication is up to you; Relativity processes and publishes the files regardless of the deduplication status. However, field mapping has some differences depending on the deduplication status.

Mapping Google Drive fields with deduplication turned off

Relativity uses a combination of hashes for deduplication when processing. With deduplication turned off, two files with differing cloud-based metadata (sidecar metadata) can have the same physical file hash (of a native parent file). Since Relativity associates sets of metadata with a physical (parent) file hash, parent files with identical hashes can be populated with identical sidecar metadata. Relativity has developed a set of source fields specific to Google Drive to resolve this issue. When processing Google Drive files with deduplication turned off, Relativity forces unique hashes for each set of sidecar metadata. See the table below this section for a description of Google Drive-specific fields.

Mapping Google Drive fields with deduplication turned on

With deduplication on, Relativity processes files like other data types, meaning it publishes all the metadata it finds. When deduplication takes place, Relativity adds additional metadata fields to prevent different native (parent) files with the same hash from receiving identical sidecar metadata. If two native (parent) files with identical hashes have associated sidecar metadata (each having different hashes), Relativity deduplicates one of the files, and you will only see one file.

Google Workspace fields

The following table lists the metadata fields found in the Google Drive .xml file. Use these fields when processing Google Drive files with deduplication turned off.

Google drive .xml field Relativity source
field name
Field type Description
DocID GoogleDrive/DocID Long Text A unique identifier for the file. For site exports, the value is the page ID.
#Author GoogleDrive/Author Long text The email address of the person who owns the file in Drive. For a shared drive file, it shows the shared drive name.
Collaborators GoogleDrive/Collaborators Multiple Choice The accounts and groups that have direct permission to edit the file or add comments. Also includes users with indirect access to the file if you chose this option during export.
Viewers GoogleDrive/Viewers Multiple Choice The accounts and groups that have direct permission to view the file. Also includes users with indirect access to the file if you chose this option during export.
#DateCreated GoogleDrive/DateCreated Date The date a Google file was created in Drive. For non-Google files, the date the file was uploaded to Drive.
#DateModified GoogleDrive/DateModified Date The date the file was last modified.
#Title Google/Title Long Text The file name as assigned by the user. Because some operating systems cannot expand zip files with extremely long file names, Vault truncates the file name at 128 characters during export. The value shown by the #Title tag isn't truncated.
DocumentType GoogleDrive/DocumentType Long Text The file type for Google files. Possible values are:
  • DOCUMENT—a document created in Google Docs.
  • SPREADSHEET—a spreadsheet created in Google Sheets.
  • PRESENTATION—a presentation created in Google Slides.
  • FORM—a form created in Google Forms.
  • DRAWING—a drawing created in Google Drawings.
  • SITES_PAGE—a page from a site created in new Google Sites.
Others GoogleDrive/Others Multiple Choice The accounts from your query that have indirect access to the file if you opted to exclude access level information during export. May also include users for whom Vault could not determine permission levels at the time of export.
DocParentID Google/DocParentID Long Text For sites, the site ID.
SharedDriveID Google/SharedDriveID Long Text The identifier of the shared drive that contains the file, if applicable.
SourceHash Google/SourceHash Long Text A unique hash value for each version of a file. Can be used to deduplicate file exports and verify the exported file is an exact copy of the source file. Supported by Google Docs, Sheets, and Slides files only.
FileName GoogleDrive/FileName Long Text The file name. Use this value to correlate the metadata with the file in the export ZIP file.
FileSize GoogleDrive/FileSize Whole Number The size of the file in bytes.
Hash GoogleDrive/FileHash Long Text The MD5 hash of the file.
UserQuery Google/UserQuery Long Text The query submitted by the Vault user that retrieved the files included in this export.
TimeZone Google/TimeZone Long Text The time zone used for date-based searches
Custodians Google/Custodians Long Text The email addresses of the users whose accounts were searched. If you searched for content rather than individual user accounts, there are no custodians listed here.
Labels GoogleDrive/Labels Multiple Choice Labels applied to the message by Google Drive or the user.

The following table lists the metadata fields found in the Gmail .csv file. Use these fields for mapping Gmail data.

Google mail
.csv field
Relativity source
field name
Field type Description Notes
Rfc822MessageId Google/Rfc822MessageId Long Text A message ID that is the same for the receiver's and sender's messages. Use this value to correlate metadata with the message in an MBOX export. For classic Hangouts, the value is for the first message in the thread.  
GmailMessageId Google/GmailMessageId Long Text A unique message ID. Use this value to manage specific messages with the Gmail API. For classic Hangouts, the value is for the first message in the thread.  
Account Google/Account Long Text The account that had the message in their inbox. For example, user1@company.com received a message sent to groupA@company.com because they are a member of the group. If a search returns that message because it was in user1's Inbox, then the value of To is groupA@company.com while the value of Account is user1@company.com.  
From Google/From Long Text The sender account.  
To Google/To Long Text The recipient account. Multiple recipients are comma-separated and the list is in double quotes. Gmail only
CC Google/CC Multiple Choice Accounts in the cc: field. Gmail only
BCC Google/BCC Multiple Choice Accounts in the bcc: field. Gmail only
Subject Google/Subject Long Text The message subject. Gmail only
Labels Google/Labels Multiple Choice Labels applied to the message by Gmail or the user. Gmail only
DateSent Google/DateSent Date The message send date in UTC, yyyy-MM-dd'T'HH:mm:ssZZZZ. Gmail only
DateRecieved Google/DateRecieved Date The message received date, yyyy-MM-dd'T'HH:mm:ssZZZZ. Gmail only
SubjectAtStart Google/SubjectAtStart Long Text The subject of the conversation when the first message was sent. Classic Hangouts only
SubjectAtEnd Google/SubjectAtEnd Long Text The subject of the conversation when the last message was sent. Classic Hangouts only
DateFirstMessageSent Google/DateFirstMessageSent Date The time stamp for when the first message in a conversation was sent. Classic Hangouts only
DateLastMessageSent Google/DateLastMessageSent Date The time stamp for when the last message in a conversation was sent. Classic Hangouts only
DateFirstMessageReceived Google/DateFirstMessageReceived Date The time stamp for when the first message in a conversation was received. Classic Hangouts only
DateLastMessageReceived Google/DateLastMessageReceived Date The time stamp for when the last message in a conversation was received. Classic Hangouts only
ThreadedMessageCount Google/ThreadedMessageCount Decimal The number of messages in the conversation. Classic Hangouts only