Google Workspace

This page covers additional supported file type (for processing) information for Google Workspace. To view the Relativity supported and unsupported file type tables, see Supported file types for processing.

When you gather Google Workspace data, Google provides a supplemental metadata file. For Google Mail (Gmail), the metadata file is a .csv file. For Google Drive, the metadata file is an .xml file.

Processing automatically identifies when the supplemental metadata file is present in your file share, links the metadata fields to the processed files, and makes them available as mappable fields.

There are two methods for collecting Google Workspace data:

  • Collect in Relativity. For more information, see the Google Workspace documentation.
  • Manual download via Google Vault.

Collect

When using Collect to gather Gmail data, the metadata file is provided as a .csv file, an .xml file is also provided but not used. When using Collect to gather Google Drive data, the metadata file is provided as an .xml file.

Relativity Collect: Gmail Relativity Collect: Google Drive
Collect Gmail Files Collect Google Drive

Note: Collect places additional files in the root of the data source. They are processed as loose individual files if they are not removed prior to processing.

Other notes
  • Google Chat data is converted to Relativity's short message format (RSMF) for processing when using Collect to gather data.
  • Do not edit the supplemental metadata files or Processing will not recognize them.

Google Vault

When using Google Vault to gather Gmail data, download the [EXPORT_NAME].zip, ;and [EXPORT_NAME]-metadata.csv files. For Google Drive data, download the [EXPORT_NAME].zip and [EXPORT_NAME]-metadata.xml files. When collecting data manually from Google Vault, you must export Gmail data in MBOX format. Relativity does not sync additional metadata if exported as a .pst file.

Google Vault: Gmail Google Vault: Google Drive
Google Vault Gmail Google Vault Drive

Note: If you have files saved in your root folder other than the supplemental metadata files, they are processed as loose individual files.

Other notes
  • Google chat data must be converted to Relativity's short message format (RSMF) prior to processing when using Google Vault. Google chat data does not need to be converted to RSMF when using Collect.
  • Do not edit the supplemental metadata files or Processing will not recognize them.

Google Workspace metadata field lists

The Google supplemental metadata files contain the fields listed below, all of which are available for mapping to Document object fields.

Note: Google may add, remove, or edit fields at any time. You can find the most current lists on their Vault export contents web page.

Mapping Google Drive fields with deduplication turned off

This section only applies when processing Google Drive data with deduplication turned off.

Relativity uses MD5 hash for deduplication when processing, and with deduplication turned off, two files with differing cloud-based metadata (sidecar metadata) can have the same physical MD5 hash. Since Relativity associates sets of metadata with a physical file hash, files with identical file hashes can be populated with identical sidecar metadata. To resolve this issue, Relativity has developed a set of source fields specific to Google Drive. When processing with deduplication turned off, this forces files with the same hash to have their own unique set of metadata.

Google Workspace fields

The following table lists the metadata fields found in the Google Drive .xml file. Use these fields when processing Google Drive files with deduplication turned off.

Google drive .xml field Relativity source
field name
Description
DocID GoogleDrive/DocID A unique identifier for the file. For site exports, the value is the page ID.
#Author GoogleDrive/Author The email address of the person who owns the file in Drive. For a shared drive file, it shows the shared drive name.
Collaborators GoogleDrive/Collaborators The accounts and groups that have direct permission to edit the file or add comments. Also includes users with indirect access to the file if you chose this option during export.
Viewers Google/Viewers The accounts and groups that have direct permission to view the file. Also includes users with indirect access to the file if you chose this option during export.
#DateCreated GoogleDrive/DateCreated The date a Google file was created in Drive. For non-Google files, the date the file was uploaded to Drive.
#DateModified GoogleDrive/DateModified The date the file was last modified.
#Title Google/Title The file name as assigned by the user. Because some operating systems cannot expand zip files with extremely long file names, Vault truncates the file name at 128 characters during export. The value shown by the #Title tag isn't truncated.
DocumentType GoogleDrive/DocumentType The file type for Google files. Possible values are:
  • DOCUMENT—a document created in Google Docs.
  • SPREADSHEET—a spreadsheet created in Google Sheets.
  • PRESENTATION—a presentation created in Google Slides.
  • FORM—a form created in Google Forms.
  • DRAWING—a drawing created in Google Drawings.
  • SITES_PAGE—a page from a site created in new Google Sites.
Others Google/Others The accounts from your query that have indirect access to the file if you opted to exclude access level information during export. May also include users for whom Vault could not determine permission levels at the time of export.
DocParentID Google/DocParentID For sites, the site ID.
SharedDriveID Google/SharedDriveID The identifier of the shared drive that contains the file, if applicable.
SourceHash Google/SourceHash A unique hash value for each version of a file. Can be used to deduplicate file exports and verify the exported file is an exact copy of the source file. Supported by Google Docs, Sheets, and Slides files only.
FileName GoogleDrive/FileName The file name. Use this value to correlate the metadata with the file in the export ZIP file.
FileSize GoogleDrive/FileSize The size of the file in bytes.
Hash GoogleDrive/FileHash The MD5 hash of the file.
UserQuery Google/UserQuery The query submitted by the Vault user that retrieved the files included in this export.
TimeZone Google/TimeZone The time zone used for date-based searches.
Custodians Google/Custodians The email addresses of the users whose accounts were searched. If you searched for content rather than individual user accounts, there are no custodians listed here.
Labels GoogleDrive/Labels Labels applied to the message by Google Drive or the user.

The following table lists the metadata fields found in the Gmail .csv file. Use these fields for mapping Gmail data.

Google mail
.cvs field
Relativity source
field name
Description Notes
Rfc822MessageId Google/Rfc822MessageId A message ID that is the same for the receiver's and sender's messages. Use this value to correlate metadata with the message in an MBOX export. For classic Hangouts, the value is for the first message in the thread.  
GmailMessageId Google/GmailMessageId A unique message ID. Use this value to manage specific messages with the Gmail API. For classic Hangouts, the value is for the first message in the thread.  
Account Google/Account The account that had the message in their inbox. For example, user1@company.com received a message sent to groupA@company.com because they are a member of the group. If a search returns that message because it was in user1's Inbox, then the value of To is groupA@company.com while the value of Account is user1@company.com.  
From Google/From The sender account.  
To Google/To The recipient account. Multiple recipients are comma-separated and the list is in double quotes. Gmail only
CC Google/CC Accounts in the cc: field. Gmail only
BCC Google/BCC Accounts in the bcc: field. Gmail only
Subject Google/Subject The message subject. Gmail only
Labels Google/Labels Labels applied to the message by Gmail or the user. Gmail only
DateSent Google/DateSent The message send date in UTC, yyyy-MM-dd'T'HH:mm:ssZZZZ. Gmail only
DateRecieved Google/DateRecieved The message received date, yyyy-MM-dd'T'HH:mm:ssZZZZ. Gmail only
SubjectAtStart Google/SubjectAtStart The subject of the conversation when the first message was sent. Classic Hangouts only
SubjectAtEnd Google/SubjectAtEnd The subject of the conversation when the last message was sent. Classic Hangouts only
DateFirstMessageSent Google/DateFirstMessageSent The time stamp for when the first message in a conversation was sent. Classic Hangouts only
DateLastMessageSent Google/DateLastMessageSent The time stamp for when the last message in a conversation was sent. Classic Hangouts only
DateFirstMessageReceived Google/DateFirstMessageReceived The time stamp for when the first message in a conversation was received. Classic Hangouts only
DateLastMessageReceived Google/DateLastMessageReceived The time stamp for when the last message in a conversation was received. Classic Hangouts only
ThreadedMessageCount Google/ThreadedMessageCount The number of messages in the conversation. Classic Hangouts only