

Visit Relativity Learning to explore additional learning opportunities for this topic.
This page covers additional supported file type (for processing) information for Google Workspace. To view the Relativity supported and unsupported file type tables, see Supported file types for processing.
When you gather Google Workspace data, Google provides a supplemental metadata file. For Google Mail (Gmail), the metadata file is a .csv file. For Google Drive, the metadata file is an .xml file.
Processing automatically identifies when the supplemental metadata file is present in your file share, links the metadata fields to the processed files, and makes them available as mappable fields.
There are two methods for collecting Google Workspace data:
When using Collect to gather Gmail data, the metadata file is provided as a .csv file, an .xml file is also provided but not used. When using Collect to gather Google Drive data, the metadata file is provided as an .xml file.
Relativity Collect: Gmail | Relativity Collect: Google Drive |
---|---|
![]() |
![]() |
Note: Collect places additional files in the root of the data source. They are processed as loose individual files if they are not removed prior to processing.
When using Google Vault to gather Gmail data, download the [EXPORT_NAME].zip, ;and [EXPORT_NAME]-metadata.csv files. For Google Drive data, download the [EXPORT_NAME].zip and [EXPORT_NAME]-metadata.xml files. When collecting data manually from Google Vault, you must export Gmail data in MBOX format. Relativity does not sync additional metadata if exported as a .pst file.
Google Vault: Gmail | Google Vault: Google Drive |
---|---|
![]() |
![]() |
Note: If you have files saved in your root folder other than the supplemental metadata files, they are processed as loose individual files.
The Google supplemental metadata files contain the fields listed below, all of which are available for mapping to Document object fields.
Note: Google may add, remove, or edit fields at any time. You can find the most current lists on their Vault export contents web page.
The following sections describe Relativity's behavior when processing Google Drive files with deduplication enabled or disabled. Whether or not to enable deduplication is up to you; Relativity processes and publishes the files regardless of the deduplication status. However, field mapping has some differences depending on the deduplication status.
Relativity uses a combination of hashes for deduplication when processing. With deduplication turned off, two files with differing cloud-based metadata (sidecar metadata) can have the same physical file hash (of a native parent file). Since Relativity associates sets of metadata with a physical (parent) file hash, parent files with identical hashes can be populated with identical sidecar metadata. Relativity has developed a set of source fields specific to Google Drive to resolve this issue. When processing Google Drive files with deduplication turned off, Relativity forces unique hashes for each set of sidecar metadata. See the table below this section for a description of Google Drive-specific fields.
With deduplication on, Relativity processes files like other data types, meaning it publishes all the metadata it finds. When deduplication takes place, Relativity adds additional metadata fields to prevent different native (parent) files with the same hash from receiving identical sidecar metadata. If two native (parent) files with identical hashes have associated sidecar metadata (each having different hashes), Relativity deduplicates one of the files, and you will only see one file.
The following table lists the metadata fields found in the Google Drive .xml file. Use these fields when processing Google Drive files with deduplication turned off.
Google drive .xml field | Relativity source field name |
Field type | Description |
---|---|---|---|
DocID | GoogleDrive/DocID | Long Text | A unique identifier for the file. For site exports, the value is the page ID. |
#Author | GoogleDrive/Author | Long text | The email address of the person who owns the file in Drive. For a shared drive file, it shows the shared drive name. |
Collaborators | GoogleDrive/Collaborators | Multiple Choice | The accounts and groups that have direct permission to edit the file or add comments. Also includes users with indirect access to the file if you chose this option during export. |
Viewers | GoogleDrive/Viewers | Multiple Choice | The accounts and groups that have direct permission to view the file. Also includes users with indirect access to the file if you chose this option during export. |
#DateCreated | GoogleDrive/DateCreated | Date | The date a Google file was created in Drive. For non-Google files, the date the file was uploaded to Drive. |
#DateModified | GoogleDrive/DateModified | Date | The date the file was last modified. |
#Title | Google/Title | Long Text | The file name as assigned by the user. Because some operating systems cannot expand zip files with extremely long file names, Vault truncates the file name at 128 characters during export. The value shown by the #Title tag isn't truncated. |
DocumentType | GoogleDrive/DocumentType | Long Text | The file type for Google files. Possible values are:
|
Others | GoogleDrive/Others | Multiple Choice | The accounts from your query that have indirect access to the file if you opted to exclude access level information during export. May also include users for whom Vault could not determine permission levels at the time of export. |
DocParentID | Google/DocParentID | Long Text | For sites, the site ID. |
SharedDriveID | Google/SharedDriveID | Long Text | The identifier of the shared drive that contains the file, if applicable. |
SourceHash | Google/SourceHash | Long Text | A unique hash value for each version of a file. Can be used to deduplicate file exports and verify the exported file is an exact copy of the source file. Supported by Google Docs, Sheets, and Slides files only. |
FileName | GoogleDrive/FileName | Long Text | The file name. Use this value to correlate the metadata with the file in the export ZIP file. |
FileSize | GoogleDrive/FileSize | Whole Number | The size of the file in bytes. |
Hash | GoogleDrive/FileHash | Long Text | The MD5 hash of the file. |
UserQuery | Google/UserQuery | Long Text | The query submitted by the Vault user that retrieved the files included in this export. |
TimeZone | Google/TimeZone | Long Text | The time zone used for date-based searches |
Custodians | Google/Custodians | Long Text | The email addresses of the users whose accounts were searched. If you searched for content rather than individual user accounts, there are no custodians listed here. |
Labels | GoogleDrive/Labels | Multiple Choice | Labels applied to the message by Google Drive or the user. |
The following table lists the metadata fields found in the Gmail .csv file. Use these fields for mapping Gmail data.
Google mail .csv field |
Relativity source field name |
Field type | Description | Notes |
---|---|---|---|---|
Rfc822MessageId | Google/Rfc822MessageId | Long Text | A message ID that is the same for the receiver's and sender's messages. Use this value to correlate metadata with the message in an MBOX export. For classic Hangouts, the value is for the first message in the thread. | |
GmailMessageId | Google/GmailMessageId | Long Text | A unique message ID. Use this value to manage specific messages with the Gmail API. For classic Hangouts, the value is for the first message in the thread. | |
Account | Google/Account | Long Text | The account that had the message in their inbox. For example, user1@company.com received a message sent to groupA@company.com because they are a member of the group. If a search returns that message because it was in user1's Inbox, then the value of To is groupA@company.com while the value of Account is user1@company.com. | |
From | Google/From | Long Text | The sender account. | |
To | Google/To | Long Text | The recipient account. Multiple recipients are comma-separated and the list is in double quotes. | Gmail only |
CC | Google/CC | Multiple Choice | Accounts in the cc: field. | Gmail only |
BCC | Google/BCC | Multiple Choice | Accounts in the bcc: field. | Gmail only |
Subject | Google/Subject | Long Text | The message subject. | Gmail only |
Labels | Google/Labels | Multiple Choice | Labels applied to the message by Gmail or the user. | Gmail only |
DateSent | Google/DateSent | Date | The message send date in UTC, yyyy-MM-dd'T'HH:mm:ssZZZZ. | Gmail only |
DateRecieved | Google/DateRecieved | Date | The message received date, yyyy-MM-dd'T'HH:mm:ssZZZZ. | Gmail only |
SubjectAtStart | Google/SubjectAtStart | Long Text | The subject of the conversation when the first message was sent. | Classic Hangouts only |
SubjectAtEnd | Google/SubjectAtEnd | Long Text | The subject of the conversation when the last message was sent. | Classic Hangouts only |
DateFirstMessageSent | Google/DateFirstMessageSent | Date | The time stamp for when the first message in a conversation was sent. | Classic Hangouts only |
DateLastMessageSent | Google/DateLastMessageSent | Date | The time stamp for when the last message in a conversation was sent. | Classic Hangouts only |
DateFirstMessageReceived | Google/DateFirstMessageReceived | Date | The time stamp for when the first message in a conversation was received. | Classic Hangouts only |
DateLastMessageReceived | Google/DateLastMessageReceived | Date | The time stamp for when the last message in a conversation was received. | Classic Hangouts only |
ThreadedMessageCount | Google/ThreadedMessageCount | Decimal | The number of messages in the conversation. | Classic Hangouts only |
Why was this not helpful?
Check one that applies.
Thank you for your feedback.
Want to tell us more?
Great!