Supported email header formats

Email threading and name normalization results may be incorrect if the extracted text isn't formatted properly. The email threading and name normalization operations rely on well-formed email messages. Poorly-formed email messages can be caused by processing and email software that doesn't adhere to email standards.

Analytics uses a best-effort approach to parse email messages, but it will not handle all possible cases of badly formed email messages.

This page contains the following information:

See these related pages:

Supported email header formats

The primary email is the most recent email segment, which is found at the very top of the document. The primary email header is only used by Analytics if the Email Header Fields are not set on the Analytics profile or if a given email does not have any data in the linked Email Header fields. If the Email Header fields are linked on the Analytics profile and the fields are present for a given document, then the fielded data is used rather than the primary email header. A lack of primary email headers in the extracted text is also supported as long as the Email Header fields are linked on the Analytics profile, and they are set on the documents.

Note: No text should be inserted into the extracted text before the primary email header. If there is text before the primary email header, this will be analyzed as if it is a reply to the email header below it. Embedded emails are those found below the primary email. Embedded headers are always used for email threading and name normalization and need to be in a supported format in order for the email segment to be properly identified.

Expand the following to view a list of currently supported header formats.

Supported email header fields

The following is a list of email header fields currently supported by email threading and name normalization for primary email headers. A line in the header beginning with one of these field names followed by a colon indicates an email header field. If the field spans more than one line, it is expected that the continuation immediately follows but is indented with white space. The field names are not case-sensitive, but the diacritics, if present, are required. The order of the fields in the primary email header is irrelevant.

Expand the following to view a list of currently supported header fields.

 

Supported date formats

The following is a list of date formats currently supported by email threading and name normalization. Note that not all date formats are supported in all languages.

Expand the following to view a list of currently supported date formats.

Reformatting extracted text

Extracted text may require reformatting to comply with the email header format requirements. For the most reliable parsing, reconstruct emails using the following guidelines:

  1. Request that the processing vendor send a version of the extracted text without headers, or request that the extracted text is altered to meet the requirements listed under Supported email header formats.
  2. Place header fields, one per line, at the top of the file. Do not place any blank lines between fields. Limit header fields to those listed under Supported email header fields.
  3. At the end of the header, include a single blank line.
  4. Place the body of the email below the blank line. Do not include any additional text, markers, spacing, or indentation that could hide the structure of the email.

Note: Use consistent end-of-line delimiters. Typical choices are \n for Unix systems and \r\n for Windows systems.