Analytics engine heap memory exceeds 95% on at least one host

Description

Monitor the heap memory usage of the Analytics engine. Alert if heap usage reaches or exceeds a defined threshold while a job is actively running, indicating potential performance degradation or memory leaks.

Resolution Guidance

Impact When Active

When this alert is active, it indicates that the heap memory on the Analytics engine has reached or exceeded a critical threshold. This can result in:

  • Slower processing of large data sets
  • Job failures or unexpected behavior due to insufficient memory
  • Potential memory leaks causing long-term stability issues

How To Resolve

  1. Identify Active Analytics Jobs
    • Check which jobs are currently running on the Analytics engine. Focus on:
      • Large index builds
      • Structured analytics jobs
      • Concurrent operations
      • Use Relativity Job Monitor or database queries to identify active workloads.
  2. Review Heap Usage
    • Query the memory_used_pct field in metrics-* index to confirm actual heap usage.
    • If memory usage remains consistently over 90�95%, continue with steps below.
  3. Evaluate JVM Heap Size Configuration Check current JVM settings (-Xmx and -Xms) in env.cmd file: \CAAT\bin\env.cmd

General Sizing Guidelines (from Relativity):

Server RoleRecommended -Xmx
Structured Analytics only~85% of total RAM (leave 10 GB for DB)
Indexing only~85% of total RAM (leave 10 GB for DB)
Combined (Indexing + Structured)~85% of total RAM (leave 10 GB for DB)
Copy
Adjust -Xmx accordingly and restart the CAAT service for changes to apply.

Follow the instructions provided in the [Relativity documentation](https://help.relativity.com/Server2024/Content/System_Guides/Environment_Optimization_Guide/Configuring_the_Analytics_server.htm#JavaheapsizeJVM) for configuring the CAAT environment.
  1. Optimize or Restructure Workloads
    • Break up large analytics jobs into smaller batches.
    • Optimize training sets and reduce number of documents per job.
    • Avoid concurrent resource-intensive jobs on the same server.
  2. Disable Unused Analytics Indexes
    • Navigate to any unused Analytics indexes and click �Disable Queries� to free RAM.
    • Use the MaxAnalyticsIndexIdleDays setting to automate this.
  3. Restart CAAT if Memory Is Fully Consumed
    • If the Analytics engine becomes unresponsive:
    • Restart the Relativity Analytics Engine (CAAT) Windows service.

Long-Term Recommendations

  • Monitor heap usage trends using telemetry or APM tools.
  • Increase physical memory if usage consistently trends high.
  • Scale horizontally by adding dedicated servers for indexing or structured analytics.
  • Follow Relativity's memory formula: Documents � 6000 = JVM bytes required e.g., 1M docs � 6 GB heap

Alert Details

Alert Condition Details

NameValueDescription
Rule TypeElasticsearch query
Data Viewmetrics-*
Filter Query(doc['jvm.memory.used'].value / doc['jvm.memory.limit'].value) * 100 > 95To fetch the data when analytics engine heap memory usage exceeds 95%
Threshold> 95%When analytics engine heap memory usage exceeds 95% alert triggers
Time Window5minVerified data for last 5 minutes
Rule schedule1 minuteChecks for every 1 minute

Alert Metric Details

Metric Name: jvm.memory.used

Metric Description: The alert triggers when jvm.memory.used reaches 95% of the allocated heap memory.

Metric Attributes:

Attribute NameDescriptionValue
jvm.memory.limitIndicates the maximum memory allocation for the JVM in bytesAmount of memory available to the JVM
jvm.memory.usedIndicates the amount of memory used by the JVM in bytesAmount of memory used by the JVM
Return to top of the page
Feedback