2153195e-2fff-49ee-a636-75d85c4910ed
Analytics engine heap memory exceeds 95% on at least one host
Description
Monitor the heap memory usage of the Analytics engine. Alert if heap usage reaches or exceeds a defined threshold while a job is actively running, indicating potential performance degradation or memory leaks.
Resolution Guidance
Impact When Active
When this alert is active, it indicates that the heap memory on the Analytics engine has reached or exceeded a critical threshold.
This can result in:
- Slower processing of large data sets
- Job failures or unexpected behavior due to insufficient memory
- Potential memory leaks causing long-term stability issues
How To Resolve
- Identify Active Analytics Jobs
- Check which jobs are currently running on the Analytics engine. Focus on:
- Large index builds
- Structured analytics jobs
- Concurrent operations
- Use Relativity Job Monitor or database queries to identify active workloads.
- Review Heap Usage
- Query the memory_used_pct field in metrics-* index to confirm actual heap usage.
- If memory usage remains consistently over 90–95%, continue with steps below.
- Evaluate JVM Heap Size Configuration Check current JVM settings (-Xmx and -Xms) in env.cmd file:
\CAAT\bin\env.cmd
General Sizing Guidelines (from Relativity):
Server Role |
Recommended -Xmx |
Structured Analytics only |
~85% of total RAM (leave 10 GB for DB) |
Indexing only |
~85% of total RAM (leave 10 GB for DB) |
Combined (Indexing + Structured) |
~85% of total RAM (leave 10 GB for DB) |
Copy
Adjust -Xmx accordingly and restart the CAAT service for changes to apply.
Follow the instructions provided in the [Relativity documentation](https://help.relativity.com/Server2024/Content/System_Guides/Environment_Optimization_Guide/Configuring_the_Analytics_server.htm#JavaheapsizeJVM) for configuring the CAAT environment.
- Optimize or Restructure Workloads
- Break up large analytics jobs into smaller batches.
- Optimize training sets and reduce number of documents per job.
- Avoid concurrent resource-intensive jobs on the same server.
- Disable Unused Analytics Indexes
- Navigate to any unused Analytics indexes and click “Disable Queries” to free RAM.
- Use the MaxAnalyticsIndexIdleDays setting to automate this.
- Restart CAAT if Memory Is Fully Consumed
- If the Analytics engine becomes unresponsive:
- Restart the Relativity Analytics Engine (CAAT) Windows service.
Long-Term Recommendations
- Monitor heap usage trends using telemetry or APM tools.
- Increase physical memory if usage consistently trends high.
- Scale horizontally by adding dedicated servers for indexing or structured analytics.
- Follow Relativity's memory formula: Documents × 6000 = JVM bytes required e.g., 1M docs ≈ 6 GB heap
Alert Details
Alert Condition Details
Name |
Value |
Description |
Rule Type |
Elasticsearch query |
|
Data View |
metrics-* |
|
Filter Query |
FROM metrics-* EVAL memory_used_pct = (jvm.memory.used / jvm.memory.limit) * 100 WHERE memory_used_pct > 95 KEEP jvm.memory.used, jvm.memory.limit, memory_used_pct,* |
To fetch the data when analytics engine heap memory usage exceeds 95% |
Threshold |
> 95% |
When analytics engine heap memory usage exceeds 95% alert triggers |
Time Window |
5min |
Verified data for last 5 minutes |
Rule schedule |
1 minute |
Checks for every 1 minute |
Alert Metric Details
Metric Name: memory_used_pct
Metric Description: Calculates the percentage of memory used by dividing jvm.memory.used
by jvm.memory.limit
and multiplying by 100. An alert is triggered when memory_used_pct
exceeds 95%, indicating that the Analytics engine heap memory usage is critically high
Metric Attributes:
Attribute Name |
Description |
Value |
jvm.memory.limit |
Indicates the maximum memory allocation for the JVM in bytes |
Amount of memory available to the JVM |
jvm.memory.used |
Indicates the amount of memory used by the JVM in bytes |
Amount of memory used by the JVM |