18778d10-60f1-4703-ba94-759175f04ce4
One or more agent servers have not been responding for n minutes
Description
The alert is active when the 'kCura EDDS Agent Manager' Windows service is not running
Alert Details
Alert ID: 18778d10-60f1-4703-ba94-759175f04ce4
Tags:
Each tag should follow "key:value" format.
- FeatureDomain:Agents
- PageType:Dashboard
- PageID:cd200ee0-1e61-4645-8220-83ce82914a71
- CreatedBy:Relativity
- ResolutionText:Go to 'Windows Services' and restart 'kCura EDDS Agent Manager'
- Resolution
Metric/Log/Trace Details
Metric Name: relsvr.agent.status
Metric Attributes:
Attribute Name |
Description |
Value |
labels.agent_name |
Relativity Agent Name |
|
labels.agent_type_name |
Relativity Agent Type |
|
labels.application_name |
Application Name |
Environment Administration & Operations |
labels.exception_message |
Any exception message on Agent |
|
labels.message |
Message describes the issue |
Agent Manager is not responding. |
labels.name |
Name of metric |
Agent Disabled |
labels.relsvr_artifact_id |
Relativity agent artifact Id |
|
labels.relsvr_subsystem |
Agent Name |
|
labels.relsvr_system |
System name |
Agents |
labels.relsvr_agent_status |
The current status of the agent |
not responding |
labels.relsvr_agent_type |
The name of the agent type of the stale agent |
|
labels.relsvr_resource_server_name |
The name of the server |
|
Rule details
Alert Condition Description: Alert triggers on agent server have not been responding count greater than 0 for last 1 minute.
Name |
Value |
Description |
Rule Type |
Elastic Query |
|
Data View |
metrics-* |
|
Filter Query |
relsvr.agent.disabled:0 and labels.relsvr_agent_status : "not responding" |
Agent not responding |
Group |
Count |
number of agent not responding |
Threshold |
> 0 |
Count greater than 0, alert triggers |
Time Window |
1 min |
Verified data for last 1 minute |
Frequency |
30 sec |
Checks for each 30 seconds |
Requires User Intervention
- Yes: alert immediately
- Min time before the alert is active/inactive: 90 seconds
Visualization link
Windows Services Dashboard
Host Heartbeat alert should not be in active state.