Windows Service is stopped for at least one Resource Server
Description
This alert is triggered when a monitored Windows Service is stopped on at least one Resource Server. The following services are monitored:
- kCura EDDS Web Processing Manager
- kCura Service Host Manager
- kCura EDDS Agent Manager
- Relativity Analytics Engine
- W3SVC (World Wide Web Publishing Service)
- Elasticsearch
- apm-server
- Relativity Secret Store
- QueueManager (Invariant Queue Manager)
- RabbitMQ
Note: This alert can also include custom Windows services that have been configured through Custom JSON Configuration.
Resolution Guidance
Impact When Active
When a monitored Windows Service is stopped on a Resource Server, that server will not function as expected and will not be able to process tasks or handle workloads assigned to it.
How To Resolve
From the alert in Relativity, navigate to the linked Kibana dashboard to identify:
- The affected server
- The specific Windows service that has stopped
Access the affected Host Server and perform the following steps:
- Open the Windows Services management console.
- Locate the Windows service identified in Kibana.
- Start the stopped Windows service.
- Return to Kibana and verify that the service status has updated to Running.
You can also try
If the Windows service fails to start or stops again after being started:
- Verify that the service Startup Type is configured correctly (e.g., Automatic).
- Review the Windows Event Logs for error messages or warnings related to the service.
- Consider restarting the server if the service continues to fail.
- Contact Relativity Support for further assistance if the issue persists.
Alert Details
Alert Condition Details
| Name | Value |
|---|---|
| Rule Type | Elasticsearch Query |
| Data View | metrics-* |
| Filter Query | relsvr.windows_service.running : 0 |
| Group | Count |
| Threshold | > 0 |
| Time Window | 90 sec |
| Frequency | 30 sec |
Alert Metric Details
Metric Name: relsvr.windows_service.running
Metric Description: Alert triggers on at least one windows service stopped for at least 90 seconds.
Metric Attributes:
| Attribute Name | Description |
|---|---|
| Display Name | Name of Service |
| host.name | host name |
| Is Service Running | 0/1 |