Custom Page Deployment Manager has not updated its status recently

Description

This alert is active when at least 1 Custom Page Deployment Manager agent has not updated for 3 or more minutes. Normally, the "kCura Web Processing Manager" Windows service will periodically update these agents. These updates consist of checking if the agent has been deleted, setting the enabled status, logging level, and interval, updating the Agents table in the EDDS database, and replacing the currently running agent if a new version has been uploaded to Relativity. If the alert is active, then these updates have not taken place within the last 3 minutes and the agent may be the wrong version or have the wrong settings.

Alert Details

Alert ID: 62c62344-a15e-404b-8057-48e5ae7eb9a9

Tags:

  • FeatureDomain:Custom Pages
  • PageType:SavedSearch
  • PageID:cada4792-9aae-449e-832b-936b68f99816
  • CreatedBy:Relativity
  • Resolution

Metric/Log/Trace Details

Metric Name: relsvr.agent.updating

Metric Description: Whether the agent has updated in the last 3 minutes

Metric Attributes:

Attribute Name Description Value
labels.application_name Application Name Name of the Relativity application the agent comes from or Environment Administration & Operations for default applications
labels.name Name of the metric
labels.relsvr_system System Name Agents
labels.relsvr_subsystem Subsystem Name
labels.relsvr_agent_name Name of the agent that is not updating
labels.message A string containing the server name, agent name, time of the last update from the agent, whether the agent is enabled, and the agent run interval
labels.relsvr_server_name The name of the Web Background Processing server
labels.relsvr_server_state Whether the resource server is active
labels.relsvr_agent_status The current status of the agent Not updating
labels.relsvr_agent_type_name The name of the agent type of the stale agent
labels.relsvr_agent_type_guid The GUID of the agent type of the stale agent

Rule details

Alert Condition Description:

Name Value Description
Rule Type Elastic Query
Data View metrics-*
Filter Query relsvr.agent.updating: 0 and labels.relsvr_agent_type_guid: BC5A8102-C038-432E-B3A7-C34C86412996 If the agent has not updated in the last 3 min
Group Count
Threshold > 0 If any agents are stale
Time Window Last 1 min
Frequency 3 min

Requires User Intervention

  • No: define time window before the alert fires
    • Min time before the alert is active/inactive: 3 minutes

Kibana saved search link called "[Relativity] Custom Page Deployment Managers without recent updates"

The "One or more agents are disabled" alert and/or the "One or more resource servers are inactive" alert are likely to also be triggered.