Elasticsearch system requirements
Depending on your infrastructure tier, you have different server specifications and recommendations for the Elasticsearch cluster available to you.
Elasticsearch is built on a distributed architecture made up of many servers or nodes. A node is a running instance of Elasticsearch, a single instance of Elasticsearch running in the JVM. Every node in an Elasticsearch cluster can serve one of three roles.
- Master nodes—are responsible for managing the cluster.
- Data nodes—responsible for indexing and searching of the stored data.
- Client nodes—are load balancers that redirect operations to the node that holds the relevant data, while offloading other tasks.
Set up an entirely separate cluster to monitor Elasticsearch with one node that serves all three roles: master, data, and client. While this setup does not take advantage of the distributed architecture, it acts as an isolated logging system that will not affect the main cluster.
See the following related pages:
Infrastructure considerations
Consider the following factors when determining the infrastructure requirements for creating an Elasticsearch environment:
- Infrastructure tier—when you build out your initial Relativity environment, we use these measures to determine a tier level of 1, 2, or 3. This tier level takes into consideration the number of users, SQL sizes, and the amount of data and activity in your system.
- Virtual versus physical servers—although Elastic recommends physical servers, our implementation does not require physical servers. Virtual servers can be implemented for all nodes.
- Storage type—Elasticsearch is a distributed system and you should run it on storage local to each server. SSDs are not required.
- Network connectivity—because of the distributed architecture, network connectivity can impact performance, especially during peak activity. Consider 10 GB as you move up to the higher tiers.
- Client nodes—larger clusters that do not perform heavy aggregations, search against your data, may perform better without client nodes. Simply use a master and data node configuration with a load balancer to handle data in your cluster.
Note: Elasticsearch will not t allocate new shards to nodes once they have more than 85% disk used.
Some other considerations:
Elasticsearch cluster system requirements
The number of nodes required and the specifications for the nodes change depending on both your infrastructure tier and the amount of data that you plan to store in Elasticsearch.
Notes:
- These recommendations are for audit only.
- Disk specs for data nodes reflect the maximum size allowed per node. Smaller disk can be used for the initial setup with plans to expand on demand.
Test (500 GB)
Node type |
# of nodes needed |
CPU |
RAM |
DISK (GB) |
Primary/Data |
1 |
4 |
32 |
500 |
Tier 1 (1 TB)
Node type |
# of nodes needed |
CPU |
RAM |
DISK (GB) |
Primary/Data |
1 |
4 |
32 |
1000 |
Data |
1 |
4 |
32 |
1000 |
Tier 2 (3TB)
Node type |
# of nodes needed |
CPU |
RAM |
DISK (GB) |
Primary/Data |
3 |
4 |
32 |
2000 |
Tier 3 (4-15 TB)
Node type |
# of nodes needed |
CPU |
RAM |
DISK (GB) |
Data |
1-15 (scale on demand) |
4 |
32 |
2000 |
Primary/Data |
3 |
4 |
8 |
2000 |
To assess the sizes of a workspace’s activity data and extracted text, contact Relativity Support and request the AuditRecord and ExtractedText Size Gatherer script.
If you have further questions after running the script, our team can review the amount of activity and monitoring data you want to store in Elasticsearch and provide a personalized recommendation of monitoring nodes required.