Kingsoft Cloud-Documentation-Preparations

Elasticsearch（KES）

View more results

Document title with current keyword not found

Estimate the storage capacity

The following table describes the major factors that affect the disk storage of KES.	Factor	Description
Number of replicas	The default and recommended number of replicas is 1. In scenarios that can withstand unexpected data loss, you can configure 0 replicas.
Indexes	In addition to the raw data, KES needs to store indexes and column store data. The required storage space in KES is usually 10% larger than the source data in size. This estimation does not consider fields such as _all.
Internal tasks	Segment merging, KES translogs, and other logs occupy about 20% of the disk space.
Resources reserved by the operating system	The Linux operating system reserves 5% of the disk space for the root user by default. The reserved disk space is used to handle key processes, recover the system, and prevent disk fragmentation.
Safety threshold	20% of the disk space must be reserved for security reasons.

Based on the preceding factors, the minimum disk space is 3.6 times the source data size. The minimum disk space is calculated based on the following formula:

Disk space = Source data size × (1 + Number of replicas) × (1 + Index space)/(1 - Linux reserved space)/(1 - Internal task space)/(1 - Safety threshold) = Source data size × (1 + Number of replicas) × 1.8 = Source data size × 3.6

Estimate the size and number of shards

The size and number of shards greatly affect the stability and performance of a KES cluster. Each index in the KES cluster requires a proper shard plan. By default, five shards are planned for each index.

Each shard is up to 50 GB in size. We recommend that you use a shard size within 10 to 50 GB.
The number of shards and replicas is preferably equal to the number of nodes, or is an integer multiple of the number of nodes.
Excessive shards make it difficult to manage the cluster status. We recommend that you plan up to 20 shards per 1 GB of memory on each instance in a KES cluster. For example, if an instance has 10 GB memory, the number of shards on this instance cannot exceed 200.

Estimate the cluster specifications

The computing resources of KES are mainly consumed by writes and queries. The complexity and proportions of writes and queries vary in different scenarios. Therefore, it is more difficult to estimate computing resources than to estimate storage resources. We recommend that you first estimate the amount of storage resources and then preliminarily select computing resources. You can determine whether the computing resources are sufficient during testing.

We recommend that you first select at least three nodes to avoid split-brain and ensure high fault tolerance to nodes.

After you preliminarily select the instance type, you can use practical data for testing. You can determine whether the instance type is appropriate by observing monitoring metrics such as the CPU usage, write performance, QPS, and rejected writes or queries. In addition, we recommend that you configure alarms for the monitoring metrics to identify resource shortages in a timely manner during online use.

Understand the suggestions on model selection

Local SSD: This model is cost-effective and suitable for customers that require a low storage capacity and superb performance.
Elastic Block Storage (EBS): This model is suitable for customers that require a high storage capacity, stable performance, and high data availability. For example, this model is ideal for processing audit logs and financial data. The cost of the EBS model is higher than that of the Local SSD model.
Elastic Physical Compute (EPC): This model is suitable for customers with large-scale business that require ultimate performance.

On this page

Previous:Documentation

Next:Quick start

Pure ModeNormal Mode

Pure Mode

Click to preview the document content in full screen

Compute

Cloud Monitor

Compliance

Networking

Security

Storage&CDN

Database

Legal

ActionTrail

Video Services

Data Analysis

Account Management

Preparations

Estimate the storage capacity

Estimate the size and number of shards

Estimate the cluster specifications

Understand the suggestions on model selection