Last updated：2021-07-21 14:16:56
EBS adopts the triplicate storage mechanism to ensure data reliability. By default, data is divided into data blocks of the same size. Each block is replicated into three replicas that are stored on different nodes in a cluster based on a distributed storage algorithm.
The main feature of EBS triplicate storage is as follows:
The EBS storage system automatically distributes the three replicas to different physical disks of different servers, so that the failure of any of them does not interrupt your business.
The data of the three replicas is consistent.
For example, Data block P2’ is stored in Physical disk A of Server A. The system backs up this block as P2’’ in Physical disk B of Server B and P2 in Physical disk C of Server C. Therefore, the same data block has three replicas P2, P2’, and P2’’. When Disk C fails, P2’ and P2’’ on the other two disks continue to function, preventing the business from being affected. The following figure shows how the replicas of the data block are stored in the servers.
Data consistency means that when an application writes data to the storage system, data in the three replicas must be synchronized, so that the application obtains the same data no matter which replica it reads.
The EBS triplicate storage mechanism ensures data consistency in the following ways:
Simultaneously write data to all three replicas
When an application writes data, the storage system simultaneously writes data to all three replicas, and returns a success response to the application only when the write operation is completed on all replicas.
Automatically fix damaged replicas in the case of a data read failure
The storage system diagnoses the type of fault when the application fails to read data. If a sector of the physical disk fails to be read, the storage system automatically reads data from the replica that is stored on another node, and rewrites the data to the node with the faulty sector. This way, the number of replicas remains unchanged, and data in the replicas are the same.
Each physical disk of the storage system stores multiple data blocks, whose replicas are stored on different nodes in the cluster based on the specific policy. The storage system automatically restores data when it detects that hardware such as a server or physical disk is faulty. Data is restored on multiple nodes in parallel, and only a small amount of data needs to be restored on each of these nodes. This mechanism effectively avoids the performance bottleneck that occurs when a large amount of data is restored on a single node, and minimizes the impact on the upper-layer business.
The following figure shows the principle of data protection in EBS. When Server F in the cluster encounters a hardware failure, data blocks on its physical disks are reconstructed in parallel on the disks of other servers.
Did you find the above information helpful?
Please give us your feedback.
Thank you for your feedback.