Diving into Red Hat Ceph Storage

Ekta Mishra
4 min readApr 19, 2020

A platform for petabyte-scale storage.

RedHat Ceph Storage is open-source software, that implements object storage on a single distributed computer cluster, and provides interfaces for object-storage, block-storage, and file-level storage in a single platform. Ceph provides a software-defined architecture that can be easily integrated into your existing hardware and infrastructure with lower capital costs and more flexibility. It’s a storage architecture that is highly recommended for cloud and virtualization applications like Red Hat Enterprise Linux, OpenStack Platform etc.

Ceph Storage Cluster

Architecture of Ceph Storage Cluster

Ceph is designed in a way that it is having both features for self-healing as well as self-managing the data. Ceph Storage Cluster accommodates a large number of storage nodes that communicate with each other to replicate and redistribute data dynamically. It provides a distributed architecture without a single point of failure. This is because the data in the Ceph Storage Cluster maintains replicas of it.

For using Ceph Storage services, you need to set up a Ceph Storage Cluster. A Ceph storage cluster consists of two daemons: Ceph Object Storage Daemon that stores the data as object in the storage node and Ceph Monitor that keeps the track of clusters by maintaining a copy of the cluster map. Every Ceph Storage Cluster requires at least one:

  • Ceph Monitor : Monitors maintains the cluster membership and provides the health of the storage nodes. It keeps the track of the cluster but it doesn’t serve the data to the clients. A minimum of 3 monitors is required for the cluster for providing high-availability (to be dependable enough to respond to the client requests). Authenticating clients and daemons is also the job of Ceph monitors.
  • Ceph Manager : It keeps the record of all the runtime scenarios of the cluster. It maintains a check over the system load, utilization of the storage resources, and the Cluster performance metrics. We can check the details provided by the manager from the Dashboard.
  • Ceph OSDs : The OSDs actually provides access to the data. An OSD is installed per disk. It peers with other OSDs for replication and recovery tasks. It works in a peer to peer fashion to fetch the query results without affecting the performance, even if some storage nodes go down. It also provides data to Ceph monitors and Ceph managers by communicating and checking other OSDs.
  • Ceph Metadata Server : It comes into the picture when we talk about the Ceph File System. While using the Ceph file system, when the client requests for some data, the request first goes to the metadata servers where it checks for all POSIX and tells in which directory the data is stored, but the response is still served by the OSDs. So, mainly it maintains the directory hierarchy and stores the metadata about the filesystem.

CRUSH Algorithm

Ceph uses the CRUSH algorithm, that decides in which OSD the particular data should be stored. It’s a pseudo-random algorithm, that ensures statistically even distribution of data over the OSDs with the decentralized placement of replicated data. It is because of the CRUSH algorithm which enables the scalability and self-healing/recovering property in Ceph Storage.

How it works??

When we have to store some objects, it first hashes the object name and decides in which placement group it should be stored. It also provides a list of devices where the replicas should be stored. It’s not based on the traditional approaches to store data where the directories and devices are already divided based on some pre-calculations made. It provides a stable mapping which means when you do some changes in the data, add or remove some data there’s minimal data migration. The calculation made by CRUSH algorithm is very fast and most of the calculation is done on the client-side.

Data Storage using CRUSH

Whenever some, storage node gets damaged, the OSDs which works in a peer to peer fashion effectively replicate the data stored in that node and updates the cluster map. Cluster Map comprises of Monitor map, OSD map, PG map, CRUSH map, & MDs map. So, when there’s some change in the OSDs arrangement or after some replication in the data, the cluster map gets updated so that the CRUSH algorithm will provide the updated results next time.

Contribute to Ceph

If you liked the philosophy and idea involved in Ceph and want to join or contribute to the community, you can have a look over these links:

Github page: https://github.com/ceph

References and other useful links:

https://docs.ceph.com/docs/master/

https://www.youtube.com/watch?v=7I9uxoEhUdY

https://www.supportsages.com/ceph-part-3-technical-architecture-and-components/

--

--

Ekta Mishra

Software Engineer @PhonePe | Former @RedHat'21 &Outreachy’20 intern @OpenRefine | Google Code-In’19 Mentor @JBoss | Teaching Assistant @Coding Blocks