Ultra-Large Scale Storage
Status
We have developed a prototype implementation of Ceph, a distributed file system based on our research. The metadata server (MDS) is based on Dynamic Subtree Partitioning, an architecture that adaptively distributes metadata across a cluster based on the current workload. Intelligent OSDs manage data replication, failure detection, and data migration during failure recovery or system expansion. Data is stored by each OSD using EBOFS, an object file system based on prior experience with OBFS. Data is distributed using CRUSH, a hash-like distribution function that allows any party to calculate (instead of looking up) the location of data. CRUSH is designed to cope with device failure and cluster expansion, while separating object replicas across failure domains for improved data safety.
The Ceph source code is available at SourceForge.
If the data stored in large-scale storage systems is sensitive or confidential, security measures must be deployed to protect the data. We have designed and implemented Horus, a system that offers fine-grained encryption-based security for large-scale storage. Horus encrypts large datasets using keyed hash trees (KHT) to generate different keys for each region of the dataset, providing fine-grained security. Performance evaluation shows that our prototype’s key distribution is highly scalable and robust.