Magellan: A Searchable Metadata Architecture for Large-Scale File Systems
Published as Storage Systems Research Center Technical Report UCSC-SSRC-09-07.
Abstract
As file systems continue to grow, metadata search is becoming an increasingly important way to access and manage files. However, existing solutions that build a separate metadata database outside of the file system face consistency and management challenges at large-scales. To address these issues, we developed Magellan, a new large-scale file system metadata architecture that enables the file system’s metadata to be efficiently and directly searched. This allows Magellan to avoid the consistency and management challenges of a separate database, while providing performance comparable to that of other large file systems. Magellan enables metadata search by introducing several techniques to metadata server design. First, Magellan uses a new on-disk inode layout that makes metadata retrieval efficient for searches. Second, Magellan indexes inodes in data structures that enable fast, multi-attribute search and allow all metadata lookups, including directory searches, to be handled as queries. Third, a query routing technique helps to keeps the search space small, even at large-scales. Fourth, a new journaling mechanism enables efficient update performance and metadata reliability. An evaluation with real-world metadata from a file system shows that, by combining these techniques, Magellan is capable of searching millions of files in under a second, while providing metadata performance comparable to, and sometimes better than, other large-scale file systems.
Publication date:
November 2009
        Authors:
        
            
                Andrew Leung
            
        
            
                Ian Adams
            
        
            
                Ethan L. Miller
            
        
    
        Projects:
        
            Scalable File System Indexing
        
            HECURA: Scalable Data Management
        
            Ultra-Large Scale Storage
        
    
Available media
Full paper text: PDF
Bibtex entry
@techreport{leung-ssrctr09-07,
  author       = {Andrew Leung and Ian Adams and Ethan L. Miller},
  title        = {Magellan: A Searchable Metadata Architecture for Large-Scale File Systems},
  institution  = {University of California, Santa Cruz},
  number       = {UCSC-SSRC-09-07},
  month        = nov,
  year         = {2009},
}
    
