Geomancy: Automated Performance Enhancement through Data Layout Optimization
Appeared in Proceeding of the Conference on Mass Storage Systems and Technologies (MSST '20).
Abstract
Large distributed storage systems such as high-performance computing (HPC) systems used by national or international laboratories require sufficient performance and scale for demanding scientific workloads and must handle shifting workloads with ease. Ideally, data is placed in locations to optimize performance, but the size and complexity of large storage systems inhibit rapid effective restructuring of data layouts to maintain performance as workloads shift.
To address these issues, we have developed Geomancy, a tool that models the placement of data within a distributed storage system and reacts to drops in performance. Using a combination of machine learning techniques suitable for temporal modeling, Geomancy determines when and where a bottleneck may happen due to changing workloads and suggests changes in the layout that mitigate or prevent them. Our approach to optimizing throughput offers benefits for storage systems such as avoiding potential bottlenecks and increasing overall I/O throughput from 11% to 30%.
Publication date:
October 2020
        Authors:
        
            
                Oceane Bel
            
        
            
                Kenneth Chang
            
        
            
                Nathan Tallent
            
        
            
                Dirk Duellman
            
        
            
                Ethan L. Miller
            
        
            
                Faisal Nawab
            
        
            
                Darrell D. E. Long
            
        
    
        Projects:
        
            Scalable High-Performance QoS
        
            Prediction and Grouping
        
            Storage QoS
        
    
Available media
Full paper text: PDF
Bibtex entry
@inproceedings{bel-msst20,
  author       = {Oceane Bel and Kenneth Chang and Nathan Tallent and Dirk Duellman and Ethan L. Miller and Faisal Nawab and Darrell D. E. Long},
  title        = {Geomancy: Automated Performance Enhancement through Data Layout Optimization},
  booktitle    = {Proceeding of the Conference on Mass Storage Systems and Technologies (MSST '20)},
  month        = oct,
  year         = {2020},
}
    
