Apr 9 Colloquium: "Automating Distributed Tiered Storage Management in Cluster Computing" -- Herodotos Herodotou

Series: Computer Science Colloquium 

Title: Automating Distributed Tiered Storage Management in Cluster Computing

Date: April 9, 2021 at 2:00pm

Presenter: Herodotos Herodotou
Assistant Professor, Department of Electrical Engineering, Computer Engineering and Informatics at the Cyprus University of Technology 

Place: Virtual meeting *  
* The above link works for those affiliated with the University of Pittsburgh. If you are not affiliated with the University of Pittsburgh, please email Heidi Davis at hld46@pitt.edu to obtain the meeting information.

Abstract:
Data-intensive platforms such as Hadoop and Spark are routinely used to process massive amounts of data residing on distributed file systems like HDFS. Increasing memory sizes and new hardware technologies (e.g., NVRAM, SSDs) have recently led to the introduction of storage tiering in such settings. However, users are now burdened with the additional complexity of managing the multiple storage tiers and the data residing on them, while trying to optimize their workloads. In this talk, I will present OctopusFS, a novel distributed file system that is aware of heterogeneous storage media (e.g., memory, SSDs, HDDs, NAS) with different capacities and performance characteristics. The system offers a variety of pluggable policies for automating data management across the storage tiers and cluster nodes. Smart placement and retrieval policies employ multi-objective optimization techniques for making intelligent data management decisions based on the requirements of fault tolerance, data and load balancing, and throughput maximization. In addition, redistribution policies employ machine learning for tracking and predicting file access patterns, which are used to decide when and which data to move up or down the storage tiers for increasing system performance. The approach uses incremental learning to dynamically refine the models with new file accesses, allowing them to naturally adjust and adapt to workload changes over time. Our extensive evaluation using realistic workloads derived from Facebook and CMU traces compares our approach with several other policies and showcases significant benefits in terms of both workload performance and cluster efficiency.

Biography:
Herodotos Herodotou is an Assistant Professor in the Department of Electrical Engineering, Computer Engineering and Informatics at the Cyprus University of Technology, where he is leading the Data Intensive Computing Research Lab. He received his Ph.D. in Computer Science from Duke University. His Ph.D. dissertation work on building a self-tuning system for big data analytics received the ACM SIGMOD Jim Gray Doctoral Dissertation Award Honorable Mention as well as the Outstanding Ph.D. Dissertation Award in Computer Science at Duke. Before joining CUT, he held research positions at Microsoft Research, Yahoo! Labs, and Aster Data. His research interests are in large-scale Data Processing Systems, Database Systems, and Cloud Computing. In particular, his work focuses on ease-of-use, manageability, and automated tuning of both centralized and distributed data-intensive computing systems. In addition, he is interested in applying database techniques in other areas like maritime informatics, scientific computing, bioinformatics, and social computing. His research work to date has been published in several top scientific conferences and journals, two books, and two book chapters, while he is actively participating in multiple European and nationally funded projects.

HostConstantinos Costa