Redwan Ibne Seraj Khan

Machine Learning and Systems Researcher, PhD Candidate

redwankhan.png

redwan@vt.edu

PhD Candidate, CS@VT

ML Systems Researcher

I am a PhD Candidate at CS@VT. My advisor is Dr. Ali R. Butt. I am affiliated with the Distributed Systems and Storage Lab (DSSL).

My research spans across two broad categories—Sys4ML, i.e., designing better scalable systems (computing and storage) for improving the performance and efficiency of (distributed) ML applications and ML4Sys, i.e., leveraging ML or data-driven approaches for improving systems software and resource management.

Currently I am working on the following projects that targets several ML domains—Distributed Deep Learning (DDL), Large Language Models(LLMs), and Federated Learning(FL):

(1) Building novel data sampling, caching policies, and hybrid storage systems for improving training performance of large-scale DDL workloads. (2) Developing intelligent procedures for efficient scheduling and resource utilization of DDL/LLM job training/inference. (3) Constructing novel mechanisms for client scheduling and sampling in privacy-aware high-performing FL workloads.

Before joining CS@VT, I graduated with the highest distinction (Summa Cum Laude) in Computer Engineering from University at Buffalo SUNY Buffalo in 2019.

news

Mar 2024 Provided a guest lecture on Deep Learning Caching Systems for Big Data Systems Course at University of Virginia. Thanks Dr. Yue Cheng for inviting me!
Nov 2023 Attended Supercomputing Conference (SC’23). Thanks to VT and Dr. Ali Butt for providing the travel grants ($1000).
Apr 2023 Presented a technical talk at IBM Research. Title: Insights into Managing Machine Learning Applications Using Optimal System Resources
Apr 2023 Presented a technical talk at ByteDance. Title: Navigating the Tricky Path to Optimal Performance - Coordinating System Resources with ML Application Needs
Feb 2023 Presented our work, SHADE at USENIX FAST 2023.

selected publications

2023

  1. USENIX FAST’23
    SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
    Redwan Ibne Seraj Khan, Ahmad Hossein Yazdani, Yuqi Fu, and 5 more authors
    In 21st USENIX Conference on File and Storage Technologies (FAST 23), Feb 2023

2020

  1. IEEE CLOUD’20
    On the use of containers in high performance computing environments
    Subil Abraham, Arnab K Paul, Redwan Ibne Seraj Khan, and 1 more author
    In 2020 IEEE 13th International Conference on Cloud Computing (CLOUD), Feb 2020