About Me
Welcome! I am a first-year Ph.D. student in Computer Science at New York University's Courant Institute, advised by Prof. Sai Qian Zhang. My research interests lie at the intersection of computer architecture and machine learning systems. I focus on Hardware–Software Co-design for Efficient AI. I am passionate about advancing efficient AI across algorithms, architectures, and hardware. Before joining NYU, I had the privilege of working with Prof. Yang (Katie) Zhao at the University of Minnesota, and Prof. Yingyan (Celine) Lin at Georgia Tech.
Education
-
New York University, Courant Institute
Ph.D. in Computer Science -
University of Minnesota, Twin Cities
M.S. in Electrical and Computer Engineering -
Sichuan University
B.Eng. in Telecommunications Engineering
Research Interests
- Efficient AI
- Hardware-Software Co-design
- Computer Architecture
Publications & Manuscripts
RTGS: Real-Time 3D Gaussian Splatting SLAM via Multi-Level Redundancy Reduction
58th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2025
Real-time 3D Gaussian Splatting SLAM system with multi-level redundancy reduction for edge devices.
Pocket-SLAM: Rendering-Area-Aware Pruning for Memory-Efficient 3DGS-SLAM
International Conference on Robotics and Automation (ICRA), 2026
Memory-efficient 3DGS-SLAM with rendering-area-aware pruning for large-scale deployment.
Gaussian Blending Unit: An Edge GPU Plug-in for Real-Time Gaussian-Based Rendering in AR/VR
31st IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2025
Dedicated hardware module for efficient Gaussian-based rendering on edge devices for AR/VR applications.
3D Gaussian Rendering Can Be Sparser: Efficient Rendering via Learned Fragment Pruning
Conference on Neural Information Processing Systems (NeurIPS), 2024
Efficient 3D Gaussian rendering through learned fragment pruning techniques.
LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Training-free method for enhancing long-context understanding in State Space Models.
DSD: A Distributed Speculative Decoding Solution for Edge-Cloud Agile Large Model Serving
Machine Learning and Systems (MLSys), 2026 (Under Review)
Distributed speculative decoding approach for efficient large model serving across edge and cloud infrastructure.
Experience
Research Experience
Research Assistant
Zhao Lab, University of Minnesota | Jun. 2024 – Present
Advisor: Prof. Yang (Kaite) Zhao | Minneapolis, MN, USA
- Conducting research on optimizing 3DGS-based SLAM for edge devices
- Developing memory-efficient 3DGS SLAM for large-scale outdoor deployment
Research Intern
EIC Lab, Georgia Institute of Technology | Mar. 2024 – May 2025
Advisor: Prof. Yingyan (Celine) Lin | Atlanta, GA, USA
- Improved 3DGS rendering performance on edge devices
- Enhanced Diffusion Large Language Models (dLLM & Llada)
Research Intern
Sai Lab, New York University | Aug. 2025 – Present
Advisors: Prof. Sai Qian Zhang & Prof. Bradley McDanel | New York, NY, USA
- Designing speculative distributed LLM system for efficient inference
- Efficient generative AI in drug delivery with human-in-the-loop strategies
Industry Experience
LLM Pre-training Intern
REDstar@hi Lab, Xiaohongshu (REDnote) | Sep. 2025 – Present
Shanghai, China
- Collaborating with Infra team on training and inference bottlenecks
- Exploring efficient scaling strategies and GPU-friendly architectures
- Investigating Attention, MoE, and optimizer strategies using Megatron and FSDP
Curriculum Vitae
You can download my full CV in PDF format below:
Download CV (PDF)Contact
Feel free to reach out if you're interested in collaboration or have any questions about my research.
University of Minnesota, Twin Cities
Minneapolis, MN, USA
Please reach out via email or connect on social media below