Yuzhe Gu

A Graduate Student interested in Machine Learning and Natural Language Processing

yuzhegu_headshot.jpg

I am a first year master student in Department of Electrical and Systems Engineering at the University of Pennsylvania with a concentration on Machine Learning and Data Science. I received my dual B.Sc. in Data Science from Duke Kunshan University and Duke University, where I worked closely with Peng Sun and Enmao Diao.

My research interest covers a wide range of topics related to representation learning and generative models:

  • Discrete Representation Learning: vector quantization networks for generative modeling, data compression, and multi-modality learning
  • Efficient LLMs: leveraging quantization techniques for efficient LLM inference/finetuning
  • Trustworthy LLMs: re-examination of bias, fairness and robustness in LLMs

I am currently seeking for a CS/ECE Ph.D. position related to Machine Learning and Natural Language Processing starting 2025 Fall.

Email: tracygu (at) seas (dot) upenn (dot) edu
Google Scholar: @scholar
Github: @github/yzGuu830
Linkedin: @in/yuzheguu




Publications

Image description ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
Yuzhe Gu, Enmao Diao
Under Review
paper / code
We propose Efficient Speech Codec (ESC), a lightweight, parameter-efficient speech codec based on a cross-scale residual vector quantization scheme and transformers. Our model employs mirrored hierarchical window transformer blocks and performs step-wise decoding from coarse-to-fine feature representations. ESC can achieve high-fidelity speech reconstruction with significantly lower complexity than state-of-the-art convolutional codecs.



Image description How Did We Get Here? Summarizing Conversation Dynamics
Yilun Hua, Nicholas Chernogor, Yuzhe Gu, Seoyeon Julie Jeong, Miranda Luo, Cristian Danescu-Niculescu-Mizil
Proceedings of NAACL, 2024
paper / code
We introduce the task of summarizing the dynamics of conversations, by constructing a dataset of human-written summaries, and exploring several automated baselines. We evaluate whether such summaries can capture the trajectory of conversations via an established downstream task: forecasting whether an ongoing conversation will eventually derail into toxic behavior. We show that they help both humans and automated systems with this forecasting task.



Image description Towards Quantification of Covid-19 Intervention Policies from Machine Learning-based Time Series Forecasting Approaches
Yuzhe Gu, Peng Sun, Azzedine Boukerche
Proceedings of IEEE International Conference on Communications (ICC), 2024
paper / code
We design a policy-aware time series forecasting model to estimate COVID-19 trends by incorporating temporal information from 16 policy indicators. Through counterfactual analysis, we quantify the causal effect of indicators and propose two static metrics lag period and average effect. Our model verifies the effectiveness of all 16 policy indicators in controlling virus transmission in the US.