Research
My ongoing research interest focuses on improving algorithm efficiency and effectiveness in deep learning problems, particularly in the context of modern generative models for vision and language. Previously, I also worked on deep learning-driven audio processing, including applications in compression and speech synthesis.
|
|
ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
Yuzhe Gu, Enmao Diao
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
paper /
code /
We present ESC, a parameter-efficient speech foundation codec with cross-scale vector-quantized transformer architectures, which achieves coding performance comparable to state-of-the-art models while reducing decoding latency by 6.4x.
|
|
How Did We Get Here? Summarizing Conversation Dynamics
Yilun Hua, Nicholas Chernogor, Yuzhe Gu, Seoyeon Julie Jeong, Miranda Luo, Cristian Danescu-Niculescu-Mizil
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024
paper /
code /
We introduce the task of summarizing the dynamics of conversations (SCD) by constructing a dataset of human-written summaries and exploring several automated baselines. We demonstrate that SCDs assist both humans and automated systems in forecasting whether an ongoing conversation will eventually derail into toxic behavior.
|
|
Towards Quantification of Covid-19 Intervention Policies from Machine Learning-based Time Series Forecasting Approaches
Yuzhe Gu, Peng Sun, Azzedine Boukerche
IEEE International Conference on Communications (ICC), 2024
paper /
We develop a policy-aware epidemic time-series predictive model, and perform causal analysis to quantify the effects of governmental interventions during COVID-19 through counterfactual estimation.
|
|
Visual Autoregressive Language Models
project: an open-source PyTorch implementation
2024/10
code /
An implementation of autoregressive vision generative models with discrete latent variables. Reconstruction and class-conditioned synthesis results are reproduced on StanfordDogs, a dataset containing 120 dog breeds.
|
|
Robustness Evaluation on LLM Bias and Fairness Metrics
project: UPenn CIS700 - Trustworthy ML
2024/05
slides /
A re-evaluation on the robustness of text continuation-based bias metrics for quantifying group fairness in LLMs. Empirical results indicate that
existing approaches are sensitive to language model’s inherent non-determinism from decoding setups.
|
|