Yiqiao Zhong

not available 

Department of Statistics,
University of Wisconsin–Madison
Medical Science Center 1122
1300 University Ave, Madison, WI 53706
E-mail: yiqiao.zhong [@] wisc [DOT] edu

Bio

I am a tenure-track assistant professor in the Department of Statistics at the University of Wisconsin–Madison. I started my appointment from Fall 2022. My research is primarily motivated by advances in data science. I enjoy working on modern statistics and machine learning problems, especially deep learning theory and high-dimensional statistics.


Very recently, my research interests have centered around analyses of deep learning, especially Large Language Models (LLMs). I am interested in both understanding the theoretical underpinning of LLMs and the practical techniques and concerns such as training and interpretability.


Previously, I was a postdoc at Stanford University, as a part of Collaboration on the Theoretical Foundations of Deep Learning, where I was advised by Prof. Andrea Montanari and Prof. David Donoho. Prior to this, I obtained my Ph.D. in 2019 from Princeton University, where I was advised by Prof. Jianqing Fan. I received my B.S. in mathematics from Peking University in 2014.

Research agenda

Large Language Models. There are quite many exciting advancements since ChatGPT. LLMs and foundation models offer new opportunities and challenges for our societies, but currently we know little about the inner workings of the building blocks. Thus, enhancing the interpretability of black-box models such as Transformers is an important and urgent task.

  • In a recent paper, we studied how large language models can generalize to distributions they have not seen during training (known as out-of-distribution generalization). Multiple attention layers need to work together to complete tasks such as in-context learning. By tracing how subspaces match, we propose a new structure called common subspace representation hypothesis.

  • In another paper, we examined various pretrained Transformer models and explored the hidden geometry inside these black-box models. The interesting geometric structure seems to contain many stories to be told!


Statistical foundation of Deep learning. A fundamental question in modern machine learning (e.g.,deep learning) is the generalization properties of complex and over-parametrized models. This impressive empirical performance of deep networks has driven active research in the past few years. A useful source of introduction to deep learning can be found in a course I co-instructed in 2019 (course link).

A recent paper excellently surveys recent progress for the statistical foundations of deep learning. I presented a brief introduction to several key ideas in a lecture for CS762 in October 2022. You can find my slides here.


Related research topics. Some other related research topics I am interested in:

  • Self-supervised learning, especially contrastive learning (e.g., A recent paper)

  • Neighborhood embedding and visualization


Older projects. Several older projects:

  • Spectral methods, PCA and factor models

  • Statistical networks, matrix completion and synchronization problems

  • Nonconvex optimization and SDP relaxation

  • Eigenvector perturbation analysis, entrywise/ell_infty bounds


My Google Scholar profile.

Interested in working with me?

I am looking for motivated students (statistics, applied math, CS, etc.) to work on any aspect of statistics, machine learning, or applied problems. I'd be happy to chat if you want to learn about my research, start working on a research project, or look for summer internship. The best way to reach out to me is through emails.