Research Interests:
-
My research interests encompass a broad spectrum of areas, spanning statistical learning theory and methods in computational neuroscience, biostatistics, and financial econometrics, as well as the analysis of imaging, spatial, and temporal data. Additionally, my work explores dimension reduction, high-dimensional inference, multiple hypotheses testing, large-scale simultaneous inference, non-parametric and semi-parametric modeling and inference, functional and longitudinal data analysis, robust statistics, and traffic forecasting in transportation.
-
Specifically, I am dedicated to developing new statistical and computational methodologies for analyzing large-scale, complex-structured medical imaging data. See the
website for more details. This involves the examination of various neuroimaging datasets, including
- spatial-temporal Functional Magnetic Resonance Imaging (fMRI),
- Diffusion Tensor Imaging (DTI) data,
- multiple neuron spike trains (sequences of action potentials generated by neurons),
- multi-channel EEG (electroencephalogram) recordings, and
- data from the Alzheimer's Disease Neuroimaging Initiative (ADNI).
As a statistician, I actively engage in interdisciplinary research, collaborating with faculty members, research scientists, and students across diverse fields such as neuroscience, psychology, computer science, engineering, physics, and radiology.
Representative Journal Publications:
-
Fan, N.(s), Zhang, C.M., and Zhang, Z.J. (2024).
"Dynamic modeling via autoregressive conditional GB2 for cross-sectional maxima of financial time series data,"
Journal of Business & Economic Statistics, accepted.
(This paper focuses on financial time series data.)
-
Zhang, C.M., Gao, M.(s), and Jia, S.J.(s) (2024).
"DAG-informed structure learning from multi-dimensional point processes,"
Journal of Machine Learning Research, 25(352):1-56.
(This paper focuses on neuron spike train data.)
-
Gao, M.(s), Zhang, C.M., and Zhou, J. (2024).
"Learning network-structured dependence from non-stationary multivariate point process data,"
IEEE Transactions on Information Theory, 70(8):5935-5968.
(This paper focuses on neuron spike train data.)
-
Zhang, C.M., Zhu, L.X., and Shen, Y.B.(s) (2023).
"Robust estimation in regression and classification methods for large dimensional data,"
Machine Learning, 112(9):3361-3411.
[MATLAB codes available on GitHub]
(This paper focuses on the classification of Lymphoma and Colon cancer data.)
-
Guo, R.S.(s), Zhang, C.M., and Zhang, Z.J. (2020).
"Maximum Independent Component Analysis with application to EEG data,"
Statistical Science, 35(1):145-157.
[PDF]
(Special Issue on Statistics and Science, with the Guest Editor David Siegmund).
(This paper focuses on brain EEG data.)
-
Liu, J.(s), Zhang, C.M., and Page, D. (2016).
"Multiple testing under dependence via graphical models,"
Annals of Applied Statistics, 10(3):1699-1724.
[PDF]
(This paper focuses on GWAS on breast cancer.)
-
Zhang, C.M., Chai, Y.(s), Guo, X.(s), Gao, M.(s), Devilbiss, D.M., and Zhang, Z. (2016).
"Statistical learning of neuronal functional connectivity,"
Technometrics, 58(3):350-359.
[PDF]
(Special Issue on Big Data)
[MATLAB codes available on GitHub]
(This paper focuses on neuron spike train data.)
-
Du, L.(s) and Zhang, C.M. (2014).
"Single-index modulated multiple testing,"
Annals of Statistics, 42(4):1262-1311.
[PDF]
(This paper focuses on prostate cancer data.)
-
Yu, T.(s), Zhang, C.M., Alexander, A.L., and Davidson, R.J. (2013).
"Local tests for identifying anisotropic diffusion areas in human brain with DTI,"
Annals of Applied Statistics, 7(1):201-225.
[PDF]
(This paper focuses on brain Diffusion Tensor Imaging data.)
-
Zhang, C.M., Fan, J.(a), and Yu, T.(s) (2011).
"Multiple testing via FDRL for large-scale imaging data,"
Annals of Statistics, 39(1):613-642.
[PDF]
(This paper focuses on brain fMRI data.)
-
Zhang, C.M., Jiang, Y.(s), and Chai, Y.(s) (2010).
"Penalized Bregman divergence for large-dimensional regression and classification,"
Biometrika, 97(3):551-566.
[PDF]
(This paper focuses on cardiac arrhythmia data.)
-
Zhang, C.M. and Yu, T.(s) (2008).
"Semiparametric detection of significant activation for brain fMRI,"
Annals of Statistics, 36(4):1693-1725.
[PDF]
(This paper focuses on brain fMRI data.)
-
Hall, P., Minnotte, M.C., and Zhang, C.M. (2004).
"Bump hunting with non-Gaussian kernels,"
Annals of Statistics, 32(5):2124-2141.
[PDF]
-
Zhang, C.M. (2003).
"Calibrating the degrees of freedom for automatic data smoothing and effective curve checking,"
Journal of the American Statistical Association, 98(463):609-628.
[PDF]
-
Fan, J.(a) and Zhang, C.M. (2003).
"A reexamination of diffusion estimators with applications to financial model validation,"
Journal of the American Statistical Association, 98(461):118-134.
[PDF]
(This paper focuses on financial time series data.)
-
Fan, J.(a), Zhang, C.M., and Zhang, Jian (2001).
"Generalized likelihood ratio statistics and Wilks phenomenon,"
Annals of Statistics, 29(1):153-193.
[PDF]
(correction, Annals of Statistics, 2002, 30(6):1811-1811.
[PDF])
Research Publications:
-
Fan, N.(s), Zhang, C.M., and Zhang, Z.J. (2024).
"Dynamic modeling via autoregressive conditional GB2 for cross-sectional maxima of financial time series data,"
Journal of Business & Economic Statistics, accepted.
-
Zhang, C.M., Gao, M.(s), and Jia, S.J.(s) (2024).
"DAG-informed structure learning from multi-dimensional point processes,"
Journal of Machine Learning Research, 25(352):1-56.
-
Gao, M.(s), Zhang, C.M., and Zhou, J. (2024).
"Learning network-structured dependence from non-stationary multivariate point process data,"
IEEE Transactions on Information Theory, 70(8):5935-5968.
-
Zhong, R.(s), Zhang, C.M., and Zhang, J.X. (2024).
"Locally sparse estimator of generalized varying coefficient model for asynchronous longitudinal data,"
Statistica Sinica, 34(4):1903-1921.
[PDF]
[R package available on CRAN]
-
Liu, S.S., Zhang, C.M., Zhang, H., Zhong, R.(s), and Zhang, J.X. (2023).
"Model averaging estimation for partially linear functional score models,"
Statistica Sinica, accepted.
-
Zhang, C.M., Zhu, L.X., and Shen, Y.B.(s) (2023).
"Robust estimation in regression and classification methods for large dimensional data,"
Machine Learning, 112(9):3361-3411.
[PDF]
[MATLAB codes available on GitHub]
-
Zhang, C.M., Guo, X.(s), Chen, M., and Du, X.Z.(s) (2023).
"Semi-parametric inference for large-scale data with temporally dependent noise,"
Electronic Journal of Statistics, 17(2):2962-3007.
[PDF]
-
Wu, Z.X.(s) and Zhang, C.M. (2023) .
"Assessment of projection pursuit index for classifying high dimension low sample size data in R,"
Journal of Data Science, 21(2):310-332.
[PDF]
[R package available on GitHub]
-
Shen, Y.B.(s), Park, Y.H., Chakraborty, S., and Zhang, C.M. (2023).
"Bayesian simultaneous partial envelope model with application to an imaging genetics analysis,"
The New England Journal of Statistics in Data Science, 1(2):237-269.
[PDF]
[R package available on GitHub]
-
Zhang, C.M., Ye, J.M., and Wang, X.M. (2023).
"A computational perspective on projection pursuit in high dimensions: feasible or infeasible feature extraction,"
International Statistical Review, 91(1):140-161.
[PDF]
[MATLAB codes available on GitHub]
-
Jia, S.J.(s), Zhang, C.M., and Lu, H.R.(s) (2022).
"Covariance function versus covariance matrix estimation in efficient semi-parametric regression for longitudinal data analysis,"
Journal of Multivariate Analysis, 187, 104900.
[PDF]
-
Han, Y. and Zhang, C.M. (2022).
"Empirical likelihood inference in autoregressive models with time-varying variances,"
Statistical Theory and Related Fields, 6(2):129-138.
[PDF]
-
Zhang, Y.Q.(s), Zhang, C.M., and Tang, N.S. (2022) .
"Estimation and variable selection on sparse model with group structure,"
Acta Mathematicae Applicatae Sinica, 45(1):31-46.
[PDF]
-
Zhang, C.M. (2021).
"Further examples related to correlations between variables and ranks,"
The American Statistician, 75(2):226-229.
[PDF]
-
Zhang, C.M., Jia, S.J.(s), and Wu, Y.F.(s) (2021).
"On simultaneous calibration of two-sample t-tests for high-dimension low-sample-size data,"
Statistica Sinica, 31(3):1189-1214.
[PDF]
-
Guo, R.S.(s), Zhang, C.M., and Zhang, Z.J. (2020).
"Maximum Independent Component Analysis with application to EEG data,"
Statistical Science, 35(1):145-157.
[PDF]
(Special Issue on Statistics and Science, with the Guest Editor David Siegmund).
-
Jia, S.J.(s), Zhang, C.M., and Wu, H.L. (2019).
"Efficient semiparametric regression for longitudinal data with regularised estimation of error covariance function,"
Journal of Nonparametric Statistics, 31(4):867-886.
[PDF]
-
Zhang, B.X., Cheng, G.H.(s), Zhang, C.M., and Zheng, S.R. (2019).
"Variable selection procedure from multiple testing,"
Science China Mathematics, 62(4):771-782.
[PDF]
-
Guo, X.(s) and Zhang, C.M. (2018).
"Robustness property of Robust-BD Wald-type test for varying-dimensional general linear models,"
Entropy, 20(3):168, 1-28.
[PDF]
-
Zhang, C.M. and Zhang, Z.J. (2017).
"Robust-BD estimation and inference for general partially linear models,"
Entropy, 19(11): 625, 1-30.
[PDF]
-
Guo, X.(s) and Zhang, C.M. (2017).
"The effect of L_1 penalization on condition number constrained estimation of precision matrix,"
Statistica Sinica, 27(3):1299-1317.
[PDF]
-
Du, L.(s) and Zhang, C.M. (2017).
"Estimation of false discovery proportion in multiple testing: from normal to chi-squared test statistics,"
Electronic Journal of Statistics, 11(1):1048-1091.
[PDF]
-
Zhang, Z.J., Zhang, C.M., and Cui, Q.R.(s) (2017).
"Random threshold driven tail dependence measures with application to precipitation data analysis,"
Statistica Sinica, 27(2):685-709.
[PDF]
-
Liu, J.(s), Zhang, C.M., and Page, D. (2016).
"Multiple testing under dependence via graphical models,"
Annals of Applied Statistics, 10(3):1699-1724.
[PDF]
-
Zhang, C.M., Guo, X.(s), and Chai, Y.(s) (2016).
"Screening-based Bregman divergence estimation with NP-dimensionality,"
Electronic Journal of Statistics, 10(2):2039-2065.
[PDF]
-
Zhang, C.M., Chai, Y.(s), Guo, X.(s), Gao, M.(s), Devilbiss, D.M., and Zhang, Z. (2016).
"Statistical learning of neuronal functional connectivity,"
Technometrics, 58(3):350-359.
[PDF]
(Special Issue on Big Data)
[MATLAB codes available on GitHub]
-
Zhang, C.M., Han, Y., and Jia, S.(s) (2016).
"Accounting for time series errors in partially linear model with single- or multiple-runs,"
Journal of Computational and Graphical Statistics, 25(1):123-143.
[PDF]
[MATLAB codes available on GitHub]
-
Feng, L.(s), Wang, Z.J., Zhang, C.M., and Zou, C.L. (2016).
"Nonparametric testing in regression models with Wilcoxon-type generalized likelihood ratio,"
Statistica Sinica, 26(1):137-155.
[PDF]
-
Guo, X.(s) and Zhang, C.M. (2015).
"Estimation of the error auto-correlation matrix in semiparametric model for fMRI data,"
Statistica Sinica, 25(2):475-498.
[PDF]
-
Du, L.(s) and Zhang, C.M. (2014).
"Single-index modulated multiple testing,"
Annals of Statistics, 42(4):1262-1311.
[PDF]
-
Liu, J.(s), Zhang, C.M., Burnside, E., and Page, D. (2014).
"Multiple testing under dependence via semiparametric graphical models,"
Proceedings of the 31st International Conference on Machine Learning (ICML 2014), Beijing, China.
PMLR (Proceedings of Machine Learning Research) 32(2):955-963, 2014.
[PDF]
-
Liu, J.(s), Zhang, C.M., Burnside, E., and Page, D. (2014).
"Learning heterogeneous hidden Markov random fields,"
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS 2014), Reykjavik, Iceland.
PMLR (Proceedings of Machine Learning Research) 33:576-584, 2014.
[PDF]
-
Zhang, C.M., Guo, X.(s), Cheng, C.(s), and Zhang, Z.J. (2014).
"Robust-BD estimation and inference for varying-dimensional general linear models,"
Statistica Sinica, 24(2):653-673.
[PDF]
-
Zhang, C.M. (2014).
"Assessing mean and median filters in multiple testing for large-scale imaging data,"
TEST, 23(1):51-71.
[PDF]
-
Kim, D.(s) and Zhang, C.M. (2014).
"Adaptive linear step-up multiple testing procedure with the bias-reduced estimator,"
Statistics and Probability Letters, 87:31-39.
[PDF]
-
Zheng, S.M., Knisley, J., and Zhang, C.M. (2013).
"Moments of matrix variate skew elliptically contoured distributions,"
Advances and Applications in Statistics, 36(1):13-27.
[PDF]
-
Zheng, S.M., Zhang, C.M., and Knisley, J. (2013).
"Stochastic representations of the matrix variate skew elliptically contoured distributions,"
Advances and Applications in Statistics, 33(2):83-98.
[PDF]
-
Jiang, Y.(s) and Zhang, C.M. (2013).
"High-dimensional regression and classification under a class of convex loss functions,"
Statistics and Its Interface, 6(2):285-299.
[PDF]
-
Yu, T.(s), Zhang, C.M., Alexander, A.L., and Davidson, R.J. (2013).
"Local tests for identifying anisotropic diffusion areas in human brain with DTI,"
Annals of Applied Statistics, 7(1):201-225.
[PDF]
-
Liu, J.(s), Zhang, C.M., McCarty, C., Peissig, P., Burnside, E., and Page, D. (2012).
"Graphical-model based multiple testing under dependence with applications to genome-wide association studies,"
Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI 2012), Catalina Island, United States. 511-522.
[PDF]
-
Liu, J.(s), Zhang, C.M., McCarty, C., Peissig, P., Burnside, E., and Page, D. (2012).
"High-dimensional structured feature screening using binary Markov random fields,"
Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS 2012), La Palma, Canary Islands.
PMLR (Proceedings of Machine Learning Research) 22:712-721, 2012.
[PDF]
-
Zhang, C.M., Zhang, Z., and Chai, Y.(s) (2011).
"Penalized Bregman divergence estimation via coordinate descent,"
Journal of the Iranian Statistical Society, 10(2):125-140.
[PDF]
-
Zhang, C.M., Fan, J.(a), and Yu, T.(s) (2011).
"Multiple testing via FDRL for large-scale imaging data,"
Annals of Statistics, 39(1):613-642.
[PDF]
-
Zhang, C.M., Jiang, Y.(s), and Chai, Y.(s) (2010).
"Penalized Bregman divergence for large-dimensional regression and classification,"
Biometrika, 97(3): 551-566.
[PDF]
-
Zhang, C.M. (2010).
"Statistical inference of minimum BD estimators and classifiers for varying-dimensional models,"
Journal of Multivariate Analysis, 101(7):1574-1593.
[PDF]
-
Zhang, C.M. and Zhang, Z.J. (2010).
"Regularized estimation of hemodynamic response function for fMRI data,"
Statistics and Its Interface, 3(1):15-31.
[PDF]
-
Li, J.(s), Zhang, C.M., Doksum, K.A., and Nordheim, E.V. (2010).
"Simultaneous confidence intervals for semiparametric logistic regression
and confidence regions for the multi-dimensional effective dose,"
Statistica Sinica, 20(2):637-659.
[PDF]
-
Zhang, C.M., Jiang, Y.(s), and Shang, Z.(s) (2009).
"New aspects of Bregman divergence in regression and classification with parametric and nonparametric estimation,"
Canadian Journal of Statistics, 37(1):119-139.
[PDF]
-
Zhang, C.M., Li, J.(s), and Meng, J.(s) (2008).
"On Stein's lemma, dependent covariates and functional monotonicity in multi-dimensional modeling,"
Journal of Multivariate Analysis, 99(10):2285-2303.
[PDF]
-
Zhang, C.M. (2008).
"Prediction error estimation under Bregman divergence for non-parametric regression and classification,"
Scandinavian Journal of Statistics, 35(3):496-523.
[PDF]
-
Zhang, C.M. and Yu, T.(s) (2008).
"Semiparametric detection of significant activation for brain fMRI,"
Annals of Statistics, 36(4):1693-1725.
[PDF]
-
Zhang, C.M., Lu, Y.(s), Johnstone, T., Oaks, T., and Davidson, R.J. (2008).
"Efficient modeling and inference for event-related functional MRI data,"
Computational Statistics and Data Analysis, 52(10):4859-4871.
[PDF]
-
Li, J.(s), Zhang, C.M., Nordheim, E.V., and Lehner, C.E. (2008).
"On the multivariate predictive distribution of multi-dimensional effective dose: a Bayesian approach,"
Journal of Statistical Computation and Simulation, 78(5):429-442.
[PDF]
-
Li, J.(s), Nordheim, E.V., Zhang, C.M., and Lehner, C.E. (2008).
"Estimation and confidence regions for multi-dimensional effective dose,"
Biometrical Journal, 50(1):110-122.
[PDF]
-
Zhang, C.M., Jiang, Y.(s), and Yu, T.(s) (2007).
"A comparative study of one-level and two-level semiparametric
estimation of hemodynamic response function for fMRI data,"
Statistics in Medicine, 26(21):3845-3861.
[PDF]
(Special Issue on statistical analysis of neuronal data.)
-
Zhang, C.M., Fu, H.(s), Jiang, Y.(s), and Yu, T.(s) (2007).
"High-dimensional pseudo logistic regression and classification with applications to gene expression data,"
Computational Statistics and Data Analysis, 52(1):452-470.
[PDF]
-
Zhang, C.M. (2005).
"Book Review of Ranked Set Sampling: Theory and Applications by Zehua Chen, Zhidong Bai, and Bimal K. Sinha, Springer-Verlag, 2004,"
Technometrics, 47(1):100-101.
[PDF]
-
Sun, H.(s), Zhang, C.M., and Ran, B. (2004).
"Interval prediction for traffic time series using local linear predictor,"
Proceedings. The 7th International IEEE Conference on Intelligent Transportation Systems (ITSC 2004) ,
410-415,
October 3-6, 2004, Washington D.C.
[PDF]
-
Sun, H.(s), Zhang, C.M., Ran, B., and Choi, K. (2004).
"Prediction intervals for traffic time series,"
Proceedings of the 83rd Annual Meeting of the Transportation
Research Board, Preprint CD-ROM (#04-4602), January 11-15, 2004, Washington D.C..
[PDF]
-
Zhang, C.M. (2004).
"Comment on "The estimation of prediction error: covariance penalties and cross-validation" by Bradley Efron,"
Journal of the American Statistical Association, 99(467):637-640.
[PDF]
-
Hall, P., Minnotte, M.C., and Zhang, C.M. (2004).
"Bump hunting with non-Gaussian kernels,"
Annals of Statistics, 32(5):2124-2141.
[PDF]
-
Zhang, C.M. and Dette, H. (2004).
"A power comparison between nonparametric regression tests,"
Statistics and Probability Letters, 66(3):289-301.
[PDF]
-
Zhang, C.M. and Lu, Y.(s) (2004).
"A note on beta function, parseval identity, and a family of integrals in non-parametric regression,"
International Journal of Mathematical Education in Science and Technology, 35(2):303-309.
[PDF]
-
Zhang, C.M. (2004).
"Assessing the equivalence of nonparametric regression tests
based on spline and local polynomial smoothers,"
Journal of Statistical Planning and Inference, 126(1):73-95.
[PDF]
-
Zhang, C.M. (2003).
"Calibrating the degrees of freedom for automatic data smoothing and effective curve checking,"
Journal of the American Statistical Association, 98(463):609-628.
[PDF]
-
Zhang, C.M. (2003).
"Adaptive tests of regression functions via multi-scale generalized likelihood ratios,"
Canadian Journal of Statistics, 31(2):151-171.
[PDF]
-
Zhang, C.M. and Cheng, B.(s) (2003).
"Binning methodology for nonparametric goodness-of-fit test,"
Journal of Statistical Computation and Simulation, 73(1):71-82.
[PDF]
-
Fan, J.(a), Jiang, J., Zhang, C.M., and Zhou, Z. (2003).
"Time-dependent diffusion models for term structure dynamics,"
Statistica Sinica, 13(4):965-992.
[PDF]
-
Fan, J.(a) and Zhang, C.M. (2003).
"A reexamination of diffusion estimators with applications to financial model validation,"
Journal of the American Statistical Association, 98(461):118-134.
[PDF]
-
Fan, J.(a), Zhang, C.M., and Zhang, Jian (2002).
"Correction: Generalized likelihood ratio statistics and Wilks phenomenon,"
Annals of Statistics, 2002, 30(6):1811-1811.
[PDF]
-
Fan, J.(a), Zhang, C.M., and Zhang, Jian (2001).
"Generalized likelihood ratio statistics and Wilks phenomenon,"
Annals of Statistics, 29(1):153-193.
[PDF]
-
Fan, J.(a) and Zhang, C.M. (1999).
"Comment on "Adjusting for non-ignorable
drop-out using semiparametric non-response models" by D.O. Scharfstein, A. Rotnitzky, and J.M. Robins,"
Journal of the American Statistical Association, 94(448):1122-1125.
[PDF]
-
Yang, Z.Q. and Zhang, C.M. (1997).
"Dimension reduction and L1 approximation for evaluation of multivariate normal integrals,"
(Chinese) Mathematica Numerica Sinica, 19(1):91-102;
translation in Chinese Journal of Numerical Mathematics and Applications, 19(2):82-95.
[PDF]
-
Zhang, Z.J., Yang, Z.Q., Zhang, C.M., and Feng, Y.C. (1996).
"An approximate algorithm of generating variates with arbitrary continuous statistical distribution,"
Journal of Systems Engineering and Electronics, 7(1):35-42.
[PDF]
-
Zhang, C.M., Yang, Z.Q., and Zhang, Z.J. (1995).
" Monte-Carlo method and its applications implemented on JN-3 parallel computer,"
Journal of Xidian University (in Chinese), 22(Sup.): 21-25.
-
Zhang, Z.J., Yang, Z.Q., and Zhang, C.M. (1994).
"Monotone piecewise curve fitting algorithms,"
Journal of Computational Mathematics, 12(2):163-172.
[PDF]
Technical Reports:
-
Zhang, C.M. (2000). "Topics in generalized likelihood ratio test".
Ph.D. Dissertation, The University of North Carolina at Chapel Hill. 125 pp. ISBN: 978-0599-73536-1.
Funding:
The research is supported by the U.S. National Science Foundation (NSF), the Wisconsin Alumni Research Foundation (WARF), and the Association for Women in Mathematics (AWM).
Home |