Validating Medical Treatment Effects by Projected F-tests under High Dimension with a Small Sample Size

Authors

  • Jiajuan Liang Department of Statistics and Data Science, Beijing Normal-Hong Kong Baptist University, 2000 Jintong Road, Tang Jia Wan, Zhuhai 519087, China and Guangdong Provincial/Zhuhai Key Laboratory of Interdisciplinary Research and Application for Data Science Beijing Normal-Hong Kong Baptist University, 2000 Jintong Road, Tang Jia Wan, Zhuhai 519087, China

DOI:

https://doi.org/10.6000/1929-6029.2025.14.73

Keywords:

Dimension reduction, F-test, Monte Carlo study, Multiple mean comparison, Principal component analysis

Abstract

This paper introduces a statistical method for validating treatment effects in high-dimensional medical data with small sample sizes. The method compares multiple multivariate population means under multivariate normality, using spherical matrix distribution theory and principal component analysis (PCA) for dimension reduction. The resulting test statistic follows an exact F-distribution under the null hypothesis of equal means, even when the sample size is smaller than the data dimension. Unlike classical MANOVA, the approach does not require equal covariance matrices across groups, making it more robust for real-world biomedical data where variance-covariance homogeneity rarely holds. Monte Carlo simulations show the test achieves accurate type I error control and favorable power. Application to real medical datasets with high-dimensional biomarkers further demonstrates its practicality and interpretability. This work provides a rigorous and versatile advancement for high-dimensional inference in biomedical research and related fields.

References

Fang KT, Zhang Y, Generalized Multivariate Analysis, Springer-Verlag and Science Press, Berlin/Beijing 1990.

Bai Z, Saranadasa H, Effect of high dimension: by an example of a two-sample problem. Statistica Sinica 1996; 6: 311-329.

Srivastava MS, Kubokawa T, Tests for multivariate analysis of variance in high dimension under non-normality. Journal of Multivariate Analysis 2013; 115: 204-216.

Cai TT, Liu W, Xia Y, Two-sample test of high dimensional means under dependence. Journal of Royal Statistical Society (Series B) 2014; 76: 349-372.

Chen SX, Qin YL, A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics 2010; 38: 808-835.

Johnstone IM, On the distribution of the largest eigenvalue in principal component analysis. Annals of Statistics 2001; 29: 295-327.

Anderson TW, An Introduction to Multivariate Statistical Inference (3rd Edition). John Wiley & sons, New Jersey, USA 2003.

Yamadaa T, Himenob T, Testing homogeneity of mean vectors under heteroscedasticity in high-dimension. Journal of Multivariate Analysis 2015; 139: 7-27.

Zhang JT, Guo J, Zhou B, Linear hypothesis testing in high-dimensional one-way MANOVA. Journal of Multivariate Analysis 2017; 155: 200-216.

La¨uter J, Exact t and F tests for analyzing studies with multiple endpoints. Biometrics 1996; 52: 964-970.

La¨uter J, Glimm E, Kropf S, New multivariate tests for data with an inherent structure. Biometrical Journal 1996; 38(1): 5-23.

Jolliffe IT, Principal Component Analysis. 2nd ed. New York: Springer 2002.

Gao N, Hu R, Huang Y, Dao L, Zhang C, Liu Y, Wu L, Wang X, Gore AC, Sun Z, Specific effects of prenatal DEHP exposure on neuroendocrine gene expression in the developing hypothalamus of male rats. Archives of Toxicology 2018; 92: 501-512.

Cao Y, Liang J, Multiple mean comparison for clusters of gene expression data through the t-SNE plot and PCA dimension reduction, International Journal of Statistics in Medical Research 2025; 14: 1-14.

Cao Y, Liang J, Gao N, Sun Z, A new method for identifying significant genes from gene expression data. Biometrics and Biostatistics International Journal 2022; 11: 140-146.

Borenstein M (Ed.), Meta-analysis: A guide to calibrating and combining statistical evidence. Wiley 2024.

Westfall PH, Young SS, Resampling-based multiple testing: Examples and methods for p-value adjustment. John Wiley & Sons 1993.

Srivastava MS, Du M, A test for the mean vector with fewer observations than the dimension. Journal of Multivariate Analysis 2008; 99(3): 386-402.

Srivastava R, Li P, Ruppert D, RAPTT: An exact two-sample test in high dimensions using random projections. Journal of Computational and Graphical Statistics 2016; 25(3): 954-970.

Downloads

Published

2025-12-09

How to Cite

Liang, J. . (2025). Validating Medical Treatment Effects by Projected F-tests under High Dimension with a Small Sample Size. International Journal of Statistics in Medical Research, 14, 811–817. https://doi.org/10.6000/1929-6029.2025.14.73

Issue

Section

Specia Issue: New Advances in Multiple Statistical Comparison and Its Applications in Medicine