Mathematics, Vol. 11, Pages 2398: Quantile-Composited Feature Screening for Ultrahigh-Dimensional Data

JournalFeeds

Mathematics, Vol. 11, Pages 2398: Quantile-Composited Feature Screening for Ultrahigh-Dimensional Data

Mathematics doi: 10.3390/math11102398

Authors:
Shuaishuai Chen
Jun Lu

Ultrahigh-dimensional grouped data are frequently encountered by biostatisticians working on multi-class categorical problems. To rapidly screen out the null predictors, this paper proposes a quantile-composited feature screening procedure. The new method first transforms the continuous predictor to a Bernoulli variable, by thresholding the predictor at a certain quantile. Consequently, the independence between the response and each predictor is easy to judge, by employing the Pearson chi-square statistic. The newly proposed method has the following salient features: (1) it is robust against high-dimensional heterogeneous data; (2) it is model-free, without specifying any regression structure between the covariate and outcome variable; (3) it enjoys a low computational cost, with the computational complexity controlled at the sample size level. Under some mild conditions, the new method was shown to achieve the sure screening property without imposing any moment condition on the predictors. Numerical studies and real data analyses further confirmed the effectiveness of the new screening procedure.

MDPI Publishing. Click here to Read More