摘要:This article considers the problem of multi-group classification in the setting where the number of variables $p$ is larger than the number of observations $n$. Several methods have been proposed in the literature that address this problem, however their variable selection performance is either unknown or suboptimal to the results known in the two-group case. In this work we provide sharp conditions for the consistent recovery of relevant variables in the multi-group case using the discriminant analysis proposal of Gaynanova et al. [7]. We achieve the rates of convergence that attain the optimal scaling of the sample size $n$, number of variables $p$ and the sparsity level $s$. These rates are significantly faster than the best known results in the multi-group case. Moreover, they coincide with the minimax optimal rates for the two-group case. We validate our theoretical results with numerical analysis.
关键词:Classification;Fisher’s discriminant analysis, group penalization;high-dimensional statistics.