摘要:Abstract This paper introduces two deep convolutional neural network training techniques that lead to more robust feature subspace separation in comparison to traditional training. Assume that dataset has M labels. The first method creates M deep convolutional neural networks called $$\{\text {DCNN}_i\}_{i=1}^{M}$$ { DCNN i } i = 1 M . Each of the networks $$\text {DCNN}_i$$ DCNN i is composed of a convolutional neural network ( $$\text {CNN}_i$$ CNN i ) and a fully connected neural network ( $$\text {FCNN}_i$$ FCNN i ). In training, a set of projection matrices $$\{\mathbf {P}_i\}_{i=1}^M$$ { P i } i = 1 M are created and adaptively updated as representations for feature subspaces $$\{\mathcal {S}_i\}_{i=1}^M$$ { S i } i = 1 M . A rejection value is computed for each training based on its projections on feature subspaces. Each $$\text {FCNN}_i$$ FCNN i acts as a binary classifier with a cost function whose main parameter is rejection values. A threshold value $$t_i$$ t i is determined for $$i^{th}$$ i th network $$\text {DCNN}_i$$ DCNN i . A testing strategy utilizing $$\{t_i\}_{i=1}^M$$ { t i } i = 1 M is also introduced. The second method creates a single DCNN and it computes a cost function whose parameters depend on subspace separations using the geodesic distance on the Grasmannian manifold of subspaces $$\mathcal {S}_i$$ S i and the sum of all remaining subspaces $$\{\mathcal {S}_j\}_{j=1,j\ne i}^M$$ { S j } j = 1 , j ≠ i M . The proposed methods are tested using multiple network topologies. It is shown that while the first method works better for smaller networks, the second method performs better for complex architectures.