出版社:Vilnius University, University of Latvia, Latvia University of Agriculture, Institute of Mathematics and Informatics of University of Latvia
摘要:Convolutional neural networks (CNNs) have become the state-of-the-art solution for
image classification and other related problems. This paper investigates the use of CNNs’ features
for on-line television stream classification by genre of the programme. As most existing offline
classification solutions propose the use of low level audio-visual video descriptors, this paper
compares the precision achieved by simple structure multi-layer perceptrons (MLP) and long
short-term memory (LSTM) recurrent neural networks (RNNs) using either low level visual and
audial descriptors or activations of InceptionV3 CNN’s global pooling layer as features. The best
real-time classification accuracy on evaluation data set of 71,6% was achieved by an LSTM RNN
of CNN features, supporting the use of CNNs for television genre classification.
关键词:television genre classification; television stream classification; video classification;
neural networks; InceptionV3