首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark
  • 本地全文:下载
  • 作者:Yong Wang ; Wenlong Ke ; Xiaoling Tao
  • 期刊名称:Information
  • 电子版ISSN:2078-2489
  • 出版年度:2016
  • 卷号:7
  • 期号:1
  • 页码:6-16
  • DOI:10.3390/info7010006
  • 出版社:MDPI Publishing
  • 摘要:Currently, with the rapid increasing of data scales in network traffic classifications, how to select traffic features efficiently is becoming a big challenge. Although a number of traditional feature selection methods using the Hadoop-MapReduce framework have been proposed, the execution time was still unsatisfactory with numeral iterative computations during the processing. To address this issue, an efficient feature selection method for network traffic based on a new parallel computing framework called Spark is proposed in this paper. In our approach, the complete feature set is firstly preprocessed based on Fisher score, and a sequential forward search strategy is employed for subsets. The optimal feature subset is then selected using the continuous iterations of the Spark computing framework. The implementation demonstrates that, on the precondition of keeping the classification accuracy, our method reduces the time cost of modeling and classification, and improves the execution efficiency of feature selection significantly.
  • 关键词:feature selection; Fisher score; sequential forward search; MapReduce; Spark feature selection ; Fisher score ; sequential forward search ; MapReduce ; Spark
国家哲学社会科学文献中心版权所有