文章基本信息

标题：Feature Optimization and Performance Evaluation of Machine Learning Algorithms for Identification of P2P Traffic
本地全文：下载
作者：Agrawal, Sunil ; Sohi, Balwinder S.
期刊名称：Journal of Advances in Information Technology
印刷版ISSN：1798-2340
出版年度：2012
卷号：3
期号：2
页码：107-114
DOI：10.4304/jait.3.2.107-114
语种：English
出版社：Academy Publisher
摘要：P2P applications supposedly constitute a substantial proportion of today's Internet traffic. The ability to accurately identify different P2P applications in internet traffic is important to a broad range of network operations including application-specific traffic engineering, capacity planning, resource provisioning, service differentiation, etc. However, current P2P applications use several obfuscation techniques, including dynamic port numbers, port hopping, and encrypted payloads. As P2P applications continue to evolve, robust and effective methods are needed for identification of P2P applications. It is general practice to reduce the cost of classification by reducing the number of features, utilizing some feature selection algorithm. But such algorithms are highly data-dependent and do not yield good result when tried upon other data set. In this paper, we propose an optimized set of features and compare five supervised ML algorithms for identification of the P2P traffic. It is found that NBTree outperforms other ML algorithms with 96.6% precision and 99.7% recall, when they are trained and tested on the same data set. As far as training time is concerned, BayesNet is the best with precision and recall very close to that of NBTree.
关键词：Flow features; Feature selection; Machine learning (ML) algorithms; Traffic classification