期刊名称:International Journal of Future Generation Communication and Networking
印刷版ISSN:2233-7857
出版年度:2016
卷号:9
期号:5
页码:75-82
DOI:10.14257/ijfgcn.2016.9.5.08
出版社:SERSC
摘要:In the high-speed backbone network, with the increasing speed of network link, the number of network flows increase rapidly. Meanwhile, with restrictions on hardware computing and storage resources, so, how to identify and measure large flows timely and accurately in massive data become a hot issue in high speed network flow measurement area. In this paper, we propose a new algorithm based on double hash algorithm to realize large flow frequent items identification, according to the defect of MF algorithm which produces false positive easily and frequent updates to bring the huge pressure to the system. The complexity and false positive rate of the algorithm was analyzed. The effect of large flow frequent items statistical accuracy and discard rate for parameter configuration was analyzed through simulation. The theoretical analysis and the simulation result indicate that compare to MF algorithm, our algorithm can identify large flow frequent items more accurately, and satisfies the need of actual measurement.
关键词:network measurement; massive data; data mining; frequent item; hash method