期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2014
卷号:2
期号:2
出版社:S&S Publications
摘要:Deep web is a database based, i.e., for many search engines, data encoded in the returned result pages comefrom the underlying structured databases. Such type of search engines is often referred as Web databases (WDB). A typicalresult page returned from a WDB has multiple search result records (SRRs). Unfortunately, the semantic labels of data unitsare often not provided in result pages. Having semantic labels for data units is not only important for the above record linkagetask, but also for storing collected SRRs into a database table. Early applications require tremendous human efforts toannotate data units manually, which severely limit their scalability. In this paper, we consider how to automatically assignlabels to the data units within the SRRs returned from WDBs improve the results with new kernel function for improving theaccuracy of the Support Vector Machines (SVMs) classification. The proposed kernel function is stated in general form and iscalled Gaussian Radial Basis Polynomials Function (GRPF) that combines both Gaussian Radial Basis Function (RBF) andPolynomial (POLY) kernels. We implement the proposed kernel with a number of parameters associated with the use of theSVM algorithm that can impact the results
关键词:Data alignment; data annotation; web database; wrapper generation