摘要:We apply machine learning and Convex-Hull algorithms to separate RR Lyrae stars from other stars like mainsequence stars, white dwarf stars, carbon stars, CVs, and carbon-lines stars, based on the Sloan Digital Sky Survey and Galaxy Evolution Explorer (GALEX). In low-dimensional spaces, the Convex-Hull algorithm is applied to select RR Lyrae stars. Given different input patterns of (u−g, g−r), (g−r, r−i), (r−i, i−z), (u−g, g−r, r−i), (g−r, r−i, i−z), (u−g, g−r, i−z), and (u−g, r−i, i −z), different convex hulls can be built for RR Lyrae stars. Comparing the performance of different input patterns, u−g, g−r, i−z is the best input pattern. For this input pattern, the efficiency (the fraction of true RR Lyrae stars in the predicted RR Lyrae sample) is 4.2% with a completeness (the fraction of recovered RR Lyrae stars in the whole RR Lyrae sample) of 100%, increases to 9.9% with 97% completeness and to 16.1% with 53% completeness by removing some outliers. In high-dimensional spaces, machine learning algorithms are used with input patterns (u−g, g−r, r−i, i−z),(u−g, g−r, r−i, i−z, r), (NUV−u, u−g, g−r, r−i, i−z), and (NUV−u, u−g, g−r, r−i, i−z, r). RR Lyrae stars, which belong to the class of interest in our paper, are rare compared to other stars. For the highly imbalanced data, cost-sensitive Support Vector Machine, cost-sensitive Random Forest, and Fast Boxes is used. The results show that information from GALEX is helpful for identifying RR Lyrae stars, and Fast Boxes is the best performer on the skewed data in our case.
关键词:astronomical databases: miscellaneous;methods: data analysis;methods: statistical;stars: general;stars: variables: RR Lyrae