期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2016
卷号:7
期号:5
DOI:10.14569/IJACSA.2016.070525
出版社:Science and Information Society (SAI)
摘要:Predicting gender by names is one of the most interesting problems in the domain of Information Retrieval and expert finding task. In this research paper, we propose a machine learning approach for gender prediction task. We propose a new feature, that is, combination of letters in names which gives 86.54% accuracy. Our data collection consists of 3000 Urdu language names written using English Alphabets. This technique can be used to extract names from email addresses and hence is also valid for emails. To the best of our knowledge, it is the first- ever attempt for predicting gender from Pakistani (Urdu) names written using English alphabets.