摘要:With the advent of the Internet and the spread of computer users, many applications have beendeveloped that are used by millions of user's everyday tasks like office applications or web browsers. Softwarecompanies spend over 45% of cost in dealing with software bugs. An inevitable step of fixing bugs is bug triagewhich aims to correctly assign a developer to a new bug. Bug-tracking and issue-tracking systems tend to bepopulated with bugs, issues or tickets written by a wide variety of bug reporters with different levels of trainingand knowledge about the system being discussed. Many bug reporters lack the skills, vocabulary, knowledgeor time to efficiently search the issue tracker for similar issues. As a result, issue trackers are often full ofduplicate issues and bugs and bug triaging is time consuming and error prone. Software bugs occur for a widerange of reasons. Bug reports can be generated automatically or drafted by user of software. Bug reports canalso go with other malfunctions of the software, mostly for the beta or unsteady versions of the software. Mostoften, these bug reports are improved with user contributed accounts experiences as to know what in fact facedby him/her. Addressing these bug's for the majority of effort spent in the maintenance phase of a softwareproject life cycle. Most often, several bug reports, sent by different users, match up to the same defect.Nevertheless, every bug report is to be analyzed separately and carefully for the possibility of a potential bug.The person responsible for processing the newly reported bugs, checking for duplicates and passing them tosuitable developers to get fixed is called a Triager and this process is called Triaging. The utility of bug trackingsystems is hindered by a large number of duplicate bug reports. In many open source software projects as manyas one third of all reports are duplicates. This identification of duplicacy in bug reports is time-taking and addsto the already high cost of software maintenance. To decrease the time cost in manual work, text classificationtechniques are applied to conduct automatic bug triage. This study presents an overview of the works doneto better detect duplicate bugs have been conducted on open source data set.