文章基本信息

标题：Object Recognition for Visually Impaired People
本地全文：下载
作者：Kavitha Srinivasan ; Shanmuga Velayutham V ; Vignesh G 等
期刊名称：International Journal of Computer Trends and Technology
电子版ISSN：2231-2803
出版年度：2020
卷号：68
期号：8
页码：33-38
DOI：10.14445/22312803/IJCTT-V68I8P105
出版社：Seventh Sense Research Group
摘要：Deep learning techniques are evolving rapidly in computer vision for many realtime applications, namely object detection, recognition, classification, segmentation, prediction and analysis. In this paper, an object recognition model for visually impaired people is proposed and validated using deep learning techniques for multiple datasets. The proposed model identifies multiple objects in a frame with its corresponding text, and the identified objects are converted into speech to guide the visually impaired people in realtime. The object identification process is carried out using a bounding box technique and a single convolutional neural network. The resulting bounding boxes with less probability than the threshold are eliminated, and the remaining objects are identified using a pretrained Darkflow model. Then the identified objects are mapped to relevant text and converted to speech using TexttoSpeech (TTS) tool. The proposed model has been validated using four types of datasets, such as Pascal VOC dataset, COCO dataset, BROID challenge dataset and Auto Rickshaw detection challenge dataset. The novelty of this work is modified intersection over union algorithm for better recognition, chosen datasets have different sets of images, and the weight file is modified to recognize the objects of the challenge dataset. OpenCV and Compute Unified Device Architecture (CUDA) are used for image manipulation and graphics processing along with Tensorflow. The final output is obtained in audio format by applying TTS to the objects identified using Pyttsx, which is a python package that converts simple text to the speech signal.
关键词：Object identification; Objectrecognition; YOLO; SSD; Intersection over Union; Darkflow; Text to speech.