期刊名称:International Journal of Computers and Communications
印刷版ISSN:2074-1294
出版年度:2014
卷号:8
页码:7-16
出版社:University Press
摘要:Several systems with multimodal interfaces are already available, and they allow for a more natural and more advanced exchange of information between man and a machine. Nevertheless, the television domain is still undergoing an innovation/development phase within which standard linear television is further enhanced with several novel technologies. In this way it is already being transformed into a full interactive entertainment environment customizable with several applications and services. Besides, TV set is a most common household device and can, therefore, represent a common platform also for smart-home environment. Current level of personalization and interactive possibilities are still quite limited, especially in terms of context-awareness, recommendation, and multiple user-control-devices (e.g. smart-phones, tablets, game-pads, keyboards, mice, etc.). Therefore, the fusion of evolving IPTV services with natural modalities can be effective solution for users that would like to access these services and IPTV content in a more natural way. In the paper a novel IMS based UMB-SmartTV system is proposed that fuses traditional IPTV services with multimodal services, including text-to-speech synthesis engine, speech recognition engine, and embodied conversational agents, available for several users even remotely. The platform enables flexible migration from often closed and purpose-oriented nature of multimodal systems to the wider scope that IPTV environment can offer. It is designed to overcome problems regarding interoperability, compatibility and integration that often accompany migrations to multiservice (and resource limited) networks. The UMB-SmartTV architecture is developed on IMS core and distributed DATA architecture. In this way it flexibly merges IPTV and non-IPTV services into uniform and highly modular solution that provides entertainment, ambience control, and many other services to the users operating with different devices and speech.