文章基本信息

标题：UNC-Corpus Corpus de diagramas UML para la solución de problemas de completitud en ingeniería de software
本地全文：下载
作者：Carlos M. Zapata J. ; Juan C. Hernández P. ; Raúl A. Zuluaga 等
期刊名称：Revista Universidad EAFIT
印刷版ISSN：0120-341X
出版年度：2011
卷号：44
期号：151
页码：93-106
语种：English
出版社：Universidad EAFIT
摘要：Computational corpora are used as tools in Natural Language Processing (NLP) to solve disambiguation, translation and automated text generation problems. In order to complete these tasks, the main feature of computational corpora (the fact that they have proven uses of a language) is combined with statistical analysis along with information extraction methods based on neural networks or genetic algorithms. In software engineering, there is no evidence supporting the use of diagram computational corpora. Diagram repositories have a similar application working with real examples of diagrams (mainly for reuse purposes), but without using neither statistics nor heuristic methods for information extraction. In this paper, the UNC-Corpus, a tool for managing a corpus of UML (Unified Modelling Language) diagrams, which applies NPL traditional techniques in order to solve completeness problems in software engineering, is proposed.
关键词：annotated corpus;UML diagrams;XMI;repository;metamodelling;NLP;information extraction;corpus anotado;diagramas UML;XMI;repositorio;metamodelado;PLN;extracción de información