文章基本信息

标题：TGLL: A Fast Threaded Nondeterministic Ll (*) Parsing
本地全文：下载
作者：Amr M. Abdel Latif ; Amr Kamel ; Reem Bahgat 等
期刊名称：ARPN Journal of Systems and Software
电子版ISSN：2222-9833
出版年度：2015
卷号：5
期号：2
页码：27-39
出版社：ARPN Publishers
摘要：
Parsers play important role in compilers, NLP and other applications. Building a parser by hand is a complex task, thus a generator is required. Nowadays computers contain processors with multi-core that encourage programmers to develop programs in parallel environments, while the parsing process is still in many cases sequential and hence can抰 benefit from the parallelization. Most grammars are written in highly non-deterministic rules. Some solutions are set to handle non-determinism such as parsing with backtracking, look ahead or by prediction. These techniques are only sequential. This paper presents a new algorithm for solving non-determinism using multi-threading technology. It is named TGLL (Threaded General LL) parsing method. It handles all context free grammars and resolves non-determinism by launching multiple threads to parse the different alternative rules. The threads are augmented with backtracking to avoid explosive thread creation. A practical parser generator is developed. The generator can generate a recursive descent parser with multi-threaded or Fork/Join model. Left recursion is converted to right recursion automatically by the generator. The TGLL parser is simple and can be built by hand, but for lengthy grammars it is preferred to use the generator. The experimental work shows that working with the multi-threaded Fork/Join model can be faster than famous existing parser generators such as Java CC and Antler V4 and even with smaller memory. Creating a parser that executes with a single thread and backtracking has approximately the same parsing time as the Fork/Join model but due to the thread allocation constant time, the Fork/Join model takes a small extra constant time than the one-threaded model. However, the Fork/Join model consumes less memory than the single-threaded model.
关键词：Parallel parsing; Parser generator; non-deterministic grammar; LL grammar and compilers