摘要:Among the more recent applications for natural language processing algorithms has been the analysis of spoken language data for diagnostic and remedial purposes, fueled by the demand for simple, objective, and unobtrusive screening tools for neurological disorders such as dementia. The automated analysis of narrative retellings in particular shows potential as a component of such a screening tool since the ability to produce accurate and meaningful narratives is noticeably impaired in individuals with dementia and its frequent precursor, mild cognitive impairment, as well as other neurodegenerative and neurodevelopmental disorders. In this article, we present a method for extracting narrative recall scores automatically and highly accurately from a word-level alignment between a retelling and the source narrative. We propose improvements to existing machine translation–based systems for word alignment, including a novel method of word alignment relying on random walks on a graph that achieves alignment accuracy superior to that of standard expectation maximization–based techniques for word alignment in a fraction of the time required for expectation maximization. In addition, the narrative recall score features extracted from these high-quality word alignments yield diagnostic classification accuracy comparable to that achieved using manually assigned scores and significantly higher than that achieved with summary-level text similarity metrics used in other areas of NLP. These methods can be trivially adapted to spontaneous language samples elicited with non-linguistic stimuli, thereby demonstrating the flexibility and generalizability of these methods.