摘要:Language modeling aims to summarize general knowledge related in natural
language. To this aim, the automatic generation of sentence is an important operation in
the automatic language processing. It can serve as the basic for such various
applications such as automatic translation, continuous speech recognition.
In this article, we present a stochastic model that allows us to measure the
probability of generating a sentence in Arabic from a set of words. This model is based
on the fact that a sentence is based on syntax and semantic level that are independent,
and that allows us to model each level with the appropriate model. The estimation of the
parameters of this model is made on a corpus of training labeled manually by the
syntactic labels.