摘要:We describe Genesis, a language for the generation of synthetic programs. The language allows users to annotate a template program to customize its code using statistical distributions and to generate program instances based on those distributions. This effectively allows users to generate programs whose characteristics vary in a statistically controlled fashion, thus improving upon existing program generators and alleviating the difficulties associated with ad hoc methods of program generation. We describe the language constructs, a prototype preprocessor for the language, and five case studies that show the ability of Genesis to express a range of programs. We evaluate the preprocessor’s performance and the statistical quality of the samples it generates. We thereby show that Genesis is a useful tool that eases the expression and creation of large and diverse program sets.