首页    期刊浏览 2024年11月08日 星期五
登录注册

文章基本信息

  • 标题:Sequence based prediction of enhancer regions from DNA random walk
  • 本地全文:下载
  • 作者:Anand Pratap Singh ; Sarthak Mishra ; Suraiya Jabin
  • 期刊名称:Scientific Reports
  • 电子版ISSN:2045-2322
  • 出版年度:2018
  • 卷号:8
  • 期号:1
  • 页码:15912
  • DOI:10.1038/s41598-018-33413-y
  • 语种:English
  • 出版社:Springer Nature
  • 摘要:Regulatory elements play a critical role in development process of eukaryotic organisms by controlling the spatio-temporal pattern of gene expression. Enhancer is one of these elements which contributes to the regulation of gene expression through chromatin loop or eRNA expression. Experimental identification of a novel enhancer is a costly exercise, due to which there is an interest in computational approaches to predict enhancer regions in a genome. Existing computational approaches to achieve this goal have primarily been based on training of high-throughput data such as transcription factor binding sites (TFBS), DNA methylation, and histone modification marks etc. On the other hand, purely sequence based approaches to predict enhancer regions are promising as they are not biased by the complexity or context specificity of such datasets. In sequence based approaches, machine learning models are either directly trained on sequences or sequence features, to classify sequences as enhancers or non-enhancers. In this paper, we derived statistical and nonlinear dynamic features along with k-mer features from experimentally validated sequences taken from Vista Enhancer Browser through random walk model and applied different machine learning based methods to predict whether an input test sequence is enhancer or not. Experimental results demonstrate the success of proposed model based on Ensemble method with area under curve (AUC) 0.86, 0.89, and 0.87 in B cells, T cells, and Natural killer cells for histone marks dataset.
国家哲学社会科学文献中心版权所有