文章基本信息

标题：Application of kernel principal component analysis and computational machine learning to exploration of metabolites strongly associated with diet
本地全文：下载
作者：Yuka Shiokawa ; Yasuhiro Date ; Jun Kikuchi 等
期刊名称：Scientific Reports
电子版ISSN：2045-2322
出版年度：2018
卷号：8
期号：1
页码：3426
DOI：10.1038/s41598-018-20121-w
语种：English
出版社：Springer Nature
摘要：Computer-based technological innovation provides advancements in sophisticated and diverse analytical instruments, enabling massive amounts of data collection with relative ease. This is accompanied by a fast-growing demand for technological progress in data mining methods for analysis of big data derived from chemical and biological systems. From this perspective, use of a general "linear" multivariate analysis alone limits interpretations due to "non-linear" variations in metabolic data from living organisms. Here we describe a kernel principal component analysis (KPCA)-incorporated analytical approach for extracting useful information from metabolic profiling data. To overcome the limitation of important variable (metabolite) determinations, we incorporated a random forest conditional variable importance measure into our KPCA-based analytical approach to demonstrate the relative importance of metabolites. Using a market basket analysis, hippurate, the most important variable detected in the importance measure, was associated with high levels of some vitamins and minerals present in foods eaten the previous day, suggesting a relationship between increased hippurate and intake of a wide variety of vegetables and fruits. Therefore, the KPCA-incorporated analytical approach described herein enabled us to capture input-output responses, and should be useful not only for metabolic profiling but also for profiling in other areas of biological and environmental systems.