首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:A Comparative Study on the Influence of Undersampling and Oversampling Techniques for the Classification of Physical Activities Using an Imbalanced Accelerometer Dataset
  • 本地全文:下载
  • 作者:Dong-Hwa Jeong ; Se-Eun Kim ; Woo-Hyeok Choi
  • 期刊名称:Healthcare
  • 电子版ISSN:2227-9032
  • 出版年度:2022
  • 卷号:10
  • 期号:7
  • DOI:10.3390/healthcare10071255
  • 语种:English
  • 出版社:MDPI Publishing
  • 摘要:Accelerometer data collected from wearable devices have recently been used to monitor physical activities (PAs) in daily life. While the intensity of PAs can be distinguished with a cut-off approach, it is important to discriminate different behaviors with similar accelerometry patterns to estimate energy expenditure. We aim to overcome the data imbalance problem that negatively affects machine learning-based PA classification by extracting well-defined features and applying undersampling and oversampling methods. We extracted various temporal, spectral, and nonlinear features from wrist-, hip-, and ankle-worn accelerometer data. Then, the influences of undersampilng and oversampling were compared using various ML and DL approaches. Among various ML and DL models, ensemble methods including random forest (RF) and adaptive boosting (AdaBoost) exhibited great performance in differentiating sedentary behavior (driving) and three walking types (walking on level ground, ascending stairs, and descending stairs) even in a cross-subject paradigm. The undersampling approach, which has a low computational cost, exhibited classification results unbiased to the majority class. In addition, we found that RF could automatically select relevant features for PA classification depending on the sensor location by examining the importance of each node in multiple decision trees (DTs). This study proposes that ensemble learning using well-defined feature sets combined with the undersampling approach is robust for imbalanced datasets in PA classification. This approach will be useful for PA classification in the free-living situation, where data imbalance problems between classes are common.
  • 关键词:enphysical activityaccelerometerensemble methodrandom forestbootstrap aggregating (bagging)adaptive boostingundersamplingoversampling
国家哲学社会科学文献中心版权所有