首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:Multi-armed Bandit Online Learning Based on POMDP in Cognitive Radio
  • 本地全文:下载
  • 作者:Juan Zhang ; Hesong-Jiang ; Hong Jiang
  • 期刊名称:International Journal of Smart Home
  • 印刷版ISSN:1975-4094
  • 出版年度:2014
  • 卷号:8
  • 期号:3
  • 页码:151-162
  • DOI:10.14257/ijsh.2014.8.3.14
  • 出版社:SERSC
  • 摘要:In cognitive radio, most of existing research efforts devoted to spectrum sharing have two weakness as follows. First, they are largely formulated as a Markov decision process (MDP), which requires a complete knowledge of channel. Second, most of the studies are online learning based on perceived channel. To solve the above problems, a new algorithm is proposed in this paper: if the authorized user exists in the current channel, Second user will send conservatively in low rate, or send aggressively. When sending conservatively, the state of the channel is not directly observable, the problem turns out to be Partially Observable Markov Decision Process (POMDP).We first establish the optimal threshold when the channel is known, then consider the optimal transmission when the channel is unknown and model for multi-armed bandit. We get the optimal K-conservative policy through the UCB algorithm and improve the convergence speed by UCB-TUNED algorithm. Simulation and analysis results show that it is the same result of K-conservative policy no matter the multi- armed bandit online learning under not fully known channel or the optimal threshold policy under known channel .At the same time, we improve the convergence speed by UCB-TUNED algorithm
  • 关键词:spectrum sharing; multi-armed bandit; online learning; Partially Observable ; Markov Decision Process
国家哲学社会科学文献中心版权所有