The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Dans cette étude, un cadre d'exploration de canal décentralisé basé sur un bandit multi-armé contextuel (CMAB) démêlant une fonction d'utilité de canal (c'est-à-dire la récompense) par rapport aux points d'accès (AP) voisins en conflit est proposé. Le cadre proposé permet aux AP d'évaluer les récompenses observées de manière compositionnelle pour les AP en compétition, permettant à la fois la robustesse contre les fluctuations de récompense dues aux différents canaux des AP voisins et l'évaluation des canaux même inexplorés. Pour réaliser ce cadre, nous proposons l'extraction de caractéristiques basée sur les conflits (CDFE), qui extrait la relation de contiguïté entre les AP en conflit et constitue la base pour exprimer les fonctions de récompense sous une forme démêlée, c'est-à-dire une combinaison linéaire de paramètres associés aux AP voisins sous contention). Cela permet d’exploiter le CMAB avec une exploration conjointe de la limite supérieure de confiance linéaire (JLinUCB) et d’approfondir l’efficacité du cadre proposé. De plus, nous abordons le problème de non-convergence — le cycle d'exploration des canaux — en proposant un JLinUCB pénalisé (P-JLinUCB) basé sur l'idée clé d'introduire un paramètre de remise dans la récompense pour l'exploitation d'un canal différent avant et après le cycle d'apprentissage. . Les évaluations numériques confirment que la méthode proposée permet aux points d'accès d'évaluer la qualité du canal de manière robuste par rapport aux fluctuations de récompense par CDFE et d'obtenir de meilleures propriétés de convergence par P-JLinUCB.
Kota YAMASHITA
Kyoto University
Shotaro KAMIYA
Sony Corporation
Koji YAMAMOTO
Kyoto University
Yusuke KODA
University of Oulu
Takayuki NISHIO
Tokyo Institute of Technology
Masahiro MORIKURA
Kyoto University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Kota YAMASHITA, Shotaro KAMIYA, Koji YAMAMOTO, Yusuke KODA, Takayuki NISHIO, Masahiro MORIKURA, "Penalized and Decentralized Contextual Bandit Learning for WLAN Channel Allocation with Contention-Driven Feature Extraction" in IEICE TRANSACTIONS on Communications,
vol. E105-B, no. 10, pp. 1268-1279, October 2022, doi: 10.1587/transcom.2021EBP3197.
Abstract: In this study, a contextual multi-armed bandit (CMAB)-based decentralized channel exploration framework disentangling a channel utility function (i.e., reward) with respect to contending neighboring access points (APs) is proposed. The proposed framework enables APs to evaluate observed rewards compositionally for contending APs, allowing both robustness against reward fluctuation due to neighboring APs' varying channels and assessment of even unexplored channels. To realize this framework, we propose contention-driven feature extraction (CDFE), which extracts the adjacency relation among APs under contention and forms the basis for expressing reward functions in disentangled form, that is, a linear combination of parameters associated with neighboring APs under contention). This allows the CMAB to be leveraged with a joint linear upper confidence bound (JLinUCB) exploration and to delve into the effectiveness of the proposed framework. Moreover, we address the problem of non-convergence — the channel exploration cycle — by proposing a penalized JLinUCB (P-JLinUCB) based on the key idea of introducing a discount parameter to the reward for exploiting a different channel before and after the learning round. Numerical evaluations confirm that the proposed method allows APs to assess the channel quality robustly against reward fluctuations by CDFE and achieves better convergence properties by P-JLinUCB.
URL: https://global.ieice.org/en_transactions/communications/10.1587/transcom.2021EBP3197/_p
Copier
@ARTICLE{e105-b_10_1268,
author={Kota YAMASHITA, Shotaro KAMIYA, Koji YAMAMOTO, Yusuke KODA, Takayuki NISHIO, Masahiro MORIKURA, },
journal={IEICE TRANSACTIONS on Communications},
title={Penalized and Decentralized Contextual Bandit Learning for WLAN Channel Allocation with Contention-Driven Feature Extraction},
year={2022},
volume={E105-B},
number={10},
pages={1268-1279},
abstract={In this study, a contextual multi-armed bandit (CMAB)-based decentralized channel exploration framework disentangling a channel utility function (i.e., reward) with respect to contending neighboring access points (APs) is proposed. The proposed framework enables APs to evaluate observed rewards compositionally for contending APs, allowing both robustness against reward fluctuation due to neighboring APs' varying channels and assessment of even unexplored channels. To realize this framework, we propose contention-driven feature extraction (CDFE), which extracts the adjacency relation among APs under contention and forms the basis for expressing reward functions in disentangled form, that is, a linear combination of parameters associated with neighboring APs under contention). This allows the CMAB to be leveraged with a joint linear upper confidence bound (JLinUCB) exploration and to delve into the effectiveness of the proposed framework. Moreover, we address the problem of non-convergence — the channel exploration cycle — by proposing a penalized JLinUCB (P-JLinUCB) based on the key idea of introducing a discount parameter to the reward for exploiting a different channel before and after the learning round. Numerical evaluations confirm that the proposed method allows APs to assess the channel quality robustly against reward fluctuations by CDFE and achieves better convergence properties by P-JLinUCB.},
keywords={},
doi={10.1587/transcom.2021EBP3197},
ISSN={1745-1345},
month={October},}
Copier
TY - JOUR
TI - Penalized and Decentralized Contextual Bandit Learning for WLAN Channel Allocation with Contention-Driven Feature Extraction
T2 - IEICE TRANSACTIONS on Communications
SP - 1268
EP - 1279
AU - Kota YAMASHITA
AU - Shotaro KAMIYA
AU - Koji YAMAMOTO
AU - Yusuke KODA
AU - Takayuki NISHIO
AU - Masahiro MORIKURA
PY - 2022
DO - 10.1587/transcom.2021EBP3197
JO - IEICE TRANSACTIONS on Communications
SN - 1745-1345
VL - E105-B
IS - 10
JA - IEICE TRANSACTIONS on Communications
Y1 - October 2022
AB - In this study, a contextual multi-armed bandit (CMAB)-based decentralized channel exploration framework disentangling a channel utility function (i.e., reward) with respect to contending neighboring access points (APs) is proposed. The proposed framework enables APs to evaluate observed rewards compositionally for contending APs, allowing both robustness against reward fluctuation due to neighboring APs' varying channels and assessment of even unexplored channels. To realize this framework, we propose contention-driven feature extraction (CDFE), which extracts the adjacency relation among APs under contention and forms the basis for expressing reward functions in disentangled form, that is, a linear combination of parameters associated with neighboring APs under contention). This allows the CMAB to be leveraged with a joint linear upper confidence bound (JLinUCB) exploration and to delve into the effectiveness of the proposed framework. Moreover, we address the problem of non-convergence — the channel exploration cycle — by proposing a penalized JLinUCB (P-JLinUCB) based on the key idea of introducing a discount parameter to the reward for exploiting a different channel before and after the learning round. Numerical evaluations confirm that the proposed method allows APs to assess the channel quality robustly against reward fluctuations by CDFE and achieves better convergence properties by P-JLinUCB.
ER -