The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Cet article se concentre sur la « période de collecte de données » pour former un meilleur modèle de prédiction des défauts juste à temps (JIT) – les données de validation précoce par rapport au modèle récent –, et mène une étude comparative à grande échelle pour explorer un modèle de données approprié. la période de collecte. Puisqu’il existe de nombreux algorithmes d’apprentissage automatique possibles pour former des modèles de prédiction de défauts, la sélection d’algorithmes d’apprentissage automatique peut devenir une menace pour la validité. Par conséquent, cette étude adopte la méthode d’apprentissage automatique automatique pour atténuer le biais de sélection dans l’étude comparative. Les résultats empiriques utilisant 122 projets de logiciels open source prouvent la tendance selon laquelle l'ensemble de données composé des commits récents deviendrait un meilleur ensemble de formation pour les modèles de prédiction de défauts JIT.
Kosuke OHARA
Ehime University
Hirohisa AMAN
Ehime University
Sousuke AMASAKI
Okayama Prefectural University
Tomoyuki YOKOGAWA
Okayama Prefectural University
Minoru KAWAHARA
Ehime University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Kosuke OHARA, Hirohisa AMAN, Sousuke AMASAKI, Tomoyuki YOKOGAWA, Minoru KAWAHARA, "A Comparative Study of Data Collection Periods for Just-In-Time Defect Prediction Using the Automatic Machine Learning Method" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 2, pp. 166-169, February 2023, doi: 10.1587/transinf.2022MPL0002.
Abstract: This paper focuses on the “data collection period” for training a better Just-In-Time (JIT) defect prediction model — the early commit data vs. the recent one —, and conducts a large-scale comparative study to explore an appropriate data collection period. Since there are many possible machine learning algorithms for training defect prediction models, the selection of machine learning algorithms can become a threat to validity. Hence, this study adopts the automatic machine learning method to mitigate the selection bias in the comparative study. The empirical results using 122 open-source software projects prove the trend that the dataset composed of the recent commits would become a better training set for JIT defect prediction models.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022MPL0002/_p
Copier
@ARTICLE{e106-d_2_166,
author={Kosuke OHARA, Hirohisa AMAN, Sousuke AMASAKI, Tomoyuki YOKOGAWA, Minoru KAWAHARA, },
journal={IEICE TRANSACTIONS on Information},
title={A Comparative Study of Data Collection Periods for Just-In-Time Defect Prediction Using the Automatic Machine Learning Method},
year={2023},
volume={E106-D},
number={2},
pages={166-169},
abstract={This paper focuses on the “data collection period” for training a better Just-In-Time (JIT) defect prediction model — the early commit data vs. the recent one —, and conducts a large-scale comparative study to explore an appropriate data collection period. Since there are many possible machine learning algorithms for training defect prediction models, the selection of machine learning algorithms can become a threat to validity. Hence, this study adopts the automatic machine learning method to mitigate the selection bias in the comparative study. The empirical results using 122 open-source software projects prove the trend that the dataset composed of the recent commits would become a better training set for JIT defect prediction models.},
keywords={},
doi={10.1587/transinf.2022MPL0002},
ISSN={1745-1361},
month={February},}
Copier
TY - JOUR
TI - A Comparative Study of Data Collection Periods for Just-In-Time Defect Prediction Using the Automatic Machine Learning Method
T2 - IEICE TRANSACTIONS on Information
SP - 166
EP - 169
AU - Kosuke OHARA
AU - Hirohisa AMAN
AU - Sousuke AMASAKI
AU - Tomoyuki YOKOGAWA
AU - Minoru KAWAHARA
PY - 2023
DO - 10.1587/transinf.2022MPL0002
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2023
AB - This paper focuses on the “data collection period” for training a better Just-In-Time (JIT) defect prediction model — the early commit data vs. the recent one —, and conducts a large-scale comparative study to explore an appropriate data collection period. Since there are many possible machine learning algorithms for training defect prediction models, the selection of machine learning algorithms can become a threat to validity. Hence, this study adopts the automatic machine learning method to mitigate the selection bias in the comparative study. The empirical results using 122 open-source software projects prove the trend that the dataset composed of the recent commits would become a better training set for JIT defect prediction models.
ER -