The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Nous étudions le problème de recherche de similarité continue pour des requêtes évolutives qui a été récemment formulé. Étant donné un flux de données et une base de données composée de n ensembles d'éléments, le but de ce problème est de maintenir le top-k les ensembles les plus similaires à la requête qui évolue dans le temps et se compose des dernières W éléments du flux de données. Pour ce problème, l’algorithme exact précédent adopte une stratégie d’élagage qui, à l’heure actuelle T, décide des candidats du top-k ensembles les plus similaires à partir des valeurs de similarité passées et calcule les valeurs de similarité uniquement pour elles. Cet article propose un nouvel algorithme exact qui raccourcit le temps d'exécution en calculant les valeurs de similarité uniquement pour les ensembles dont les valeurs de similarité à T peut changer avec le temps T-1. Nous identifions ces ensembles très rapidement avec des listes inversées basées sur la fréquence (FIL). De plus, nous dérivons les valeurs de similarité à T in O(1) heure en mettant à jour les valeurs précédentes calculées à l'heure T-1. Expérimentalement, notre algorithme exact s'exécute plus rapidement que l'algorithme exact précédent d'un ordre de grandeur et aussi vite que l'algorithme d'approximation précédent.
Tomohiro YAMAZAKI
Engineering, the Univeristy of Electro-Communications
Hisashi KOGA
Engineering, the Univeristy of Electro-Communications
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Tomohiro YAMAZAKI, Hisashi KOGA, "Exact Algorithm to Solve Continuous Similarity Search for Evolving Queries and Its Variant" in IEICE TRANSACTIONS on Information,
vol. E105-D, no. 5, pp. 898-908, May 2022, doi: 10.1587/transinf.2021DAP0003.
Abstract: We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time T-1. We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time T-1. Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude and as fast as the previous approximation algorithm.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021DAP0003/_p
Copier
@ARTICLE{e105-d_5_898,
author={Tomohiro YAMAZAKI, Hisashi KOGA, },
journal={IEICE TRANSACTIONS on Information},
title={Exact Algorithm to Solve Continuous Similarity Search for Evolving Queries and Its Variant},
year={2022},
volume={E105-D},
number={5},
pages={898-908},
abstract={We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time T-1. We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time T-1. Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude and as fast as the previous approximation algorithm.},
keywords={},
doi={10.1587/transinf.2021DAP0003},
ISSN={1745-1361},
month={May},}
Copier
TY - JOUR
TI - Exact Algorithm to Solve Continuous Similarity Search for Evolving Queries and Its Variant
T2 - IEICE TRANSACTIONS on Information
SP - 898
EP - 908
AU - Tomohiro YAMAZAKI
AU - Hisashi KOGA
PY - 2022
DO - 10.1587/transinf.2021DAP0003
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E105-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2022
AB - We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time T-1. We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time T-1. Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude and as fast as the previous approximation algorithm.
ER -