The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Un cadre de transformation grammaticale sensible au contexte pour accélérer la correspondance de modèles compressés (CPM) est proposé. Un algorithme de compression glouton avec le modèle de transformation est présenté ainsi qu'un algorithme de correspondance de motifs compressé de type Knuth-Morris-Pratt (KMP). Le taux de compression correspond à gzip et Re-Pair, et la vitesse de recherche de notre algorithme CPM est presque deux fois plus rapide que l'algorithme CPM de type KMP sur Byte-Pair-Encoding de Shibata et al., et dans le cas de courts modèles, plus rapide que l'algorithme de Boyer-Moore-Horspool avec le codage stopper de Rautio et al., qui est considéré comme l'une des meilleures combinaisons permettant une recherche pratiquement rapide.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Shirou MARUYAMA, Youhei TANAKA, Hiroshi SAKAMOTO, Masayuki TAKEDA, "Context-Sensitive Grammar Transform: Compression and Pattern Matching" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 2, pp. 219-226, February 2010, doi: 10.1587/transinf.E93.D.219.
Abstract: A framework of context-sensitive grammar transform for speeding-up compressed pattern matching (CPM) is proposed. A greedy compression algorithm with the transform model is presented as well as a Knuth-Morris-Pratt (KMP)-type compressed pattern matching algorithm. The compression ratio is a match for gzip and Re-Pair, and the search speed of our CPM algorithm is almost twice faster than the KMP-type CPM algorithm on Byte-Pair-Encoding by Shibata et al., and in the case of short patterns, faster than the Boyer-Moore-Horspool algorithm with the stopper encoding by Rautio et al., which is regarded as one of the best combinations that allows a practically fast search.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.219/_p
Copier
@ARTICLE{e93-d_2_219,
author={Shirou MARUYAMA, Youhei TANAKA, Hiroshi SAKAMOTO, Masayuki TAKEDA, },
journal={IEICE TRANSACTIONS on Information},
title={Context-Sensitive Grammar Transform: Compression and Pattern Matching},
year={2010},
volume={E93-D},
number={2},
pages={219-226},
abstract={A framework of context-sensitive grammar transform for speeding-up compressed pattern matching (CPM) is proposed. A greedy compression algorithm with the transform model is presented as well as a Knuth-Morris-Pratt (KMP)-type compressed pattern matching algorithm. The compression ratio is a match for gzip and Re-Pair, and the search speed of our CPM algorithm is almost twice faster than the KMP-type CPM algorithm on Byte-Pair-Encoding by Shibata et al., and in the case of short patterns, faster than the Boyer-Moore-Horspool algorithm with the stopper encoding by Rautio et al., which is regarded as one of the best combinations that allows a practically fast search.},
keywords={},
doi={10.1587/transinf.E93.D.219},
ISSN={1745-1361},
month={February},}
Copier
TY - JOUR
TI - Context-Sensitive Grammar Transform: Compression and Pattern Matching
T2 - IEICE TRANSACTIONS on Information
SP - 219
EP - 226
AU - Shirou MARUYAMA
AU - Youhei TANAKA
AU - Hiroshi SAKAMOTO
AU - Masayuki TAKEDA
PY - 2010
DO - 10.1587/transinf.E93.D.219
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2010
AB - A framework of context-sensitive grammar transform for speeding-up compressed pattern matching (CPM) is proposed. A greedy compression algorithm with the transform model is presented as well as a Knuth-Morris-Pratt (KMP)-type compressed pattern matching algorithm. The compression ratio is a match for gzip and Re-Pair, and the search speed of our CPM algorithm is almost twice faster than the KMP-type CPM algorithm on Byte-Pair-Encoding by Shibata et al., and in the case of short patterns, faster than the Boyer-Moore-Horspool algorithm with the stopper encoding by Rautio et al., which is regarded as one of the best combinations that allows a practically fast search.
ER -