The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
La collocation est un phénomène omniprésent dans les langues et une reconnaissance et une extraction précises de la collocation revêtent une grande importance pour de nombreuses tâches de traitement du langage naturel. Les collocations peuvent être différenciées des simples collocations bigrammes aux cadres de collocation (faisant référence aux collocations multi-grammes distantes). Jusqu’à présent, peu d’attention a été accordée aux cadres de colocalisation. Orientée vers la traduction et l'analyse syntaxique, cette étude vise à reconnaître et extraire les trames de collocation les plus longues possibles à partir de phrases données. Nous extrayons d’abord les collocations de bigrammes avec une méthode basée sur la sémantique distributionnelle en introduisant des modèles de collocation et en intégrant certaines mesures d’association de pointe. Sur la base des collocations de bigrammes extraites par la méthode proposée, nous obtenons les trames de collocation les plus longues en fonction de la nature récursive et des règles linguistiques des collocations. Par rapport aux systèmes de base, la méthode proposée fonctionne nettement mieux dans l’extraction de collocations de bigrammes, tant en termes de précision que de rappel. Et lors de l'extraction de trames de collocation, la méthode proposée fonctionne encore mieux avec une précision similaire à celle des résultats d'extraction de collocation de bigrammes.
Xiaoxia LIU
Dalian University of Technology
Degen HUANG
Dalian University of Technology
Zhangzhi YIN
Dalian University of Technology
Fuji REN
Tokushima University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Xiaoxia LIU, Degen HUANG, Zhangzhi YIN, Fuji REN, "Recognition of Collocation Frames from Sentences" in IEICE TRANSACTIONS on Information,
vol. E102-D, no. 3, pp. 620-627, March 2019, doi: 10.1587/transinf.2018EDP7255.
Abstract: Collocation is a ubiquitous phenomenon in languages and accurate collocation recognition and extraction is of great significance to many natural language processing tasks. Collocations can be differentiated from simple bigram collocations to collocation frames (referring to distant multi-gram collocations). So far little focus is put on collocation frames. Oriented to translation and parsing, this study aims to recognize and extract the longest possible collocation frames from given sentences. We first extract bigram collocations with distributional semantics based method by introducing collocation patterns and integrating some state-of-the-art association measures. Based on bigram collocations extracted by the proposed method, we get the longest collocation frames according to recursive nature and linguistic rules of collocations. Compared with the baseline systems, the proposed method performs significantly better in bigram collocation extraction both in precision and recall. And in extracting collocation frames, the proposed method performs even better with the precision similar to its bigram collocation extraction results.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDP7255/_p
Copier
@ARTICLE{e102-d_3_620,
author={Xiaoxia LIU, Degen HUANG, Zhangzhi YIN, Fuji REN, },
journal={IEICE TRANSACTIONS on Information},
title={Recognition of Collocation Frames from Sentences},
year={2019},
volume={E102-D},
number={3},
pages={620-627},
abstract={Collocation is a ubiquitous phenomenon in languages and accurate collocation recognition and extraction is of great significance to many natural language processing tasks. Collocations can be differentiated from simple bigram collocations to collocation frames (referring to distant multi-gram collocations). So far little focus is put on collocation frames. Oriented to translation and parsing, this study aims to recognize and extract the longest possible collocation frames from given sentences. We first extract bigram collocations with distributional semantics based method by introducing collocation patterns and integrating some state-of-the-art association measures. Based on bigram collocations extracted by the proposed method, we get the longest collocation frames according to recursive nature and linguistic rules of collocations. Compared with the baseline systems, the proposed method performs significantly better in bigram collocation extraction both in precision and recall. And in extracting collocation frames, the proposed method performs even better with the precision similar to its bigram collocation extraction results.},
keywords={},
doi={10.1587/transinf.2018EDP7255},
ISSN={1745-1361},
month={March},}
Copier
TY - JOUR
TI - Recognition of Collocation Frames from Sentences
T2 - IEICE TRANSACTIONS on Information
SP - 620
EP - 627
AU - Xiaoxia LIU
AU - Degen HUANG
AU - Zhangzhi YIN
AU - Fuji REN
PY - 2019
DO - 10.1587/transinf.2018EDP7255
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E102-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2019
AB - Collocation is a ubiquitous phenomenon in languages and accurate collocation recognition and extraction is of great significance to many natural language processing tasks. Collocations can be differentiated from simple bigram collocations to collocation frames (referring to distant multi-gram collocations). So far little focus is put on collocation frames. Oriented to translation and parsing, this study aims to recognize and extract the longest possible collocation frames from given sentences. We first extract bigram collocations with distributional semantics based method by introducing collocation patterns and integrating some state-of-the-art association measures. Based on bigram collocations extracted by the proposed method, we get the longest collocation frames according to recursive nature and linguistic rules of collocations. Compared with the baseline systems, the proposed method performs significantly better in bigram collocation extraction both in precision and recall. And in extracting collocation frames, the proposed method performs even better with the precision similar to its bigram collocation extraction results.
ER -