The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Dans cet article, nous présentons une nouvelle méthode d'analyse des dépendances pour les langues qui ont un très petit corpus annoté et pour lesquelles les méthodes de segmentation et d'analyse morphologique produisant un résultat unique (automatiquement désambiguïsé) sont très peu fiables. Notre méthode fonctionne sur un réseau morphosyntaxique factorisant tous les résultats possibles de segmentation et de marquage de parties du discours. La qualité de l’entrée dans l’analyse syntaxique est donc bien meilleure que celle d’une séquence unique et peu fiable de mots lemmatisés et étiquetés. Nous proposons une adaptation de l'algorithme d'Eisner pour trouver les k meilleurs arbres de dépendances dans une structure de réseau morphosyntaxique codant pour plusieurs résultats d'analyse morphosyntaxique. De plus, nous présentons comment utiliser la grammaire d'insertion de dépendances afin d'ajuster les scores et filtrer les arbres invalides, l'utilisation d'un modèle de langage pour rescorer les arbres d'analyse et le k-meilleure extension de notre modèle d'analyse. La précision d'analyse la plus élevée rapportée dans cet article est de 74.32 %, ce qui représente une amélioration de 6.31 % par rapport au modèle prenant en compte les données des outils d'analyse morphosyntaxique peu fiables.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copier
Sutee SUDPRASERT, Asanee KAWTRAKUL, Christian BOITET, Vincent BERMENT, "Dependency Parsing with Lattice Structures for Resource-Poor Languages" in IEICE TRANSACTIONS on Information,
vol. E92-D, no. 10, pp. 2122-2136, October 2009, doi: 10.1587/transinf.E92.D.2122.
Abstract: In this paper, we present a new dependency parsing method for languages which have very small annotated corpus and for which methods of segmentation and morphological analysis producing a unique (automatically disambiguated) result are very unreliable. Our method works on a morphosyntactic lattice factorizing all possible segmentation and part-of-speech tagging results. The quality of the input to syntactic analysis is hence much better than that of an unreliable unique sequence of lemmatized and tagged words. We propose an adaptation of Eisner's algorithm for finding the k-best dependency trees in a morphosyntactic lattice structure encoding multiple results of morphosyntactic analysis. Moreover, we present how to use Dependency Insertion Grammar in order to adjust the scores and filter out invalid trees, the use of language model to rescore the parse trees and the k-best extension of our parsing model. The highest parsing accuracy reported in this paper is 74.32% which represents a 6.31% improvement compared to the model taking the input from the unreliable morphosyntactic analysis tools.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E92.D.2122/_p
Copier
@ARTICLE{e92-d_10_2122,
author={Sutee SUDPRASERT, Asanee KAWTRAKUL, Christian BOITET, Vincent BERMENT, },
journal={IEICE TRANSACTIONS on Information},
title={Dependency Parsing with Lattice Structures for Resource-Poor Languages},
year={2009},
volume={E92-D},
number={10},
pages={2122-2136},
abstract={In this paper, we present a new dependency parsing method for languages which have very small annotated corpus and for which methods of segmentation and morphological analysis producing a unique (automatically disambiguated) result are very unreliable. Our method works on a morphosyntactic lattice factorizing all possible segmentation and part-of-speech tagging results. The quality of the input to syntactic analysis is hence much better than that of an unreliable unique sequence of lemmatized and tagged words. We propose an adaptation of Eisner's algorithm for finding the k-best dependency trees in a morphosyntactic lattice structure encoding multiple results of morphosyntactic analysis. Moreover, we present how to use Dependency Insertion Grammar in order to adjust the scores and filter out invalid trees, the use of language model to rescore the parse trees and the k-best extension of our parsing model. The highest parsing accuracy reported in this paper is 74.32% which represents a 6.31% improvement compared to the model taking the input from the unreliable morphosyntactic analysis tools.},
keywords={},
doi={10.1587/transinf.E92.D.2122},
ISSN={1745-1361},
month={October},}
Copier
TY - JOUR
TI - Dependency Parsing with Lattice Structures for Resource-Poor Languages
T2 - IEICE TRANSACTIONS on Information
SP - 2122
EP - 2136
AU - Sutee SUDPRASERT
AU - Asanee KAWTRAKUL
AU - Christian BOITET
AU - Vincent BERMENT
PY - 2009
DO - 10.1587/transinf.E92.D.2122
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E92-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2009
AB - In this paper, we present a new dependency parsing method for languages which have very small annotated corpus and for which methods of segmentation and morphological analysis producing a unique (automatically disambiguated) result are very unreliable. Our method works on a morphosyntactic lattice factorizing all possible segmentation and part-of-speech tagging results. The quality of the input to syntactic analysis is hence much better than that of an unreliable unique sequence of lemmatized and tagged words. We propose an adaptation of Eisner's algorithm for finding the k-best dependency trees in a morphosyntactic lattice structure encoding multiple results of morphosyntactic analysis. Moreover, we present how to use Dependency Insertion Grammar in order to adjust the scores and filter out invalid trees, the use of language model to rescore the parse trees and the k-best extension of our parsing model. The highest parsing accuracy reported in this paper is 74.32% which represents a 6.31% improvement compared to the model taking the input from the unreliable morphosyntactic analysis tools.
ER -