This paper presents a pioneering work on building a Named Entity Recognition system for the Mongolian language, with an agglutinative morphol- ogy and a subject-object-verb word order. Our work explores the fittest feature set from a wide range of features and a method that refines machine learning ap- proach using gazetteers with approximate string matching, in an effort for robust handling of out-of-vocabulary words. As well as we tried to apply various ex- isting machine learning methods and find optimal ensemble of classifiers based on genetic algorithm. The classifiers uses different feature representations. The resulting system constitutes the first-ever usable software package for Mongolian NER, while our experimental evaluation will also serve as a much-needed basis of comparison for further research.
Named Entity Recognition for the Mongolian Language / Zoljargal, Munkhjargal; Bella, Gabor; Altangerel, Chagnaa; Giunchiglia, Fausto. - ELETTRONICO. - 9302:(2015), pp. 1-8. (Intervento presentato al convegno 18th International Conference on Text, Speech, and Dialogue. tenutosi a Pilsen, Repubblica Ceca nel 2015) [Pavel Král Václav Matoušek].
Named Entity Recognition for the Mongolian Language
Bella, Gabor;Giunchiglia, Fausto
2015-01-01
Abstract
This paper presents a pioneering work on building a Named Entity Recognition system for the Mongolian language, with an agglutinative morphol- ogy and a subject-object-verb word order. Our work explores the fittest feature set from a wide range of features and a method that refines machine learning ap- proach using gazetteers with approximate string matching, in an effort for robust handling of out-of-vocabulary words. As well as we tried to apply various ex- isting machine learning methods and find optimal ensemble of classifiers based on genetic algorithm. The classifiers uses different feature representations. The resulting system constitutes the first-ever usable software package for Mongolian NER, while our experimental evaluation will also serve as a much-needed basis of comparison for further research.File | Dimensione | Formato | |
---|---|---|---|
Named Entity Recognition.pdf
accesso aperto
Tipologia:
Post-print referato (Refereed author’s manuscript)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.11 MB
Formato
Adobe PDF
|
1.11 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione