CIC-FBK Approach to Native Language Identification

IRIS

We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-grams, and syntactic n-grams of words and of syntactic relation tags. We use log-entropy weighting scheme and perform classification using the Support Vector Machines (SVM) algorithm. Our system achieved 0.8808 macro-averaged F1-score and shared the 1st rank in the NLI Shared Task 2017 scoring

CIC-FBK Approach to Native Language Identification / Markov, I., Chen, L., Strapparava, C., Sidorov, G.. - (2017), pp. 374-381. (12th Workshop on Innovative Use of NLP for Building Educational Applications Copenhagen, Denmark September) [10.18653/v1/W17-5042].

CIC-FBK Approach to Native Language Identification

Ilia Markov;Lingzhen Chen;Carlo Strapparava;Grigori Sidorov

2017-01-01

Abstract

We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-grams, and syntactic n-grams of words and of syntactic relation tags. We use log-entropy weighting scheme and perform classification using the Support Vector Machines (SVM) algorithm. Our system achieved 0.8808 macro-averaged F1-score and shared the 1st rank in the NLI Shared Task 2017 scoring

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2017
			
	Titolo del volume (Proceedings title)
	
				Proceedings of 12th Workshop on Innovative Use of NLP for Building Educational Applications
			
	Luogo di edizione (Place of publication)
	
				USA
			
	Casa editrice (Publisher)
	
				Association for Computational Linguistics
			
	ISBN
	
				978-1-945626-85-2
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85096916226
			
	Tutti gli autori
	
						Markov, Ilia; Chen, Lingzhen; Strapparava, Carlo; Sidorov, Grigori
					
	Citazione
	
				CIC-FBK Approach to Native Language Identification / Markov, I., Chen, L., Strapparava, C., Sidorov, G.. - (2017), pp. 374-381. (12th Workshop on Innovative Use of NLP for Building Educational Applications Copenhagen, Denmark September) [10.18653/v1/W17-5042].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/343173

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

21

ND

17

social impact