In order to successfully apply opinion mining (OM) to the large amounts of user-generated content produced every day, we need robust models that can handle the noisy input well yet can easily be adapted to a new domain or language. We here focus on opinion mining for YouTube by (i) modeling classifiers that predict the type of a comment and its polarity, while distinguishing whether the polarity is directed towards the product or video; (ii) proposing a robust shallow syntactic structure (STRUCT) that adapts well when tested across domains; and (iii) evaluating the effectiveness on the proposed structure on two languages, English and Italian. We rely on tree kernels to automatically extract and learn features with better generalization power than traditionally used bag-of-word models. Our extensive empirical evaluation shows that (i) STRUCT outperforms the bag-of-words model both within the same domain (up to 2.6% and 3% of absolute improvement for Italian and English, respectively); (ii) it is particularly useful when tested across domains (up to more than 4% absolute improvement for both languages), especially when little training data is available (up to 10% absolute improvement) and (iii) the proposed structure is also effective in a lower-resource language scenario, where only less accurate linguistic processing tools are available. © 2015 Elsevier Ltd. All rights reserved. © 2015 Elsevier B.V. All rights reserved.

Multilingual opinion mining on YouTube / Severyn, Aliaksei; Moschitti, Alessandro; Uryupina, Olga; Plank, Barbara; Filippova, Katja. - In: INFORMATION PROCESSING & MANAGEMENT. - ISSN 0306-4573. - STAMPA. - 52:1(2016), pp. 46-60. [10.1016/j.ipm.2015.03.002]

Multilingual opinion mining on YouTube

Aliaksei Severyn;Alessandro Moschitti;Olga Uryupina;Barbara Plank;
2016-01-01

Abstract

In order to successfully apply opinion mining (OM) to the large amounts of user-generated content produced every day, we need robust models that can handle the noisy input well yet can easily be adapted to a new domain or language. We here focus on opinion mining for YouTube by (i) modeling classifiers that predict the type of a comment and its polarity, while distinguishing whether the polarity is directed towards the product or video; (ii) proposing a robust shallow syntactic structure (STRUCT) that adapts well when tested across domains; and (iii) evaluating the effectiveness on the proposed structure on two languages, English and Italian. We rely on tree kernels to automatically extract and learn features with better generalization power than traditionally used bag-of-word models. Our extensive empirical evaluation shows that (i) STRUCT outperforms the bag-of-words model both within the same domain (up to 2.6% and 3% of absolute improvement for Italian and English, respectively); (ii) it is particularly useful when tested across domains (up to more than 4% absolute improvement for both languages), especially when little training data is available (up to 10% absolute improvement) and (iii) the proposed structure is also effective in a lower-resource language scenario, where only less accurate linguistic processing tools are available. © 2015 Elsevier Ltd. All rights reserved. © 2015 Elsevier B.V. All rights reserved.
2016
1
Severyn, Aliaksei; Moschitti, Alessandro; Uryupina, Olga; Plank, Barbara; Filippova, Katja
Multilingual opinion mining on YouTube / Severyn, Aliaksei; Moschitti, Alessandro; Uryupina, Olga; Plank, Barbara; Filippova, Katja. - In: INFORMATION PROCESSING & MANAGEMENT. - ISSN 0306-4573. - STAMPA. - 52:1(2016), pp. 46-60. [10.1016/j.ipm.2015.03.002]
File in questo prodotto:
File Dimensione Formato  
Multilingual opinion mining on YouTube.pdf

accesso aperto

Tipologia: Pre-print non referato (Non-refereed preprint)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 640.69 kB
Formato Adobe PDF
640.69 kB Adobe PDF Visualizza/Apri
1-s2.0-S0306457315000400-main.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.69 MB
Formato Adobe PDF
1.69 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/101799
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 69
  • ???jsp.display-item.citation.isi??? 46
social impact