Using large corpora of chronologically ordered language, it is possible to explore diachronic phenomena, identifying previously unknown correlations between language usage and time periods, or epochs. We focused on a statistical approach to epoch delimitation and introduced the task of epoch characterization. We investigated the significant changes in the distribution of terms in the Google N-gram corpus and their relationships with emotion words. The results show that the method is reliable and the task is feasible.
Behind the Times: Detecting Epoch Changes using Large Corpora / Popescu, O.; Strapparava, C.. - (2013), pp. 347-355. (Intervento presentato al convegno 6th International Joint Conference on Natural Language Processing, IJCNLP 2013 tenutosi a Nagoya, Japan nel 2013).
Behind the Times: Detecting Epoch Changes using Large Corpora
Popescu O.;Strapparava C.
2013-01-01
Abstract
Using large corpora of chronologically ordered language, it is possible to explore diachronic phenomena, identifying previously unknown correlations between language usage and time periods, or epochs. We focused on a statistical approach to epoch delimitation and introduced the task of epoch characterization. We investigated the significant changes in the distribution of terms in the Google N-gram corpus and their relationships with emotion words. The results show that the method is reliable and the task is feasible.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione