Recent neural network approaches to sentence matching compute the probability of two sentences being similar by minimizing a logistic loss. In this paper, we learn sentence representations by means of a siamese network, which: (i) uses encoders that share parameters; and (ii) enables the comparison between two sentences in terms of their euclidean distance, by minimizing a contrastive loss. Moreover, we add a multilayer perceptron in the architecture to simultaneously optimize the contrastive and the logistic losses. This way, our network can exploit a more informative feedback, given by the logistic loss, which is also quantified by the distance that the two sentences have according to their representation in the euclidean space. We show that jointly minimizing the two losses yields higher accuracy than minimizing them independently. We verify this finding by evaluating several baseline architectures in two sentence matching tasks: question paraphrasing and textual entailment recognition. Our network approaches the state of the art, while being much simpler and faster to train, and with less parameters than its competitors.
Accurate Sentence Matching with Hybrid Siamese Networks / Nicosia, Massimo; Moschitti, Alessandro. - ELETTRONICO. - (2017), pp. 2235-2238. [10.1145/3132847.3133156]
Accurate Sentence Matching with Hybrid Siamese Networks
massimo nicosia;alessandro moschitti
2017-01-01
Abstract
Recent neural network approaches to sentence matching compute the probability of two sentences being similar by minimizing a logistic loss. In this paper, we learn sentence representations by means of a siamese network, which: (i) uses encoders that share parameters; and (ii) enables the comparison between two sentences in terms of their euclidean distance, by minimizing a contrastive loss. Moreover, we add a multilayer perceptron in the architecture to simultaneously optimize the contrastive and the logistic losses. This way, our network can exploit a more informative feedback, given by the logistic loss, which is also quantified by the distance that the two sentences have according to their representation in the euclidean space. We show that jointly minimizing the two losses yields higher accuracy than minimizing them independently. We verify this finding by evaluating several baseline architectures in two sentence matching tasks: question paraphrasing and textual entailment recognition. Our network approaches the state of the art, while being much simpler and faster to train, and with less parameters than its competitors.File | Dimensione | Formato | |
---|---|---|---|
2017_CIKM_Moschitti_Siamese.pdf
accesso aperto
Tipologia:
Post-print referato (Refereed author’s manuscript)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
537.24 kB
Formato
Adobe PDF
|
537.24 kB | Adobe PDF | Visualizza/Apri |
3132847.3133156.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
992.5 kB
Formato
Adobe PDF
|
992.5 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione