This thesis extracts conceptual structures from multiple sources: Wordnet, Web Corpora and Wikipedia. The conceptual structures extracted from Wordnet and Web Corpora are inspired by the feature norm effort in cognitive psychology. The conceptual structure extracted from Wikipedia makes the transition between feature norm structures and theory like structures. The main contribution of this thesis can be grouped in two categories: 1. Novel methods for the extraction of conceptual structures. More precisely, there are three new methods we developed: (a) Conceptual structure extraction from Wordnet. We devise a procedure for property extraction from Wordnet using the notion of semantic neighborhood. The procedure exploits the main relations organizing the nouns, the information in glosses and the inheritance of properties principle. (b) Feature Norms like extraction from corpora. We propose a method to acquire feature norm like structures from corpora using weakly supervised methods. (c) Conceptual Structure from Wikipedia. A novel unsupervised method for the extraction of conceptual structures from Wikipedia entries of similar concepts is put forward. The main idea we follow is that similar concepts (i.e. those classied under the same node in a taxonomy) are described in a comparable way in Wikipedia. Moreover, to understand the kind of information extracted from Wikipedia we annotate this knowledge with a set of property types. 2. Evaluation. We evaluate Wordnet as a model of semantic memory and suggest the addition of new semantic relations. We also assess the properties extracted from all sources for a unified test set, in a clustering experiment.
Extracting conceptual structures from multiple sources / Barbu, Eduard. - (2010), pp. 1-181.
Extracting conceptual structures from multiple sources
Barbu, Eduard
2010-01-01
Abstract
This thesis extracts conceptual structures from multiple sources: Wordnet, Web Corpora and Wikipedia. The conceptual structures extracted from Wordnet and Web Corpora are inspired by the feature norm effort in cognitive psychology. The conceptual structure extracted from Wikipedia makes the transition between feature norm structures and theory like structures. The main contribution of this thesis can be grouped in two categories: 1. Novel methods for the extraction of conceptual structures. More precisely, there are three new methods we developed: (a) Conceptual structure extraction from Wordnet. We devise a procedure for property extraction from Wordnet using the notion of semantic neighborhood. The procedure exploits the main relations organizing the nouns, the information in glosses and the inheritance of properties principle. (b) Feature Norms like extraction from corpora. We propose a method to acquire feature norm like structures from corpora using weakly supervised methods. (c) Conceptual Structure from Wikipedia. A novel unsupervised method for the extraction of conceptual structures from Wikipedia entries of similar concepts is put forward. The main idea we follow is that similar concepts (i.e. those classied under the same node in a taxonomy) are described in a comparable way in Wikipedia. Moreover, to understand the kind of information extracted from Wikipedia we annotate this knowledge with a set of property types. 2. Evaluation. We evaluate Wordnet as a model of semantic memory and suggest the addition of new semantic relations. We also assess the properties extracted from all sources for a unified test set, in a clustering experiment.File | Dimensione | Formato | |
---|---|---|---|
eduardThesis.pdf
accesso aperto
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.13 MB
Formato
Adobe PDF
|
2.13 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione