Relational Learning with Graph Neural Networks: Interpretability and handling incomplete data

Ferrini, Francesco

Graph-structured data provide a natural representation for many real-world domains, including relational databases, knowledge graphs, and complex information systems. Learning from such data requires models that can effectively exploit relational structure while remaining robust, interpretable, and reliable under realistic data imperfections. Although Graph Neural Networks (GNNs) have emerged as a powerful paradigm for graph-based learning, their application to complex relational settings raises fundamental challenges related to scalability, interpretability, and robustness to incomplete information. This thesis addresses these challenges through two complementary lines of research. The first part focuses on learning from heterogeneous graphs. It introduces a principled framework for learning informative meta-paths directly from data, enabling GNNs to reason explicitly over semantically meaningful relational patterns. Building on this idea, the thesis further proposes a self-explainable heterogeneous GNN that extends meta-path reasoning by incorporating learnable statistics over relational paths, yielding faithful explanations that quantify both which relational structures are relevant and how they influence predictions. The second part of the thesis studies learning under missing node features, a pervasive characteristic of real-world relational data that has received limited principled treatment in the graph learning literature. Rather than directly addressing this problem in fully multi-relational settings, the thesis deliberately considers the simpler yet non-trivial case of homogeneous graphs as a foundational step. This controlled setting allows for a systematic analysis of how missing features interact with message passing, existing GNN architectures, and common evaluation practices. The thesis revisits standard benchmarks and missingness assumptions, showing that they often fail to reflect realistic scenarios, and introduces improved evaluation protocols based on structured and feature-dependent missingness mechanisms. In addition, it proposes a simple yet effective modeling strategy that explicitly exposes missing information within the message passing process. Extensive empirical results demonstrate improved robustness and more stable training dynamics across diverse missingness patterns and train–test distribution shifts. Overall, this thesis advances graph learning by combining relational modeling, interpretability, and a principled analysis of robustness to incomplete data. By jointly addressing expressive relational reasoning and controlled investigations of missing-feature learning, the proposed contributions lay the groundwork for more transparent and reliable graph neural networks in real-world relational domains.

Relational Learning with Graph Neural Networks: Interpretability and handling incomplete data / Ferrini, Francesco. - (2026 Apr 27).