From GNNs to Sparse Transformers: Graph-based architectures for Multi-hop Question Answering

Acton, Shane and Buys, Jan (2022) From GNNs to Sparse Transformers: Graph-based architectures for Multi-hop Question Answering, Proceedings of Third Southern African Conference for AI Research (SACAIR 2022), December 2022, Stellenbosch, South Africa, Artificial Intelligence Research, Communications in Computer and Information Science, 1734, 154-168, Springer, Cham.

[thumbnail of From_GNNs_to_Sparse_Transformers__SACAIR_9855.pdf] Text
From_GNNs_to_Sparse_Transformers__SACAIR_9855.pdf - Accepted Version

Download (624kB)

Abstract

Sparse Transformers have surpassed Graph Neural Networks (GNNs) as the state-of-the-art architecture for multi-hop question answering (MHQA). Noting that the Transformer is a particular message passing GNN, in this paper we perform an architectural analysis and evaluation to investigate why the Transformer outperforms other GNNs on MHQA. We simplify existing GNN-based MHQA models and leverage this system to compare GNN architectures in a lower compute setting than token-level models. Our results support the superiority of the Transformer architecture as a GNN in MHQA. We also investigate the role of graph sparsity, graph structure, and edge features in our GNNs. We find that task-specific graph structuring rules outperform the random connections used in Sparse Transformers. We also show that utilising edge type information alleviates performance losses introduced by sparsity.

Item Type: Conference paper
Additional Information: The Version of Record is available online at: http://dx.doi.org/10.1007/978-3-031-22321-1_11
Subjects: Information systems > Information retrieval > Retrieval tasks and goals > Question answering
Computing methodologies > Artificial intelligence > Natural language processing
Date Deposited: 09 Nov 2023 10:08
Last Modified: 09 Nov 2023 10:08
URI: https://pubs.cs.uct.ac.za/id/eprint/1633

Actions (login required)

View Item View Item