Acton, Shane and Buys, Jan (2022) From GNNs to Sparse Transformers: Graph-based architectures for Multi-hop Question Answering, Proceedings of Third Southern African Conference for AI Research (SACAIR 2022), December 2022, Stellenbosch, South Africa, Artificial Intelligence Research, Communications in Computer and Information Science, 1734, 154-168, Springer, Cham.
Text
From_GNNs_to_Sparse_Transformers__SACAIR_9855.pdf - Accepted Version Download (624kB) |
Abstract
Sparse Transformers have surpassed Graph Neural Networks (GNNs) as the state-of-the-art architecture for multi-hop question answering (MHQA). Noting that the Transformer is a particular message passing GNN, in this paper we perform an architectural analysis and evaluation to investigate why the Transformer outperforms other GNNs on MHQA. We simplify existing GNN-based MHQA models and leverage this system to compare GNN architectures in a lower compute setting than token-level models. Our results support the superiority of the Transformer architecture as a GNN in MHQA. We also investigate the role of graph sparsity, graph structure, and edge features in our GNNs. We find that task-specific graph structuring rules outperform the random connections used in Sparse Transformers. We also show that utilising edge type information alleviates performance losses introduced by sparsity.
Item Type: | Conference paper |
---|---|
Additional Information: | The Version of Record is available online at: http://dx.doi.org/10.1007/978-3-031-22321-1_11 |
Subjects: | Information systems > Information retrieval > Retrieval tasks and goals > Question answering Computing methodologies > Artificial intelligence > Natural language processing |
Date Deposited: | 09 Nov 2023 10:08 |
Last Modified: | 09 Nov 2023 10:08 |
URI: | https://pubs.cs.uct.ac.za/id/eprint/1633 |
Actions (login required)
View Item |