Link to paper The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract GNNs have potential in graph representation learning Standard GNNs have two major limitations ViT/MLP-Mixer architectures can solve these limitations but increase computational cost Graph MLP-Mixer captures long-range dependency and mitigates over-squashing Graph MLP-Mixer is faster and more memory efficient than related models Graph MLP-Mixer is highly expressive and can distinguish non-isomorphic graphs Paper Content Generalizing vit/ml-mixer to graphs overcome mp-gnn limitations MLP-Mixer architecture is designed to capture long-range interaction while keeping low computational cost Generalizing MLP-Mixer from grids and sequences to arbitrary graph topology is challenging Main contribution is to design a novel GNN architecture that captures long-range interaction, keeps low computational complexity, and is isomorphically expressive GNNs have linear learning/inference complexities but low representation power and poor long-range dependency Graph MLP-Mixer overcomes the computational bottleneck of Graph Transformers and solves the issue of long-distance dependency Competitive results on multiple benchmarks Capacity to capture long-range dependency with SOTA performance while keeping low complexity Forms a bridge between CV, NLP and graphs under a unified architecture Generalization challenges MLP-Mixer is adapted from images to graphs Table 1 summarizes the differences between standard MLP-Mixer and Graph MLP-Mixer Graphs cannot be uniformly divided into similar patches across all examples in the dataset Graph patches need to be transformed into a fixed-length vectorial representation Graph patches are unordered and nodes in graph tokens are naturally unordered MLP-Mixer architectures are known to be strong overfitters Overview Graph MLP-Mixer is a computer science architecture Graph MLP-Mixer is composed of a patch extraction module, patch embedding module, mixer layers, global average pooling layer, and a fully-connected layer Graphs are represented by a set of nodes (V) and edges (E) Graphs have a pre-defined number of patches (P) Graph-level vectorial representation (h G ) and graph-level target (y G ) are used for prediction Patch extraction MLP-Mixer can be generalized to graphs by extracting patches....