Key Papers and Formulas in Static Graph Neural Networks (GNNs)

Answered using GPT 40 by GPT40

Question:

Give explain Static graph GNN papers with real existed references and with formula

Answer:

Certainly! Static Graph Neural Networks (GNNs) are a class of neural networks designed to operate on graph-structured data where the graph topology does not change over time. Below are some key papers in this area, along with explanations and relevant formulas.

1. Graph Convolutional Networks (GCNs)

Paper: Kipf, T. N., & Welling, M. (2017). "Semi-Supervised Classification with Graph Convolutional Networks." ICLR 2017.

Explanation: GCNs are one of the foundational models in the field of GNNs. They generalize the convolution operation to graph-structured data. The core idea is to perform a convolution-like operation on the graph, aggregating feature information from a node's neighbors.

Formula: The layer-wise propagation rule for GCNs is given by: [ H^{(l+1)} = \sigma\left( \tilde{D}^{-1/2} \tilde{A} \tilde{D}^{-1/2} H^{(l)} W^{(l)} \right) ] where:

  • ( \tilde{A} = A + I ) is the adjacency matrix with added self-loops.
  • ( \tilde{D} ) is the degree matrix of ( \tilde{A} ).
  • ( H^{(l)} ) is the matrix of activations in the ( l )-th layer.
  • ( W^{(l)} ) is the weight matrix for the ( l )-th layer.
  • ( \sigma ) is an activation function, such as ReLU.

2. Graph Attention Networks (GATs)

Paper: Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2018). "Graph Attention Networks." ICLR 2018.

Explanation: GATs introduce attention mechanisms to GNNs, allowing the model to assign different importance to different nodes in a neighborhood. This is particularly useful for dealing with graphs where the importance of neighbors varies.

Formula: The attention mechanism in GATs is defined as: [ e_{ij} = \text{LeakyReLU}\left( a^T [W h_i | W h_j] \right) ] where:

  • ( h_i ) and ( h_j ) are the feature vectors of nodes ( i ) and ( j ).
  • ( W ) is a shared weight matrix.
  • ( a ) is a learnable weight vector.
  • ( | ) denotes concatenation.
  • ( e_{ij} ) is the attention coefficient.

The normalized attention coefficients are computed using the softmax function: [ \alpha_{ij} = \frac{\exp(e_{ij})}{\sum_{k \in \mathcal{N}(i)} \exp(e_{ik})} ]

The final output features are then computed as: [ h_i' = \sigma\left( \sum_{j \in \mathcal{N}(i)} \alpha_{ij} W h_j \right) ]

3. GraphSAGE

Paper: Hamilton, W. L., Ying, R., & Leskovec, J. (2017). "Inductive Representation Learning on Large Graphs." NeurIPS 2017.

Explanation: GraphSAGE (Graph Sample and AggregatE) is designed for inductive learning, where the model can generalize to unseen nodes. It samples a fixed-size neighborhood and aggregates the features of the sampled neighbors.

Formula: The general aggregation function in GraphSAGE is: [ h_i^{(l+1)} = \sigma\left( W^{(l)} \cdot \text{AGGREGATE}^{(l)} \left( { h_i^{(l)} } \cup { h_j^{(l)}, \forall j \in \mathcal{N}(i) } \right) \right) ] where:

  • ( h_i^{(l)} ) is the feature vector of node ( i ) at layer ( l ).
  • ( \mathcal{N}(i) ) is the set of neighbors of node ( i ).
  • ( \text{AGGREGATE}^{(l)} ) is a neighborhood aggregation function, such as mean, LSTM, or pooling.
  • ( W^{(l)} ) is a learnable weight matrix.
  • ( \sigma ) is an activation function.

4. Simplified Graph Convolution (SGC)

Paper: Wu, F., Souza, A. d., Zhang, T., Fifty, C., Yu, T., & Weinberger, K. Q. (2019). "Simplifying Graph Convolutional Networks." ICML 2019.

Explanation: SGC simplifies GCNs by removing the non-linearities between layers and collapsing the weight matrices. This results in a more efficient model with fewer parameters.

Formula: The simplified propagation rule is: [ H = \tilde{D}^{-1/2} \tilde{A} \tilde{D}^{-1/2} X \Theta ] where:

  • ( X ) is the input feature matrix.
  • ( \Theta ) is the weight matrix.
  • ( \tilde{A} ) and ( \tilde{D} ) are as defined in the GCN formula.

These papers and their corresponding models have significantly advanced the field of graph neural networks, providing powerful tools for various applications in social networks, biological networks, recommendation systems, and more.