A theory for compressibility of graph transformers for transductive learning

H Shirzad, H Lin, A Velingker, B Venkatachalam… - arxiv preprint arxiv …, 2024 - arxiv.org
Transductive tasks on graphs differ fundamentally from typical supervised machine learning
tasks, as the independent and identically distributed (iid) assumption does not hold among …

Normalization Matters for Optimization Performance on Graph Neural Networks

A Milligan, F Kunstner, H Shirzad, M Schmidt… - OPT 2024: Optimization … - openreview.net
We show that feature normalization has a drastic impact on the performance of optimization
algorithms in the context of graph neural networks. The standard normalization scheme …