Topologies of reasoning: Demystifying chains, trees, and graphs of thoughts

M Besta, F Memedi, Z Zhang, R Gerstenberger… - arxiv preprint arxiv …, 2024 - arxiv.org
The field of natural language processing (NLP) has witnessed significant progress in recent
years, with a notable focus on improving large language models'(LLM) performance through …

Low-Depth Spatial Tree Algorithms

Y Baumann, T Ben-Nun, M Besta, L Gianinazzi… - arxiv preprint arxiv …, 2024 - arxiv.org
Contemporary accelerator designs exhibit a high degree of spatial localization, wherein two-
dimensional physical distance determines communication costs between processing …

[PDF][PDF] Demystifying Chains, Trees, and Graphs of Thoughts

M Besta, F Memedi, Z Zhang… - arxiv preprint arxiv …, 2024 - aegjcef.unixer.de
The field of natural language processing (NLP) has witnessed significant progress in recent
years, with a notable focus on improving large language models'(LLM) performance through …

Demystifying Higher-Order Graph Neural Networks

M Besta, F Scheidl, L Gianinazzi, S Klaiman… - arxiv preprint arxiv …, 2024 - arxiv.org
Higher-order graph neural networks (HOGNNs) are an important class of GNN models that
harness polyadic relations between vertices beyond plain edges. They have been used to …

Communication Collectives for the Cerebras Wafer-Scale Engine

P Luczynski - 2023 - research-collection.ethz.ch
Cerebras Wafer-Scale Engine (WSE) is a powerful architecture used initially for machine
learning training but now also for a larger variety of workloads. To achieve the best possible …

2D Collective Communication for the Cerebras Wafer-Scale Engine

L Schnyder - 2024 - research-collection.ethz.ch
The Cerebras Wafer-Scale Engine (WSE) is a powerful architecture, initially introduced for
machine learning, but also a viable tool for a larger variety of high-performance …

[PDF][PDF] Graph algorithms for the spatial computer model

Y Baumann - 2022 - research-collection.ethz.ch
The spatial computer model is based on the idea that the energy used for two processors to
communicate is dependent on the physical distance between the two processors. For a …