The structure of the token space for large language models

M Robinson, S Dey, S Sweet - arxiv preprint arxiv:2410.08993, 2024 - arxiv.org
Large language models encode the correlational structure present in natural language by
fitting segments of utterances (tokens) into a high dimensional ambient latent space upon …