Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Abstract Large Language Models (LLMs) have the capacity to store and recall facts. Through
experimentation with open-source models, we observe that this ability to retrieve facts can …
experimentation with open-source models, we observe that this ability to retrieve facts can …
Advanced AI framework for enhanced detection and assessment of abdominal trauma: Integrating 3D segmentation with 2D CNN and RNN models
L Jiang, X Yang, C Yu, Z Wu… - 2024 3rd International …, 2024 - ieeexplore.ieee.org
Trauma is a significant cause of mortality and diagnostic methods for traumatic injuries, such
as X-rays, CT scans, and MRI, are often time-consuming and dependent on medical …
as X-rays, CT scans, and MRI, are often time-consuming and dependent on medical …
Fast Second-order Method for Neural Networks under Small Treewidth Setting
Training neural networks is a fundamental problem in theoretical machine learning. Second-
order methods are rarely used in practice due to their high computational cost, even they …
order methods are rarely used in practice due to their high computational cost, even they …
Differentially private kernel density estimation
We introduce a refined differentially private (DP) data structure for kernel density estimation
(KDE), offering not only improved privacy-utility tradeoff but also better efficiency over prior …
(KDE), offering not only improved privacy-utility tradeoff but also better efficiency over prior …
Outeffhop: A principled outlier-efficient attention layer from dense associative memory models
H Luo, JYC Hu, PH Chang, HY Chen, W Li… - Workshop on Efficient …, 2024 - openreview.net
We introduce a principled approach to Outlier-Efficient Attention Layers via associative
memory models to reduce outlier emergence in large transformer-based model. Our main …
memory models to reduce outlier emergence in large transformer-based model. Our main …
ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction
Predicting human mobility across multiple cities presents significant challenges due to the
complex and diverse spatial-temporal dynamics inherent in different urban environments. In …
complex and diverse spatial-temporal dynamics inherent in different urban environments. In …
[PDF][PDF] Towards Better Adaptation of Foundation Models
Z Xu - pages.cs.wisc.edu
Foundation models have revolutionized artificial intelligence, yet fundamental challenges
remain in understanding and optimizing their capabilities in adaptation and inference. This …
remain in understanding and optimizing their capabilities in adaptation and inference. This …