- Academic Search

Future events as backdoor triggers: Investigating temporal vulnerabilities in llms

S Price, A Panickssery, S Bowman… - arxiv preprint arxiv …, 2024 - arxiv.org

Backdoors are hidden behaviors that are only triggered once an AI system has been
deployed. Bad actors looking to create successful backdoors must design them to avoid …

Simpan Kutip Dirujuk 4 kali Artikel terkait 2 versi Versi HTML

Enhancing logical reasoning in large language models through graph-based synthetic data

J Zhou, A Ghaddar, G Zhang, L Ma, Y Hu, S Pal… - arxiv preprint arxiv …, 2024 - arxiv.org

Despite recent advances in training and prompting strategies for Large Language Models
(LLMs), these models continue to face challenges with complex logical reasoning tasks that …

Simpan Kutip Dirujuk 2 kali Artikel terkait 3 versi Versi HTML

Graph Reasoning with LLMs (GReaL)

A Tsitsulin, B Perozzi, B Fatemi… - Proceedings of the 30th …, 2024 - dl.acm.org

Graphs are a powerful tool for representing and analyzing complex relationships in real-
world applications. Large Language Models (LLMs) have demonstrated impressive …

Simpan Kutip Dirujuk 1 kali Artikel terkait

ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints

D Handa, P Dolin, S Kumbhar, TC Son… - arxiv preprint arxiv …, 2024 - arxiv.org

Reasoning about Actions and Change (RAC) has historically played a pivotal role in solving
foundational AI problems, such as the frame problem. It has driven advancements in AI …

Simpan Kutip Dirujuk 1 kali Artikel terkait 2 versi Versi HTML

ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains

Y Park, C Yoon, J Park, D Lee, M Jeong… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) have significantly impacted many aspects of our lives.
However, assessing and ensuring their chronological knowledge remains challenging …

[PDF] aclanthology.org

Perceive the Passage of Time: A Systematic Evaluation of Large Language Model in Temporal Relativity

S Chen, Y Zheng, S Li, Q Cheng… - Proceedings of the 31st …, 2025 - aclanthology.org

Temporal perception is crucial for Large Language Models (LLMs) to effectively understand
the world. However, current benchmarks primarily focus on temporal reasoning, falling short …

Simpan Kutip Artikel terkait Versi HTML

VCBench: A Controllable Benchmark for Symbolic and Abstract Challenges in Video Cognition

C Li, Q Chen, Z Li, F Tao, Y Zhang - arxiv preprint arxiv:2411.09105, 2024 - arxiv.org

Recent advancements in Large Video-Language Models (LVLMs) have driven the
development of benchmarks designed to assess cognitive abilities in video-based tasks …

Time Awareness in Large Language Models: Benchmarking Fact Recall Across Time

D Herel, V Bartek, T Mikolov - arxiv preprint arxiv:2409.13338, 2024 - arxiv.org

Who is the US President? The answer changes depending on when the question is asked.
While large language models (LLMs) are evaluated on various reasoning tasks, they often …

Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning

Y Zhu, X Bai, K Chen, Y **ang, M Zhang - arxiv preprint arxiv:2412.13540, 2024 - arxiv.org

Large Vision-Language Models (LVLMs) have demonstrated remarkable performance
across diverse tasks. Despite great success, recent studies show that LVLMs encounter …