Poisoning web-scale training datasets is practical

N Carlini, M Jagielski… - … IEEE Symposium on …, 2024 - ieeexplore.ieee.org
Deep learning models are often trained on distributed, web-scale datasets crawled from the
internet. In this paper, we introduce two new dataset poisoning attacks that intentionally …

Graph neural networks for natural language processing: A survey

L Wu, Y Chen, K Shen, X Guo, H Gao… - … and Trends® in …, 2023 - nowpublishers.com
Deep learning has become the dominant approach in addressing various tasks in Natural
Language Processing (NLP). Although text inputs are typically represented as a sequence …

Stark: Benchmarking llm retrieval on textual and relational knowledge bases

S Wu, S Zhao, M Yasunaga, K Huang… - Advances in …, 2025 - proceedings.neurips.cc
Answering real-world complex queries, such as complex product search, often requires
accurate retrieval from semi-structured knowledge bases that involve blend of unstructured …

Improving multi-hop question answering over knowledge graphs using knowledge base embeddings

A Saxena, A Tripathi, P Talukdar - … of the 58th annual meeting of …, 2020 - aclanthology.org
Abstract Knowledge Graphs (KG) are multi-relational graphs consisting of entities as nodes
and relations among them as typed edges. Goal of the Question Answering over KG (KGQA) …

What is semantic communication? A view on conveying meaning in the era of machine intelligence

Q Lan, D Wen, Z Zhang, Q Zeng, X Chen… - Journal of …, 2021 - ieeexplore.ieee.org
In the 1940s, Claude Shannon developed the information theory focusing on quantifying the
maximum data rate that can be supported by a communication channel. Guided by this …

TaBERT: Pretraining for joint understanding of textual and tabular data

P Yin, G Neubig, W Yih, S Riedel - arxiv preprint arxiv:2005.08314, 2020 - arxiv.org
Recent years have witnessed the burgeoning of pretrained language models (LMs) for text-
based natural language (NL) understanding tasks. Such models are typically trained on free …

TAT-QA: A question answering benchmark on a hybrid of tabular and textual content in finance

F Zhu, W Lei, Y Huang, C Wang, S Zhang, J Lv… - arxiv preprint arxiv …, 2021 - arxiv.org
Hybrid data combining both tabular and textual content (eg, financial reports) are quite
pervasive in the real world. However, Question Answering (QA) over such hybrid data is …