- Academic Search

Antileak-bench: Preventing data contamination by automatically constructing benchmarks with updated real-world knowledge

X Wu, L Pan, Y **e, R Zhou, S Zhao, Y Ma… - arxiv preprint arxiv …, 2024 - arxiv.org

Data contamination hinders fair LLM evaluation by introducing test data into newer models'
training sets. Existing studies solve this challenge by updating benchmarks with newly …

Save Cite Cited by 3 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Ragchecker: A fine-grained framework for diagnosing retrieval-augmented generation

D Ru, L Qiu, X Hu, T Zhang, P Shi, S Chang… - arxiv preprint arxiv …, 2024 - arxiv.org

Despite Retrieval-Augmented Generation (RAG) showing promising capability in leveraging
external knowledge, a comprehensive evaluation of RAG systems is still challenging due to …

Save Cite Cited by 8 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] aclanthology.org

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

F Mu, Y Jiang, L Zhang, L Liuchu, W Li… - Findings of the …, 2024 - aclanthology.org

Current research on tool learning primarily focuses on selecting the most effective tool from
a wide array of options, often overlooking cost-effectiveness, a crucial factor in human …

Save Cite Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

Adaptive Selection for Homogeneous Tools: An Instantiation in the RAG Scenario

F Mu, Y Jiang, L Zhang, C Liu, W Li, P **e… - arxiv preprint arxiv …, 2024 - arxiv.org

Current research on tool learning primarily focuses on selecting the most effective tool from
a wide array of options, often overlooking cost-effectiveness, a crucial factor in human …

Save Cite Cited by 1 Related articles View as HTML

Create alert

Cite

Advanced search

Saved to My library

Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark

Antileak-bench: Preventing data contamination by automatically constructing benchmarks with updated real-world knowledge

Ragchecker: A fine-grained framework for diagnosing retrieval-augmented generation

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

Adaptive Selection for Homogeneous Tools: An Instantiation in the RAG Scenario