- Academic Search

M Kang, B Li - arxiv preprint arxiv:2407.05557, 2024 - arxiv.org

As LLMs become increasingly prevalent across various applications, it is critical to establish
safety guardrails to moderate input/output content of LLMs. Existing guardrail models treat …

Save Cite Cited by 5 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

C-rag: Certified generation risks for retrieval-augmented language models

M Kang, NM Gürel, N Yu, D Song, B Li - arxiv preprint arxiv:2402.03181, 2024 - arxiv.org

Despite the impressive capabilities of large language models (LLMs) across diverse
applications, they still suffer from trustworthiness issues, such as hallucinations and …

Save Cite Cited by 11 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

D Feng, B Qin, C Huang, Y Huang, Z Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

The success of the reward model in distinguishing between responses with subtle safety
differences depends critically on the high-quality preference dataset, which should capture …

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A Survey of Generative Techniques for Spatial-Temporal Data Mining

Q Zhang, H Wang, C Long, L Su, X He, J Chang… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper focuses on the integration of generative techniques into spatial-temporal data
mining, considering the significant growth and diverse nature of spatial-temporal data. With …

Save Cite Cited by 7 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities

B Bi, S Liu, Y Wang, L Mei, H Gao, Y Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

The parametric knowledge memorized by large language models (LLMs) becomes outdated
quickly. In-context editing (ICE) is currently the most effective method for updating the …

Save Cite Cited by 4 Related articles View as HTML

Cite

Advanced search

Saved to My library

-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning

C-rag: Certified generation risks for retrieval-augmented language models

Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

A Survey of Generative Techniques for Spatial-Temporal Data Mining

Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities