الباحث العلمي من Google

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …‏

حفظ اقتباس تم اقتباسها في عدد: 482 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Foundation Models Defining a New Era in Vision: a Survey and Outlook‏

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025‏ - ieeexplore.ieee.org‏

Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …‏

حفظ اقتباس تم اقتباسها في عدد: 135 مقالات ذات صلة الإصدارات الـ 2كلها

[Free GPT-4]

[PDF] thecvf.com

Improved baselines with visual instruction tuning‏

H Liu, C Li, Y Li, YJ Lee - … of the IEEE/CVF Conference on …, 2024‏ - openaccess.thecvf.com‏

Large multimodal models (LMM) have recently shown encouraging progress with visual
instruction tuning. In this paper we present the first systematic study to investigate the design …‏

حفظ اقتباس تم اقتباسها في عدد: 1755 مقالات ذات صلة الإصدارات الـ 5كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Universal and transferable adversarial attacks on aligned language models‏

A Zou, Z Wang, N Carlini, M Nasr, JZ Kolter… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Because" out-of-the-box" large language models are capable of generating a great deal of
objectionable content, recent work has focused on aligning these models in an attempt to …‏

حفظ اقتباس تم اقتباسها في عدد: 1070 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Open problems and fundamental limitations of reinforcement learning from human feedback‏

S Casper, X Davies, C Shi, TK Gilbert… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …‏

حفظ اقتباس تم اقتباسها في عدد: 436 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Fine-tuning aligned language models compromises safety, even when users do not intend to!‏

X Qi, Y Zeng, T **e, PY Chen, R Jia, P Mittal… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Optimizing large language models (LLMs) for downstream use cases often involves the
customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama …‏

حفظ اقتباس تم اقتباسها في عدد: 412 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]

[PDF] acm.org

Explainability for large language models: A survey‏

H Zhao, H Chen, F Yang, N Liu, H Deng, H Cai… - ACM Transactions on …, 2024‏ - dl.acm.org‏

Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …‏

حفظ اقتباس تم اقتباسها في عدد: 418 مقالات ذات صلة الإصدارات الـ 5كلها

[Free GPT-4]

[PDF] arxiv.org

Foundational challenges in assuring alignment and safety of large language models‏

U Anwar, A Saparov, J Rando, D Paleka… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …‏

حفظ اقتباس تم اقتباسها في عدد: 118 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Scalable extraction of training data from (production) language models‏

M Nasr, N Carlini, J Hayase, M Jagielski… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

This paper studies extractable memorization: training data that an adversary can efficiently
extract by querying a machine learning model without prior knowledge of the training …‏

حفظ اقتباس تم اقتباسها في عدد: 281 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Catastrophic jailbreak of open-source llms via exploiting generation‏

Y Huang, S Gupta, M **a, K Li, D Chen - arxiv preprint arxiv:2310.06987, 2023‏ - arxiv.org‏

The rapid progress in open-source large language models (LLMs) is significantly advancing
AI development. Extensive efforts have been made before model release to align their …‏

حفظ اقتباس تم اقتباسها في عدد: 221 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

Are aligned neural networks adversarially aligned?

Challenges and applications of large language models‏

Foundation Models Defining a New Era in Vision: a Survey and Outlook‏

Improved baselines with visual instruction tuning‏

Universal and transferable adversarial attacks on aligned language models‏

Open problems and fundamental limitations of reinforcement learning from human feedback‏

Fine-tuning aligned language models compromises safety, even when users do not intend to!‏

Explainability for large language models: A survey‏

Foundational challenges in assuring alignment and safety of large language models‏

Scalable extraction of training data from (production) language models‏

Catastrophic jailbreak of open-source llms via exploiting generation‏