محقق Google

C Helwe, C Clavel, F Suchanek - International Conference on …, 2021‏ - imt.hal.science‏

Recent years have seen impressive performance of transformer-based models on different
natural language processing tasks. However, it is not clear to what degree the transformers …‏

ذخیره ارجاع بیان شده در 67 یافته مقاله‌های مربوط تمام نسخه‌های 11 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Do as i can, not as i say: Grounding language in robotic affordances‏

M Ahn, A Brohan, N Brown, Y Chebotar… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Large language models can encode a wealth of semantic knowledge about the world. Such
knowledge could be extremely useful to robots aiming to act upon high-level, temporally …‏

ذخیره ارجاع بیان شده در 1531 یافته مقاله‌های مربوط تمام نسخه‌های 2 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluating large language models at evaluating instruction following‏

Z Zeng, J Yu, T Gao, Y Meng, T Goyal… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

As research in large language models (LLMs) continues to accelerate, LLM-based
evaluation has emerged as a scalable and cost-effective alternative to human evaluations …‏

ذخیره ارجاع بیان شده در 141 یافته مقاله‌های مربوط تمام نسخه‌های 5 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Language models show human-like content effects on reasoning tasks‏

I Dasgupta, AK Lampinen, SCY Chan… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Reasoning is a key ability for an intelligent system. Large language models (LMs) achieve
above-chance performance on abstract reasoning tasks, but exhibit many imperfections …‏

ذخیره ارجاع بیان شده در 213 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Chinese clip: Contrastive vision-language pretraining in chinese‏

A Yang, J Pan, J Lin, R Men, Y Zhang, J Zhou… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

The tremendous success of CLIP (Radford et al., 2021) has promoted the research and
application of contrastive learning for vision-language pretraining. In this work, we construct …‏

ذخیره ارجاع بیان شده در 129 یافته مقاله‌های مربوط تمام نسخه‌های 2 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cruxeval: A benchmark for code reasoning, understanding and execution‏

A Gu, B Rozière, H Leather, A Solar-Lezama… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

We present CRUXEval (Code Reasoning, Understanding, and eXecution Evaluation), a
benchmark consisting of 800 Python functions (3-13 lines). Each function comes with an …‏

ذخیره ارجاع بیان شده در 64 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Consistency analysis of chatgpt‏

ME Jang, T Lukasiewicz - arxiv preprint arxiv:2303.06273, 2023‏ - arxiv.org‏

ChatGPT has gained a huge popularity since its introduction. Its positive aspects have been
reported through many media platforms, and some analyses even showed that ChatGPT …‏

ذخیره ارجاع بیان شده در 94 یافته مقاله‌های مربوط تمام نسخه‌های 6 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Prosocialdialog: A prosocial backbone for conversational agents‏

H Kim, Y Yu, L Jiang, X Lu, D Khashabi, G Kim… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Most existing dialogue systems fail to respond properly to potentially unsafe user utterances
by either ignoring or passively agreeing with them. To address this issue, we introduce …‏

ذخیره ارجاع بیان شده در 108 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Negative object presence evaluation (nope) to measure object hallucination in vision-language models‏

H Lovenia, W Dai, S Cahyawijaya, Z Ji… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Object hallucination poses a significant challenge in vision-language (VL) models, often
leading to the generation of nonsensical or unfaithful responses with non-existent objects …‏

ذخیره ارجاع بیان شده در 56 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Small models are valuable plug-ins for large language models‏

C Xu, Y Xu, S Wang, Y Liu, C Zhu… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Large language models (LLMs) such as GPT-3 and GPT-4 are powerful but their weights are
often publicly unavailable and their immense sizes make the models difficult to be tuned with …‏

ذخیره ارجاع بیان شده در 54 یافته مقاله‌های مربوط تمام نسخه‌های 3 نسخه HTML

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Understanding by understanding not: Modeling negation in language models

[فهرست منابع][C] Reasoning with transformer-based models: Deep learning, but shallow reasoning‏

Do as i can, not as i say: Grounding language in robotic affordances‏

Evaluating large language models at evaluating instruction following‏

Language models show human-like content effects on reasoning tasks‏

Chinese clip: Contrastive vision-language pretraining in chinese‏

Cruxeval: A benchmark for code reasoning, understanding and execution‏

Consistency analysis of chatgpt‏

Prosocialdialog: A prosocial backbone for conversational agents‏

Negative object presence evaluation (nope) to measure object hallucination in vision-language models‏

Small models are valuable plug-ins for large language models‏