Μελετητής Google

R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov… - arxiv preprint arxiv …, 2023 - arxiv.org

The BigCode community, an open-scientific collaboration working on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder and …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 816 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Starcoder 2 and the stack v2: The next generation

A Lozhkov, R Li, LB Allal, F Cassano… - arxiv preprint arxiv …, 2024 - arxiv.org

The BigCode project, an open-scientific collaboration focused on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 204 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multi-step jailbreaking privacy attacks on chatgpt

H Li, D Guo, W Fan, M Xu, J Huang, F Meng… - arxiv preprint arxiv …, 2023 - arxiv.org

With the rapid progress of large language models (LLMs), many downstream NLP tasks can
be well solved given appropriate prompts. Though model developers and researchers work …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 326 Σχετικά άρθρα Όλες οι 6 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Leak, cheat, repeat: Data contamination and evaluation malpractices in closed-source LLMs

S Balloccu, P Schmidtová, M Lango… - arxiv preprint arxiv …, 2024 - arxiv.org

Natural Language Processing (NLP) research is increasingly focusing on the use of Large
Language Models (LLMs), with some of the most popular ones being either fully or partially …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 144 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dolma: An open corpus of three trillion tokens for language model pretraining research

L Soldaini, R Kinney, A Bhagia, D Schwenk… - arxiv preprint arxiv …, 2024 - arxiv.org

Information about pretraining corpora used to train the current best-performing language
models is seldom discussed: commercial models rarely detail their data, and even open …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 121 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Embers of autoregression: Understanding large language models through the problem they are trained to solve

RT McCoy, S Yao, D Friedman, M Hardy… - arxiv preprint arxiv …, 2023 - arxiv.org

The widespread adoption of large language models (LLMs) makes it important to recognize
their strengths and limitations. We argue that in order to develop a holistic understanding of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 130 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluating the social impact of generative ai systems in systems and society

I Solaiman, Z Talat, W Agnew, L Ahmad… - arxiv preprint arxiv …, 2023 - arxiv.org

Generative AI systems across modalities, ranging from text (including code), image, audio,
and video, have broad social impacts, but there is no official standard for means of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 114 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Investigating data contamination in modern benchmarks for large language models

C Deng, Y Zhao, X Tang, M Gerstein… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent observations have underscored a disparity between the inflated benchmark scores
and the actual performance of LLMs, raising concerns about potential contamination of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 73 Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] cell.com Full View

An archival perspective on pretraining data

MA Desai, IV Pasquetto, AZ Jacobs, D Card - Patterns, 2024 - cell.com

Alongside an explosion in research and development related to large language models,
there has been a concomitant rise in the creation of pretraining datasets—massive …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 10 Σχετικά άρθρα Όλες οι 9 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards AI accountability infrastructure: Gaps and opportunities in AI audit tooling

V Ojewale, R Steed, B Vecchione, A Birhane… - arxiv preprint arxiv …, 2024 - arxiv.org

Audits are critical mechanisms for identifying the risks and limitations of deployed artificial
intelligence (AI) systems. However, the effective execution of AI audits remains incredibly …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 33 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

The ROOTS search tool: Data transparency for LLMs

Starcoder: may the source be with you!

Starcoder 2 and the stack v2: The next generation

Multi-step jailbreaking privacy attacks on chatgpt

Leak, cheat, repeat: Data contamination and evaluation malpractices in closed-source LLMs

Dolma: An open corpus of three trillion tokens for language model pretraining research

Embers of autoregression: Understanding large language models through the problem they are trained to solve

Evaluating the social impact of generative ai systems in systems and society

Investigating data contamination in modern benchmarks for large language models

An archival perspective on pretraining data

Towards AI accountability infrastructure: Gaps and opportunities in AI audit tooling