Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Pythia: A suite for analyzing large language models across training and scaling
How do large language models (LLMs) develop and evolve over the course of training?
How do these patterns change as models scale? To answer these questions, we introduce …
How do these patterns change as models scale? To answer these questions, we introduce …
NusaCrowd: Open source initiative for Indonesian NLP resources
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …
Indonesian languages, including opening access to previously non-public resources …
Aya model: An instruction finetuned open-access multilingual language model
Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …
data-rich languages. What does it take to broaden access to breakthroughs beyond first …
Modular deep learning
Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …
trained models fine-tuned for downstream tasks achieve better performance with fewer …
Aya 23: Open weight releases to further multilingual progress
This technical report introduces Aya 23, a family of multilingual language models. Aya 23
builds on the recent release of the Aya model (\" Ust\" un et al., 2024), focusing on pairing a …
builds on the recent release of the Aya model (\" Ust\" un et al., 2024), focusing on pairing a …
Exploring the benefits of training expert language models over instruction tuning
Abstract Recently, Language Models (LMs) instruction-tuned on multiple tasks, also known
as multitask-prompted fine-tuning (MT), have shown capabilities to generalize to unseen …
as multitask-prompted fine-tuning (MT), have shown capabilities to generalize to unseen …
Pile of law: Learning responsible data filtering from the law and a 256gb open-source legal dataset
One concern with the rise of large language models lies with their potential for significant
harm, particularly from pretraining on biased, obscene, copyrighted, and private information …
harm, particularly from pretraining on biased, obscene, copyrighted, and private information …
Silo language models: Isolating legal risk in a nonparametric datastore
The legality of training language models (LMs) on copyrighted or otherwise restricted data is
under intense debate. However, as we show, model performance significantly degrades if …
under intense debate. However, as we show, model performance significantly degrades if …
Adapters: A unified library for parameter-efficient and modular transfer learning
We introduce Adapters, an open-source library that unifies parameter-efficient and modular
transfer learning in large language models. By integrating 10 diverse adapter methods into a …
transfer learning in large language models. By integrating 10 diverse adapter methods into a …
Investigating cultural alignment of large language models
The intricate relationship between language and culture has long been a subject of
exploration within the realm of linguistic anthropology. Large Language Models (LLMs) …
exploration within the realm of linguistic anthropology. Large Language Models (LLMs) …