Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A survey on data selection for language models
A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …
ever-growing text datasets for unsupervised pre-training. However, naively training a model …
Scaling data-constrained language models
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
Llemma: An open language model for mathematics
We present Llemma, a large language model for mathematics. We continue pretraining
Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing …
Code Llama on the Proof-Pile-2, a mixture of scientific papers, web data containing …
Crosslingual generalization through multitask finetuning
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
Octopack: Instruction tuning code large language models
Finetuning large language models (LLMs) on instructions leads to vast performance
improvements on natural language tasks. We apply instruction tuning using code …
improvements on natural language tasks. We apply instruction tuning using code …
Aya model: An instruction finetuned open-access multilingual language model
Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …
data-rich languages. What does it take to broaden access to breakthroughs beyond first …
Aya dataset: An open-access collection for multilingual instruction tuning
Datasets are foundational to many breakthroughs in modern artificial intelligence. Many
recent achievements in the space of natural language processing (NLP) can be attributed to …
recent achievements in the space of natural language processing (NLP) can be attributed to …
Aya 23: Open weight releases to further multilingual progress
This technical report introduces Aya 23, a family of multilingual language models. Aya 23
builds on the recent release of the Aya model (\" Ust\" un et al., 2024), focusing on pairing a …
builds on the recent release of the Aya model (\" Ust\" un et al., 2024), focusing on pairing a …
Having beer after prayer? measuring cultural bias in large language models
As the reach of large language models (LMs) expands globally, their ability to cater to
diverse cultural contexts becomes crucial. Despite advancements in multilingual …
diverse cultural contexts becomes crucial. Despite advancements in multilingual …
Multilingual large language model: A survey of resources, taxonomy and frontiers
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …
Models to handle and respond to queries in multiple languages, which achieves remarkable …