Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
ScandEval: A benchmark for Scandinavian natural language processing
DS Nielsen - arxiv preprint arxiv:2304.00906, 2023 - arxiv.org
This paper introduces a Scandinavian benchmarking platform, ScandEval, which can
benchmark any pretrained model on four different tasks in the Scandinavian languages. The …
benchmark any pretrained model on four different tasks in the Scandinavian languages. The …
Position: measure dataset diversity, don't just claim it
Machine learning (ML) datasets, often perceived as neutral, inherently encapsulate abstract
and disputed social constructs. Dataset curators frequently employ value-laden terms such …
and disputed social constructs. Dataset curators frequently employ value-laden terms such …
Rogpt2: Romanian gpt2 for text generation
MA Niculescu, S Ruseti… - 2021 IEEE 33rd …, 2021 - ieeexplore.ieee.org
Text generation is one of the most important and challenging tasks in NLP, where models
have shown a significant performance increase in recent years. However, most generative …
have shown a significant performance increase in recent years. However, most generative …
IndicXNLI: Evaluating multilingual inference for Indian languages
While Indic NLP has made rapid advances recently in terms of the availability of corpora and
pre-trained models, benchmark datasets on standard NLU tasks are limited. To this end, we …
pre-trained models, benchmark datasets on standard NLU tasks are limited. To this end, we …
Measuring diversity in datasets
Machine learning (ML) datasets, often perceived as" neutral," inherently encapsulate
abstract and disputed social constructs. Dataset curators frequently employ value-laden …
abstract and disputed social constructs. Dataset curators frequently employ value-laden …
Beyond lexical boundaries: Llm-generated text detection for romanian digital libraries
Machine-generated content reshapes the landscape of digital information; hence, ensuring
the authenticity of texts within digital libraries has become a paramount concern. This work …
the authenticity of texts within digital libraries has become a paramount concern. This work …
This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish
The availability of compute and data to train larger and larger language models increases
the demand for robust methods of benchmarking the true progress of LM training. Recent …
the demand for robust methods of benchmarking the true progress of LM training. Recent …
" Vorbe\c {s} ti Rom\^ ane\c {s} te?" A Recipe to Train Powerful Romanian LLMs with English Instructions
In recent years, Large Language Models (LLMs) have achieved almost human-like
performance on various tasks. While some LLMs have been trained on multilingual data …
performance on various tasks. While some LLMs have been trained on multilingual data …
Prompt optimization via adversarial in-context learning
We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompts
for in-context learning (ICL). Inspired by adversarial learning, adv-ICL is implemented as a …
for in-context learning (ICL). Inspired by adversarial learning, adv-ICL is implemented as a …
Distilling the knowledge of Romanian BERTs using multiple teachers
Running large-scale pre-trained language models in computationally constrained
environments remains a challenging problem yet to be addressed, while transfer learning …
environments remains a challenging problem yet to be addressed, while transfer learning …