Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Wikimatrix: Mining 135m parallel sentences in 1620 language pairs from wikipedia
We present an approach based on multilingual sentence embeddings to automatically
extract parallel sentences from the content of Wikipedia articles in 85 languages, including …
extract parallel sentences from the content of Wikipedia articles in 85 languages, including …
CCMatrix: Mining billions of high-quality parallel sentences on the web
We show that margin-based bitext mining in a multilingual sentence space can be applied to
monolingual corpora of billions of sentences. We are using ten snapshots of a curated …
monolingual corpora of billions of sentences. We are using ten snapshots of a curated …
CsFEVER and CTKFacts: acquiring Czech data for fact verification
In this paper, we examine several methods of acquiring Czech data for automated fact-
checking, which is a task commonly modeled as a classification of textual claim veracity wrt …
checking, which is a task commonly modeled as a classification of textual claim veracity wrt …
Tep: Tehran english-persian parallel corpus
Parallel corpora are one of the key resources in natural language processing. In spite of
their importance in many multi-lingual applications, no large-scale English-Persian corpus …
their importance in many multi-lingual applications, no large-scale English-Persian corpus …
Semantic orientation of crosslingual sentiments: Employment of lexicon and dictionaries
Sentiment Analysis is a modern discipline at the crossroads of data mining and natural
language processing. It is concerned with the computational treatment of public moods …
language processing. It is concerned with the computational treatment of public moods …
On the mono-and cross-language detection of text reuse and plagiarism
A Barrón-Cedeño - Proceedings of the 33rd international ACM SIGIR …, 2010 - dl.acm.org
Plagiarism, the unacknowledged reuse of text, has increased in recent years due to the
large amount of texts readily available. For instance, recent studies claim that nowadays a …
large amount of texts readily available. For instance, recent studies claim that nowadays a …
[PDF][PDF] JMaxAlign: A maximum entropy parallel sentence alignment tool
M Kaufmann - Proceedings of COLING 2012: Demonstration …, 2012 - aclanthology.org
Parallel corpora are an extremely useful tool in many natural language processing tasks,
particularly statistical machine translation. Parallel corpora for certain language pairs, such …
particularly statistical machine translation. Parallel corpora for certain language pairs, such …
Hybrid distance-statistical-based phrase alignment for analyzing parallel texts in standard Malay and Malay dialects
JKY Min, TP Tan… - Malaysian Journal of …, 2024 - mjes.um.edu.my
Parallel texts corpora are essential resources in linguistics and natural language
processing, especially in translation and multilingual information retrieval. The publicly …
processing, especially in translation and multilingual information retrieval. The publicly …
MultiWiki: Interlingual text passage alignment in Wikipedia
In this article, we address the problem of text passage alignment across interlingual article
pairs in Wikipedia. We develop methods that enable the identification and interlinking of text …
pairs in Wikipedia. We develop methods that enable the identification and interlinking of text …
[PDF][PDF] Parallel-Wiki: A collection of parallel sentences extracted from Wikipedia
Parallel corpora are essential resources for certain Natural Language Processing tasks such
as Statistical Machine Translation. However, the existing publically available parallel …
as Statistical Machine Translation. However, the existing publically available parallel …