Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond
We introduce an architecture to learn joint multilingual sentence representations for 93
languages, belonging to more than 30 different families and written in 28 different scripts …
languages, belonging to more than 30 different families and written in 28 different scripts …
SimAlign: High quality word alignments without parallel training data using static and contextualized embeddings
Word alignments are useful for tasks like statistical and neural machine translation (NMT)
and cross-lingual annotation projection. Statistical word aligners perform well, as do …
and cross-lingual annotation projection. Statistical word aligners perform well, as do …
A crosslingual investigation of conceptualization in 1335 languages
Languages differ in how they divide up the world into concepts and words; eg, in contrast to
English, Swahili has a single concept forbelly'andwomb'. We investigate these differences in …
English, Swahili has a single concept forbelly'andwomb'. We investigate these differences in …
Gpu-based private information retrieval for on-device machine learning inference
On-device machine learning (ML) inference can enable the use of private user data on user
devices without revealing them to remote servers. However, a pure on-device solution to …
devices without revealing them to remote servers. However, a pure on-device solution to …
Crosslingual transfer learning for low-resource languages based on multilingual colexification graphs
In comparative linguistics, colexification refers to the phenomenon of a lexical form
conveying two or more distinct meanings. Existing work on colexification patterns relies on …
conveying two or more distinct meanings. Existing work on colexification patterns relies on …
Graph-based multilingual label propagation for low-resource part-of-speech tagging
Part-of-Speech (POS) tagging is an important component of the NLP pipeline, but many low-
resource languages lack labeled data for training. An established method for training a POS …
resource languages lack labeled data for training. An established method for training a POS …
Topology of word embeddings: Singularities reflect polysemy
A Jakubowski, M Gašić, M Zibrowius - arxiv preprint arxiv:2011.09413, 2020 - arxiv.org
The manifold hypothesis suggests that word vectors live on a submanifold within their
ambient vector space. We argue that we should, more accurately, expect them to live on a …
ambient vector space. We argue that we should, more accurately, expect them to live on a …
A multilingual BPE embedding space for universal sentiment lexicon induction
We present a new method for sentiment lexicon induction that is designed to be applicable
to the entire range of typological diversity of the world's languages. We evaluate our method …
to the entire range of typological diversity of the world's languages. We evaluate our method …
Graph algorithms for multiparallel word alignment
With the advent of end-to-end deep learning approaches in machine translation, interest in
word alignments initially decreased; however, they have again become a focus of research …
word alignments initially decreased; however, they have again become a focus of research …
Learning contextualised cross-lingual word embeddings and alignments for extremely low-resource languages using parallel corpora
We propose a new approach for learning contextualised cross-lingual word embeddings
based on a small parallel corpus (eg a few hundred sentence pairs). Our method obtains …
based on a small parallel corpus (eg a few hundred sentence pairs). Our method obtains …