„Google“ mokslinčius

H Schwenk, V Chaudhary, S Sun, H Gong… - arxiv preprint arxiv …, 2019 - arxiv.org

We present an approach based on multilingual sentence embeddings to automatically
extract parallel sentences from the content of Wikipedia articles in 85 languages, including …

Išsaugoti Cituoti Cituoja 374 Susiję straipsniai Visos 7 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CCMatrix: Mining billions of high-quality parallel sentences on the web

H Schwenk, G Wenzek, S Edunov, E Grave… - arxiv preprint arxiv …, 2019 - arxiv.org

We show that margin-based bitext mining in a multilingual sentence space can be applied to
monolingual corpora of billions of sentences. We are using ten snapshots of a curated …

Išsaugoti Cituoti Cituoja 245 Susiję straipsniai Visos 6 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

CsFEVER and CTKFacts: acquiring Czech data for fact verification

H Ullrich, J Drchal, M Rýpar, H Vincourová… - Language Resources …, 2023 - Springer

In this paper, we examine several methods of acquiring Czech data for automated fact-
checking, which is a task commonly modeled as a classification of textual claim veracity wrt …

Išsaugoti Cituoti Cituoja 17 Susiję straipsniai Visos 11 versijos

[Free GPT-4]
[DeepSeek]

[PDF] iust.ac.ir

Tep: Tehran english-persian parallel corpus

MT Pilevar, H Faili, AH Pilevar - International conference on intelligent text …, 2011 - Springer

Parallel corpora are one of the key resources in natural language processing. In spite of
their importance in many multi-lingual applications, no large-scale English-Persian corpus …

Išsaugoti Cituoti Cituoja 67 Susiję straipsniai Visos 7 versijos

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Semantic orientation of crosslingual sentiments: Employment of lexicon and dictionaries

AA Raza, A Habib, J Ashraf, B Shah, F Moreira - IEEE Access, 2023 - ieeexplore.ieee.org

Sentiment Analysis is a modern discipline at the crossroads of data mining and natural
language processing. It is concerned with the computational treatment of public moods …

Išsaugoti Cituoti Cituoja 7 Susiję straipsniai Visos 6 versijos

[Free GPT-4]
[DeepSeek]

[PDF] upv.es

On the mono-and cross-language detection of text reuse and plagiarism

A Barrón-Cedeño - Proceedings of the 33rd international ACM SIGIR …, 2010 - dl.acm.org

Plagiarism, the unacknowledged reuse of text, has increased in recent years due to the
large amount of texts readily available. For instance, recent studies claim that nowadays a …

Išsaugoti Cituoti Cituoja 65 Susiję straipsniai Visos 20 versijos

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

[PDF][PDF] JMaxAlign: A maximum entropy parallel sentence alignment tool

M Kaufmann - Proceedings of COLING 2012: Demonstration …, 2012 - aclanthology.org

Parallel corpora are an extremely useful tool in many natural language processing tasks,
particularly statistical machine translation. Parallel corpora for certain language pairs, such …

Išsaugoti Cituoti Cituoja 49 Susiję straipsniai Visos 5 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] um.edu.my

Hybrid distance-statistical-based phrase alignment for analyzing parallel texts in standard Malay and Malay dialects

JKY Min, TP Tan… - Malaysian Journal of …, 2024 - mjes.um.edu.my

Parallel texts corpora are essential resources in linguistics and natural language
processing, especially in translation and multilingual information retrieval. The publicly …

Išsaugoti Cituoti Cituoja 3 Susiję straipsniai Visos 27 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MultiWiki: Interlingual text passage alignment in Wikipedia

S Gottschalk, E Demidova - ACM Transactions on the Web (TWEB), 2017 - dl.acm.org

In this article, we address the problem of text passage alignment across interlingual article
pairs in Wikipedia. We develop methods that enable the identification and interlinking of text …

Išsaugoti Cituoti Cituoja 26 Susiję straipsniai Visos 4 versijos

[Free GPT-4]
[DeepSeek]

[PDF] cicling.org

[PDF][PDF] Parallel-Wiki: A collection of parallel sentences extracted from Wikipedia

D Ştefănescu, R Ion - Proceedings of the 14th International Conference …, 2013 - cicling.org

Parallel corpora are essential resources for certain Natural Language Processing tasks such
as Statistical Machine Translation. However, the existing publically available parallel …

Išsaugoti Cituoti Cituoja 29 Susiję straipsniai Visos 10 versijos HTML kopija

Kurti įspėjimą

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

Building bilingual parallel corpora based on wikipedia

Wikimatrix: Mining 135m parallel sentences in 1620 language pairs from wikipedia

CCMatrix: Mining billions of high-quality parallel sentences on the web

CsFEVER and CTKFacts: acquiring Czech data for fact verification

Tep: Tehran english-persian parallel corpus

Semantic orientation of crosslingual sentiments: Employment of lexicon and dictionaries

On the mono-and cross-language detection of text reuse and plagiarism

[PDF][PDF] JMaxAlign: A maximum entropy parallel sentence alignment tool

Hybrid distance-statistical-based phrase alignment for analyzing parallel texts in standard Malay and Malay dialects

MultiWiki: Interlingual text passage alignment in Wikipedia

[PDF][PDF] Parallel-Wiki: A collection of parallel sentences extracted from Wikipedia