[PDF][PDF] A survey of text similarity approaches

WH Gomaa, AA Fahmy - international journal of Computer Applications, 2013 - Citeseer
Measuring the similarity between words, sentences, paragraphs and documents is an
important component in various tasks such as information retrieval, document clustering …

Translation techniques in cross-language information retrieval

D Zhou, M Truran, T Brailsford, V Wade… - ACM Computing …, 2012 - dl.acm.org
Cross-language information retrieval (CLIR) is an active sub-domain of information retrieval
(IR). Like IR, CLIR is centered on the search for documents and for information contained …

Semantics-aware content-based recommender systems

M De Gemmis, P Lops, C Musto, F Narducci… - Recommender systems …, 2015 - Springer
Content-based recommender systems (CBRSs) rely on item and user descriptions (content)
to build item representations and user profiles that can be effectively exploited to suggest …

Learning multilingual named entity recognition from Wikipedia

J Nothman, N Ringland, W Radford, T Murphy… - Artificial Intelligence, 2013 - Elsevier
We automatically create enormous, free and multilingual silver-standard training annotations
for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner …

Value-sensitive algorithm design: Method, case study, and lessons

H Zhu, B Yu, A Halfaker, L Terveen - Proceedings of the ACM on human …, 2018 - dl.acm.org
Most commonly used approaches to develo** automated or artificially intelligent
algorithmic systems are Big Data-driven and machine learning-based. However, these …

Understanding plagiarism linguistic patterns, textual features, and detection methods

SM Alzahrani, N Salim… - IEEE Transactions on …, 2011 - ieeexplore.ieee.org
Plagiarism can be of many different natures, ranging from copying texts to adopting ideas,
without giving credit to its originator. This paper presents a new taxonomy of plagiarism that …

[PDF][PDF] Cross-language text classification using structural correspondence learning

P Prettenhofer, B Stein - Proceedings of the 48th annual meeting …, 2010 - aclanthology.org
We present a new approach to crosslanguage text classification that builds on structural
correspondence learning, a recently proposed theory for domain adaptation. The approach …

Wikipedia-based semantic interpretation for natural language processing

E Gabrilovich, S Markovitch - Journal of Artificial Intelligence Research, 2009 - jair.org
Adequate representation of natural language semantics requires access to vast amounts of
common sense and domain-specific world knowledge. Prior work in the field was based on …

Mining meaning from Wikipedia

O Medelyan, D Milne, C Legg, IH Witten - International Journal of Human …, 2009 - Elsevier
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing
community of researchers who recognize it as a resource of exceptional scale and utility. It …

Concept-based information retrieval using explicit semantic analysis

O Egozi, S Markovitch, E Gabrilovich - ACM Transactions on Information …, 2011 - dl.acm.org
Information retrieval systems traditionally rely on textual keywords to index and retrieve
documents. Keyword-based retrieval may return inaccurate and incomplete results when …