[PDF][PDF] A survey of text similarity approaches
Measuring the similarity between words, sentences, paragraphs and documents is an
important component in various tasks such as information retrieval, document clustering …
important component in various tasks such as information retrieval, document clustering …
Translation techniques in cross-language information retrieval
Cross-language information retrieval (CLIR) is an active sub-domain of information retrieval
(IR). Like IR, CLIR is centered on the search for documents and for information contained …
(IR). Like IR, CLIR is centered on the search for documents and for information contained …
Semantics-aware content-based recommender systems
Content-based recommender systems (CBRSs) rely on item and user descriptions (content)
to build item representations and user profiles that can be effectively exploited to suggest …
to build item representations and user profiles that can be effectively exploited to suggest …
Learning multilingual named entity recognition from Wikipedia
We automatically create enormous, free and multilingual silver-standard training annotations
for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner …
for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner …
Value-sensitive algorithm design: Method, case study, and lessons
Most commonly used approaches to develo** automated or artificially intelligent
algorithmic systems are Big Data-driven and machine learning-based. However, these …
algorithmic systems are Big Data-driven and machine learning-based. However, these …
Understanding plagiarism linguistic patterns, textual features, and detection methods
Plagiarism can be of many different natures, ranging from copying texts to adopting ideas,
without giving credit to its originator. This paper presents a new taxonomy of plagiarism that …
without giving credit to its originator. This paper presents a new taxonomy of plagiarism that …
[PDF][PDF] Cross-language text classification using structural correspondence learning
We present a new approach to crosslanguage text classification that builds on structural
correspondence learning, a recently proposed theory for domain adaptation. The approach …
correspondence learning, a recently proposed theory for domain adaptation. The approach …
Wikipedia-based semantic interpretation for natural language processing
Adequate representation of natural language semantics requires access to vast amounts of
common sense and domain-specific world knowledge. Prior work in the field was based on …
common sense and domain-specific world knowledge. Prior work in the field was based on …
Mining meaning from Wikipedia
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing
community of researchers who recognize it as a resource of exceptional scale and utility. It …
community of researchers who recognize it as a resource of exceptional scale and utility. It …
Concept-based information retrieval using explicit semantic analysis
Information retrieval systems traditionally rely on textual keywords to index and retrieve
documents. Keyword-based retrieval may return inaccurate and incomplete results when …
documents. Keyword-based retrieval may return inaccurate and incomplete results when …