Cultural cartography with word embeddings
Using the frequency of keywords is a classic approach in the formal analysis of text, but has
the drawback of glossing over the relationality of word meanings. Word embedding models …
the drawback of glossing over the relationality of word meanings. Word embedding models …
Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey
Optimal Transport (OT) is a mathematical framework that first emerged in the eighteenth
century and has led to a plethora of methods for answering many theoretical and applied …
century and has led to a plethora of methods for answering many theoretical and applied …
Linear-complexity data-parallel earth mover's distance approximations
K Atasu, T Mittelholzer - International Conference on …, 2019 - proceedings.mlr.press
Abstract The Earth Mover's Distance (EMD) is a state-of-the art metric for comparing discrete
probability distributions, but its high distinguishability comes at a high cost in computational …
probability distributions, but its high distinguishability comes at a high cost in computational …
Neural word and entity embeddings for ad hoc retrieval
Learning low dimensional dense representations of the vocabularies of a corpus, known as
neural embeddings, has gained much attention in the information retrieval community. While …
neural embeddings, has gained much attention in the information retrieval community. While …
Integrating semantic directions with concept mover's distance to measure binary concept engagement
In an earlier article published in this journal (“Concept Mover's Distance”, 2019), we
proposed a method for measuring concept engagement in texts that uses word embeddings …
proposed a method for measuring concept engagement in texts that uses word embeddings …
An unsupervised semantic sentence ranking scheme for text documents
Abstract This paper presents Semantic SentenceRank (SSR), an unsupervised scheme for
automatically ranking sentences in a single document according to their relative importance …
automatically ranking sentences in a single document according to their relative importance …
Eliminating Negative Word Similarities for Measuring Document Distances: A Thoroughly Empirical Study on Word Mover's Distance
Document distance is a fundamental yet significant research topic in the information retrieval
community, and its accuracy dominates the performance of many text retrieval applications …
community, and its accuracy dominates the performance of many text retrieval applications …
Fast paraphrase extraction in Ancient Greek literature
M Pöckelmann, J Dähne, J Ritter… - it-Information Technology, 2020 - degruyter.com
In this paper, we present a method for paraphrase extraction in Ancient Greek that can be
applied to huge text corpora in interactive humanities applications. Since lexical databases …
applied to huge text corpora in interactive humanities applications. Since lexical databases …
Semantic WordRank: Generating finer single-document summarizations
Abstract We present Semantic WordRank (SWR), an unsupervised method for generating an
extractive summary of a single document. Built on a weighted word graph with semantic and …
extractive summary of a single document. Built on a weighted word graph with semantic and …
[PDF][PDF] Information retrieval with finnish case law embeddings
S Sarsa - University of Helsinki, Department of Computer …, 2019 - seco.tkk.fi
This thesis was created in the project “Anoppi–Automatic anonymisation and content
description of documents containing personal data” funded by Finnish Ministry of Justice1 as …
description of documents containing personal data” funded by Finnish Ministry of Justice1 as …