Cultural cartography with word embeddings

DS Stoltz, MA Taylor - Poetics, 2021 - Elsevier
Using the frequency of keywords is a classic approach in the formal analysis of text, but has
the drawback of glossing over the relationality of word meanings. Word embedding models …

Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey

A Khamis, R Tsuchida, M Tarek… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Optimal Transport (OT) is a mathematical framework that first emerged in the eighteenth
century and has led to a plethora of methods for answering many theoretical and applied …

Linear-complexity data-parallel earth mover's distance approximations

K Atasu, T Mittelholzer - International Conference on …, 2019 - proceedings.mlr.press
Abstract The Earth Mover's Distance (EMD) is a state-of-the art metric for comparing discrete
probability distributions, but its high distinguishability comes at a high cost in computational …

Neural word and entity embeddings for ad hoc retrieval

E Bagheri, F Ensan, F Al-Obeidat - Information Processing & Management, 2018 - Elsevier
Learning low dimensional dense representations of the vocabularies of a corpus, known as
neural embeddings, has gained much attention in the information retrieval community. While …

Integrating semantic directions with concept mover's distance to measure binary concept engagement

MA Taylor, DS Stoltz - Journal of Computational Social Science, 2021 - Springer
In an earlier article published in this journal (“Concept Mover's Distance”, 2019), we
proposed a method for measuring concept engagement in texts that uses word embeddings …

An unsupervised semantic sentence ranking scheme for text documents

H Zhang, J Wang - Integrated Computer-Aided Engineering, 2021 - content.iospress.com
Abstract This paper presents Semantic SentenceRank (SSR), an unsupervised scheme for
automatically ranking sentences in a single document according to their relative importance …

Eliminating Negative Word Similarities for Measuring Document Distances: A Thoroughly Empirical Study on Word Mover's Distance

B Cheng, X Li, Y Chang - IEEE Transactions on Neural …, 2022 - ieeexplore.ieee.org
Document distance is a fundamental yet significant research topic in the information retrieval
community, and its accuracy dominates the performance of many text retrieval applications …

Fast paraphrase extraction in Ancient Greek literature

M Pöckelmann, J Dähne, J Ritter… - it-Information Technology, 2020 - degruyter.com
In this paper, we present a method for paraphrase extraction in Ancient Greek that can be
applied to huge text corpora in interactive humanities applications. Since lexical databases …

Semantic WordRank: Generating finer single-document summarizations

H Zhang, J Wang - Intelligent Data Engineering and Automated Learning …, 2018 - Springer
Abstract We present Semantic WordRank (SWR), an unsupervised method for generating an
extractive summary of a single document. Built on a weighted word graph with semantic and …

[PDF][PDF] Information retrieval with finnish case law embeddings

S Sarsa - University of Helsinki, Department of Computer …, 2019 - seco.tkk.fi
This thesis was created in the project “Anoppi–Automatic anonymisation and content
description of documents containing personal data” funded by Finnish Ministry of Justice1 as …