F-score driven max margin neural network for named entity recognition in Chinese social media

H He, X Sun - arxiv preprint arxiv:1611.04234, 2016 - arxiv.org
We focus on named entity recognition (NER) for Chinese social media. With massive
unlabeled text and quite limited labelled corpus, we propose a semi-supervised learning …

Rejection sampling for weighted jaccard similarity revisited

X Li, P Li - Proceedings of the AAAI Conference on Artificial …, 2021 - ojs.aaai.org
Efficiently computing the weighted Jaccard similarity has become an active research topic in
machine learning and theory. For sparse data, the standard technique is based on the …

Consistent sampling through extremal process

P Li, X Li, G Samorodnitsky, W Zhao - Proceedings of the Web …, 2021 - dl.acm.org
The1 Jaccard similarity has been widely used in search and machine learning, especially in
industrial practice. For binary (0/1) data, the Jaccard similarity is often called the …

Building K-Anonymous User Cohorts with Consecutive Consistent Weighted Sampling (CCWS)

X Zheng, W Zhao, X Li, P Li - Proceedings of the 46th International ACM …, 2023 - dl.acm.org
To retrieve personalized campaigns and creatives while protecting user privacy, digital
advertising is shifting from member-based identity to cohort-based identity. Under such …

A graph-based author name disambiguation method and analysis via information theory

Y Ma, Y Wu, C Lu - Entropy, 2020 - mdpi.com
Name ambiguity, due to the fact that many people share an identical name, often
deteriorates the performance of information integration, document retrieval and web search …

A graph-based approach to person name disambiguation in Web

H Emami - ACM Transactions on Management Information …, 2019 - dl.acm.org
This article presents a name disambiguation approach to resolve ambiguities between
person names and group web pages according to the individuals they refer to. The …

[PDF][PDF] Enhanced unsupervised person name disambiguation to support alumni tracer study

H Toba, EA Wijaya, MC Wijanto… - Global Journal of …, 2017 - wiete.com.au
An alumni database is a valuable information source for the development of a university.
However, alumni databases tend to be incomplete. It is always possible for phone numbers …

Fuzzy agglomerative clustering

M Konkol - Artificial Intelligence and Soft Computing: 14th …, 2015 - Springer
In this paper, we describe fuzzy agglomerative clustering, a brand new fuzzy clustering
algorithm. The basic idea of the proposed algorithm is based on the well-known hierarchical …

On disambiguating authors: Collaboration network reconstruction in a bottom-up manner

N Li, R Zhu, X Zhou, X He, W Cai… - 2021 IEEE 37th …, 2021 - ieeexplore.ieee.org
Author disambiguation arises when different authors share the same name, which is a
critical task in digital libraries, such as DBLP, CiteULike, CiteSeerX, etc. While the state-of …

Person name disambiguation in the web using adaptive threshold clustering

AD Delgado, R Martínez, S Montalvo… - Journal of the …, 2017 - Wiley Online Library
In this article, we present a new clustering algorithm for Person Name Disambiguation in
web search results. The algorithm groups web results according to the individuals they refer …