F-score driven max margin neural network for named entity recognition in Chinese social media
We focus on named entity recognition (NER) for Chinese social media. With massive
unlabeled text and quite limited labelled corpus, we propose a semi-supervised learning …
unlabeled text and quite limited labelled corpus, we propose a semi-supervised learning …
Rejection sampling for weighted jaccard similarity revisited
X Li, P Li - Proceedings of the AAAI Conference on Artificial …, 2021 - ojs.aaai.org
Efficiently computing the weighted Jaccard similarity has become an active research topic in
machine learning and theory. For sparse data, the standard technique is based on the …
machine learning and theory. For sparse data, the standard technique is based on the …
Consistent sampling through extremal process
The1 Jaccard similarity has been widely used in search and machine learning, especially in
industrial practice. For binary (0/1) data, the Jaccard similarity is often called the …
industrial practice. For binary (0/1) data, the Jaccard similarity is often called the …
Building K-Anonymous User Cohorts with Consecutive Consistent Weighted Sampling (CCWS)
To retrieve personalized campaigns and creatives while protecting user privacy, digital
advertising is shifting from member-based identity to cohort-based identity. Under such …
advertising is shifting from member-based identity to cohort-based identity. Under such …
A graph-based author name disambiguation method and analysis via information theory
Name ambiguity, due to the fact that many people share an identical name, often
deteriorates the performance of information integration, document retrieval and web search …
deteriorates the performance of information integration, document retrieval and web search …
A graph-based approach to person name disambiguation in Web
H Emami - ACM Transactions on Management Information …, 2019 - dl.acm.org
This article presents a name disambiguation approach to resolve ambiguities between
person names and group web pages according to the individuals they refer to. The …
person names and group web pages according to the individuals they refer to. The …
[PDF][PDF] Enhanced unsupervised person name disambiguation to support alumni tracer study
An alumni database is a valuable information source for the development of a university.
However, alumni databases tend to be incomplete. It is always possible for phone numbers …
However, alumni databases tend to be incomplete. It is always possible for phone numbers …
Fuzzy agglomerative clustering
M Konkol - Artificial Intelligence and Soft Computing: 14th …, 2015 - Springer
In this paper, we describe fuzzy agglomerative clustering, a brand new fuzzy clustering
algorithm. The basic idea of the proposed algorithm is based on the well-known hierarchical …
algorithm. The basic idea of the proposed algorithm is based on the well-known hierarchical …
On disambiguating authors: Collaboration network reconstruction in a bottom-up manner
Author disambiguation arises when different authors share the same name, which is a
critical task in digital libraries, such as DBLP, CiteULike, CiteSeerX, etc. While the state-of …
critical task in digital libraries, such as DBLP, CiteULike, CiteSeerX, etc. While the state-of …
Person name disambiguation in the web using adaptive threshold clustering
In this article, we present a new clustering algorithm for Person Name Disambiguation in
web search results. The algorithm groups web results according to the individuals they refer …
web search results. The algorithm groups web results according to the individuals they refer …