[LIVRE][B] The data matching process

P Christen, P Christen - 2012 - Springer
This chapter provides an overview of the data matching process, and describes the five
major steps involved in this process: data pre-processing (cleaning and standardisation) …

Data-Centric Systems and Applications

MJ Carey, S Ceri, P Bernstein, U Dayal, C Faloutsos… - Italy: Springer, 2006 - Springer
The rapid growth of the Web in the past two decades has made it the largest publicly
accessible data source in the world. Web mining aims to discover useful information or …

Kolmogorov complexity as a data similarity metric: application in mitochondrial DNA

R Antão, A Mota, JAT Machado - Nonlinear Dynamics, 2018 - Springer
The problem of develo** a similarity index for different objects is discussed. The
limitations of current metrics are evaluated and discussed. The normalized compression …

Comparison of compression-based measures with application to the evolution of primate genomes

D Pratas, RM Silva, AJ Pinho - Entropy, 2018 - mdpi.com
An efficient DNA compressor furnishes an approximation to measure and compare
information quantities present in, between and across DNA sequences, regardless of the …

[PDF][PDF] Visual analytics of social media for situation awareness

D Thom - 2015 - researchgate.net
“It's after 2001. Where is HAL?” This question was asked 2007 by cognitive science pioneer
Marvin Minsky in a talk about the state of artificial intelligence (AI). He was referring to the …

Artificial intelligence system employing multimodal learning for analyzing entity record relationships

X Chen, L Wang, A Dutta - US Patent 11,423,072, 2022 - Google Patents
Respective text feature sets and non-text feature sets are generated corresponding to
individual pairs of a plurality of record pairs. At least one text feature is based on whether a …

Finding a boundary between valid and invalid regions of the input space

B Marculescu, R Feldt - 2018 25th Asia-Pacific Software …, 2018 - ieeexplore.ieee.org
In the context of robustness testing, the boundary between the valid and invalid regions of
the input space can be an interesting source of erroneous inputs. Knowing where a specific …

Gem: Translation-free zero-shot global entity matcher for global catalogs

K Bouyarmane - Proceedings of the 27th ACM SIGKDD Conference on …, 2021 - dl.acm.org
We propose a modular BiLSTM/CNN/Transformer deep-learning encoder architecture,
together with a data synthesis and training approach, to solve the problem of matching …

Comparison plagiarism search algorithms implementations

S Vashchilin, H Kushnir - 2017 2nd international conference on …, 2017 - ieeexplore.ieee.org
This article offers you to familiarize with metrics which are reveal disadvantages and
advantages of plagiarism search algorithms. Reviewed algorithms have the most practical …

Iterative machine learning based techniques for value-based defect analysis in large data sets

Y Xu, X Wang, M Wichterich - US Patent 11,620,558, 2023 - Google Patents
G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY
ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR …