Continuous data cleaning

M Volkovs, F Chiang, J Szlichta… - 2014 IEEE 30th …, 2014‏ - ieeexplore.ieee.org
In declarative data cleaning, data semantics are encoded as constraints and errors arise
when the data violates the constraints. Various forms of statistical and logical inference can …

Privacy-aware data cleaning-as-a-service

Y Huang, M Milani, F Chiang - Information Systems, 2020‏ - Elsevier
Data cleaning is a pervasive problem for organizations as they try to reap value from their
data. Recent advances in networking and cloud computing technology have fueled a new …

PACAS: Privacy-aware, data cleaning-as-a-service

Y Huang, M Milani, F Chiang - 2018 IEEE International …, 2018‏ - ieeexplore.ieee.org
Data cleaning consumes up to 80% of the data analysis pipeline. This is a significant
overhead for organizations where data cleaning is still a manually driven process requiring …

Infoclean: Protecting sensitive information in data cleaning

F Chiang, D Gairola - Journal of Data and Information Quality (JDIQ), 2018‏ - dl.acm.org
Data quality has become a pervasive challenge for organizations as they wrangle with large,
heterogeneous datasets to extract value. Given the proliferation of sensitive and confidential …

[PDF][PDF] Keyminer: Discovering keys for graphs

M Alipourlangouri, F Chiang - VLDB workshop TD-LSG, 2018‏ - tdlsg-vldb18.isima.fr
Keys allow us to uniquely identify entities in a graph database. They have applications in
object identification, entity resolution, knowledge fusion, and social network reconciliation …

Contextual data cleaning

M Alipour-Langouri, Z Zheng, F Chiang… - 2018 IEEE 34th …, 2018‏ - ieeexplore.ieee.org
In this paper, we motivate the need to include context in data cleaning in order to account for
the subjective nature of data quality. Based on our recent work on incorporating ontologies …

A provenance-based approach to manage long term preservation of scientific data

RB Sousa, DC Cugler, JEG Malaverri… - 2014 IEEE 30th …, 2014‏ - ieeexplore.ieee.org
Long term preservation of scientific data goes beyond the data, and extends to metadata
preservation and curation. While several researchers emphasize curation processes, our …

Models for distributed, large scale data cleaning

VJ Maccio, F Chiang, DG Down - … in Knowledge Discovery and Data Mining …, 2014‏ - Springer
Poor data quality is a serious and costly problem affecting organizations across all
industries. Real data is often dirty, containing missing, erroneous, incomplete, and duplicate …

Performance evaluation of Mobile IPv6 handover extensions in an IEEE 802.11 b wireless network environment

J Lai, YA Sekercioglu, N Jordan… - 11th IEEE Symposium …, 2006‏ - ieeexplore.ieee.org
In order to support mobile users, the basic Internet protocols have been extended with
protocols (eg, Mobile IPv6) for intercepting and forwarding packets to a mobile and possibly …

A data quality framework for customer relationship analytics

F Chiang, S Sitaramachandran - … , Miami, FL, USA, November 1-3, 2015 …, 2015‏ - Springer
Poor data quality has become an increasingly pervasive problem for organizations leading
to operational inefficiency, increased costs, and missed opportunities. As high quality data is …