BIRCH: an efficient data clustering method for very large databases

T Zhang, R Ramakrishnan, M Livny - ACM sigmod record, 1996 - dl.acm.org
Finding useful patterns in large datasets has attracted considerable interest recently, and
one of the most widely studied problems in this area is the identification of clusters, or …

[BOOK][B] Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more

SS Lightstone, TJ Teorey, T Nadeau - 2010 - books.google.com
The rapidly increasing volume of information contained in relational databases places a
strain on databases, performance, and maintainability: DBAs are under greater pressure …

Experiments in parallel clustering with DBSCAN

D Arlia, M Coppola - Euro-Par 2001 Parallel Processing: 7th International …, 2001 - Springer
We present a new result concerning the parallelisation of DBSCAN, a Data Mining algorithm
for density-based spatial clustering. The overall structure of DBSCAN has been mapped to a …

Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging

CR Lin, MS Chen - IEEE Transactions on Knowledge and Data …, 2005 - ieeexplore.ieee.org
Data clustering has attracted a lot of research attention in the field of computational statistics
and data mining. In most related studies, the dissimilarity between two clusters is defined as …

[PDF][PDF] Analyzing popular clustering algorithms from different viewpoints

W Qian, A Zhou - Journal of software, 2002 - Citeseer
Clustering is widely studied in data mining community. It is used to partition data set into
clusters so that intra-cluster data are similar and inter-cluster data are dissimilar. Different …

Pixnostics: Towards measuring the value of visualization

J Schneidewind, M Sips… - 2006 IEEE Symposium On …, 2006 - ieeexplore.ieee.org
During the last two decades a wide variety of advanced methods for the visual exploration of
large data sets have been proposed. For most of these techniques user interaction has …

Using self-similarity to cluster large data sets

D Barbará, P Chen - Data Mining and Knowledge Discovery, 2003 - Springer
Clustering is a widely used knowledge discovery technique. It helps uncovering structures in
data that were not previously known. The clustering of large data sets has received a lot of …

データマイニング分野のクラスタリング手法 (1): クラスタリングを使ってみよう!

神嶌敏弘 - 人工知能, 2003 - jstage.jst.go.jp
本稿では, 代表的なデータ解析手法であるクラスタリングの最新手法を, 二回にわたって紹介する.
クラスタリングとは, 内的結合 (internalcohesion) と外的分離 (externalisolation) …

Business process impact visualization and anomaly detection

MC Hao, DA Keim, U Dayal… - Information …, 2006 - journals.sagepub.com
Business operations involve many factors and relationships and are modeled as complex
business process workflows. The execution of these business processes generates vast …

An efficient clustering algorithm for market basket data based on small large ratios

CH Yun, KT Chuang, MS Chen - 25th Annual International …, 2001 - ieeexplore.ieee.org
In this paper we devise an efficient algorithm for clustering market-basket data items. In view
of the nature of clustering market basket data, we devise in this paper a novel measurement …