The family of mapreduce and large-scale data processing systems

S Sakr, A Liu, AG Fayoumi - ACM Computing Surveys (CSUR), 2013 - dl.acm.org
In the last two decades, the continuous increase of computational power has produced an
overwhelming flow of data which has called for a paradigm shift in the computing …

Scalable clustering algorithms for big data: A review

MA Mahdi, KM Hosny, I Elhenawy - IEEE Access, 2021 - ieeexplore.ieee.org
Clustering algorithms have become one of the most critical research areas in multiple
domains, especially data mining. However, with the massive growth of big data applications …

Graph based anomaly detection and description: a survey

L Akoglu, H Tong, D Koutra - Data mining and knowledge discovery, 2015 - Springer
Detecting anomalies in data is a vital task, with numerous high-impact applications in areas
such as security, finance, health care, and law enforcement. While numerous techniques …

Data mining with big data

X Wu, X Zhu, GQ Wu, W Ding - IEEE transactions on …, 2013 - ieeexplore.ieee.org
Big Data concern large-volume, complex, growing data sets with multiple, autonomous
sources. With the fast development of networking, data storage, and the data collection …

Social influence analysis in large-scale networks

J Tang, J Sun, C Wang, Z Yang - Proceedings of the 15th ACM SIGKDD …, 2009 - dl.acm.org
In large social networks, nodes (users, entities) are influenced by others for various reasons.
For example, the colleagues have strong influence on one's work, while the friends have …

Copycatch: stop** group attacks by spotting lockstep behavior in social networks

A Beutel, W Xu, V Guruswami, C Palow… - Proceedings of the 22nd …, 2013 - dl.acm.org
How can web services that depend on user generated content discern fraudulent input by
spammers from legitimate input? In this paper we focus on the social network Facebook and …

Pegasus: A peta-scale graph mining system implementation and observations

U Kang, CE Tsourakakis… - 2009 Ninth IEEE …, 2009 - ieeexplore.ieee.org
In this paper, we describe PEGASUS, an open source peta graph mining library which
performs typical graph mining tasks such as computing the diameter of the graph, computing …

Gigatensor: scaling tensor analysis up by 100 times-algorithms and discoveries

U Kang, E Papalexakis, A Harpale… - Proceedings of the 18th …, 2012 - dl.acm.org
Many data are modeled as tensors, or multi dimensional arrays. Examples include the
predicates (subject, verb, object) in knowledge bases, hyperlinks and anchor texts in the …

MapReduce algorithms for big data analysis

K Shim - International workshop on databases in networked …, 2013 - Springer
As there is an increasing trend of applications being expected to deal with big data that
usually do not fit in the main memory of a single machine, analyzing big data is a …

Slashburn: Graph compression and mining beyond caveman communities

Y Lim, U Kang, C Faloutsos - IEEE Transactions on Knowledge …, 2014 - ieeexplore.ieee.org
Given a real world graph, how should we lay-out its edges? How can we compress it? These
questions are closely related, and the typical approach so far is to find clique-like …