Big data analytics: A review on theoretical contributions and tools used in literature

P Grover, AK Kar - Global Journal of Flexible Systems Management, 2017 - Springer
The importance of data science and big data analytics is growing very fast as organizations
are gearing up to leverage their information assets to gain competitive advantage. The …

Analytical platform based on Jbowl library providing text-mining services in distributed environment

M Sarnovský, P Butka, P Bednár, F Babič… - … Technology: Third IFIP …, 2015 - Springer
The paper presents the Jbowl, Java software library for data and text analysis, and various
research activities performed and implemented on top of the library. The paper describes the …

Reduction of concepts from generalized one-sided concept lattice based on subsets quality measure

P Butka, J Pócs, J Pócsová - New Research in Multimedia and Internet …, 2015 - Springer
One of the conceptual methods in data mining area is based on the onesided concept
lattices, which belongs to approaches known as Formal ConceptAnalysis (FCA). It provides …

Assessment of surface water quality using a growing hierarchical self-organizing map: a case study of the Songhua River Basin, northeastern China, from 2011 to …

M Jiang, Y Wang, Q Yang, F Meng, Z Yao… - Environmental monitoring …, 2018 - Springer
The analysis of a large number of multidimensional surface water monitoring data for
extracting potential information plays an important role in water quality management. In this …

Distributed boosting algorithm for classification of text documents

M Sarnovsky, M Vronc - 2014 IEEE 12th international …, 2014 - ieeexplore.ieee.org
Presented paper focuses on the area of analysis and classification of textual documents. We
present the classification of documents based on boosting method applied on the decision …

Bivariate, cluster, and suitability analysis of NoSQL solutions for big graph applications

S Khan, X Liu, SA Ali, M Alam - Advances in Computers, 2023 - Elsevier
With the explosion of social media, the Web, Internet of Things, and the proliferation of smart
devices, large amounts of data are being generated each day. However, traditional data …

Distributed algorithm for text documents clustering based on k-means approach

M Sarnovsky, N Carnoka - … of 36th International Conference on Information …, 2016 - Springer
The presented paper describes the design and implementation of distributed k-means
clustering algorithm for text documents analysis. Motivation for the research effort presented …

[PDF][PDF] Big Data Bot with a Special Reference to Bioinformatics.

AM Al-Omari, SM Tawalbeh, YH Akkam… - … Materials & Continua, 2023 - cdn.techscience.cn
There are quintillions of data on deoxyribonucleic acid (DNA) and protein in publicly
accessible data banks, and that number is expanding at an exponential rate. Many scientific …

Improvement in the efficiency of a distributed multi-label text classification algorithm using infrastructure and task-related data

M Sarnovsky, M Olejnik - Informatics, 2019 - mdpi.com
Distributed computing technologies allow a wide variety of tasks that use large amounts of
data to be solved. Various paradigms and technologies are already widely used, but many …

Multi-Tenanted Framework: Distributed Near Duplicate Detection for Big Data

P Kathiravelu, H Galhardas, L Veiga - … " On the Move to Meaningful Internet …, 2015 - Springer
Near duplicate detection algorithms have been proposed and implemented in order to
detect and eliminate duplicate entries from massive datasets. Due to the differences in data …