Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Information retrieval on the web
In this paper we review studies of the growth of the Internet and technologies that are useful
for information search and retrieval on the Web. We present data on the Internet from several …
for information search and retrieval on the Web. We present data on the Internet from several …
A review on text mining
Y Zhang, M Chen, L Liu - 2015 6th IEEE International …, 2015 - ieeexplore.ieee.org
Because of large amounts of unstructured text data generated on the Internet, text mining is
believed to have high commercial value. Text mining is the process of extracting previously …
believed to have high commercial value. Text mining is the process of extracting previously …
A survey of text clustering algorithms
Clustering is a widely studied data mining problem in the text domains. The problem finds
numerous applications in customer segmentation, classification, collaborative filtering …
numerous applications in customer segmentation, classification, collaborative filtering …
[PDF][PDF] A comparison of document clustering techniques
This paper presents the results of an experimental study of some common document
clustering techniques. In particular, we compare the two main approaches to document …
clustering techniques. In particular, we compare the two main approaches to document …
[PDF][PDF] Fast and effective text mining using linear-time document clustering
B Larsen, C Aone - Proceedings of the fifth ACM SIGKDD international …, 1999 - dl.acm.org
Clustering is a powerful technique for large-scale topic discovery from text. It involves two
phases: first, feature extraction maps each document or record to a point in high …
phases: first, feature extraction maps each document or record to a point in high …
Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data
Finding clusters in data, especially high dimensional data, is challenging when the clusters
are of widely differing shapes, sizes, and densities, and when the data contains noise and …
are of widely differing shapes, sizes, and densities, and when the data contains noise and …
[PDF][PDF] Web document clustering: A feasibility demonstration
O Zamir, O Etzioni - Proceedings of the 21st annual international ACM …, 1998 - dl.acm.org
Users of Web search engines are often forced to sift through the long ordered list of
document “snippets” returned by the engines. The IR community has explored document …
document “snippets” returned by the engines. The IR community has explored document …
Principal direction divisive partitioning
D Boley - Data mining and knowledge discovery, 1998 - Springer
We propose a new algorithm capable of partitioning a set of documents or other samples
based on an embedding in a high dimensional Euclidean space (ie, in which every …
based on an embedding in a high dimensional Euclidean space (ie, in which every …
Efficient phrase-based document indexing for web document clustering
KM Hammouda, MS Kamel - IEEE Transactions on knowledge …, 2004 - ieeexplore.ieee.org
Document clustering techniques mostly rely on single term analysis of the document data
set, such as the vector space model. To achieve more accurate document clustering, more …
set, such as the vector space model. To achieve more accurate document clustering, more …
[PDF][PDF] An evaluation on feature selection for text clustering
Feature selection methods have been successfully applied to text categorization but seldom
applied to text clustering due to the unavailability of class label information. In this paper, we …
applied to text clustering due to the unavailability of class label information. In this paper, we …