Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
[HTML][HTML] Automated data processing and feature engineering for deep learning and big data applications: a survey
A Mumuni, F Mumuni - Journal of Information and Intelligence, 2024 - Elsevier
Modern approach to artificial intelligence (AI) aims to design algorithms that learn directly
from data. This approach has achieved impressive results and has contributed significantly …
from data. This approach has achieved impressive results and has contributed significantly …
Learning from data with structured missingness
Missing data are an unavoidable complication in many machine learning tasks. When data
are 'missing at random'there exist a range of tools and techniques to deal with the issue …
are 'missing at random'there exist a range of tools and techniques to deal with the issue …
The effects of data quality on machine learning performance
L Budach, M Feuerpfeil, N Ihde, A Nathansen… - arxiv preprint arxiv …, 2022 - arxiv.org
Modern artificial intelligence (AI) applications require large quantities of training and test
data. This need creates critical challenges not only concerning the availability of such data …
data. This need creates critical challenges not only concerning the availability of such data …
Pervasive label errors in test sets destabilize machine learning benchmarks
We identify label errors in the test sets of 10 of the most commonly-used computer vision,
natural language, and audio datasets, and subsequently study the potential for these label …
natural language, and audio datasets, and subsequently study the potential for these label …
Dataperf: Benchmarks for data-centric ai development
Abstract Machine learning research has long focused on models rather than datasets, and
prominent datasets are used for common ML tasks without regard to the breadth, difficulty …
prominent datasets are used for common ML tasks without regard to the breadth, difficulty …
[HTML][HTML] A procedure for anomaly detection and analysis
Anomaly detection is often used to identify and remove outliers in datasets. However,
detecting and analyzing the pattern of outliers can contribute to future business decisions or …
detecting and analyzing the pattern of outliers can contribute to future business decisions or …
UniDM: a Unified framework for data manipulation with large language models
Designing effective data manipulation methods is a long standing problem in data lakes.
Traditional methods, which rely on rules or machine learning models, require extensive …
Traditional methods, which rely on rules or machine learning models, require extensive …
Sudowoodo: Contrastive self-supervised learning for multi-purpose data integration and preparation
Machine learning (ML) is playing an increasingly important role in data management tasks,
particularly in Data Integration and Preparation (DI&P). The success of ML-based …
particularly in Data Integration and Preparation (DI&P). The success of ML-based …
Navigating data-centric artificial intelligence with DC-Check: Advances, challenges, and opportunities
Data-centric artificial intelligence (AI) is an emerging paradigm that emphasizes the critical
role of data in real-world machine learning (ML) systems—as a complement to model …
role of data in real-world machine learning (ML) systems—as a complement to model …
Machine learning-assisted data filtering and QSAR models for prediction of chemical acute toxicity on rat and mouse
Abstract Machine learning (ML) methods provide a new opportunity to build quantitative
structure-activity relationship (QSAR) models for predicting chemicals' toxicity based on …
structure-activity relationship (QSAR) models for predicting chemicals' toxicity based on …