Federated learning for generalization, robustness, fairness: A survey and benchmark

W Huang, M Ye, Z Shi, G Wan, H Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Federated learning has emerged as a promising paradigm for privacy-preserving
collaboration among different parties. Recently, with the popularity of federated learning, an …

Data banzhaf: A robust data valuation framework for machine learning

JT Wang, R Jia - International Conference on Artificial …, 2023 - proceedings.mlr.press
Data valuation has wide use cases in machine learning, including improving data quality
and creating economic incentives for data sharing. This paper studies the robustness of data …

[HTML][HTML] Data-driven learning for data rights, data pricing, and privacy computing

J Xu, N Hong, Z Xu, Z Zhao, C Wu, K Kuang, J Wang… - Engineering, 2023 - Elsevier
In recent years, data has become one of the most important resources in the digital
economy. Unlike traditional resources, the digital nature of data makes it difficult to value …

Datainf: Efficiently estimating data influence in lora-tuned llms and diffusion models

Y Kwon, E Wu, K Wu, J Zou - arxiv preprint arxiv:2310.00902, 2023 - arxiv.org
Quantifying the impact of training data points is crucial for understanding the outputs of
machine learning models and for improving the transparency of the AI pipeline. The …

Synthetic sample selection for generalized zero-shot learning

SN Gowda - Proceedings of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
Abstract Generalized Zero-Shot Learning (GZSL) has emerged as a pivotal research domain
in computer vision, owing to its capability to recognize objects that have not been seen …

What is your data worth to gpt? llm-scale data valuation with influence functions

SK Choe, H Ahn, J Bae, K Zhao, M Kang… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are trained on a vast amount of human-written data, but data
providers often remain uncredited. In response to this issue, data valuation (or data …

A privacy-friendly approach to data valuation

JT Wang, Y Zhu, YX Wang, R Jia… - Advances in Neural …, 2023 - proceedings.neurips.cc
Data valuation, a growing field that aims at quantifying the usefulness of individual data
sources for training machine learning (ML) models, faces notable yet often overlooked …

Data valuation without training of a model

K Nohyun, H Choi, HW Chung - The Eleventh International …, 2022 - openreview.net
Many recent works on understanding deep learning try to quantify how much individual data
instances influence the optimization and generalization of a model. Such attempts reveal …

PINNACLE: PINN Adaptive ColLocation and Experimental points selection

GKR Lau, A Hemachandra, SK Ng… - arxiv preprint arxiv …, 2024 - arxiv.org
Physics-Informed Neural Networks (PINNs), which incorporate PDEs as soft constraints,
train with a composite loss function that contains multiple training point types: different types …

Performance scaling via optimal transport: Enabling data selection from partially revealed sources

F Kang, HA Just, AK Sahu, R Jia - Advances in Neural …, 2023 - proceedings.neurips.cc
Traditionally, data selection has been studied in settings where all samples from prospective
sources are fully revealed to a machine learning developer. However, in practical data …