Perspectives on the integration between first-principles and data-driven modeling

W Bradley, J Kim, Z Kilwein, L Blakely… - Computers & Chemical …, 2022 - Elsevier
Efficiently embedding and/or integrating mechanistic information with data-driven models is
essential if it is desired to simultaneously take advantage of both engineering principles and …

MiniLLM: Knowledge distillation of large language models

Y Gu, L Dong, F Wei, M Huang - arxiv preprint arxiv:2306.08543, 2023 - arxiv.org
Knowledge Distillation (KD) is a promising technique for reducing the high computational
demand of large language models (LLMs). However, previous KD methods are primarily …

Tutorial: Deriving the standard variational autoencoder (vae) loss function

S Odaibo - arxiv preprint arxiv:1907.08956, 2019 - arxiv.org
In Bayesian machine learning, the posterior distribution is typically computationally
intractable, hence variational inference is often required. In this approach, an evidence …

Adversarial uncertainty quantification in physics-informed neural networks

Y Yang, P Perdikaris - Journal of Computational Physics, 2019 - Elsevier
We present a deep learning framework for quantifying and propagating uncertainty in
systems governed by non-linear differential equations using physics-informed neural …

Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow

XB Peng, A Kanazawa, S Toyer, P Abbeel… - arxiv preprint arxiv …, 2018 - arxiv.org
Adversarial learning methods have been proposed for a wide range of applications, but the
training of adversarial models can be notoriously unstable. Effectively balancing the …

Wasserstein contrastive representation distillation

L Chen, D Wang, Z Gan, J Liu… - Proceedings of the …, 2021 - openaccess.thecvf.com
The primary goal of knowledge distillation (KD) is to encapsulate the information of a model
learned from a teacher network into a student network, with the latter being more compact …

Adversarial text generation via feature-mover's distance

L Chen, S Dai, C Tao, H Zhang, Z Gan… - Advances in neural …, 2018 - proceedings.neurips.cc
Generative adversarial networks (GANs) have achieved significant success in generating
real-valued data. However, the discrete nature of text hinders the application of GAN to text …

Triangle generative adversarial networks

Z Gan, L Chen, W Wang, Y Pu… - Advances in neural …, 2017 - proceedings.neurips.cc
Abstract A Triangle Generative Adversarial Network ($\Delta $-GAN) is developed for semi-
supervised cross-domain joint distribution matching, where the training data consists of …

Learning latent representations across multiple data domains using lifelong VAEGAN

F Ye, AG Bors - European Conference on Computer Vision, 2020 - Springer
The problem of catastrophic forgetting occurs in deep learning models trained on multiple
databases in a sequential manner. Recently, generative replay mechanisms (GRM) have …

On unifying deep generative models

Z Hu, Z Yang, R Salakhutdinov, EP **ng - arxiv preprint arxiv:1706.00550, 2017 - arxiv.org
Deep generative models have achieved impressive success in recent years. Generative
Adversarial Networks (GANs) and Variational Autoencoders (VAEs), as emerging families …