Reusing deep learning models: Challenges and directions in software engineering

JC Davis, P Jajal, W Jiang… - 2023 IEEE John …, 2023 - ieeexplore.ieee.org
Deep neural networks (DNNs) achieve state-of-the-art performance in many areas, including
computer vision, system configuration, and question-answering. However, DNNs are …

What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claims

J Jones, W Jiang, N Synovic, G Thiruvathukal… - Proceedings of the 18th …, 2024 - dl.acm.org
Background: Software Package Registries (SPRs) are an integral part of the software supply
chain. These collaborative platforms unite contributors, users, and code for streamlined …

An empirical study of pre-trained model reuse in the hugging face deep learning model registry

W Jiang, N Synovic, M Hyatt… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) are being adopted as components in software systems.
Creating and specializing DNNs from scratch has grown increasingly difficult as state-of-the …

Signing in four public software package registries: Quantity, quality, and influencing factors

TR Schorlemmer, KG Kalu, L Chigges… - … IEEE Symposium on …, 2024 - ieeexplore.ieee.org
Many software applications incorporate open-source third-party packages distributed by
public package registries. Guaranteeing authorship along this supply chain is a challenge …

Boms away! inside the minds of stakeholders: A comprehensive study of bills of materials for software systems

T Stalnaker, N Wintersgill, O Chaparro… - Proceedings of the 46th …, 2024 - dl.acm.org
Software Bills of Materials (SBOMs) have emerged as tools to facilitate the management of
software dependencies, vulnerabilities, licenses, and the supply chain. While significant …

Large language model supply chain: A research agenda

S Wang, Y Zhao, X Hou, H Wang - ACM Transactions on Software …, 2024 - dl.acm.org
The rapid advancement of large language models (LLMs) has revolutionized artificial
intelligence, introducing unprecedented capabilities in natural language processing and …

Models are codes: Towards measuring malicious code poisoning attacks on pre-trained model hubs

J Zhao, S Wang, Y Zhao, X Hou, K Wang… - Proceedings of the 39th …, 2024 - dl.acm.org
The proliferation of pre-trained models (PTMs) and datasets has led to the emergence of
centralized model hubs like Hugging Face, which facilitate collaborative development and …

Challenges and practices of deep learning model reengineering: A case study on computer vision

W Jiang, V Banna, N Vivek, A Goel, N Synovic… - Empirical Software …, 2024 - Springer
Context Many engineering organizations are reimplementing and extending deep neural
networks from the research community. We describe this process as deep learning model …

PTMTorrent: a dataset for mining open-source pre-trained model packages

W Jiang, N Synovic, P Jajal… - 2023 IEEE/ACM 20th …, 2023 - ieeexplore.ieee.org
Due to the cost of develo** and training deep learning models from scratch, machine
learning engineers have begun to reuse pre-trained models (PTMs) and fine-tune them for …

Ecosystem of large language models for code

Z Yang, J Shi, P Devanbu, D Lo - arxiv preprint arxiv:2405.16746, 2024 - arxiv.org
The availability of vast amounts of publicly accessible data of source code and the advances
in modern language models, coupled with increasing computational resources, have led to …