Scientific large language models: A survey on biological & chemical domains

Q Zhang, K Ding, T Lv, X Wang, Q Yin, Y Zhang… - ACM Computing …, 2024 - dl.acm.org
Large Language Models (LLMs) have emerged as a transformative power in enhancing
natural language comprehension, representing a significant stride toward artificial general …

A review of large language models and autonomous agents in chemistry

MC Ramos, CJ Collison, AD White - Chemical Science, 2025 - pubs.rsc.org
Large language models (LLMs) have emerged as powerful tools in chemistry, significantly
impacting molecule design, property prediction, and synthesis optimization. This review …

Artificial intelligence for science in quantum, atomistic, and continuum systems

X Zhang, L Wang, J Helwig, Y Luo, C Fu, Y **e… - arxiv preprint arxiv …, 2023 - arxiv.org
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural
sciences. Today, AI has started to advance natural sciences by improving, accelerating, and …

Biot5: Enriching cross-modal integration in biology with chemical knowledge and natural language associations

Q Pei, W Zhang, J Zhu, K Wu, K Gao, L Wu… - arxiv preprint arxiv …, 2023 - arxiv.org
Recent advancements in biological research leverage the integration of molecules, proteins,
and natural language to enhance drug discovery. However, current models exhibit several …

Biomedgpt: Open multimodal generative pre-trained transformer for biomedicine

Y Luo, J Zhang, S Fan, K Yang, Y Wu, M Qiao… - arxiv preprint arxiv …, 2023 - arxiv.org
Foundation models (FMs) have exhibited remarkable performance across a wide range of
downstream tasks in many domains. Nevertheless, general-purpose FMs often face …

Biot5+: Towards generalized biological understanding with iupac integration and multi-task tuning

Q Pei, L Wu, K Gao, X Liang, Y Fang, J Zhu… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent research trends in computational biology have increasingly focused on integrating
text and bio-entity modeling, especially in the context of molecules and proteins. However …

Leveraging biomolecule and natural language through multi-modal learning: A survey

Q Pei, L Wu, K Gao, J Zhu, Y Wang, Z Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
The integration of biomolecular modeling with natural language (BL) has emerged as a
promising interdisciplinary area at the intersection of artificial intelligence, chemistry and …

L+ m-24: Building a dataset for language+ molecules@ acl 2024

C Edwards, Q Wang, L Zhao, H Ji - arxiv preprint arxiv:2403.00791, 2024 - arxiv.org
Language-molecule models have emerged as an exciting direction for molecular discovery
and understanding. However, training these models is challenging due to the scarcity of …

Pyridine-induced caused structural reconfiguration forming ultrathin 2D metal–organic frameworks for the oxygen evolution reaction

Y Liu, S Deng, S Fu, X Wang, G Liu… - Journal of Materials …, 2024 - pubs.rsc.org
Two-dimensional metal–organic frameworks (2D MOFs) as an ideal prototype material for
the electrocatalytic oxygen evolution reaction (OER) can expose more metal active sites due …

Langcell: Language-cell pre-training for cell identity understanding

S Zhao, J Zhang, Y Wu, Y Luo, Z Nie - arxiv preprint arxiv:2405.06708, 2024 - arxiv.org
Cell identity encompasses various semantic aspects of a cell, including cell type, pathway
information, disease information, and more, which are essential for biologists to gain insights …