Leveraging biomolecule and natural language through multi-modal learning: A survey

Q Pei, L Wu, K Gao, J Zhu, Y Wang, Z Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
The integration of biomolecular modeling with natural language (BL) has emerged as a
promising interdisciplinary area at the intersection of artificial intelligence, chemistry and …

Large language model to multimodal large language model: A journey to shape the biological macromolecules to biological sciences and medicine

M Bhattacharya, S Pal, S Chatterjee, SS Lee… - … Therapy-Nucleic Acids, 2024 - cell.com
After ChatGPT was released, large language models (LLMs) became more popular.
Academicians use ChatGPT or LLM models for different purposes, and the use of ChatGPT …

Proteingpt: Multimodal llm for protein property prediction and structure understanding

Y **ao, E Sun, Y **, Q Wang, W Wang - arxiv preprint arxiv:2408.11363, 2024 - arxiv.org
Understanding biological processes, drug development, and biotechnological
advancements requires detailed analysis of protein structures and sequences, a task in …

Large knowledge model: Perspectives and challenges

H Chen - arxiv preprint arxiv:2312.02706, 2023 - arxiv.org
Humankind's understanding of the world is fundamentally linked to our perception and
cognition, with\emph {human languages} serving as one of the major carriers of\emph {world …

Presto: progressive pretraining enhances synthetic chemistry outcomes

H Cao, Y Shao, Z Liu, Z Liu, X Tang, Y Yao… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal Large Language Models (MLLMs) have seen growing adoption across various
scientific disciplines. These advancements encourage the investigation of molecule-text …

Instructbiomol: Advancing biomolecule understanding and design following human instructions

X Zhuang, K Ding, T Lyu, Y Jiang, X Li, Z **ang… - arxiv preprint arxiv …, 2024 - arxiv.org
Understanding and designing biomolecules, such as proteins and small molecules, is
central to advancing drug discovery, synthetic biology, and enzyme engineering. Recent …

ProLLaMA: A Protein Language Model for Multi-Task Protein Language Processing

L Lv, Z Lin, H Li, Y Liu, J Cui, CYC Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have achieved remarkable performance in multiple Natural
Language Processing (NLP) tasks. Under the premise that protein sequences constitute the …

Multi-modal conditional diffusion model using signed distance functions for metal-organic frameworks generation

J Park, Y Lee, J Kim - Nature Communications, 2025 - nature.com
The design of porous materials with user-desired properties has been a great interest for the
last few decades. However, the flexibility of target properties has been highly limited, and …

Logical Consistency of Large Language Models in Fact-checking

B Ghosh, S Hasan, NA Arafat, A Khan - arxiv preprint arxiv:2412.16100, 2024 - arxiv.org
In recent years, large language models (LLMs) have demonstrated significant success in
performing varied natural language tasks such as language translation, question-answering …

EvoLlama: Enhancing LLMs' Understanding of Proteins via Multimodal Structure and Sequence Representations

N Liu, C Sun, T Ji, J Tian, J Tang, Y Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
Current Large Language Models (LLMs) for understanding proteins primarily treats amino
acid sequences as a text modality. Meanwhile, Protein Language Models (PLMs), such as …