Challenges and applications of large language models
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
Scientific large language models: A survey on biological & chemical domains
Large Language Models (LLMs) have emerged as a transformative power in enhancing
natural language comprehension, representing a significant stride toward artificial general …
natural language comprehension, representing a significant stride toward artificial general …
De novo design of protein structure and function with RFdiffusion
There has been considerable recent progress in designing new proteins using deep-
learning methods,,,,,,,–. Despite this progress, a general deep-learning framework for protein …
learning methods,,,,,,,–. Despite this progress, a general deep-learning framework for protein …
Bloomberggpt: A large language model for finance
The use of NLP in the realm of financial technology is broad and complex, with applications
ranging from sentiment analysis and named entity recognition to question answering. Large …
ranging from sentiment analysis and named entity recognition to question answering. Large …
Fast and accurate protein structure search with Foldseek
As structure prediction methods are generating millions of publicly available protein
structures, searching these databases is becoming a bottleneck. Foldseek aligns the …
structures, searching these databases is becoming a bottleneck. Foldseek aligns the …
Galactica: A large language model for science
Information overload is a major obstacle to scientific progress. The explosive growth in
scientific literature and data has made it ever harder to discover useful insights in a large …
scientific literature and data has made it ever harder to discover useful insights in a large …
Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution
Genomic (DNA) sequences encode an enormous amount of information for gene regulation
and protein synthesis. Similar to natural language models, researchers have proposed …
and protein synthesis. Similar to natural language models, researchers have proposed …
OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization
AlphaFold2 revolutionized structural biology with the ability to predict protein structures with
exceptionally high accuracy. Its implementation, however, lacks the code and data required …
exceptionally high accuracy. Its implementation, however, lacks the code and data required …
Automated model building and protein identification in cryo-EM maps
Interpreting electron cryo-microscopy (cryo-EM) maps with atomic models requires high
levels of expertise and labour-intensive manual intervention in three-dimensional computer …
levels of expertise and labour-intensive manual intervention in three-dimensional computer …
Foldseek: fast and accurate protein structure search
Highly accurate structure prediction methods are generating an avalanche of publicly
available protein structures. Searching through these structures is becoming the main …
available protein structures. Searching through these structures is becoming the main …