A "ChatGPT moment for biology": What is ESM3?
EvolutionaryScale, backed by Amazon and Nvidia, raises $142M for protein-generating AI
Hello Everyone,
I hope you are enjoying your summer. This week EvolutionaryScale launched their cutting-edge ESM3 model for researchers and scientists.
With $142 million for protein-generating AI it might be an important milestone for Generative AI in biotech with backers such as Nvidia and Amazon.
Why is this Important?
EvolutionaryScale, is a frontier AI research lab for biology, launched on June 25th, 2024 with ESM3, a milestone AI model capable of generating novel proteins.
ESM3 is unique because ESM3 it’s a model that’s the third-generation ESM model, which simultaneously reasons over the sequence, structure and functions of proteins, giving protein discovery engineers a programmable platform.
Alexander Rives is a “Meta-AI Mafia member”. In August, 2023 Meta chose to disband its LLM team working on biology models. What a titanic mistake that would end up being.
Frontier AI for the Life Sciences
It was not surprise then to see Yann LeCun post on LinkedIn about the company.
Using ESM3 and a simulated evolutionary process, they have produced a new type GFP (Green Fluorescent Protein) different from anything found in nature.
Synthetic Bio meets LLMs
Biology is fundamentally programmable. Every living organism shares the same genetic code across the same 20 amino acids—life’s alphabet. ESM3 understands all of this biological data, translates it, and speaks it fluently to be used as a generative tool.
ESM3 is a generative language model for programming biology. In experiments, we found ESM3 can simulate 500M years of evolution to generate new fluorescent proteins.
ESM3: A frontier language model for biology
🟢 They announced ESM3, the first generative model for biology that simultaneously reasons over the sequence, structure, and function of proteins.
ESM3 is trained across the natural diversity of the Earth—billions of proteins, from the Amazon rainforest, to the depths of the oceans, extreme environments like hydrothermal vents, and the microbes in a handful of soil.
🟢 Trained on one of the highest throughput GPU clusters in the world today, ESM3 is a frontier generative model for biology created at the leading edge of parameters, computational power, and data. We believe that ESM3 is the most compute ever applied to training a biological model, trained with over 1x1024 FLOPS and 98B parameters.
Across AI we see the power of scaling. As model scale increases in parameters, data, and compute, larger models gain new emergent capabilities that smaller models lack. In many different domains generalist models trained on diverse data are outperforming specialist models. The incredible pace of new AI advances is being driven by increasingly large models, increasingly large datasets, and increasing computational power.
The same patterns hold true in biology. In research over the last five years, the ESM team has explored scaling in biology. We find that as language models scale they develop an understanding of the underlying principles of biology, and discover biological structure and function.
ESM3 represents a milestone model in the ESM family—the first created by our team at EvolutionaryScale, an order of magnitude larger than our previous model ESM2, and natively multimodal and generative.
The founding team at EvolutionaryScale and behind ESM3 are pioneers in applying AI to biology, building what is widely considered to be the first transformer language model for proteins ESM1.
Rives, along with Tom Sercu and Sal Candido, began developing generative AI models to explore proteins while at Meta’s AI research lab, FAIR, in 2019. After their team was disbanded, Rives, Sercu and Candido left Meta to continue building on the work they’d started.
“Launching ESM3 marks the beginning of the most exciting part of building these tools: learning from how the scientific community leverages our work.”
Keep reading with a 7-day free trial
Subscribe to Artificial Intelligence Learning 🤖🧠🦾 to keep reading this post and get 7 days of free access to the full post archives.