Recently, the Evo2 bio AI model, jointly developed by Arc Institute and Nvidia, was officially released. This foundational model is based on DNA data from over 100,000 species, aiming to deeply decode various complex phenomena in biology. Evo2 can identify patterns in the genetic sequences of different organisms that researchers would typically take years to discover, significantly enhancing the ability to identify disease-related mutations and enabling the design of entirely new genomes comparable to simple bacteria.

image.png

The training of Evo2 involved processing over 93 trillion nucleotides, far surpassing its predecessor, Evo1. The development team consists of members from Nvidia and the nonprofit biomedical research institution Arc Institute in Palo Alto, California, who collaborated closely with researchers from Stanford University, the University of California, Berkeley, and the University of California, San Francisco. Evo2 not only possesses powerful computational capabilities but has also made positive explorations in transparency and interpretability. To promote open scientific research, the research team has made Evo2's training data, code, and model weights publicly available, marking it as the largest fully open-source bio AI model to date.

image.png

Patrick Hsu, co-founder of Arc Institute and assistant professor at UC Berkeley, stated that the development of Evo2 represents a significant breakthrough in the field of generative biology. Through this technology, machines can "read," "write," and "think" in the language of nucleotides, advancing the progress of biological research. Evo2's training capabilities are comparable to those of large-scale language models, demonstrating strong potential in predicting disease mutations and designing potential artificial life.

Moreover, Evo2 can provide new ideas for the design of biotherapies, such as gene therapies activated by specific cell types, to reduce side effects and improve treatment precision. The development of Evo2 is not only a technical breakthrough but also has a profound impact on the understanding of biology.

While ensuring the responsible development of the model, researchers deliberately excluded data from pathogens that could infect humans and other complex organisms. Anthony Costa, Nvidia's Director of Digital Biology, stated that Evo2 breaks the limitations of biological foundational models, providing scientists worldwide with a powerful collaborative tool to address significant health and disease challenges faced by humanity.