← Back to Library

AI-Designed Phages

A few months ago, Arc Institute released a new language model, called Evo 2, that can design entire genomes. In that original paper, though, the model’s designs — for a yeast chromosome and some small bacterial genomes — were entirely confined to a computer. The AI-generated genomes were not assembled or tested in the laboratory.

Although AI models are exceptionally good at designing proteins (including, recently, highly dynamic enzymes), there was little evidence that AI models could design viable genomes. Proteins are self-contained entities, made from a single strand of amino acids. But even the simplest genomes are composed of multiple genes and regulatory elements that must collaborate to build a functioning, living organism. A single mutation in a genome is often enough to render it entirely defunct.

But today, Arc Institute and Stanford University researchers have validated their designs in the real world, reporting the first viable genomes created using generative AI. They used fine-tuned versions of both Evo 1 and Evo 2 to create 16 bacteriophages modeled on ΦX174, a virus that infects E. coli bacteria.1 Some of these AI-generated phages work just as well or better at infecting E. coli cells compared to wild ΦX174. All of the fine-tuned models used in this work are also freely available on HuggingFace. The paper offers “a blueprint for the design of diverse synthetic bacteriophages,” the authors write, “and, more broadly, lays a foundation for the generative design of useful living systems at the genome scale.”

Choosing the Phage

Of the 13,000 known bacteriophage types, ΦX174 is the most widely studied. First discovered in the Paris sewers in 1935, its genome includes only 5,000 bases of single-stranded DNA, with eleven genes and at least seven regulatory elements, or short stretches of DNA that regulate which genes switch on at which times. So many genes fit in such a small sequence because they physically overlap one another, with some genes tucked in the middle of other genes.

ΦX174 is often used as a model organism in molecular biology because it is easy to work with. It infects a nonpathogenic strain of E. coli, which itself divides quickly and can be readily grown in the laboratory using nutrient-laden broth and a warm incubator. These phages are also structurally simple, even by bacteriophage standards — they are made from little more than a capsid, packed with the

...
Read full article on →