← Back to Library

BioByte 135: improvements to biosecurity screening, CPTAC-PROTSTRUCT and KRONOS demonstrate LLM fine-tuning for patient-specific prognoses, new methods for imaging at single-protein resolution

Welcome to Decoding Bio’s BioByte: each week our writing collective highlight notable news—from the latest scientific papers to the latest funding rounds—and everything in between. All in one place.

What we read

Papers

Strengthening nucleic acid biosecurity screening against generative protein design tools [Wittman et al., Science, October 2025]

Why it matters: Nucleic acid screening companies are the watchtowers that ensure that new strands of DNA being ordered are not harmful for humanity. As deep learning has transformed protein design, bad actors’ ability to design harmful sequences becomes more robust. Researchers from Microsoft led a red-teaming collaboration to help nucleic acid screening companies improve their ability to catch potentially toxic proteins that look benign in sequence space but could be troublesome based on their structure.

Historically, biosecurity screening has relied on sequence-based homology searches. Methods like BLAST enable scientists to flag sequences if they closely resemble a known toxin or pathogen gene. This approach worked because dangerous proteins were generally sequence-similar to previously cataloged ones. But with modern deep-learning tools, we can now predict 3D protein structures with remarkable accuracy and compare them using structure-based tools like FoldSeek, revealing similarities that sequence alone might miss. At the same time, generative protein-design models such as ProteinMPNN and EvoDiff can create entirely new sequences that differ dramatically in amino-acid identity yet still fold into the same functional structures as known toxins. To test and strengthen global biosecurity infrastructure, Microsoft researchers conducted a biosecurity red-team exercise with synthesis providers to “attack” and stress-test their screening systems using AI-designed proteins meant to evade detection.

The team computationally generated 76,080 synthetic homologs derived from 72 wild-type proteins of concern (including ricin, botulinum neurotoxins, and viral effectors). Each variant was designed using open-source generative models at different mutation levels to test how far sequence drift could go while preserving structure. Although the study did not synthesize or experimentally validate the proteins, structure predictions with OpenFold showed that many variants retained folds strongly resembling the originals - meaning they should, in principle, trigger screening alerts. The researchers also explored a DNA-obfuscation strategy, in which protein-coding sequences were fragmented, scrambled, and reversed across reading frames to mimic how a malicious actor might further disguise intent from DNA-level screens.

The red-team exercise revealed that existing “best-match” sequence-based frameworks missed a substantial fraction of these AI-reformulated toxins. In response, three major biosecurity-screening providers (Aclid, IBBIS, and RTX

...
Read full article on →