← Back to Library

Import AI 418: 100b distributed training run; decentralized robots; AI myths

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe.

Better video models with radial attention:
…Efficiency improvements for internet-generated media…
Researchers with MIT, NVIDIA, Princeton, UC Berkeley, Stanford and startup First Intelligence have built and released Radial Attention, an attention mechanism that can be used for training and sampling from video generation models.
"Unlike image generation, video synthesis involves an additional temporal dimension, dramatically increasing the number of tokens to process. As self attention scales quadratically with sequence length, training and inference on long videos become prohibitively expensive, limiting model practicality and scalability," they write. "The key insight of Radial Attention is that attention scores between tokens decay with increasing spatial and temporal distance. This motivates us to allocate computation based on the inherent spatiotemporal correlations in video data".

Good performance on real world models: The results are convincing: the authors show that they're able to get a 2.78X training speedup and 2.35X inference speedup on Hunyuan Video, a good video generation model from Tencent.
They also demonstrate similarly good performance (1.78X training, 1.63X inference) on the Mochi 1 video model.
"At default video lengths, Radial Attention achieves up to a 1.9× speedup while maintaining video quality. For videos up to 4× longer, Radial Attention preserves video fidelity and delivers up to 4.4× and 3.7× speedups in training and inference, respectively, with minimal LoRA fine-tuning," they write.

Why this matters - making it cheaper to do AI entertainment: The internet has become a vast engine for the consumption of video content - see social media shorts, YouTube, the streaming services, etc. Technologies like Radial Attention will help lower the cost of training and sampling from AI video models, which will make it cheaper to produce synthetic video content. Where the internet before was the place that we stored videos that were gathered from the world, it will now increasingly become a machine where people use internet-mediated services to generate videos, then internet-mediated services to propagate them as well.
Read more: Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation (arXiv).
Get the code for Radial Attention here (MIT-han-lab, GitHub).

***

Pete Buttigieg thinks AI is a big deal:
Fellow Substacker and former presidential candidate Pete Buttigieg has written a post about how he ...

Read full article on Import AI →