Deep Learning Weekly: Issue 416
This week in deep learning, we bring you OpenAI's gpt-oss, Pretraining: Breaking Down the Modern LLM Training Pipeline, and a paper on Routine: A Structural Planning Framework for LLM Agent System in Enterprise.
You may also enjoy Anthropic's Claude Opus 4.1, Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value, a paper on Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
OpenAI just released gpt-oss-120b and gpt-oss-20b—two state-of-the-art open-weight language models that deliver strong real-world performance at low cost.
Anthropic released Claude Opus 4.1, an upgrade to Claude Opus 4 on agentic tasks, real-world coding, and reasoning.
Genie 3: A new frontier for world models
The DeepMind team announced Genie 3, a general purpose world model that can generate an unprecedented diversity of interactive environments.
Introducing Command A Vision: Multimodal AI Built for Business
The Cohere team introduced Command A Vision, a new state-of-the-art generative model that brings enterprises leading performance across multimodal vision tasks while maintaining strong text capabilities.
MLOps & LLMOps
Pretraining: Breaking Down the Modern LLM Training Pipeline
The line between pretraining and fine-tuning in LLMs is increasingly blurred, making it harder to define what "training" means today. This article explores how evolving methods, inconsistent terminology, and opaque pipelines complicate understanding model behavior, emphasizing the critical role of pretraining and data curation in scaling LLMs responsibly.
AI judging AI: Scaling unstructured text analysis with Amazon Nova
A practical blog post about deploying LLM jury systems on Amazon Bedrock to scale unstructured text analysis.
Remember this: Agent state and memory with ADK
A Google Cloud blog post illustrating how to implement short-term and long-term memory for AI agents using the Agent Development Kit (ADK) and Vertex AI Memory Bank.
Learning
Full-Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value
A research paper that proposes full-stack alignment and thick models of value as an alternative to current human-AI value alignment approaches.
Why, When and How to Fine-Tune a Custom Embedding Model
A comprehensive technical article detailing the why, when, and how of fine-tuning custom text embedding models to improve retrieval performance in RAG systems.
This excerpt is provided for preview purposes. Full article content is available on the original publication.