← Back to Library

Deep Learning Weekly: Issue 428

Deep Dives

Explore related topics with these Wikipedia articles, rewritten for enjoyable reading:

  • Self-supervised learning 15 min read

    The Concerto paper discusses joint 2D-3D self-supervised learning for spatial representations. Understanding self-supervised learning—how models learn from unlabeled data by creating their own supervisory signals—provides essential context for grasping why this approach to spatial cognition is significant and how it differs from traditional supervised methods.

  • Knowledge graph 1 min read

    The ODKE+ paper focuses on automatically extracting facts into knowledge graphs from web sources. Knowledge graphs are foundational data structures in AI that represent relationships between entities, and understanding their architecture and applications illuminates why maintaining their freshness and completeness is so challenging and valuable.

  • Time series 13 min read

    Chronos-2 is described as a foundation model for forecasting tasks including univariate and multivariate predictions. Time series analysis—the study of data points collected over time intervals—provides the mathematical foundation for understanding what makes forecasting challenging and why a 'universal' forecasting model represents a significant advance.

This week in deep learning, we bring you DeepSeek-OCR, Introducing Chronos-2: From univariate to universal forecasting, and a paper on InteractComp: Evaluating Search Agents With Ambiguous Queries.

You may also enjoy Advancing Claude for Financial Services, LLM Inference Economics from First Principles, a paper on Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations, and more!

As always, happy reading and hacking. If you have something you think should be in next week’s issue, find us on Twitter: @dl_weekly.

Until next week!


Industry

DeepSeek-OCR: Revolutionary Context Compression Through Optical 2D Mapping

DeepSeek AI unveiled DeepSeek-OCR, an approach to compressing long contexts via optical 2D mapping.

Advancing Claude for Financial Services \ Anthropic

Anthropic expanded Claude for Financial Services with an Excel add-in, additional connectors to real-time market data and portfolio analytics, and new pre-built Agent Skills.

Introducing Chronos-2: From univariate to universal forecasting

Amazon introduced Chronos-2, a foundation model designed to handle arbitrary forecasting tasks — univariate, multivariate, and covariate informed — in a zero-shot manner.

Grammarly transforms into AI-enabled productivity suite with Superhuman rebrand

Grammarly, best known for AI-powered proofreading and writing, announced its rebrand to Superhuman: a full-featured AI-native productivity platform.

MLOps & LLMOps.

LLM Tracing: The Foundation of Reliable AI Applications

An article discussing that LLM tracing is the foundation of reliable AI applications by capturing end-to-end steps to diagnose non-deterministic and semantic failures.

Build AI Agents Worth Keeping: The Canvas Framework

An article about why enterprise AI agent projects fail and how to use product-first canvas frameworks to build agents that actually reach production.

Learning

LLM Inference Economics from First Principles

A detailed article explaining LLM inference economics from first principles, focusing on how batching is the key to profitability by offsetting memory-bound costs in the token-by-token generation phase.

T*: Rethinking Temporal Search for Long-Form Video Understanding

An article introducing the T* temporal search algorithm, which reframes long-form video understanding as spatial search to efficiently locate relevant frames.

Learning from Failure to Tackle Extremely Hard Problems

A research blog post introducing BaNEL (Bayesian Negative Evidence Learning), an algorithm that post-trains generative models efficiently using only negative reward samples to tackle extremely sparse, hard problems.

Post-Training Generative Recommenders with Advantage-Weighted Supervised Finetuning

A novel study presenting Advantage-Weighted Supervised Fine-tuning (A-SFT), an algorithm for post-training generative recommenders.

Artificial intelligence could dramatically improve weather forecasting

An article about how AI could dramatically ...

Read full article on Deep Learning Weekly →