The Anatomy of the Least Squares Method, Part One

By Tivadar Danka · The Palindrome ·Oct 13, 2025 · 14 min read

Hi there! It’s Tivadar from The Palindrome.

Today’s post is the first in a series by the legendary , educator extraordinaire.

In case you haven’t encountered him yet, Mike is an extremely prolific author; his textbooks and online courses range from time series analysis through statistics to linear algebra, all with a focus on practical implementations as well.

He also recently started on Substack, and if you enjoy The Palindrome, you’ll enjoy his publication too. So, make sure to subscribe!

The following series explores the least squares method, a foundational tool in mathematics, data science, and machine learning.

Have fun!

Cheers,
Tivadar

By the end of this post series, you will be confident about understanding, applying, and interpreting the least-squares algorithm for fitting machine learning models to data. “Least-squares” is one of the most important techniques in machine learning and statistics. It is fast, one-shot (non-iterative), easy to interpret, and mathematically optimal. Here’s a breakdown of what you’ll learn:

Post 1 (what you’re reading now 🙂): Theory and math. You’ll learn what “least-squares” means, why it works, and how to find the optimal solution. There’s some linear algebra and calculus in this post, but I’ll explain the main take-home points in case you’re not so familiar with the math bits.

Post 2: Explorations in simulations. You’ll learn how to simulate data to supercharge your intuition for least-squares, and how to visualize the results. You’ll also learn about residuals and overfitting.

Post 3: real-data examples. There’s no real substitute for real data. And that’s what you’ll experience in this post. I’ll also teach you how to use the Python statsmodels library.

Post 4: modeling GPT activations. This post will be fun and fascinating. We’ll dissect OpenAI’s LLM GPT2, the precursor to its state-of-the-art ChatGPT. You’ll learn more about least-squares and also about LLM mechanisms.

Following along with code

I’m a huge fan of learning math through coding. You can learn a lot of math with a bit of code.

That’s why I have Python notebook files that accompany my posts. The essential code bits are pasted directly into this post, but the complete code files, including all the code for visualization and additional explorations, are here on my GitHub.

If you’re more interested in the theory/concepts, then it’s completely fine to ignore the code and just read the post. But if you want a deeper level of ...

Read full article on The Palindrome →

This excerpt is provided for preview purposes. Full article content is available on the original publication.