How d-Matrix's In-Memory Compute Tackles AI Inference Economics
Deep Dives
Explore related topics with these Wikipedia articles, rewritten for enjoyable reading:
-
In-memory processing
13 min read
The article's central focus is on d-Matrix's in-memory compute approach for AI inference. Understanding the broader technical foundations of in-memory computing - how it differs from traditional von Neumann architectures and why moving computation closer to data reduces latency - provides essential context for evaluating d-Matrix's claims.
-
Memristor
14 min read
The article mentions memristors as one approach to implementing analog in-memory compute weights. Memristors are a fascinating fourth fundamental circuit element theorized by Leon Chua in 1971 and first physically realized in 2008, with significant implications for neuromorphic computing that readers may not know deeply.
-
Ohm's law
13 min read
The article explains how analog in-memory compute leverages Ohm's law (I=GV) for multiplication operations. While readers may remember the basic formula, the deeper history of Georg Ohm's discovery, the physics behind electrical resistance, and its foundational role in electronics provides enriching context for understanding why this natural property enables efficient computation.
Each week, I help investors and professionals stay up-to-date on the semiconductor industry. If you’re new, start here. See here for all the benefits of upgrading your subscription tier!
Paid subscribers will get have access to a video discussion of this essay, key highlights, and a google drive link to this article to parse with LLMs.
Disclaimer: This article is entirely my own opinion. I have not been paid by d-Matrix, nor do I have any access to internal documents. All information is publicly available (references cited). I do not hold any investment position in d-Matrix, and this is not investment advice. Do your own research. This article does not reflect the views of any past, present, or future employers, nor does it directly or indirectly imply any competitors are better or worse. This is my attempt at trying to understand how core technology works and where its advantages lie. I do not endorse any products.
Disclosure: I requested that d-Matrix review the article to ensure that I do not misunderstand/misrepresent their technology. I’m grateful to them for pointing out errors in my conceptual understanding. All editorial decisions are entirely mine.
Recently d-Matrix, a Bay Area AI inference chip startup, announced its Series C funding of $275M which brings its total funding up to $450M.
d-Matrix claims to have the “world’s highest performing, most efficient data center inference platform for hyperscale, enterprise, and sovereign customers,” and a “full-stack inference platform that combines breakthrough compute-memory integration, high-speed networking, and inference-optimized software to deliver 10× faster performance, 3× lower cost, and 3–5× better energy efficiency than GPU-based systems.”
Their main compute engine is called Corsair, and is based on a different approach to inference called in-memory compute. In this post, we will look at this technology in detail, how it provides all those benefits, and where it is useful.
For free subscribers:
Analog in-memory computing
d-Matrix’s digital in-memory compute solution
Four chiplets and LPDDR5
Scaling up to rack-level solutions
References
For paid subscribers:
A real-world use-case for d-Matrix DIMC hardware
Designing Hardware for Latency, Throughput, and TCO
The PCIe Advantage
Possible Uses of Small Inference Models running d-Matrix Hardware
Analog In-memory Compute (AIMC)
The process of AI training and inference involves a lot of matrix multiplications, followed by additions, which come from vector multiplications. If you need a deeper understanding of what those operations are ...
This excerpt is provided for preview purposes. Full article content is available on the original publication.