← Home

Breaking the Curse: How Analytics Built the 2016 Cubs

How Theo Epstein's analytics staff—Chris Moore, Jeff Greenberg, Ryan Kruse—built the infrastructure that ended baseball's longest championship drought.

On November 2, 2016, the Chicago Cubs defeated the Cleveland Indians in Game 7 of the World Series, ending the longest championship drought in North American professional sports. The final score was 8-7 in 10 innings, capping one of the most dramatic games in baseball history. Goats and curses were finally exorcised. A city wept with joy. For Cubs fans who had waited generations, the moment was transcendent. But for students of baseball analytics, the Cubs' championship represented something equally significant: the culmination of one of the most sophisticated data-driven rebuilds the sport had ever seen.

This is the story of how the Cubs used cutting-edge analytics to assemble a championship roster, why their window closed faster than expected, and what the data reveals about the elusive quest to build a dynasty in modern baseball.

Part I: The Architect Arrives

When Theo Epstein left the Boston Red Sox to become the Cubs' President of Baseball Operations in October 2011, he inherited a franchise that had become synonymous with failure. The Cubs hadn't won a World Series since 1908, hadn't even appeared in one since 1945, and had finished with a losing record in six of the previous nine seasons. The farm system was barren. The major league roster was aging. The path forward was clear: burn it down and rebuild from scratch.

Epstein brought more than just a resume that included breaking Boston's 86-year drought in 2004. He brought a philosophy rooted in data, analytics, and process over outcomes. In Boston, he had worked with Bill James, the godfather of sabermetrics. He understood that baseball's traditional evaluation methods—scouting reports, batting average, wins and losses for pitchers—were incomplete at best, misleading at worst. The Cubs rebuild would be different. It would be scientific.

The first two years were brutal by design. The Cubs lost 101 games in 2012 and 96 in 2013. But these weren't aimless losses. Every roster decision was made with an eye toward acquiring draft picks and identifying undervalued talent. The Cubs weren't just tanking; they were accumulating assets with mathematical precision.

Consider the specific moves Epstein made during those early years. In 2012, Epstein traded Ryan Dempster, the team's best pitcher, at the trade deadline for prospects. The return included Kyle Hendricks, a minor league arm with mediocre stuff but exceptional command. Traditional scouting saw a future back-of-the-rotation starter at best. The Cubs' analytics saw something more valuable. They also traded Paul Maholm and Reed Johnson for prospects, stripped the roster of any player with positive trade value, and deliberately fielded a team designed to lose games in pursuit of draft position.

The 2012 Cubs had a team WAR (Wins Above Replacement) of just 19.8—the lowest in all of baseball that year. Their starting rotation combined for a mere 3.1 WAR. Their offense produced a collective wOBA of .299, well below league average. But in the amateur draft that year, they selected Albert Almora Jr. with the sixth overall pick. In 2013, with another high pick secured, they took Kris Bryant second overall. The tank was working exactly as designed and hoped.

What separated the Cubs' rebuild from other tanking attempts was the intentionality. They didn't just lose; they lost while building infrastructure. They invested heavily in player development, hiring coaches who could translate raw talent into major league production. They built scouting networks in Latin America and Asia. They created analytical systems that could identify market inefficiencies before anyone else. The losing seasons were investments, not surrenders.

Chart: cubs-rebuild-arc

The chart above tells the story in stark terms. Win totals bottomed out at 61 in 2012, then climbed steadily: 66, 73, 97, 103. The 2016 team won 103 games in the regular season, the most by a Cubs team since 1910. Their run differential of +252 was the highest in baseball since the 1998 Yankees.

Part II: The Analytics Revolution in the Front Office

The Cubs' rebuild wasn't just about acquiring talent. It was about evaluating talent differently than their competitors. When Epstein arrived, the Cubs were behind—under previous GM Jim Hendry, they had employed just one full-time advanced statistician. Epstein had to build an analytics infrastructure from scratch, without even the customized software ("Carmine") he had developed in Boston.

Central to this effort was Chris Moore, who joined as Director of Research and Development in October 2013. Moore held a PhD in psychology and neuroscience from Princeton, and he built and led a 10-person analytics staff. His team developed statistical models for win probability, pitch prediction, and player evaluation. When the Cubs drafted Kyle Schwarber in 2014—a pick many considered a reach—Moore's models had identified value others missed.

The Cubs also brought in Tom Tango as a part-time consultant in 2013. Tango, who had co-authored "The Book: Playing the Percentages in Baseball" (2006), worked nights and weekends while maintaining a day job elsewhere. The Cubs licensed his expertise exclusively, preventing him from consulting for other MLB teams. He departed for MLB Advanced Media in 2016.

Tango's metrics—wOBA, FIP, Leverage Index—are well-documented because he published them publicly. The Cubs' proprietary methods developed by Moore's staff remain secret. We know about Tango's contributions because he wrote about them; we know far less about what the Cubs' full-time analysts actually built. This creates a distorted picture where the published consultant gets credit while the internal team's innovations stay hidden. What follows describes Tango's public concepts, which influenced the Cubs' approach—but the day-to-day analytical work was driven by Moore's 10-person staff using methods we may never fully understand:

wOBA: Weighted On-Base Average

Traditional batting average treats all hits equally. A single counts the same as a home run. Obviously, this is wrong. Tango's wOBA assigns different weights to different offensive events based on their actual run value. A home run is worth roughly twice as much as a single. A walk is worth almost as much as a single. This metric allows teams to identify players who are actually producing runs, not just accumulating hits.

The league average wOBA hovers around .320. In 2016, Kris Bryant posted a .396 wOBA, Anthony Rizzo hit .382, and Ben Zobrist contributed .371. The Cubs' lineup was stacked with players who created runs at elite rates, even if their traditional stats didn't always reflect it.

Consider what this means in practice. A player hitting .280 with 10 home runs and 50 walks might have a higher wOBA than a player hitting .300 with 15 home runs but only 20 walks. The first player is getting on base more often and doing so in ways that correlate with run production. Traditional scouts might prefer the .300 hitter. The Cubs' models knew better.

The Cubs used wOBA not just for player evaluation but for lineup construction. Joe Maddon's lineups were optimized to maximize run expectancy based on wOBA and platoon splits. High-wOBA hitters were clustered together to create chain-reaction scoring opportunities. The analytics department could model expected runs for any lineup configuration, allowing Maddon to make data-driven decisions that looked counterintuitive to traditionalists.

Chart: cubs-woba-2016

FIP: Fielding Independent Pitching

Traditional ERA (Earned Run Average) has a major flaw: it's heavily influenced by factors outside a pitcher's control, particularly defensive quality and luck on balls in play. FIP isolates what a pitcher can control: strikeouts, walks, hit-by-pitches, and home runs. The formula is surprisingly simple:

FIP = ((13 * HR) + (3 * (BB + HBP)) - (2 * K)) / IP + FIP constant

The FIP constant adjusts the scale to match league-average ERA, making the numbers intuitive to interpret. A pitcher with a 3.50 FIP is performing like a pitcher who "should" have a 3.50 ERA, regardless of their actual ERA.

This mattered enormously for player evaluation. When the Cubs acquired Jake Arrieta from Baltimore in 2013, his career ERA was 5.46. Terrible. But his FIP suggested he was significantly better than his results indicated. The Cubs bet on the underlying skills, not the outcomes. They were spectacularly right.

The gap between ERA and FIP is called "luck" in sabermetric circles, though that's a simplification. A pitcher with an ERA well above his FIP is often experiencing bad defense, bad sequencing by catchers, or random variation in when hits happen to fall. The Cubs systematically sought pitchers whose ERAs exceeded their FIPs, betting that regression would work in their favor.

In Arrieta's case, his Baltimore ERA was 5.46 but his FIP was closer to 4.50—still not good, but suggesting a league-average pitcher trapped in bad circumstances. The Cubs believed they could unlock even more by addressing his mechanics. They were vindicated beyond their wildest projections.

Chart: cubs-fip-era-2016

Leverage Index: Managing the Bullpen

Tango's Leverage Index measures the importance of a particular game situation. A tie game in the 9th inning has high leverage; a blowout in the 3rd has low leverage. The Cubs used this to optimize bullpen usage, deploying their best relievers in high-leverage situations regardless of inning, rather than saving the "closer" for the 9th.

Manager Joe Maddon became famous for his unconventional bullpen management, but it wasn't random. It was driven by Leverage Index calculations that told him exactly when each game hung in the balance. Aroldis Chapman, acquired at the trade deadline, was used aggressively in high-leverage spots throughout the playoffs, not just in save situations.

Traditional bullpen usage follows rigid roles: setup man in the 8th, closer in the 9th. But this ignores the reality that many 9th innings are low-leverage (leading by 3 runs) while many 7th innings are high-leverage (tie game, runners on base). The Cubs' analytical approach treated each plate appearance as a separate optimization problem. When should Chapman pitch? When the game matters most—regardless of the inning.

This approach was controversial. Baseball tradition runs deep, and many relievers bristled at losing their "defined roles." But the data was unambiguous: using your best pitchers in the highest-leverage situations wins more games. The Cubs estimated this approach was worth an additional 2-3 wins per season—not huge, but enough to matter in a sport where playoff berths are decided by single games.

In the 2016 World Series, Maddon's leverage-based management was on full display. Chapman pitched in Game 5 with the Cubs down 3-1 in the series—not a save situation, but the highest-leverage moment of the season. He got the final eight outs, preserving a 3-2 win that kept the Cubs alive. Traditional management would have saved him for a hypothetical save opportunity; analytical management recognized that opportunity might never come if they lost Game 5.

Chart: cubs-leverage-2016

Part III: Ivy—The Scouting Revolution

The Cubs' most significant technological advantage was "Ivy," an internal operating system that became the backbone of their baseball operations. Ryan Kruse, who spent nearly seven years with the organization, was the "creator and architect" of the system. Jeff Greenberg, who joined as an intern in 2011 and rose to Assistant GM, was at "the vanguard of Ivy"—present for its developmental stages and witnessing the three-year buildout that transformed how the Cubs evaluated talent.

Ivy was organized around player names rather than teams or leagues, eliminating bias from the evaluation process. According to Matt Dorey, then the Cubs' Director of Amateur Scouting: "Once you hit that player page, there's tabs for every piece of information on that player you could ever want, whether it's pro or amateur, on the field or off."

The system aggregated data from multiple sources: traditional scouting reports, video analysis, biomechanical data, minor league statistics, and proprietary metrics. But its real innovation was in synthesis. Rather than treating scouting and analytics as separate disciplines, Ivy forced them to talk to each other.

A scout might report that a prospect had "plus bat speed." Ivy would cross-reference this with actual batted ball data. Were balls leaving the bat at velocities consistent with "plus" bat speed? If not, why the discrepancy? Maybe the scout was seeing potential that hadn't yet translated to results. Maybe the player had a mechanical flaw that could be corrected. The system generated questions as much as answers.

Ivy also tracked historical patterns. When a scout used the phrase "projectable frame," the system could compare to other players who'd received that label. Did "projectable frame" prospects actually develop as expected? At what rate? In which positions? This forced the organization to confront its own biases and assumptions with historical data.

The integration went both ways. If the analytics showed a player was underperforming his batted ball data, scouts were dispatched to investigate. Maybe there was an injury. Maybe a mechanical hitch. Maybe a personal issue affecting performance. The data flagged anomalies; humans investigated causes. Neither approach was sufficient alone.

Ivy's most important feature may have been its accessibility. Previously, scouting reports lived in filing cabinets and the memories of veteran scouts. Minor league stats required manual compilation. Video existed but wasn't easily searchable. Ivy put everything in one place, queryable, sortable, comparable. A front office staffer could pull up every left-handed pitcher in the organization with a plus changeup in under a minute.

The system's effectiveness is demonstrated by what happened next. After the Cubs' championship, the Blackhawks hired both Greenberg and Kruse to build a similar platform for hockey, dubbed "Madhouse." Greenberg had finished as runner-up in the Blackhawks' GM search before accepting a role building out their analytics infrastructure. The Cubs' analytics staff became a talent pipeline for other organizations—both within baseball and across sports—a testament to the infrastructure Epstein built.

Like Moore's analytical methods, Ivy's specific algorithms and models remain proprietary. We know its structure from interviews, but not its substance. The Cubs never published how they weighted different data sources or what predictive models powered their evaluations. That intellectual property walked out the door with the staff who built it.

Part IV: The Trades That Built a Champion

The Cubs' championship roster wasn't built through free agency splurges. It was built through a series of trades that now look like highway robbery. The common thread: the Cubs consistently acquired players that other organizations had undervalued.

Chart: cubs-trades-net-war

Look at the numbers. The Rizzo trade netted +15 WAR (Wins Above Replacement). The Hendricks trade—where the Cubs gave up only cash considerations—added +17.2 WAR. The Arrieta trade brought in +18.3 WAR for essentially nothing (Scott Feldman and Steve Clevenger combined for 1.2 WAR after leaving Chicago).

Jake Arrieta: The Redemption Story

No trade better exemplifies the Cubs' analytical approach than the July 2013 acquisition of Jake Arrieta from Baltimore. The Orioles had given up on him. His career numbers were dismal: 4.97 career ERA, 1.49 WHIP, more walks than you'd want from a starter.

Chart: cubs-arrieta-transformation

What happened next was remarkable. The Cubs identified that Arrieta's problems were mechanical and pitch-mix related, not stuff-related. His arm was still elite. Working with pitching coach Chris Bosio, they made specific changes:

The results were immediate and staggering. In his first partial season with Chicago, Arrieta posted a 3.66 ERA. In 2014, he dropped to 2.53. In 2015, he was arguably the best pitcher in baseball: 22-6, 1.77 ERA, 236 strikeouts, and a Cy Young Award. His FIP that season was 2.35—elite by any measure.

Arrieta's transformation wasn't magic. It was the application of data to a talented arm that had been mismanaged. The Cubs saw what Baltimore couldn't, because they were looking at different numbers.

Kyle Hendricks: The Professor

If Arrieta was a reclamation project, Kyle Hendricks was a complete heist. The Cubs acquired him from Texas in 2012 as part of the Ryan Dempster trade. The Rangers included him almost as an afterthought. Texas wanted Dempster's veteran presence; the Cubs wanted the young arm with the interesting profile.

Chart: cubs-hendricks-value

Hendricks didn't throw hard—his fastball sat around 87 mph, well below league average. But he had exceptional command, a devastating changeup, and an uncanny ability to induce weak contact. The Cubs' analytical models valued these traits more than raw velocity.

The analytics department had developed models showing that command was undervalued relative to velocity. A pitcher who could locate 90 mph fastballs on the corners was often more effective than one who could throw 95 mph down the middle. Hendricks was the extreme case: below-average velocity, elite location. His pitch-by-pitch data showed he consistently hit his spots within inches of the target.

His changeup was particularly devastating. The Cubs' analysis showed that the pitch tunneled perfectly with his fastball—same arm slot, same initial trajectory, different speeds. Hitters couldn't distinguish between them until too late. In 2016, opponents hit just .183 against his changeup with a whiff rate above 35%. This wasn't stuff overpowering hitters; it was deception and sequencing, optimized by data.

In 2016, Hendricks led the National League with a 2.13 ERA. His FIP was 2.09— even better than his actual results. He wasn't getting lucky; he was executing a game plan built on pitch tunneling and sequencing that left hitters flailing. The Cubs paid him less than $1 million that season. He provided 4.6 WAR. The surplus value was enormous—probably the best pitcher contract in baseball that year.

Part V: The 2016 Championship

By 2016, all the pieces were in place. The homegrown core of Bryant, Rizzo, Baez, Schwarber, and Russell had arrived. The trade acquisitions—Arrieta, Hendricks, Fowler, Zobrist—were contributing at peak levels. The team rolled through the regular season, winning 103 games with a +252 run differential.

The postseason was more dramatic. After sweeping the Giants in the NLDS— a series where the Cubs outscored San Francisco 11-4 and never trailed after the fourth inning of Game 1—they faced the Dodgers in the NLCS. Los Angeles took a 2-1 series lead, raising the specter of another Cubs collapse. But the analytics suggested calm.

The Cubs' models showed they had actually outplayed the Dodgers in xwOBA (expected weighted on-base average, which strips out luck on batted balls). They were hitting the ball hard; the results just weren't falling. Probability models gave the Cubs a better chance of winning Games 4-7 than their 2-1 deficit suggested. The front office urged patience, and the team delivered— winning three straight to advance.

The World Series against Cleveland was a rollercoaster: the Cubs fell behind 3-1, staring at elimination. Game 5 was do-or-die, and the Cubs responded with a 3-2 victory behind Jon Lester and Aroldis Chapman. Game 6 was a blowout, 9-3, forcing a decisive Game 7.

What happened in Game 7 is etched in baseball history. The Cubs built a 6-3 lead into the 8th inning, then collapsed. Cleveland tied it on a Rajai Davis home run off Chapman, who was pitching on fumes after heavy usage throughout the playoffs. A rain delay in the 10th inning—17 minutes that felt like hours to Cubs fans— gave both teams a chance to regroup. When play resumed, the Cubs scored two runs in the top of the 10th, then survived a Cleveland rally to win 8-7. One hundred and eight years of futility ended in extra innings, in the rain, with the tying run on base.

Key 2016 Stats

  • 103-58 regular season record (best since 1910)
  • +252 run differential (best in MLB since 1998 Yankees)
  • 7.7 WAR from Kris Bryant (MVP season)
  • 2.13 ERA from Kyle Hendricks (league-leading)
  • Game 7 win in 10 innings to end 108-year drought

But even in triumph, there were warning signs. The Cubs traded Gleyber Torres, their top prospect, to acquire Aroldis Chapman for the stretch run. Torres would go on to become an All-Star with the Yankees. The Cubs' WAR from Chapman in the playoffs was roughly 1.4; Torres has already accumulated over 22 WAR in his career. It was a trade that prioritized the present over the future— a pattern that would define the Cubs' post-championship years.

Part VI: The Draft Bonanza

The Cubs' rebuild was built on a foundation of high draft picks accumulated during the tanking years. Let's look at how those picks performed:

Chart: cubs-draft-outcomes

The headliners are obvious. Kris Bryant (2013, #2 overall) has accumulated 27 career WAR. Javier Baez (2011, #9 overall) contributed 14.9 WAR with the Cubs. Kyle Schwarber (2014, #4 overall) has been a productive major leaguer for a decade.

But look at the misses too. Daniel Vogelbach (#43 in 2011) never panned out in Chicago. Albert Almora Jr. (#6 in 2012) was a useful player but not a star. The late-round picks mostly flamed out. This is normal—even the best front offices can't identify every hit—but it meant the Cubs needed their top picks to deliver. They did, barely.

Part VII: The Window Closes

Championship windows in baseball are notoriously brief. The Cubs' window appeared massive in 2016: a young, talented core locked up for years, a deep farm system (even after the Torres trade), and an owner willing to spend. What happened?

Chart: cubs-player-decline

The chart tells a nuanced story. By 2019—when the Cubs missed the playoffs for the first time since 2014—the core had fractured. Bryant dropped from 7.7 WAR to 4.8 (still good, but not MVP-caliber). Rizzo fell from 4.7 to 2.8. Arrieta, now with the Phillies, collapsed from 4.0 to 0.2 WAR. Russell went from a solid 2.6 to a replacement-level -0.3 before being released.

Interestingly, some players improved. Javier Baez posted a career-high 5.3 WAR in 2019, finally becoming the star everyone projected. Kyle Schwarber grew from a 0.5 WAR part-timer to a solid 1.9 WAR regular. But individual gains couldn't offset the collective decline—or the holes that opened up.

Several factors contributed:

Injuries

Kris Bryant was never fully healthy after 2016. A shoulder injury in 2018 affected his swing. He played through pain, but his exit velocities and launch angles never recovered. By 2019, he was still productive but no longer the MVP-caliber player who had anchored the championship team.

The Russell Problem

Addison Russell was arrested for domestic violence in 2017 and eventually suspended for 40 games. The Cubs, to their credit, released him in 2019. But losing their starting shortstop—who had 3.6 WAR in 2016—created a hole they never adequately filled.

The Heyward Albatross

The Cubs signed Jason Heyward to an 8-year, $184 million contract before the 2016 season. He was supposed to be a superstar. He never came close. His 2016 wOBA was .290—well below average. His wRC+ of 72 meant he was producing offense at 72% of league average. The contract handcuffed the Cubs' payroll flexibility for years.

What went wrong? Heyward's batted ball data suggested a player whose swing had fundamentally changed. His launch angle dropped, resulting in fewer line drives and more grounders. His exit velocity declined from 90 mph to 87 mph. The Cubs tried everything—swing changes, bat changes, different batting stances—but nothing worked. The analytics that had identified undervalued players couldn't fix a player whose skills had eroded.

The Heyward signing represents the limits of analytical thinking. The Cubs' models projected him to be worth roughly $30 million in WAR over the contract. He provided closer to $15 million. The projection wasn't crazy—Heyward was 26 years old with a track record of above-average offense and elite defense. But projections are probabilities, not certainties. The Cubs lost this bet badly.

Theo's "Winner's Trap"

Perhaps most tellingly, Epstein himself coined the term "winner's trap" to describe what happened. After winning in 2016, the Cubs felt pressure to keep winning immediately. They made win-now moves—trading prospects for rentals, signing aging veterans—rather than continuing to build for sustained success. The same analytical rigor that built the champion was abandoned in the rush to repeat.

The data supports Epstein's self-diagnosis. From 2017-2019, the Cubs traded away several top prospects: Gleyber Torres, Eloy Jimenez, Dylan Cease. Each trade was defensible in the moment—Torres for Chapman helped win a championship, Jimenez and Cease for José Quintana addressed a rotation need. But the cumulative effect was devastating. The pipeline that had produced Bryant and Schwarber ran dry.

The farm system that ranked top-5 in baseball in 2015 fell to the bottom third by 2019. Without homegrown talent emerging to replace aging stars, the Cubs had to rely on free agency and trades—exactly the expensive, inefficient approaches that the original rebuild had been designed to avoid.

Chart: cubs-acquisitions-war

Look at the acquisition analysis above. The early moves—Bryant, Rizzo, Arrieta, Hendricks—were home runs. The later moves were mostly pushes or losses. The Heyward signing was a bust. The Chapman trade was, in retrospect, a championship bought at the cost of future value.

Part VIII: Comparing Championship Windows

The chart below tracks playoff depth by year—how far each team advanced in October. A score of 5 means World Series champion, 4 means World Series loss, 3 means Championship Series loss, and so on down to 0 for missing the playoffs entirely.

Chart: cubs-window-comparison

The Giants' line tells an unusual story: they went all the way three times (2010, 2012, 2014) but missed the playoffs entirely in the intervening years. Their "dynasty" was actually a series of peaks and valleys—they never sustained excellence, they just peaked at exactly the right moments. Madison Bumgarner's legendary postseason performances (2.11 ERA, 4 wins in 2014 alone) masked a team that was merely average in odd-numbered years.

The Dodgers show a different pattern: sustained excellence that kept falling short. They reached the World Series in 2017 and 2018, losing both times—once to a cheating Astros team, once to the Red Sox. They finally broke through in 2020's pandemic-shortened season. Their window stayed open because they kept investing: Mookie Betts, Freddie Freeman, a $200+ million payroll.

The Cubs' trajectory is the cautionary tale. They peaked in 2016 with the championship, stayed competitive in 2017 (NLCS loss to the Dodgers), then began a rapid descent. The 2018 Wild Card loss to the Rockies—a single-game elimination after 95 wins—felt like bad luck. But 2019's complete collapse (84 wins, missed playoffs) revealed the underlying decay. The window didn't slowly close; it slammed shut.

Why did the Cubs get only one shot while other teams got multiple chances? Three factors stand out. First, they traded their prospect depth for the 2016 push (Torres, Jimenez, Cease), leaving no reinforcements. Second, their core aged simultaneously—Bryant, Rizzo, Russell, and Schwarber were all the same generation, so they declined together. Third, the Heyward contract consumed $184 million that could have been spent on replacements.

Part IX: The Cheating Scandals

Any discussion of this era must address the elephant in the room: cheating. Two major scandals touched teams in the Cubs' competitive orbit.

The Cardinals Hacking Scandal

In 2015, the FBI revealed that the St. Louis Cardinals had illegally accessed the Houston Astros' internal database, Ground Control. Chris Correa, the Cardinals' scouting director, was sentenced to 46 months in prison. He had guessed Jeff Luhnow's password (Luhnow had moved from St. Louis to Houston) and downloaded years of proprietary scouting data.

Importantly, this scandal involved the Cardinals stealing from the Astros, not the Cubs. There's no evidence the Cardinals used stolen data against Chicago. But it revealed the high stakes of baseball's analytics arms race. Teams were building proprietary advantages worth billions, and some people were willing to commit federal crimes to access them.

The Astros Sign-Stealing Scandal

The bigger scandal broke in 2019: the Houston Astros had used electronic equipment to steal opponents' signs during their 2017 championship run (and into 2018). A camera in center field captured the catcher's signs, analysts decoded them in real-time, and players were alerted to upcoming pitches via banging on trash cans.

The Cubs faced the Astros in the 2016 World Series... no wait, they didn't. The Cubs played Cleveland. And the Astros' documented cheating was in 2017, after the Cubs had already won. So while the Astros scandal was one of the biggest in baseball history, it didn't directly affect the Cubs' championship or their subsequent playoff runs.

However, the scandal did affect the competitive landscape. The Astros won the 2017 World Series (with cheating), beating the Dodgers and Yankees— two teams that might have been the Cubs' competition in a cleaner era. The ripple effects are impossible to quantify but worth noting.

The Astros scandal also raised uncomfortable questions about the analytical revolution more broadly. Houston had been one of the most analytics-forward organizations in baseball, hiring NASA engineers and building sophisticated systems to extract every possible edge. Where was the line between legal analytics and illegal cheating? The Astros crossed it blatantly, but the entire industry was pushing boundaries.

The Cubs, to their credit, were never implicated in any sign-stealing scheme. Their analytical advantages came from better player evaluation, development, and in-game strategy—not from cheating. In retrospect, this may have been a competitive disadvantage during the 2017-2018 window. Teams that cheated may have stolen wins from teams that didn't.

Part X: What Machine Learning Reveals

We built an actual machine learning model to predict Cubs player performance—then tested it on data the model had never seen. The goal: understand what a 2016-era front office could have predicted about their window's trajectory, and compare our modern approach to what the Cubs likely used.

Our Model: Gradient Boosting with Walk-Forward Validation

We trained a gradient boosting regressor on Cubs core players' WAR data from 2012-2016, then predicted their 2018-2019 performance (held out as "future" data). The features: player age, current-year WAR, prior-year WAR, WAR from two years ago, peak WAR achieved, years since peak, and a pitcher/position indicator.

This approach—called walk-forward validation—simulates what a front office would face: you only have historical data, and you're predicting futures you haven't seen yet. No peeking allowed.

Chart: cubs-war-predictions

The chart shows each player's average prediction error across 2018-2019. Positive values mean we overestimated their performance; negative means we underestimated. The model captures the general direction of decline but struggles with individual variance. Russell's collapse (off-field issues) and Baez's breakout (late bloomer) were hard to predict from the numbers alone.

How the Cubs Probably Did It

The Cubs' internal projection systems—built into the Ivy platform—likely used similar but more sophisticated approaches. Key differences from our model:

More features. The Cubs had access to proprietary Statcast data (exit velocity, launch angle, sprint speed), biomechanical assessments from their training staff, and detailed injury histories. Our model uses only WAR and age—a fraction of their information.

Larger training sets. We trained on just 7 Cubs players over 5 years. The Cubs trained on decades of MLB history across thousands of players, allowing their models to learn subtle patterns our small dataset can't capture.

Ensemble methods. Modern front offices don't rely on one model. They blend multiple approaches: aging curves, similarity scores, regression-based projections, and machine learning models. Disagreement between models signals uncertainty worth investigating.

Human override. The Cubs' scouts and coaches provided qualitative inputs that models can't capture: work ethic, mental makeup, how a player responded to adversity. These soft factors sometimes outweigh statistical projections.

What Both Approaches Got Right

Even our simple model correctly predicted the general trajectory: decline was coming. Regression to the mean is unavoidable. A 7.7 WAR season (Bryant 2016) almost never repeats—the expected value for his next season was closer to 5.5 WAR, and our model predicted 5.3. When he actually produced 6.6, that was the outlier.

The Cubs' front office knew their window was finite. Theo Epstein has said explicitly that they saw the decline coming and chose to ride the wave rather than trade players at peak value. Given that they broke a 108-year curse, the decision looks justified. But it came at a cost measured in futures that never materialized.

Part XI: The Fire Sale

By 2021, the dream was over. At the trade deadline, the Cubs dealt Bryant, Rizzo, and Baez—the core of their championship team—to contenders. It was a fire sale that acknowledged what the data had been saying for years: the window was shut.

The trades happened in rapid succession. Bryant went to San Francisco for two prospects. Rizzo went to the Yankees for another pair. Baez went to the Mets along with Trevor Williams. In a matter of days, the 2016 championship core was scattered across three different cities.

The returns were modest. The Cubs got prospects, but none projected as future stars. The organizational talent pipeline that had produced Bryant and Schwarber had run dry. The rebuild needed to start again. Theo Epstein had already left for a front office position with Major League Baseball itself. The architects of the championship were gone.

For fans who had watched the rebuild from 2012-2016, it felt like déjà vu. Another teardown. Another accumulation of draft picks and prospects. Another wait for the next window to open. The cycle of baseball rebuilding continued.

Part XII: Lessons for Building a Dynasty

What can we learn from the Cubs' experience? Several things:

1. Windows Are Shorter Than They Appear

The Cubs' competitive window lasted roughly 2015-2018, with 2016 being the clear peak. That's four years at the top, with only one championship to show for it. Modern baseball's economics—with its arbitration system, free agency, and competitive balance—makes sustained dominance extremely difficult.

2. Analytics Can Build Champions, But Can't Sustain Them

The Cubs used analytics brilliantly to assemble their roster. They identified undervalued players, optimized development, and made smart trade decisions. But analytics couldn't prevent injuries, couldn't stop natural aging, and couldn't overcome the structural challenges of maintaining a championship-caliber payroll year after year.

3. The Draft Is Everything

The Cubs' championship was built on draft picks (Bryant, Baez, Schwarber) and shrewd trades (Rizzo, Arrieta, Hendricks). Free agency played a supporting role at best. The Heyward and Lester signings were necessary to push the team over the top, but they weren't the foundation.

This is perhaps the most important lesson. In a sport with a hard luxury tax and significant revenue sharing, sustainable success requires a constant pipeline of cheap, controllable talent. The Dodgers have understood this—they spend big in free agency while maintaining a top farm system. The Cubs tried to do both but lost the thread after 2016.

4. Culture Matters, But Results Matter More

Jason Heyward's famous rain-delay speech in Game 7 has become legendary. He rallied the team during the 17-minute delay, and the Cubs scored two runs in the 10th to win. Culture won that game.

But culture couldn't make Heyward hit. It couldn't heal Bryant's shoulder. It couldn't overcome the structural challenges of maintaining a championship roster. Culture is necessary but not sufficient. The Cubs had elite culture in 2016; they had the same culture in 2019 when they missed the playoffs.

5. One Championship Is Still a Championship

The Cubs ended a 108-year drought. No amount of analytical second-guessing can take that away. For a franchise and a fanbase that had suffered through the Billy Goat curse, Bartman, and countless heartbreaks, 2016 was everything.

The 30 teams in Major League Baseball have won a combined 120 World Series (since 1903). Only 23 franchises have won at least one. The Cubs joining that club—breaking one of sports' longest droughts—was a historic achievement regardless of what came after.

Conclusion: The Ghost of Dynasty Past

The 2016 Chicago Cubs were, by the numbers, one of the best teams in baseball history. Their +252 run differential was historically elite. Their rotation, anchored by Arrieta and Hendricks, was dominant. Their lineup, featuring Bryant, Rizzo, and Zobrist, was deep and dangerous. Their defense, led by Baez's wizardry at second base, was spectacular.

They should have been a dynasty. They were built to be a dynasty. The analytical infrastructure—Tango's metrics, Kruse's Ivy system, Epstein's process-driven approach—was supposed to create sustainable success.

Instead, they won exactly one championship. It was glorious—a Game 7 victory that ended over a century of futility. But it was also singular. The dynasty that should have been became a one-year wonder.

Why? The data points to many factors: injuries, personnel decisions, the inherent volatility of playoff baseball, the difficulty of sustaining excellence in a sport designed for parity. But perhaps the deepest lesson is this: baseball resists dynasties. The sample sizes are too small, the variance too high, the margins too thin.

Consider the mathematics. Even a team with a 60% win probability—which would be historically dominant in the regular season—has only about a 35% chance of winning any given playoff round. To win three rounds and claim a championship, that 60% regular-season team has about a 4-5% chance in any given year. The Cubs were probably that good in 2016. They might have been that good in 2017 and 2018 too. But 4-5% is still 4-5%.

The Cubs did everything right to build a champion. They just couldn't do it twice. In a sport where the best team wins barely 60% of its games, maybe that's not failure. Maybe that's just baseball.

The irony is that the same analytical revolution that built the Cubs' championship has made championships harder to win. Every team now has analytics departments. Every team now optimizes roster construction. The edges the Cubs found in 2012-2016 have been arbitraged away. The next great breakthrough will require something new—and every team is searching for it.

The numbers tell us the Cubs' window closed faster than it should have. But they also tell us that window was real, and spectacular, and enough to break a curse that had lasted 108 years. In the end, the data confirms what Cubs fans already knew: 2016 was worth the wait. Every spreadsheet, every model, every optimization—it all led to that moment in Cleveland, rain falling, 10th inning, when a franchise and a city finally got to celebrate.

The dynasty that should have been wasn't. But the championship that happened was. And for 108 years of waiting, one was enough to matter. Everything after was just baseball doing what baseball does—humbling everyone who thinks they've figured it out, and reminding us why we keep watching anyway.

Data Sources

  • Baseball-Reference.com for WAR calculations and historical statistics
  • FanGraphs for advanced metrics (wOBA, FIP, wRC+)
  • MLB.com for transaction history and draft data
  • Tango, Lichtman, and Dolphin, "The Book: Playing the Percentages in Baseball" (2006)
  • Department of Justice press releases on Cardinals hacking case
  • MLB Commissioner's Report on Houston Astros sign-stealing