How MLB Prediction Models Actually Work

Most baseball prediction systems advertise accuracy numbers without explaining what is under the hood. That is a problem. A model you do not understand is a model you cannot evaluate, and a model you cannot evaluate is one you are taking on faith. Faith is not a methodology.

This section of MLBPrediction.com exists to open the hood. Every concept covered here represents a real component of modern baseball forecasting, from the foundational math of run expectancy to the unsolved problem of quantifying how wrong your model might be on any given night. None of this is theoretical. These are the building blocks that power projection systems across the sport.

Baseball is uniquely suited to prediction science. The discrete event structure of the sport, where every plate appearance produces a classifiable outcome and every game state can be described by a base-out matrix, makes it more modelable than any other major sport. But more modelable does not mean easy. The interactions between components create complexity that no single metric can capture.

The Foundation: Modeling How Runs Score

Every prediction model starts with the same question: how many runs will each team score? The answer begins with run expectancy, a framework that assigns expected run values to each of the 24 possible base-out states. A runner on second with nobody out is worth more expected runs than a runner on first with two outs, and the magnitude of that difference is quantifiable across decades of play-by-play data.

Run expectancy is not a static table. It shifts based on era, park, lineup quality, and the specific pitcher on the mound. Modern models build dynamic run expectancy matrices that adjust in real time rather than relying on league-average tables from five years ago. Understanding how these matrices are constructed, and where they break down, is the first step toward understanding any prediction system.

Breaking Down the Plate Appearance

Runs score because batters reach base and advance. But reaching base is not a single event. It is the product of a probability tree where every plate appearance branches into a strikeout, walk, single, double, triple, home run, or batted-ball out, each with its own likelihood based on the batter-pitcher matchup, count leverage, and pitch mix.

Modeling plate appearance outcomes means estimating the probability of each branch for every possible matchup. This is where pitch-level data becomes essential. A fastball-dominant right-hander facing a left-handed batter with a 40% chase rate on sliders outside the zone produces a fundamentally different probability distribution than the same pitcher facing a disciplined right-handed contact hitter. Collapsing those differences into a single projected batting average discards most of the information that makes predictions useful.

From Matchups to Outcomes: Game Simulation

Knowing the probability distribution for individual plate appearances is necessary but not sufficient. A baseball game is a sequence of 50 to 80 plate appearances per team, and the order in which outcomes occur matters enormously. A solo home run with nobody on base and a grand slam with the bases loaded are both home runs, but they produce different run totals. Simulation captures this.

Monte Carlo game simulation runs thousands of plate-appearance sequences through the inning and game structure, each time sampling from the estimated probability distributions. The result is not a single predicted score but a distribution of possible outcomes. The median of that distribution is the best guess, but the shape of the distribution, whether it is tight or wide, symmetric or skewed, carries just as much information.

The Bullpen Problem

Starting pitcher projections get the most attention, but modern games are not nine-inning starter affairs. The average starter faces the lineup fewer than three full times through the order before handing the ball to the bullpen. That means three to four innings of the game, often the highest-leverage innings, are pitched by relievers whose deployment is uncertain at the time of prediction.

Bullpen usage prediction involves modeling which relievers will pitch, in what order, for how many batters, and in what game states. A closer protecting a one-run lead in the ninth is a different prediction problem than a middle reliever entering with runners on base in the sixth. Manager tendencies, recent usage patterns, handedness matchups, and rest days all factor into reliever deployment models. Getting the bullpen wrong can swing a game projection by a full run or more.

Why Lineup Order Changes Everything

Player projection systems typically estimate production in isolation: projected slash line, expected home run rate, strikeout probability. But batters do not hit in isolation. They hit in a lineup where the hitters around them influence their opportunities and outcomes.

A number-three hitter with a high-OBP leadoff man ahead of him sees more at-bats with runners on base than the same hitter batting behind a low-OBP counterpart. Protection effects, while debated, influence pitch selection: pitchers attack the strike zone differently when the on-deck hitter represents a legitimate threat. Lineup interaction modeling captures these contextual effects and adjusts individual projections based on where each batter hits and who surrounds him. Ignoring lineup context means treating the same player as equally productive in every possible lineup configuration, which is measurably false.

When Models and Markets Disagree

Prediction models produce probability estimates. Markets produce prices. These are related but not identical, and understanding where and why they diverge is one of the most informative aspects of prediction science.

Persistent model-market disagreement can stem from several sources: the model captures information the market has not priced, the market captures information the model does not have (late scratches, clubhouse issues, undisclosed injuries), or structural features of market-making create systematic pricing patterns that do not reflect true probabilities. Studying these divergence patterns reveals the strengths and blind spots of both approaches. Neither models nor markets are consistently right. The interesting question is understanding the conditions under which each tends to be more accurate.

The Environment Is a Variable

Baseball is played outdoors in 22 of 30 major league parks, and the physical environment directly affects ball flight, pitcher grip, and player performance. A fly ball hit at Coors Field in July travels measurably farther than the same batted ball at Oracle Park in April. Humidity, wind speed, wind direction, barometric pressure, and temperature all influence game outcomes in ways that are quantifiable but frequently ignored.

Environmental modeling treats weather and park geometry as input variables rather than background noise. The effect sizes are not trivial: temperature alone accounts for roughly 0.1 runs per game per ten-degree change, and altitude effects at Coors Field can add a full run to the expected total. Models that treat all parks and all weather conditions identically leave predictive value on the table.

Fatigue, Workload, and Injury Risk

A pitcher in April is not the same as that pitcher in September. Velocity decays over the course of a season as innings accumulate. Spin rates fluctuate with fatigue. Command erodes as workload builds. These are not random variations; they follow measurable patterns that workload models can capture and project forward.

Fatigue modeling tracks cumulative pitch counts, innings loads, rest intervals, and in-game workload to estimate real-time performance degradation. The same pitcher with 180 innings on his arm projects differently than that pitcher at 80 innings, even if his season-level stats look identical at the time. Injury risk models extend this further, estimating the probability that a workload trajectory places a pitcher at elevated risk of a performance-altering injury. These models do not predict specific injuries, but they can identify when a pitcher is operating in a fatigue zone where historical injury rates spike.

The Hardest Problem: Knowing How Wrong You Might Be

Every prediction is an estimate, and every estimate has uncertainty. The question that most projection systems fail to answer honestly is: how much uncertainty? A model that projects a team to score 4.3 runs is making a fundamentally different claim depending on whether the 90% confidence interval is 3.8 to 4.8 or 2.1 to 6.5.

Uncertainty quantification is arguably the most important and most neglected component of prediction science. It requires propagating error through every layer of the model: uncertainty in the input data, uncertainty in the parameter estimates, uncertainty in the model structure itself. A prediction without a confidence interval is an assertion without context. Understanding the sources and magnitudes of prediction uncertainty is what separates rigorous forecasting from educated guessing.

The Future of Baseball Prediction

The components described above represent the current state of the art, but prediction science does not stand still. Emerging areas include real-time biomechanical modeling using high-speed pitch tracking data, contextual decision-making models that predict in-game managerial strategy, and ensemble methods that combine multiple independent modeling approaches to reduce systematic bias.

The data infrastructure of baseball continues to expand. Statcast now captures bat speed, swing length, and squared-up rate in addition to exit velocity and launch angle. Hawk-Eye tracking provides granular defensive positioning data. Each new data stream creates opportunities to refine existing models and build entirely new ones.

The limiting factor in baseball prediction has never been data availability. It is the ability to integrate disparate information sources into a coherent probabilistic framework that accounts for the interactions between components. That integration problem is what this entire section explores, one piece at a time.

Prediction Model Topics

Back to Prediction Models Hub