Market Making
Market making is the strategy of simultaneously and repeatedly posting limit orders on both sides of a Limit Order Book to provide liquidity and capture the bid-ask spread, while continuously adjusting quotes in response to inventory and market conditions. A market maker does not forecast direction; it earns the spread by being on both sides, and its central problem is risk management — specifically inventory risk (the diffusive drift of the mid-price against an accumulated position) and Adverse Selection (being systematically filled by better-informed counterparties). The strategy appears in this vault because it is the cleanest real-world instance of trading framed as a sequential decision problem: state, action, reward, transition.
The mapping onto a Markov Decision Process Trading Model is direct and is stated explicitly in the literature. The state captures inventory, time, volatility and Limit Order Book features such as order-book imbalance; the action is a pair of quote placements (bid/ask price offsets, sizes, or cancel/post decisions); the reward is profit-and-loss, typically a CARA utility or PnL net of a running inventory penalty. Lalor Swishchuk 2025 give a full formal MDP definition for exactly this problem and note that optimal-market-making state spaces “often include variables such as inventory, order imbalance, market quality measures, differences between bid/ask prices and many more,” with action spaces of “bid/ask price changes, bid/ask size changes, quote pairs, cancelling or posting orders.” Because inventory evolves as a controlled process and the objective is a discounted cumulative reward, market making is a textbook MDP application.
Market Making [part-of] Limit Order Book Markov Decision Process Trading Model [defines] Market Making Adverse Selection [opposes] Market Making
The foundational quantitative formulation is the Avellaneda-Stoikov 2008 stochastic-control model. A dealer posts bid/ask limit orders around a Brownian-motion mid-price and maximises terminal exponential utility of P&L; the solution is a two-step rule — compute an inventory-skewed reservation price r(s,q,t) = s − q·γ·σ²·(T−t) that pulls quotes away from accumulated inventory, then calibrate the half-spread to order-arrival intensity. This is the classical, model-based route: it is a fully specified MDP solved by dynamic programming (a Hamilton-Jacobi-Bellman PDE). Its weak point is the Poisson order-arrival assumption — arrivals λ(δ) = A·exp(−k·δ) are memoryless and independent of the price process. Modern microstructure rejects this: Hawkes Process models show order arrivals cluster (self-excitation) and interact across event types (cross-excitation), so the real Limit Order Book is non-Markovian. The reinforcement-learning route, used by Lalor Swishchuk 2025, solves the same MDP from sampled experience when the dynamics are unknown — they train a Soft Actor-Critic agent and argue RL is preferable to stochastic control precisely because real markets have latent variables and high-dimensional state that the closed-form model cannot capture.
Avellaneda-Stoikov 2008 [proposes_model] Market Making Avellaneda-Stoikov 2008 [part-of] Markov Decision Process Trading Model Hawkes Process [contradicts] Avellaneda-Stoikov 2008 Reinforcement Learning Trading Policy [optimises_policy] Market Making
The honest profitability verdict for market making as evidenced in this vault is inconclusive — a legitimate MDP application whose tradeable edge is demonstrated only in simulation. Two structural problems block a stronger grade. First, the Markov assumption is empirically violated at the order-book level: Lalor Swishchuk 2025 title their study “Non-Markov Market-Making” because real LOB dynamics show jumps and dependence on past trades, so the memoryless MDP state is a convenient fiction — a Partial Observability / Non-Stationarity problem made concrete. Second, profitability evidence is simulation-based, not live. Avellaneda-Stoikov 2008 is a simulation that reports lower inventory-risk variance than a naive strategy — a risk-management result, not a real-data profit demonstration. Lalor Swishchuk 2025 tested 200 simulated out-of-sample episodes and warn explicitly that omitting Adverse Selection (“adverse fills”) produces “large phantom gains,” that midprice-rather-than-bid/ask simulation, fixed-spread and front-of-queue assumptions all inflate results, and that such models “have often been shown to over-inflate results.” Real market making is a known live business for HFT firms — but none of the sources in this vault supplies disclosed, cost-realistic, live-tradeable P&L; the academic record establishes the framework and risk-control mechanism, not a substantiated alpha.
Lalor Swishchuk 2025 [tests_strategy] Market Making Adverse Selection [causes] Phantom Gains in Backtests Lalor Swishchuk 2025 [contradicts] Markov Decision Process Trading Model
Connections
- Markov Decision Process Trading Model — optimises_policy, source: https://arxiv.org/html/2410.14504v2
- Limit Order Book — part-of (the trading environment), source: https://arxiv.org/html/2410.14504v2
- Avellaneda-Stoikov 2008 — proposes_model, 2008, source: https://people.orie.cornell.edu/sfs33/LimitOrderBook.pdf
- Lalor Swishchuk 2025 — tests_strategy, 2024-2025, source: https://arxiv.org/html/2410.14504v2
- Reinforcement Learning Trading Policy — optimises_policy, source: https://arxiv.org/html/2410.14504v2
- Adverse Selection — suffers_overfitting_risk (omitting it inflates backtests), source: https://www.sciencedirect.com/science/article/pii/0304405X85900443
- Hawkes Process — contradicts (non-Markovian arrivals break the model), source: https://www.maths.ox.ac.uk/system/files/attachments/Hawkes%20Process-Driven%20Models%20for%20Limit%20Order%20Book%20Dynamics_0.pdf
- Partial Observability — suffers_overfitting_risk, source: https://arxiv.org/html/2410.14504v2
- Out-of-Sample Backtesting — lacks_live_evidence, source: https://arxiv.org/html/2410.14504v2
Sources
- High-frequency trading in a limit order book — Avellaneda & Stoikov, Quantitative Finance 8(3):217-224, 2008
- Deep Reinforcement Learning in Non-Markov Market-Making — Lalor & Swishchuk (arXiv 2410.14504 / Risks 13(3):40, 2025)
- Bid, ask and transaction prices in a specialist market with heterogeneously informed traders — Glosten & Milgrom, JFE 14(1):71-100, 1985
- Hawkes Process-Driven Models for Limit Order Book Dynamics — Oxford Mathematical Institute