Chapter 7 of 7
Pricing and Elasticity
Created Apr 28, 2026 Updated Jun 7, 2026
Pricing is the worked-example application of every method covered in this track. The central problem is that price is not exogenous — a revenue manager sets it based on expected demand, a recommendation engine sets it based on user behaviour, an algorithm sets it based on inventory and competitor prices — and a naive regression of observed demand on observed price gives the wrong answer about the price-demand relationship. The whole Econometrics track was building up the tools to handle exactly this kind of problem.
This note is the integration: it walks through what elasticity means, why it cannot be read off a pooled OLS of demand on price, the standard ways to identify it, how cross-product effects (cannibalization) fit in, what the user-facing what-if surface looks like in practice, and where each of the methods from earlier in the track gets applied. If the rest of the track was the toolkit, this is the assembly.
What elasticity is
Own-price elasticity is the percentage change in demand for a product when its own price changes by 1%:
ε_own = (% Δ Q) / (% Δ P)
Almost always negative — demand falls when price rises — and the absolute value tells you how responsive demand is. |ε| = 1 is unit-elastic (revenue insensitive to small price changes); |ε| > 1 is elastic (revenue falls when price rises); |ε| < 1 is inelastic (revenue rises with a small price increase, assuming the same demand curve and no cross-product effects). For revenue optimisation in a simple single-product setting without costs or cross-product effects, the unit-elastic point is the local revenue maximum — which is why elasticity sits at the centre of pricing optimisation. Real pricing problems usually optimise profit rather than revenue and run across a portfolio rather than a single product, so the textbook unit-elasticity result is a starting intuition rather than the final optimum.
Cross-price elasticity is the same idea between two products: how much demand for product A changes when the price of product B changes by 1%:
ε_AB = (% Δ Q_A) / (% Δ P_B)
The sign carries the substantive interpretation:
ε_AB > 0→ substitutes. When B becomes more expensive, customers switch to A. Coca-Cola and Pepsi, butter and margarine, hotel-room categories, standard and premium seats in the same venue.ε_AB < 0→ complements. Goods consumed together. When B becomes more expensive, less of B is consumed and therefore less of A is consumed too. Cars and gasoline, printer and cartridges, cheese and crackers.ε_AB ≈ 0→ independent. No economic connection between the products.
A cross-elasticity of 0.4 between standard and premium seats says: a 10% price rise on premium shifts demand for standard up by 4%. Some customers who would have chosen premium switch to the cheaper section. This kind of cross-product effect is exactly what pricing optimisation has to take into account — pricing each product independently ignores the matrix of cross-elasticities and leaves money on the table.
Revenue vs profit
In production pricing systems the objective is usually profit, not revenue. Revenue is P × Q; gross profit is (P − marginal_cost) × Q, and the optimal price is the one that maximises the latter. The unit-elastic intuition from the previous section is a revenue-side result: at unit elasticity, a small price change leaves revenue locally unchanged, but the profit-maximising price is generally above the unit-elastic point because it weighs the price against the marginal cost.
Profit optimisation also has to account for several factors that pure revenue maximisation can ignore: differing margins across products (which shifts the optimal mix), capacity (raising the price of a near-full product is profitable in a way that raising the price of an under-utilised one is not), substitution patterns (a price rise that pushes customers onto a lower-margin alternative may raise total revenue while reducing total profit), and long-term customer effects (retention, churn, lifetime value — see the short-run vs long-run section below). A price change that raises revenue can still be a bad decision if it routes customers to lower-margin substitutes or damages retention. Elasticity-based reasoning has to sit inside a profit-aware optimisation, not replace it.
Demand is linear, so elasticity changes as you move along it — slide your price and watch ε go from inelastic below the peak to elastic above it. The revenue curve peaks exactly where ε = −1, the unit-elastic point. The profit curve peaks higher — at the revenue-max price plus half the marginal cost — because profit weighs each sale's price against what the unit costs to make. Raise the marginal cost and the two optima separate further. That gap is why production pricing targets profit, not revenue.
The identification problem
Both kinds of elasticity look simple to estimate from data: regress demand on price, read the coefficient, divide by means to convert to percent. The problem is that this regression is biased.
In observational pricing data, price is not randomly assigned. A revenue manager sets prices high when demand is expected to be high (busy season, popular items, near-full capacity) and lower when demand is expected to be low. The price observed in the data is therefore correlated with whatever drives demand expectations — and a regression of demand on price conflates the (negative) effect of price on demand with the (positive) correlation that comes from price being set in response to demand. In the common case where prices are raised when expected demand is high, the OLS coefficient is pulled toward zero or even the wrong sign, and the resulting elasticity is biased toward zero in economically meaningful ways — demand "looks" less sensitive to price than it actually is, and downstream optimisation sets the wrong price. The exact direction of the bias depends on the pricing rule; what is general is that the coefficient is not the causal elasticity.
This is exactly the endogeneity problem and specifically the simultaneity flavour: price and demand mutually determine each other through the manager's pricing rule. The whole rest of the Econometrics track was about how to recover unbiased causal effects in settings like this. The next section walks through which tools apply where in pricing.
Identification strategies for pricing
The right tool depends on what variation in price is available to exploit and what kind of confounders are in the data.
Cost shifters as instruments
The classical IV identification of demand curves: find a variable that shifts the supply side without affecting demand directly. Wholesale costs, fuel prices, exchange rates, input commodity prices — anything that moves the seller's price without independently moving the buyer's willingness to pay. With a valid cost shifter, 2SLS recovers the demand-side elasticity by isolating price variation that is exogenous to demand. This is the canonical worked example in undergraduate econometrics, and is still the cleanest identification when cost data is available.
Lagged competitor prices
A common applied IV in retail and marketplaces: previous-period competitor prices that move our price (we react to competitors) but should not move our demand directly (customers see only our price). The IV note covers the two failure modes — common demand shocks that move both competitors' prices and our demand through a shared cause, and persistence in demand shocks that lets yesterday's competitor price still correlate with today's demand through autocorrelation. Lagged competitor prices are a plausible instrument when these channels are explicitly addressed, and a misleading one when they are not.
Capacity-based instruments
A near-full-capacity indicator that triggers price increases under yield management. The IV note goes through the failure modes carefully: mechanical capping of observed sales when capacity binds, scarcity-signalling UI ("only 2 rooms left"), and capacity itself being endogenous to anticipated demand. Useful only when none of these channels is active, which is a stronger condition than the textbook framing suggests.
Price discontinuities (RD)
When pricing rules have hard cutoffs — bulk-discount thresholds, tier boundaries, age-based discounts, eligibility cutoffs — the regression discontinuity design can identify the local effect of crossing the boundary. Time-based "RD" around a price change is closer to an event study and requires the harder identifying assumptions discussed in the RD note (continuity of every other relevant factor through the cutoff is much harder to defend when the running variable is time).
Panel approaches: store × time fixed effects
When the data is a panel — many stores observed across time, or many SKUs across time — panel-data fixed effects absorb time-invariant store characteristics (location, format, brand) and time-fixed shocks (national holidays, macro events). Two-way FE on a price experiment that varied across stores at staggered times brings in the staggered-DiD literature — vanilla two-way FE can give wrong-sign estimates under heterogeneous price effects, and the modern Callaway–Sant'Anna or Sun–Abraham estimators are the right reach.
High-dimensional confounders: DML
In modern pricing systems with rich behavioural data — user features, session features, product features, time-of-day patterns, recency-frequency-monetary signals — confounders are high-dimensional and the relationship between them and price is nonlinear. Double Machine Learning handles exactly this: flexible ML for the nuisance functions (predict price from confounders, predict demand from confounders), Neyman-orthogonal moments and cross-fitting on top, and a valid causal estimate of price elasticity at the end. This is the standard tool when a learning-to-price system upstream produces complex price patterns from rich features.
Personalised elasticity: CATE
Different customers have very different price sensitivities — loyal customers may be insensitive (they buy regardless), price-sensitive shoppers may abandon at small increases, and some segments may even react negatively to low prices (the price-as-quality-signal effect). Causal ML methods for CATE — uplift modelling, causal forests, X-learners — estimate the elasticity function ε(X) rather than a single scalar, which is what enables personalised pricing where the offered price depends on the user's estimated sensitivity.
The uplift four-quadrant framing from the causal-ML note carries over to pricing, but with a caveat: the meaning of each quadrant depends on what the "treatment" is — a discount, a price increase, a personalised offer, exposure to a paywall. The useful idea is the same regardless of which treatment is in play: some customers are moved by the intervention (the persuadables of pricing), some would behave the same way without it (sure things and lost causes), and some may react negatively (a discount that signals a quality problem, a price increase that triggers churn). The pricing-segment design question is which of these groups the intervention should be aimed at, and CATE estimation is what tells you who falls where.
Cross-price effects, cannibalization, and joint optimisation
Cannibalization is the negative side of cross-price elasticity inside a single portfolio: when raising the price of one product pushes customers onto a different product that the same company also sells, instead of capturing them as new revenue.
The classic retail example: launching a cheaper version of an existing product — the cheaper version sells well, but sales of the more expensive original drop by roughly the same amount. Net new revenue is small or negative once margins are accounted for. The same pattern shows up in pricing: raising the price of standard seats pushes some customers onto premium — standard is now under-utilised, premium may sell out, both bad for total revenue.
The fix is joint optimisation: price all products in the portfolio together, accounting for the full cross-price elasticity matrix. The optimiser proposes a price for each product, simulates the demand response across the entire portfolio (not just the directly-priced product), and evaluates total revenue rather than per-product revenue.
Two products of one firm, linked by a cross-effect. Pricing each to maximise its own profit — the independent point — quietly ignores that cutting one product's price steals customers from the other product the same firm sells. The joint optimum internalises that cross-effect and always lands higher on the total-profit curve. Slide the cross-effect from substitutes (joint pricing moves prices up) to complements (it moves them down) — either way, independent per-product pricing forfeits the gap. That forfeited gap is cannibalisation, and closing it is the whole point of optimising across the cross-elasticity matrix.
The practical challenge is the matrix scaling. For N products there are N² cross-elasticity entries, and estimating each one with sufficient precision is hard — most pairs of products do not have enough overlapping price variation to identify the cross-effect cleanly. Modern pricing systems address this with structured demand models that have far fewer parameters than the full N × N matrix:
- Nested logit — products are grouped into nests with shared substitution patterns; cross-elasticities follow a parametric structure tied to the nest hierarchy.
- Mixed-logit / BLP (Berry, Levinsohn & Pakes 1995) — random-coefficient demand models where heterogeneous consumer preferences induce realistic substitution patterns from a small set of underlying parameters.
- Hierarchical / Bayesian shrinkage — partial pooling across cross-elasticities so that pairs with little data borrow strength from pairs with similar product characteristics.
Estimation of the structured model is intertwined with the identification problem from the previous section: a structured demand model still needs identifying variation in prices to recover the parameters, and the same IV / FE / DML tools are how that variation is recovered cleanly.
Functional form: constant elasticity vs alternatives
Saying "demand has elasticity ε" implicitly assumes a functional form — most often a constant-elasticity (log-log) demand curve:
log Q = α + ε · log P + controls + noise
Under this form, elasticity is the same at every price, and the percentage change in demand for a percentage change in price is the constant ε. This is the form behind the standard what-if formula Q_new = Q_baseline × (P_new / P_baseline)^ε discussed below.
Constant-elasticity is one specific assumption, not an inevitable choice:
- Linear demand
Q = a − b · P. Elasticity changes with price: low at low prices, high at high prices. Often a better fit when demand has a clear choke price. - Semi-log
log Q = a − b · P. Elasticity scales linearly with price. - Logit / multinomial logit — demand for each product is the share from a discrete-choice model. Elasticities are derived from the choice probabilities and depend on price level.
- Mixed logit / BLP — heterogeneous preferences, more realistic substitution patterns at the cost of more identifying variation needed to estimate.
Which form is right depends on the product, the price range under consideration, and how local the analysis is. Constant-elasticity is the default reach for simplicity and is acceptable inside a small price range; for cross-product analysis or large-range price changes, structured choice models are usually closer to right.
What-if analysis as the user-facing surface
Estimated elasticities live behind a user-facing what-if interface in most production pricing systems. A revenue manager wants to ask "what happens if I raise this price by 10%?" and get an instant answer — projected demand, revenue, knock-on effects on related products — without waiting for a full forecast pipeline to rerun.
Endpoint design
The standard implementation is a separate API endpoint serving a simplified model — the constant-elasticity formula or a closely related lightweight approximation — that responds in milliseconds. The full pricing pipeline (demand forecast, inventory simulation, optimisation) runs asynchronously and uses richer models, but the what-if surface needs to keep up with a slider in a UI.
new_demand = baseline_demand × (new_price / current_price) ** elasticity
new_revenue = new_demand × new_price
For elasticity = −1 (unit-elastic), doubling the price halves demand and revenue is unchanged. For elasticity = −0.5 (inelastic), doubling the price reduces demand by ~29% and revenue rises substantially. These are the canonical sanity checks the UI should display alongside the projection.
Local-validity caveat
The constant-elasticity assumption is a local approximation. Inside a small price range — say ±20% around the current price — it is usually close enough; large jumps require recomputing elasticity because:
- Customer behaviour changes qualitatively at large price moves (loyalty effects, switching to competitors, perceived-quality flips).
- Cross-product cannibalisation patterns shift when relative prices move outside their training range.
- The market context that produced the estimated elasticity (competitor prices, season, inventory) may not survive a large move.
A good pricing UI reflects this: the what-if slider is restricted to a "safe" range, and larger moves trigger a "request detailed forecast" path that runs the full pipeline. The constraint is not arbitrary UI design — it is the econometric assumption (constant elasticity in a local range) directly shaping what the interface can be trusted to answer.
The what-if formula Q·(P_new/P_base)^ε and the true demand curve are built to touch at the base price with the same slope, so inside the shaded ±20% band they stay glued together — the formula is trustworthy and fast enough to answer a UI slider in milliseconds. Drag the price move out toward ±55% and the power curve and the real curve peel apart; the projected demand drifts off by tens of percent. That divergence is precisely why a good pricing interface caps the slider to a safe band and routes larger moves to the full forecast pipeline — the UI constraint is the econometric assumption made visible.
Dynamic and time-varying elasticity
Real-world elasticities move over time. The discussion above has implicitly treated elasticity as static, but several mechanisms make it shift:
- Seasonality. Holiday-period demand can be far less price-sensitive than off-season demand.
- Competitor moves. A competitor's price change shifts the customer's choice set and changes the effective elasticity overnight.
- Macro conditions. Recessions tighten budgets and raise sensitivity; booms loosen them.
- Inventory and scarcity signals. Low stock makes customers less price-sensitive (urgency); high stock the opposite.
Modelling time-varying elasticity is itself a research area, with structured approaches like time-varying-parameter models, regime-switching demand models, and rolling-window re-estimation as the practical workhorses. For most production systems, periodic re-estimation (daily, weekly, monthly depending on the speed of underlying changes) is the pragmatic compromise between full dynamic models and assuming static elasticity. The underlying identification tools — IV, panel FE, DML — apply just as well to time-varying elasticity, with the time dimension handled either by re-estimating in rolling windows or by structurally modelling the time variation.
Short-run vs long-run effects
Short-run elasticity is not the whole pricing problem. The ε recovered by IV, FE, DML, or CATE methods captures the immediate demand response to a price change — the change in quantity sold in the period after the price moves. That is what a what-if endpoint shows, and what most observational pricing data identifies cleanly. Several long-run effects are missed by the short-run estimate:
- Retention and churn. A price increase may not reduce sales this week but may raise churn over the next quarter as customers gradually switch to alternatives. The short-run elasticity does not see this; it has to be modelled separately, often through customer-lifetime-value forecasts.
- Reference prices. Customers anchor on prices they have seen before. A price reduction trains the market to expect lower prices and is hard to reverse without backlash; the short-run uplift can be real while the long-run revenue effect is negative.
- Perceived fairness. Price discrimination that becomes visible (different users seeing different prices for the same item) can damage brand trust in ways that surface over months rather than days. This is especially important for personalised pricing, where the fairness and trust dimensions can become reputational risks even when the short-run elasticity-based optimisation looks favourable.
- Competitive response. Competitors observe price moves and may match them, eroding the temporary advantage that a short-run model would attribute to the price change.
A what-if surface that estimates immediate demand response should not be confused with the full long-term value of a pricing policy. The general fix is to evaluate pricing policies at the policy level — long-horizon experiments, off-policy evaluation on logged decisions, holdout cohorts, customer-lifetime-value modelling — rather than treating each pricing decision as a one-shot demand-response problem.
Bringing the toolkit together
This note is the integration point for the Econometrics track. The pieces:
- The problem. Price is endogenous (revenue managers and algorithms set it conditional on expected demand), so naive demand-on-price regression mixes the price effect with the pricing rule — the canonical endogeneity problem in its simultaneity form.
- The classical fixes. Instrumental variables with cost shifters, lagged competitor prices, or capacity instruments; regression discontinuity at price-rule cutoffs; panel-data fixed effects for store-level and time-level confounders.
- The modern fixes. Control function and DML for high-dimensional confounders; causal-ML CATE estimation for personalised pricing.
- The application layer. Cross-price elasticity matrices, structured demand models for the cross-elasticity scaling problem, joint optimisation across the portfolio, and what-if interfaces with constant-elasticity caveats sitting in front of the whole thing.
Pricing is the worked example where every method in the track has a real role. Production pricing systems usually combine several of them — DML to estimate baseline elasticity from rich confounders, panel FE for store-level heterogeneity, CATE methods for segment-level personalisation, structured demand models for cross-product effects — rather than picking one. The right architecture depends on the data, the decision the elasticity is going to inform, and how much variation in price is available to exploit. But every honest pricing pipeline has to take identification seriously somewhere; the alternative is optimising on a confounded number and being surprised when the predicted revenue lift does not show up in production.