### Normalization of Market Prices - A Technical Note

Interpreting prices in prediction markets can pose a formidable challenge when predictions require that their total sums up to 100%. While average traded prices will typically be close to 100% (or 100¢), they do not necessarily need to sum up to 100¢.

One school of thought is to interpret predictions nominally, attributing any violation of the summing up constraint as noise. Another school of thought suggests that normalizing the predictions to enforce the summing-up constraint corrects for inherent problems in the microstructure of prediction markets. But which normalization provides a reasonable approximation of the true prediction?

In what follows we consider a multi-contract prediction market with contracts $i\in\{1,..,n\}$ so that their payoff prices $p_i$ sum to 100¢. Let $p^l_i$ denote the last-traded price, and $\bar{p}_i$ an average price of recent trades.

One method of normalization sums the prices $p_i$ of the most recent trades over a particular reference period (e.g., current or last trading day, or last trading week), perhaps weighted by trading volume. The prediction is then normalized by multiplying each average price $\bar{p}_i$ with a correction factor 100¢$/\sum_{i=1}^n \bar{p}_i$.

To capture the most current outlook of the prediction market, using the best bid and best ask prices is more suitable than using past-trades prices because bid and ask prices are essentially forward-looking rather than backward-looking.

A multi-contract prediction market offers contracts $i\in\{1,..,n\}$ so that their payoff prices $p_i$ sum to 100¢. For any of the $n$ contracts, traders post best bids $p^b_i$ and best asks $p^a_i$. Further let $$A\equiv\sum_{i=1}^n p^a_i \quad\mathrm{and}\quad B\equiv\sum_{i=1}^n p^b_i\label{eq:AB}$$ denote the sum of best asks prices and sum of best bid prices. Arbitrage opportunities ensure that $A\ge 1$ and $B\le 1$. Then $S\equiv A-B$ is the overall spread. A consistent prediction $\widehat{p}_i$ based on current market prices requires that $$\sum_{i=1}^n \widehat{p}_i =1\label{eq:sumup}$$ The bid and ask prices impose constraints on $\widehat{p}_i$. Concretely, best bid and best ask prices provide lower and upper bounds on the prediction so that $p^b_i\le \widehat{p}_i \le p^a_i$. Arguably, the prediction falls in between prices at which traders are willing to take explicit positions. Then it is possible to calculate a common weight factor $0\le\lambda\le1$ so that $$\widehat{p}_i\equiv \lambda p^b_i + (1-\lambda) p^a_i\label{eq:pil}$$ The identifying assumption for $\lambda$ is that the weight is the same for all contracts. Systematic deviations from $\lambda=0.5$ are therefore assumed to be common across contracts. Summing up over all contracts $i$ and solving for $\lambda$ yields $$\lambda=\frac{A-1}{A-B}\label{eq:lambda}$$ It is immediately apparent that $\lambda\ge0$ implies $A\le1$, and that $\lambda\le1$ implies $B\le1$. These conditions are always met because arbitrage opportunities rule out $A < 1$ or $B > 1$. Substituting (\ref{eq:lambda}) back into (\ref{eq:pil}) reveals that $$\widehat{p}_i=\frac{p^b_i (A-1) + p^a_i(1-B)}{A-B} \label{eq:prediction}$$ It is easily verified that (\ref{eq:prediction}) satisifies (\ref{eq:sumup}).

In some instances, there is no best bid or best ask price. In these cases, assuming that there is a regularity in the bid-ask bounce, the last traded price $p_i^l$ can be used as a substitue for $p^b_i$ if $p_i^l > p^b_i$, and for $p^a_i$ if $p_i^l < p^a_i$.

It is also useful to characterize the influence of a change in the best ask or best bid price on the overall prediction. Differentiating (\ref{eq:prediction}) with respect to $p^a_i$ and $p^b_i$ yields: $$\frac{\partial\widehat{p}_i}{\partial p^a_i}=(1-\lambda)\left[1-\frac{p^a_i-p^b_i}{A-B}\right]>0\label{eq:dpia}$$ and $$\frac{\partial\widehat{p}_i}{\partial p^b_i}=\lambda\left[1-\frac{p^a_i-p^b_i}{A-B}\right]>0\label{eq:dpib}$$ Increases in the bid and ask prices also increase the payoff prediction. However, both expressions are also smaller than one. If the bid-ask spreads $(p^a_i-p^b_i)$ are about the same across the $n$ contracts in the prediction market, then the ratio of the individual spread to the total spread is about $1/n$, and therefore the expression in square brackets in (\ref{eq:dpia}) and (\ref{eq:dpib}) is approximately $(n-1)/n$. Thus the magnitude of (\ref{eq:dpia}) and (\ref{eq:dpib}) depends primarily on $\lambda$. If an increase in the best ask price occurs at a level where $A$ is already exceeding $1$ a fair bit, then $1-\lambda$ will be fairly small and further increases in the ask price have little influence on the predicted price. This mechanism is therefore preventing predictions from becoming spurious due to distortions to individual ask prices. The mechanism works in the opposite direction for changes in the best bid price. To summarize, predictions based on the weighted bid-ask midpoint as defined in (\ref{eq:prediction}) are more robust against temporary distortions and market influences than prices that are determined through simple normalizations.

For comparison purposes, it is useful to consider an alternative prediction based on normalized bid-ask midpoints. The latter are defined as $p^m_i\equiv(p^a_i+p^b_i)/2$. From that it follows that the the sum of bid-ask midpoints is equal to $(A+B)/2$, which may or may not be equal to 1. The normalized prediction based on bid-ask midpoints is therefore given by $$\widehat{p}^m_i= \frac{p^a_i+p^b_i}{A+B}\label{eq:pmi}$$ Prediction (\ref{eq:pmi}) satisfies (\ref{eq:sumup}) by using the definitions in (\ref{eq:AB}). Predictions based on (\ref{eq:pmi}) may be internally inconsistent because it is possible that $\widehat{p}^m_i < p^b_i$ or that $\widehat{p}^m_i > p^a_i$. For example, the latter case can arise when $p^a_i/p^b_i < A+B-1$. Because the ask-to-bid price ratio on the left hand side is greater than one by definition, the inequality will tend to be violated when $A$ exceeds one by a fair bit and $B$ is close to one, and the ask-to-bid price ratio is close to one for a particular contract. Such situations can occur when one contract has an unusually large bid-ask spread. Therefore, the unweighted bid-ask midpoint is a much less suitable price predictor than the weighted bid-ask midpoint.