Brier score

Contents

Interactive online version: Binder badge. Download notebook.

Brier score#

The Brier score is the most commonly used verification metric for evaluating a probability of a binary outcome forecast, such as a “chance of rainfall” forecast.

Probabilistic forecasts of binary events are expressed as values between 0 and 1, and observations are exactly 0 (event did not occur), or 1 (event occured).

The metric is then calculated the same way as MSE. The Brier score is a strictly proper scoring rule where lower values are better (it is negatively oriented) where a perfect score is 0 and the worst score is 1.

[1]:
from scores.probability import brier_score
from scipy.stats import beta, binom

import numpy as np
import xarray as xr
[2]:
# To learn more about the implemenation of the Brier score, uncomment the following
# help(brier_score)

We generate two synthetic forecasts. By design, fcst1 is a good forecast, while fcst2 is a poor forecast. We measure the difference in skill by calculating and comparing their Brier Scores.

[3]:

fcst1 = beta.rvs(2, 1, size=1000) obs = binom.rvs(1, fcst1) fcst2 = beta.rvs(0.5, 1, size=1000) fcst1 = xr.DataArray(data=fcst1, dims="time", coords={"time": np.arange(0, 1000)}) fcst2 = xr.DataArray(data=fcst2, dims="time", coords={"time": np.arange(0, 1000)}) obs = xr.DataArray(data=obs, dims="time", coords={"time": np.arange(0, 1000)})
[4]:
brier_fcst1 = brier_score(fcst1, obs)
brier_fcst2 = brier_score(fcst2, obs)

print(f"Brier score for fcst1 = {brier_fcst1.item():.2f}")
print(f"Brier score for fcst2 = {brier_fcst2.item():.2f}")
Brier score for fcst1 = 0.16
Brier score for fcst2 = 0.43

As expected, fcst1 has the lower Brier Score quantifying the degree to which it is better than fcst2.

Notes#

  • If you are using the Brier score on large data with Dask, consider setting check_args arg to False in brier_score.

  • In the future, the Brier score components calculation will be added.

  • You may be interested in working through the Murphy Diagram tutorial which allows you to break down the performance of the Brier score based on each threshold probability.

Reference: Brier, G.W., 1950. Verification of forecasts expressed in terms of probability. Monthly weather review, 78(1), pp.1-3.