Note

Go to the end to download the full example code or to run this example in your browser via JupyterLite or Binder.

Online Covariance Forecast Evaluation#

This tutorial shows how to evaluate online covariance estimators with online_covariance_forecast_evaluation.

We compare EWCovariance, a plain EWMA covariance, against RegimeAdjustedEWCovariance, its regime-adjusted counterpart based on the Short-Term Volatility Update (STVU) [1].

Both support incremental updates via partial_fit, making them suitable for streaming evaluation. For estimators that do not support partial_fit, the batch counterpart covariance_forecast_evaluation can be used instead.

Data#

We load the S&P 500 dataset composed of the daily prices of 20 assets from the S&P 500 Index composition starting from 2010-01-04 up to 2022-12-28.

import numpy as np
from plotly.io import show

from skfolio.datasets import load_sp500_dataset
from skfolio.model_selection import (
    CovarianceForecastComparison,
    online_covariance_forecast_evaluation,
)
from skfolio.moments import EWCovariance, RegimeAdjustedEWCovariance, RegimeAdjustmentMethod
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()
X = prices_to_returns(prices)
X = X["2010":]

Covariance Estimators#

We use two covariance estimators:

EWCovariance can react slowly to volatility shocks.

RegimeAdjustedEWCovariance adds a regime adjustment via the Short-Term Volatility Update (STVU). This applies a scalar multiplier to better align predicted and realized risk when volatility regimes change faster than a plain EWMA can track.

We set the same variance half-life of 40 trading days for both estimators and a correlation half-life of 80 trading days for RegimeAdjustedEWCovariance. Lower half-life for variance allows the model to adapt faster to volatility shifts, while higher half-life for correlation enables more stable estimation of co-movements, which typically require more data for reliable inference and reduces estimation noise. This choice also aligns with empirical evidence that volatility tends to mean-revert faster than correlation.

ew_cov = EWCovariance(half_life=40)

stvu_cov = RegimeAdjustedEWCovariance(
    half_life=40,
    corr_half_life=80,
    regime_half_life=20,
    regime_method=RegimeAdjustmentMethod.RMS,
)

Evaluate Each Estimator#

We now evaluate each estimator with online_covariance_forecast_evaluation. This function performs a walk-forward evaluation. At each step, it updates the estimator with partial_fit and compares the one-step-ahead forecast with the next realized return.

Here, warmup_size=252 reserves the first year for initialization, while test_size=1 evaluates the forecast one day at a time.

ew_evaluation = online_covariance_forecast_evaluation(
    ew_cov,
    X,
    warmup_size=252,
    test_size=1,
)
stvu_evaluation = online_covariance_forecast_evaluation(
    stvu_cov,
    X,
    warmup_size=252,
    test_size=1,
)

Summary Table#

Let’s display the summary of the regime-adjusted covariance forecast evaluation. The four rows are:

Mahalanobis ratio evaluates whether the full covariance structure (all eigenvalue directions) is correctly specified. The target is 1.0, with values above 1.0 indicating underestimated risk and values below 1.0 indicating overestimated risk.
Diagonal ratio evaluates the individual asset variances only, with the same 1.0 target and interpretation.
Portfolio standardized returns evaluate calibration along one portfolio direction rather than across all directions. Their std column is the bias statistic, with values near 1.0 meaning well-calibrated portfolio risk.
Portfolio QLIKE evaluates portfolio variance forecasts along one portfolio direction by comparing the forecast portfolio variance with the realized sum of squared portfolio returns over the evaluation window. Lower values indicate better variance forecasts.

stvu_evaluation.summary()

	mean	median	std	p5	p95	mad_from_target	target
Mahalanobis ratio	1.410751	0.997069	1.591321	0.265389	3.735248	0.819639	1.0
Diagonal ratio	1.196073	0.817138	1.309131	0.200601	3.318981	0.744549	1.0
Portfolio standardized returns	0.070449	0.102939	1.033466	-1.669902	1.61604	0.7606	mean=0, std=1
Portfolio QLIKE	-8.622594	-9.12539	2.387038	-10.752215	-5.20276	NaN	lower is better

Calibration Plot#

Let’s now plot the rolling calibration diagnostics: the rolling mean of the Mahalanobis ratio, the rolling mean of the diagonal ratio, and the rolling bias statistic from the portfolio standardized returns.

stvu_evaluation.plot_calibration()

Side-by-Side Comparison#

We now compare both evaluations with CovarianceForecastComparison:

comparison = CovarianceForecastComparison(
    [ew_evaluation, stvu_evaluation], names=["EWMA Cov", "STVU Cov"]
)
comparison.summary()

estimator	EWMA Cov							STVU Cov
	mean	median	std	p5	p95	mad_from_target	target	mean	median	std	p5	p95	mad_from_target	target
Mahalanobis ratio	1.34943	0.959639	1.505114	0.343315	3.433126	0.735578	1.0	1.410751	0.997069	1.591321	0.265389	3.735248	0.819639	1.0
Diagonal ratio	1.074643	0.715882	1.370326	0.21308	2.95529	0.699377	1.0	1.196073	0.817138	1.309131	0.200601	3.318981	0.744549	1.0
Portfolio standardized returns	0.067156	0.099519	1.038122	-1.648727	1.585503	0.746656	mean=0, std=1	0.070449	0.102939	1.033466	-1.669902	1.61604	0.7606	mean=0, std=1
Portfolio QLIKE	-8.545705	-9.128461	2.799672	-10.463692	-5.208358	NaN	lower is better	-8.622594	-9.12539	2.387038	-10.752215	-5.20276	NaN	lower is better

Bias Statistic#

Let’s plot the bias statistic. It measures whether the portfolio risk forecast is well calibrated, with a target of 1.0. We expect the regime-adjusted model to remain closer to 1.0, especially during stress periods.

comparison.plot_calibration(diagnostics=["bias"])

QLIKE Loss#

Let’s now plot the QLIKE loss. It compares the forecast portfolio variance with the realized sum of squared portfolio returns over the evaluation window, with lower values indicating better portfolio variance forecasts. Because STVU rescales the forecast toward realized risk, we generally expect it to achieve a lower QLIKE.

comparison.plot_qlike_loss()

Exceedance Rates#

We can also display the exceedance summary. If the covariance forecast were perfectly calibrated and returns were Gaussian, the squared Mahalanobis distance would follow a chi-squared distribution. The exceedance rate measures how often this distance exceeds the chi-squared threshold at a given significance level.

In practice, daily equity returns are fat-tailed, so in this example both estimators exceed the nominal levels. This metric is therefore more useful for comparing estimators than for making an absolute calibration statement.

comparison.exceedance_summary()

estimator	EWMA Cov		STVU Cov
	observed_rate	deviation	observed_rate	deviation
confidence_level
0.95	0.227966	0.177966	0.270378	0.220378
0.99	0.169649	0.159649	0.201789	0.191789

Multi-Portfolio Analysis#

In the evaluations above, we use the default portfolio_weights=None, which computes dynamic inverse-volatility weights at each step as a single default portfolio direction so that high-volatility assets do not dominate the diagnostics. We can also provide explicit test portfolios to evaluate calibration along multiple portfolio directions instead of only this default one. Unlike the Mahalanobis diagnostic, which tests the full covariance structure across all directions, these portfolio diagnostics focus on selected traded directions.

In practice, these portfolios should be representative of the allocations you trade. Here, we generate random Dirichlet draws for illustration:

n_assets = X.shape[1]
rng = np.random.default_rng(42)
portfolio_weights = rng.dirichlet(np.ones(n_assets), size=30)

ew_multi_portfolio = online_covariance_forecast_evaluation(
    ew_cov,
    X,
    warmup_size=252,
    test_size=1,
    portfolio_weights=portfolio_weights,
)
stvu_multi_portfolio = online_covariance_forecast_evaluation(
    stvu_cov,
    X,
    warmup_size=252,
    test_size=1,
    portfolio_weights=portfolio_weights,
)

Bias Statistic Distribution#

Let’s summarize the bias statistic across the 30 portfolio directions. A tight P5-P95 spread indicates that calibration does not depend strongly on the selected portfolio direction.

multi_portfolio_comparison = CovarianceForecastComparison(
    [ew_multi_portfolio, stvu_multi_portfolio], names=["EWMA Cov", "STVU Cov"]
)
multi_portfolio_comparison.bias_statistic_summary()

	p5	p25	median	p75	p95	mean	n_portfolios
EWMA Cov	1.026164	1.027146	1.027982	1.028803	1.030388	1.028066	30.0
STVU Cov	1.025547	1.028614	1.032159	1.039217	1.046619	1.034172	30.0

The calibration plot shows the median bias across portfolios together with the P5-P95 bands:

fig = multi_portfolio_comparison.plot_calibration(diagnostics=["bias"])
show(fig)

The QLIKE plot also includes the P5-P95 bands from the 30 portfolios:

multi_portfolio_comparison.plot_qlike_loss()

Conclusion#

This tutorial showed how to:

Define online covariance estimators supporting partial_fit.
Evaluate them with online_covariance_forecast_evaluation.
Inspect calibration diagnostics and QLIKE.
Compare multiple estimators with CovarianceForecastComparison.
Extend the analysis to multiple portfolio directions.

In the next tutorial, we show how to tune covariance estimator hyperparameters with online search.

Total running time of the script: (0 minutes 16.435 seconds)

Gallery generated by Sphinx-Gallery