Online Covariance Forecast Evaluation#

This tutorial shows how to evaluate online covariance estimators with online_covariance_forecast_evaluation.

We compare EWCovariance, a plain EWMA covariance, against RegimeAdjustedEWCovariance, its regime-adjusted counterpart based on the Short-Term Volatility Update (STVU) [1].

Both support incremental updates via partial_fit, making them suitable for streaming evaluation. For estimators that do not support partial_fit, the batch counterpart covariance_forecast_evaluation can be used instead.

Data#

We load the S&P 500 dataset composed of the daily prices of 20 assets from the S&P 500 Index composition starting from 2010-01-04 up to 2022-12-28.

import numpy as np
from plotly.io import show

from skfolio.datasets import load_sp500_dataset
from skfolio.model_selection import (
    CovarianceForecastComparison,
    online_covariance_forecast_evaluation,
)
from skfolio.moments import EWCovariance, RegimeAdjustedEWCovariance, RegimeAdjustmentMethod
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()
X = prices_to_returns(prices)
X = X["2010":]

Covariance Estimators#

We use two covariance estimators:

EWCovariance can react slowly to volatility shocks.

RegimeAdjustedEWCovariance adds a regime adjustment via the Short-Term Volatility Update (STVU). This applies a scalar multiplier to better align predicted and realized risk when volatility regimes change faster than a plain EWMA can track.

We set the same variance half-life of 40 trading days for both estimators and a correlation half-life of 80 trading days for RegimeAdjustedEWCovariance. Lower half-life for variance allows the model to adapt faster to volatility shifts, while higher half-life for correlation enables more stable estimation of co-movements, which typically require more data for reliable inference and reduces estimation noise. This choice also aligns with empirical evidence that volatility tends to mean-revert faster than correlation.

ew_cov = EWCovariance(half_life=40)

stvu_cov = RegimeAdjustedEWCovariance(
    half_life=40,
    corr_half_life=80,
    regime_half_life=20,
    regime_method=RegimeAdjustmentMethod.RMS,
)

Evaluate Each Estimator#

We now evaluate each estimator with online_covariance_forecast_evaluation. This function performs a walk-forward evaluation. At each step, it updates the estimator with partial_fit and compares the one-step-ahead forecast with the next realized return.

Here, warmup_size=252 reserves the first year for initialization, while test_size=1 evaluates the forecast one day at a time.

ew_evaluation = online_covariance_forecast_evaluation(
    ew_cov,
    X,
    warmup_size=252,
    test_size=1,
)
stvu_evaluation = online_covariance_forecast_evaluation(
    stvu_cov,
    X,
    warmup_size=252,
    test_size=1,
)

Summary Table#

Let’s display the summary of the regime-adjusted covariance forecast evaluation. The four rows are:

  • Mahalanobis ratio evaluates whether the full covariance structure (all eigenvalue directions) is correctly specified. The target is 1.0, with values above 1.0 indicating underestimated risk and values below 1.0 indicating overestimated risk.

  • Diagonal ratio evaluates the individual asset variances only, with the same 1.0 target and interpretation.

  • Portfolio standardized returns evaluate calibration along one portfolio direction rather than across all directions. Their std column is the bias statistic, with values near 1.0 meaning well-calibrated portfolio risk.

  • Portfolio QLIKE evaluates portfolio variance forecasts along one portfolio direction by comparing the forecast portfolio variance with the realized sum of squared portfolio returns over the evaluation window. Lower values indicate better variance forecasts.

stvu_evaluation.summary()
mean median std p5 p95 mad_from_target target
Mahalanobis ratio 1.410751 0.997069 1.591321 0.265389 3.735248 0.819639 1.0
Diagonal ratio 1.196073 0.817138 1.309131 0.200601 3.318981 0.744549 1.0
Portfolio standardized returns 0.070449 0.102939 1.033466 -1.669902 1.61604 0.7606 mean=0, std=1
Portfolio QLIKE -8.622594 -9.12539 2.387038 -10.752215 -5.20276 NaN lower is better


Calibration Plot#

Let’s now plot the rolling calibration diagnostics: the rolling mean of the Mahalanobis ratio, the rolling mean of the diagonal ratio, and the rolling bias statistic from the portfolio standardized returns.

stvu_evaluation.plot_calibration()


Side-by-Side Comparison#

We now compare both evaluations with CovarianceForecastComparison:

comparison = CovarianceForecastComparison(
    [ew_evaluation, stvu_evaluation], names=["EWMA Cov", "STVU Cov"]
)
comparison.summary()
estimator EWMA Cov STVU Cov
mean median std p5 p95 mad_from_target target mean median std p5 p95 mad_from_target target
Mahalanobis ratio 1.34943 0.959639 1.505114 0.343315 3.433126 0.735578 1.0 1.410751 0.997069 1.591321 0.265389 3.735248 0.819639 1.0
Diagonal ratio 1.074643 0.715882 1.370326 0.21308 2.95529 0.699377 1.0 1.196073 0.817138 1.309131 0.200601 3.318981 0.744549 1.0
Portfolio standardized returns 0.067156 0.099519 1.038122 -1.648727 1.585503 0.746656 mean=0, std=1 0.070449 0.102939 1.033466 -1.669902 1.61604 0.7606 mean=0, std=1
Portfolio QLIKE -8.545705 -9.128461 2.799672 -10.463692 -5.208358 NaN lower is better -8.622594 -9.12539 2.387038 -10.752215 -5.20276 NaN lower is better


Bias Statistic#

Let’s plot the bias statistic. It measures whether the portfolio risk forecast is well calibrated, with a target of 1.0. We expect the regime-adjusted model to remain closer to 1.0, especially during stress periods.

comparison.plot_calibration(diagnostics=["bias"])


QLIKE Loss#

Let’s now plot the QLIKE loss. It compares the forecast portfolio variance with the realized sum of squared portfolio returns over the evaluation window, with lower values indicating better portfolio variance forecasts. Because STVU rescales the forecast toward realized risk, we generally expect it to achieve a lower QLIKE.

comparison.plot_qlike_loss()


Exceedance Rates#

We can also display the exceedance summary. If the covariance forecast were perfectly calibrated and returns were Gaussian, the squared Mahalanobis distance would follow a chi-squared distribution. The exceedance rate measures how often this distance exceeds the chi-squared threshold at a given significance level.

In practice, daily equity returns are fat-tailed, so in this example both estimators exceed the nominal levels. This metric is therefore more useful for comparing estimators than for making an absolute calibration statement.

comparison.exceedance_summary()
estimator EWMA Cov STVU Cov
observed_rate deviation observed_rate deviation
confidence_level
0.95 0.227966 0.177966 0.270378 0.220378
0.99 0.169649 0.159649 0.201789 0.191789


Multi-Portfolio Analysis#

In the evaluations above, we use the default portfolio_weights=None, which computes dynamic inverse-volatility weights at each step as a single default portfolio direction so that high-volatility assets do not dominate the diagnostics. We can also provide explicit test portfolios to evaluate calibration along multiple portfolio directions instead of only this default one. Unlike the Mahalanobis diagnostic, which tests the full covariance structure across all directions, these portfolio diagnostics focus on selected traded directions.

In practice, these portfolios should be representative of the allocations you trade. Here, we generate random Dirichlet draws for illustration:

n_assets = X.shape[1]
rng = np.random.default_rng(42)
portfolio_weights = rng.dirichlet(np.ones(n_assets), size=30)
ew_multi_portfolio = online_covariance_forecast_evaluation(
    ew_cov,
    X,
    warmup_size=252,
    test_size=1,
    portfolio_weights=portfolio_weights,
)
stvu_multi_portfolio = online_covariance_forecast_evaluation(
    stvu_cov,
    X,
    warmup_size=252,
    test_size=1,
    portfolio_weights=portfolio_weights,
)

Bias Statistic Distribution#

Let’s summarize the bias statistic across the 30 portfolio directions. A tight P5-P95 spread indicates that calibration does not depend strongly on the selected portfolio direction.

multi_portfolio_comparison = CovarianceForecastComparison(
    [ew_multi_portfolio, stvu_multi_portfolio], names=["EWMA Cov", "STVU Cov"]
)
multi_portfolio_comparison.bias_statistic_summary()
p5 p25 median p75 p95 mean n_portfolios
EWMA Cov 1.026164 1.027146 1.027982 1.028803 1.030388 1.028066 30.0
STVU Cov 1.025547 1.028614 1.032159 1.039217 1.046619 1.034172 30.0


The calibration plot shows the median bias across portfolios together with the P5-P95 bands:

fig = multi_portfolio_comparison.plot_calibration(diagnostics=["bias"])
show(fig)

The QLIKE plot also includes the P5-P95 bands from the 30 portfolios:

multi_portfolio_comparison.plot_qlike_loss()


Conclusion#

This tutorial showed how to:

  1. Define online covariance estimators supporting partial_fit.

  2. Evaluate them with online_covariance_forecast_evaluation.

  3. Inspect calibration diagnostics and QLIKE.

  4. Compare multiple estimators with CovarianceForecastComparison.

  5. Extend the analysis to multiple portfolio directions.

In the next tutorial, we show how to tune covariance estimator hyperparameters with online search.

Total running time of the script: (0 minutes 23.414 seconds)

Gallery generated by Sphinx-Gallery