skfolio.moments.RegimeAdjustedEWVariance#

class skfolio.moments.RegimeAdjustedEWVariance(half_life=40, hac_lags=None, regime_method=FIRST_MOMENT, regime_half_life=None, regime_multiplier_clip=(0.7, 1.6), regime_min_observations=None, min_observations=None, assume_centered=True)[source]#

Exponentially weighted variance estimator with regime adjustment via the Short-Term Volatility Update (STVU) [1].

This is the variance-only counterpart of RegimeAdjustedEWCovariance, assuming zero correlation. This is appropriate when:

Estimating idiosyncratic (specific) risk in factor models, where residual returns are uncorrelated by construction
Working with orthogonalized or uncorrelated return series
The full covariance structure is not needed or is constructed separately

This estimator computes per-asset exponentially weighted variances and applies a scalar multiplier $\phi_t$ to improve risk calibration when volatility regimes change more quickly than a plain EWMA can track.

Additionally, this estimator supports optional Newey-West HAC (Heteroskedasticity and Autocorrelation Consistent) correction via the hac_lags parameter. This adjusts for serial correlation in returns.

NaN handling:

The estimator handles missing data (NaN returns) caused by late listings, delistings, and holidays using EWMA updates together with active_mask. An asset with active_mask=True is treated as active at time $t$. If its return is finite, the EWMA is updated normally. If its return is NaN, the observation is treated as a holiday and the previous variance is kept. An asset with active_mask=False is treated as inactive, for example during pre-listing or post-delisting periods, and its variance is set to NaN.

Active with valid return: Normal EWMA update.
Active with NaN return (holiday): Freeze; the previous variance is kept.
Inactive (active_mask=False): Variance is set to NaN.

When active_mask is not provided, trailing NaN returns are treated as holidays and the variance is frozen. When an asset becomes active again after an inactive period, its variance restarts from a zero prior and receives per-asset bias correction at output time.

Late-listing bias correction:

The EWMA recursion is initialized at zero for every asset. This zero-initialization introduces a transient downward scale bias: after $n_i$ valid observations, the raw EWMA weights sum to $(1 - \lambda^{n_i})$ instead of 1. At output time, a per-asset correction removes this bias:

\[\hat{\sigma}^2_i = \frac{S_i}{1 - \lambda^{n_i}}\]

where $S_i$ is the raw internal EWMA accumulator. For assets with a long history, the correction is negligible ($\lambda^{n_i} \to 0$).

The min_observations parameter controls a warm-up period: an asset’s variance estimate remains NaN in the output until it has accumulated enough valid observations for a reliable estimate.

Estimation universe for STVU:

An optional estimation_mask defines the estimation universe used for the cross-sectional STVU statistic without affecting per-asset EWMA variance updates. The STVU is computed in a one-step-ahead manner: the return observed at time $t$ is standardized by the bias-corrected variance estimate available at time $t-1$, and only assets that were already above min_observations before time $t$ contribute to the regime signal. This is important because the STVU multiplier is derived from a cross-sectional average of standardized squared returns: noisy or illiquid assets with unreliable variance estimates can inflate or deflate the statistic, distorting the regime multiplier applied to all variances.

Parameters:

half_lifefloat, default=40

Half-life of the exponential weights in number of observations.

The half-life controls how quickly older observations lose their influence:

Larger half-life: More stable estimates, slower to adapt (robust to noise)
Smaller half-life: More responsive estimates, faster to adapt (sensitive to noise)

The decay factor $\lambda$ is computed as: $\lambda = 2^{-1/\text{half-life}}$

For example:

half-life = 40: $\lambda \approx 0.983$
half-life = 23: $\lambda \approx 0.970$
half-life = 11: $\lambda \approx 0.939$
half-life = 6: $\lambda \approx 0.891$

Note

For portfolio optimization, larger half-lives (>= 20) are generally preferred to avoid excessive turnover from estimation noise.

hac_lagsint, optional

Number of lags for Newey-West HAC (Heteroskedasticity and Autocorrelation Consistent) correction. If None (default), no HAC correction is applied.

When enabled, the variance update uses HAC-adjusted squared returns instead of simple squared returns, accounting for autocorrelation:

\[\text{hac\_var}_i = r_{i,t}^2 + 2 \sum_{j=1}^{L} w_j \cdot r_{i,t} \cdot r_{i,t-j}\]

where $w_j = 1 - j/(L+1)$ is the Bartlett kernel weight.

Typical values:

Daily equity data: 3-5 lags (weak autocorrelation from microstructure)
High-frequency data: 5-10 lags (stronger autocorrelation)
Monthly data: 1-2 lags

Must be a positive integer if specified.

regime_methodRegimeAdjustmentMethod, default=RegimeAdjustmentMethod.FIRST_MOMENT

Method used to transform the update statistic into the volatility multiplier $\phi$:

LOG: Robust to outliers (log compresses extremes)
FIRST_MOMENT: Calibrates the first moment of the standardized risk statistic
RMS: $\chi^2$ calibration (sensitive to extremes)

regime_half_lifefloat, optional

Half-life for smoothing the volatility regime signal, in number of observations.

The regime signal is built from one-step-ahead standardized returns and then transformed into the multiplier $\phi$ according to regime_method. A shorter regime_half_life makes the multiplier react faster to abrupt changes in realized risk. A longer one produces a smoother, slower moving adjustment.

If None (default), it is automatically calibrated as: $\text{regime-half-life} = 0.5 \times \text{half-life}$

This makes the STVU more responsive (shorter half-life) than the variance, allowing it to quickly rescale risk when realized volatility deviates from the slower EWMA estimate.

regime_multiplier_cliptuple[float, float] or None, default=(0.7, 1.6)

Clip to avoid extreme swings in the regime multiplier. Set to None to disable clipping. The multiplier is applied to the covariance as $\phi^2 \Sigma$.

Default bounds rationale:

Lower bound (0.7): Limits volatility reduction to 30%, equivalent to a minimum variance scale of $0.7^2 = 0.49$
Upper bound (1.6): Limits volatility increase to 60%, equivalent to a maximum variance scale of $1.6^2 = 2.56$

regime_min_observationsint, optional

Minimum number of one-step-ahead comparisons before enabling STVU. If insufficient data, STVU defaults to 1.0 (no adjustment).

If None (default), it is automatically set to int(regime_half_life), ensuring the STVU EWMA has seen roughly one half-life of data before being applied.

min_observationsint, optional

Minimum number of valid observations per asset before its variance estimate is considered reliable and exposed in the output variance_. Until this threshold is reached, the asset’s variance estimate remains NaN.

The default (None) uses int(half_life) as the threshold, ensuring the late-listing initialization bias has decayed to at most 50%. Set to 1 to disable warm-up entirely.

assume_centeredbool, default=True

If True (default), the EWMA update uses raw returns without demeaning. This is the standard convention for EWMA variance estimation in finance. If False, returns are demeaned using an EWMA mean estimate before computing the variance update, and location_ tracks the EWMA mean.

Note

For factor model residuals, centering is typically not needed as residuals should already have zero mean by construction. Set to False only if residuals exhibit persistent non-zero means.

Attributes:

variance_ndarray of shape (n_assets,): Estimated regime-adjusted variances.
regime_multiplier_float: The volatility regime adjustment factor applied. Equal to 1.0 if insufficient data or no regime adjustment needed.
location_ndarray of shape (n_assets,): Estimated location, i.e. the estimated mean. When assume_centered=True, this is zero. When assume_centered=False, this is the EWMA mean estimate.
n_features_in_int: Number of assets seen during fit.
feature_names_in_ndarray of shape (n_features_in_,): Names of features seen during fit. Defined only when X has feature names that are all strings.

Methods

`fit`(X[, y, estimation_mask, active_mask])	Fit the Regime-Adjusted Exponentially Weighted Variance estimator.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`partial_fit`(X[, y, estimation_mask, active_mask])	Incrementally fit the estimator with new observations.
`set_fit_request`(*[, active_mask, ...])	Configure whether metadata should be requested to be passed to the `fit` method.
`set_params`(**params)	Set the parameters of this estimator.
`set_partial_fit_request`(*[, active_mask, ...])	Configure whether metadata should be requested to be passed to the `partial_fit` method.

References

[1]

“The Elements of Quantitative Investing”, Wiley Finance, Giuseppe Paleologo (2025).

[2]

“Multivariate exponentially weighted moving covariance matrix”, Technometrics, Hawkins & Maboudou-Tchao (2008).

Examples

>>> import numpy as np
>>> from skfolio.datasets import load_sp500_dataset
>>> from skfolio.moments import RegimeAdjustedEWVariance, RegimeAdjustmentMethod
>>> from skfolio.preprocessing import prices_to_returns
>>>
>>> prices = load_sp500_dataset()
>>> X = prices_to_returns(prices)
>>> # Standard EWMA with STVU
>>> model = RegimeAdjustedEWVariance(half_life=23)
>>> model.fit(X)
>>> print(model.regime_multiplier_)
>>>
>>> # With LOG method for robustness to outliers
>>> model2 = RegimeAdjustedEWVariance(
...     half_life=11,
...     regime_method=RegimeAdjustmentMethod.LOG
... )
>>> model2.fit(X)
>>>
>>> # With Newey-West HAC correction for autocorrelation
>>> model3 = RegimeAdjustedEWVariance(
...     half_life=23,
...     hac_lags=5    # 5-lag Newey-West correction
... )
>>> model3.fit(X)
>>>
>>> # With an estimation universe focused on specific assets
>>> estimation_mask = np.ones((len(X), X.shape[1]), dtype=bool)
>>> estimation_mask[:, :5] = False  # Exclude first 5 assets from STVU
>>> model4 = RegimeAdjustedEWVariance(half_life=23)
>>> model4.fit(X, estimation_mask=estimation_mask)

fit(X, y=None, *, estimation_mask=None, active_mask=None)[source]#

Fit the Regime-Adjusted Exponentially Weighted Variance estimator.

Parameters:

Xarray-like of shape (n_observations, n_assets)

Idiosyncratic (specific) residual returns per asset, typically obtained from a factor model regression. NaN values are allowed and handled robustly.

yIgnored

Not used, present for API consistency by convention.

estimation_maskarray-like of shape (n_observations, n_assets), optional

Boolean mask indicating which active assets should belong to the estimation universe for the cross-sectional STVU statistic on each day.

If None (default), all active assets with finite returns are used.
If provided, only assets where the mask is True contribute to the regime multiplier calculation on that day.

Per-asset EWMA variance updates still use all active assets with finite returns; this parameter only affects the cross-sectional regime adjustment calculation.

Use cases:

Focus on liquid assets to reduce noise from thinly traded securities
Exclude assets with suspected data quality issues
Match the estimation universe used in downstream models

active_maskarray-like of shape (n_observations, n_assets), optional

Boolean mask indicating whether each asset is structurally active at each observation. Use this to distinguish between holidays (active_mask=True and NaN return: variance is frozen) and inactive periods such as pre-listing or post-delisting (active_mask=False: variance is set to NaN). If None (default), all pairs are assumed active and NaN returns are treated as holidays (variance frozen).

When an asset becomes active again after an inactive period, its variance restarts from a zero prior with per-asset bias correction.

Returns:

selfRegimeAdjustedEWVariance: Fitted estimator.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

partial_fit(X, y=None, *, estimation_mask=None, active_mask=None)[source]#

Incrementally fit the estimator with new observations.

This method allows online/streaming updates to the variance estimates.

Parameters:

Xarray-like of shape (n_observations, n_assets): Idiosyncratic (specific) residual returns per asset. NaN values are allowed and handled robustly.
yIgnored: Not used, present for API consistency by convention.
estimation_maskarray-like of shape (n_observations, n_assets), optional: Boolean mask indicating which active assets belong to the estimation universe for the cross-sectional STVU statistic on each day. See fit for details.
active_maskarray-like of shape (n_observations, n_assets), optional: Boolean mask indicating whether each asset is structurally active at each observation. See fit for details.

Returns:

selfRegimeAdjustedEWVariance: Fitted estimator.

set_fit_request(*, active_mask='$UNCHANGED$', estimation_mask='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

active_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for active_mask parameter in fit.
estimation_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for estimation_mask parameter in fit.

Returns:

selfobject: The updated object.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

set_partial_fit_request(*, active_mask='$UNCHANGED$', estimation_mask='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the partial_fit method.

The options for each parameter are:

True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to partial_fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

active_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for active_mask parameter in partial_fit.
estimation_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for estimation_mask parameter in partial_fit.

Returns:

selfobject: The updated object.