skfolio.moments.RegimeAdjustedEWVariance#
- class skfolio.moments.RegimeAdjustedEWVariance(half_life=40, hac_lags=None, regime_method=FIRST_MOMENT, regime_half_life=None, regime_multiplier_clip=(0.7, 1.6), regime_min_observations=None, min_observations=None, assume_centered=True)[source]#
Exponentially weighted variance estimator with regime adjustment via the Short-Term Volatility Update (STVU) [1].
This is the variance-only counterpart of
RegimeAdjustedEWCovariance, assuming zero correlation. This is appropriate when:Estimating idiosyncratic (specific) risk in factor models, where residual returns are uncorrelated by construction
Working with orthogonalized or uncorrelated return series
The full covariance structure is not needed or is constructed separately
This estimator computes per-asset exponentially weighted variances and applies a scalar multiplier \(\phi_t\) to improve risk calibration when volatility regimes change more quickly than a plain EWMA can track.
Additionally, this estimator supports optional Newey-West HAC (Heteroskedasticity and Autocorrelation Consistent) correction via the
hac_lagsparameter. This adjusts for serial correlation in returns.NaN handling:
The estimator handles missing data (NaN returns) caused by late listings, delistings, and holidays using EWMA updates together with
active_mask. An asset withactive_mask=Trueis treated as active at time \(t\). If its return is finite, the EWMA is updated normally. If its return is NaN, the observation is treated as a holiday and the previous variance is kept. An asset withactive_mask=Falseis treated as inactive, for example during pre-listing or post-delisting periods, and its variance is set to NaN.Active with valid return: Normal EWMA update.
Active with NaN return (holiday): Freeze; the previous variance is kept.
Inactive (
active_mask=False): Variance is set to NaN.
When
active_maskis not provided, trailing NaN returns are treated as holidays and the variance is frozen. When an asset becomes active again after an inactive period, its variance restarts from a zero prior and receives per-asset bias correction at output time.Late-listing bias correction:
The EWMA recursion is initialized at zero for every asset. This zero-initialization introduces a transient downward scale bias: after \(n_i\) valid observations, the raw EWMA weights sum to \((1 - \lambda^{n_i})\) instead of 1. At output time, a per-asset correction removes this bias:
\[\hat{\sigma}^2_i = \frac{S_i}{1 - \lambda^{n_i}}\]where \(S_i\) is the raw internal EWMA accumulator. For assets with a long history, the correction is negligible (\(\lambda^{n_i} \to 0\)).
The
min_observationsparameter controls a warm-up period: an asset’s variance estimate remains NaN in the output until it has accumulated enough valid observations for a reliable estimate.Estimation universe for STVU:
An optional
estimation_maskdefines the estimation universe used for the cross-sectional STVU statistic without affecting per-asset EWMA variance updates. The STVU is computed in a one-step-ahead manner: the return observed at time \(t\) is standardized by the bias-corrected variance estimate available at time \(t-1\), and only assets that were already abovemin_observationsbefore time \(t\) contribute to the regime signal. This is important because the STVU multiplier is derived from a cross-sectional average of standardized squared returns: noisy or illiquid assets with unreliable variance estimates can inflate or deflate the statistic, distorting the regime multiplier applied to all variances.- Parameters:
- half_lifefloat, default=40
Half-life of the exponential weights in number of observations.
The half-life controls how quickly older observations lose their influence:
Larger half-life: More stable estimates, slower to adapt (robust to noise)
Smaller half-life: More responsive estimates, faster to adapt (sensitive to noise)
The decay factor \(\lambda\) is computed as: \(\lambda = 2^{-1/\text{half-life}}\)
- For example:
half-life = 40: \(\lambda \approx 0.983\)
half-life = 23: \(\lambda \approx 0.970\)
half-life = 11: \(\lambda \approx 0.939\)
half-life = 6: \(\lambda \approx 0.891\)
Note
For portfolio optimization, larger half-lives (>= 20) are generally preferred to avoid excessive turnover from estimation noise.
- hac_lagsint, optional
Number of lags for Newey-West HAC (Heteroskedasticity and Autocorrelation Consistent) correction. If None (default), no HAC correction is applied.
When enabled, the variance update uses HAC-adjusted squared returns instead of simple squared returns, accounting for autocorrelation:
\[\text{hac_var}_i = r_{i,t}^2 + 2 \sum_{j=1}^{L} w_j \cdot r_{i,t} \cdot r_{i,t-j}\]where \(w_j = 1 - j/(L+1)\) is the Bartlett kernel weight.
- Typical values:
Daily equity data: 3-5 lags (weak autocorrelation from microstructure)
High-frequency data: 5-10 lags (stronger autocorrelation)
Monthly data: 1-2 lags
Must be a positive integer if specified.
- regime_methodRegimeAdjustmentMethod, default=RegimeAdjustmentMethod.FIRST_MOMENT
Method used to transform the update statistic into the volatility multiplier \(\phi\):
LOG: Robust to outliers (log compresses extremes)FIRST_MOMENT: Calibrates the first moment of the standardized risk statisticRMS: \(\chi^2\) calibration (sensitive to extremes)
- regime_half_lifefloat, optional
Half-life for smoothing the volatility regime signal, in number of observations.
The regime signal is built from one-step-ahead standardized returns and then transformed into the multiplier \(\phi\) according to
regime_method. A shorterregime_half_lifemakes the multiplier react faster to abrupt changes in realized risk. A longer one produces a smoother, slower moving adjustment.If None (default), it is automatically calibrated as: \(\text{regime-half-life} = 0.5 \times \text{half-life}\)
This makes the STVU more responsive (shorter half-life) than the variance, allowing it to quickly rescale risk when realized volatility deviates from the slower EWMA estimate.
- regime_multiplier_cliptuple[float, float] or None, default=(0.7, 1.6)
Clip to avoid extreme swings in the regime multiplier. Set to None to disable clipping. The multiplier is applied to the covariance as \(\phi^2 \Sigma\).
- Default bounds rationale:
Lower bound (0.7): Limits volatility reduction to 30%, equivalent to a minimum variance scale of \(0.7^2 = 0.49\)
Upper bound (1.6): Limits volatility increase to 60%, equivalent to a maximum variance scale of \(1.6^2 = 2.56\)
- regime_min_observationsint, optional
Minimum number of one-step-ahead comparisons before enabling STVU. If insufficient data, STVU defaults to 1.0 (no adjustment).
If None (default), it is automatically set to
int(regime_half_life), ensuring the STVU EWMA has seen roughly one half-life of data before being applied.- min_observationsint, optional
Minimum number of valid observations per asset before its variance estimate is considered reliable and exposed in the output
variance_. Until this threshold is reached, the asset’s variance estimate remains NaN.The default (
None) usesint(half_life)as the threshold, ensuring the late-listing initialization bias has decayed to at most 50%. Set to 1 to disable warm-up entirely.- assume_centeredbool, default=True
If True (default), the EWMA update uses raw returns without demeaning. This is the standard convention for EWMA variance estimation in finance. If False, returns are demeaned using an EWMA mean estimate before computing the variance update, and
location_tracks the EWMA mean.Note
For factor model residuals, centering is typically not needed as residuals should already have zero mean by construction. Set to False only if residuals exhibit persistent non-zero means.
- Attributes:
- variance_ndarray of shape (n_assets,)
Estimated regime-adjusted variances.
- regime_multiplier_float
The volatility regime adjustment factor applied. Equal to 1.0 if insufficient data or no regime adjustment needed.
- location_ndarray of shape (n_assets,)
Estimated location, i.e. the estimated mean. When
assume_centered=True, this is zero. Whenassume_centered=False, this is the EWMA mean estimate.- n_features_in_int
Number of assets seen during
fit.- feature_names_in_ndarray of shape (n_features_in_,)
Names of features seen during
fit. Defined only whenXhas feature names that are all strings.
Methods
fit(X[, y, estimation_mask, active_mask])Fit the Regime-Adjusted Exponentially Weighted Variance estimator.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
partial_fit(X[, y, estimation_mask, active_mask])Incrementally fit the estimator with new observations.
set_fit_request(*[, active_mask, ...])Configure whether metadata should be requested to be passed to the
fitmethod.set_params(**params)Set the parameters of this estimator.
set_partial_fit_request(*[, active_mask, ...])Configure whether metadata should be requested to be passed to the
partial_fitmethod.References
[1]“The Elements of Quantitative Investing”, Wiley Finance, Giuseppe Paleologo (2025).
[2]“Multivariate exponentially weighted moving covariance matrix”, Technometrics, Hawkins & Maboudou-Tchao (2008).
Examples
>>> import numpy as np >>> from skfolio.datasets import load_sp500_dataset >>> from skfolio.moments import RegimeAdjustedEWVariance, RegimeAdjustmentMethod >>> from skfolio.preprocessing import prices_to_returns >>> >>> prices = load_sp500_dataset() >>> X = prices_to_returns(prices) >>> # Standard EWMA with STVU >>> model = RegimeAdjustedEWVariance(half_life=23) >>> model.fit(X) >>> print(model.regime_multiplier_) >>> >>> # With LOG method for robustness to outliers >>> model2 = RegimeAdjustedEWVariance( ... half_life=11, ... regime_method=RegimeAdjustmentMethod.LOG ... ) >>> model2.fit(X) >>> >>> # With Newey-West HAC correction for autocorrelation >>> model3 = RegimeAdjustedEWVariance( ... half_life=23, ... hac_lags=5 # 5-lag Newey-West correction ... ) >>> model3.fit(X) >>> >>> # With an estimation universe focused on specific assets >>> estimation_mask = np.ones((len(X), X.shape[1]), dtype=bool) >>> estimation_mask[:, :5] = False # Exclude first 5 assets from STVU >>> model4 = RegimeAdjustedEWVariance(half_life=23) >>> model4.fit(X, estimation_mask=estimation_mask)
- fit(X, y=None, *, estimation_mask=None, active_mask=None)[source]#
Fit the Regime-Adjusted Exponentially Weighted Variance estimator.
- Parameters:
- Xarray-like of shape (n_observations, n_assets)
Idiosyncratic (specific) residual returns per asset, typically obtained from a factor model regression. NaN values are allowed and handled robustly.
- yIgnored
Not used, present for API consistency by convention.
- estimation_maskarray-like of shape (n_observations, n_assets), optional
Boolean mask indicating which active assets should belong to the estimation universe for the cross-sectional STVU statistic on each day.
If None (default), all active assets with finite returns are used.
If provided, only assets where the mask is True contribute to the regime multiplier calculation on that day.
Per-asset EWMA variance updates still use all active assets with finite returns; this parameter only affects the cross-sectional regime adjustment calculation.
- Use cases:
Focus on liquid assets to reduce noise from thinly traded securities
Exclude assets with suspected data quality issues
Match the estimation universe used in downstream models
- active_maskarray-like of shape (n_observations, n_assets), optional
Boolean mask indicating whether each asset is structurally active at each observation. Use this to distinguish between holidays (
active_mask=Trueand NaN return: variance is frozen) and inactive periods such as pre-listing or post-delisting (active_mask=False: variance is set to NaN). IfNone(default), all pairs are assumed active and NaN returns are treated as holidays (variance frozen).When an asset becomes active again after an inactive period, its variance restarts from a zero prior with per-asset bias correction.
- Returns:
- selfRegimeAdjustedEWVariance
Fitted estimator.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- partial_fit(X, y=None, *, estimation_mask=None, active_mask=None)[source]#
Incrementally fit the estimator with new observations.
This method allows online/streaming updates to the variance estimates.
- Parameters:
- Xarray-like of shape (n_observations, n_assets)
Idiosyncratic (specific) residual returns per asset. NaN values are allowed and handled robustly.
- yIgnored
Not used, present for API consistency by convention.
- estimation_maskarray-like of shape (n_observations, n_assets), optional
Boolean mask indicating which active assets belong to the estimation universe for the cross-sectional STVU statistic on each day. See
fitfor details.- active_maskarray-like of shape (n_observations, n_assets), optional
Boolean mask indicating whether each asset is structurally active at each observation. See
fitfor details.
- Returns:
- selfRegimeAdjustedEWVariance
Fitted estimator.
- set_fit_request(*, active_mask='$UNCHANGED$', estimation_mask='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- active_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
active_maskparameter infit.- estimation_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
estimation_maskparameter infit.
- Returns:
- selfobject
The updated object.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_partial_fit_request(*, active_mask='$UNCHANGED$', estimation_mask='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
partial_fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topartial_fitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topartial_fit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- active_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
active_maskparameter inpartial_fit.- estimation_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
estimation_maskparameter inpartial_fit.
- Returns:
- selfobject
The updated object.