skfolio.moments.EWMu#
- class skfolio.moments.EWMu(half_life=None, min_observations=None, window_size=None, alpha=None)[source]#
Exponentially Weighted Expected Returns (Mu) estimator.
This estimator uses the recursive EWMA formula:
\[\mu_t = \lambda \mu_{t-1} + (1-\lambda) r_t\]where \(\lambda\) is the decay factor, which determines how much weight is given to past observations. It is computed from the half-life parameter:
\[\lambda = 2^{-1/\text{half-life}}\]The half-life is the number of observations for the weight to decay to 50%.
This estimator supports both batch fitting via
fitand incremental updates viapartial_fit, making it suitable for online learning scenarios.NaN handling:
The estimator handles missing data (NaN returns) caused by late listings, delistings, and holidays using EWMA updates together with
active_mask. An asset withactive_mask=Trueis treated as active at time \(t\). If its return is finite, the EWMA is updated normally. If its return is NaN, the observation is treated as a holiday and the previous estimate is kept. An asset withactive_mask=Falseis treated as inactive, for example during pre-listing or post-delisting periods, and its estimate is set to NaN.Active with valid return: Normal EWMA update.
Active with NaN return (holiday): Freeze; the previous estimate is kept.
Inactive (
active_mask=False): The estimate is set to NaN.
When
active_maskis not provided, trailing NaN returns are ambiguous: they could correspond either to holidays, in which case the mean is frozen, or to inactive periods, in which case the mean is set to NaN.Late-listing bias correction:
When an asset becomes active (late listing), the EWMA recursion is initialized at zero rather than at the first return. This zero-initialization introduces a transient downward bias: after \(n_i\) valid observations, the raw EWMA weights sum to \((1 - \lambda^{n_i})\) instead of 1. At output time, a per-asset correction removes this bias:
\[\hat{\mu}_i = \frac{S_i}{1 - \lambda^{n_i}}\]where \(S_i\) is the raw internal EWMA accumulator. For assets with a long history, the correction is negligible (\(\lambda^{n_i} \to 0\)).
The
min_observationsparameter controls a warm-up period: an asset’s mean estimate remains NaN in the output until it has accumulated enough valid observations for a reliable estimate.- Parameters:
- half_lifefloat, default=40
Half-life of the exponential weights in number of observations.
The half-life controls how quickly older observations lose their influence:
Larger half-life: More stable estimates, slower to adapt (robust to noise)
Smaller half-life: More responsive estimates, faster to adapt (sensitive to noise)
The decay factor \(\lambda\) is computed as: \(\lambda = 2^{-1/\text{half-life}}\)
- For example:
half-life = 40: \(\lambda \approx 0.983\)
half-life = 23: \(\lambda \approx 0.970\)
half-life = 11: \(\lambda \approx 0.939\)
half-life = 6: \(\lambda \approx 0.891\)
Note
For portfolio optimization, larger half-lives (>= 20) are generally preferred to avoid excessive turnover from estimation noise.
- min_observationsint, optional
Minimum number of valid observations per asset before its mean estimate is considered reliable and exposed in the output
mu_. Until this threshold is reached, the asset’s mean estimate remains NaN.The default (
None) usesint(half_life)as the threshold, ensuring the late-listing initialization bias has decayed to at most 50%. Set to 1 to disable warm-up entirely.- window_sizeint, optional
Window size to truncate data to the last
window_sizeobservations before fitting. Only applies to the initialfitcall (or equivalently, the firstpartial_fitcall); subsequentpartial_fitcalls use all provided data.This is a computational optimization for very long time series. Due to exponential decay, observations far in the past contribute negligibly to the current estimate. For example, with half-life = 23 (\(\lambda = 0.97\)), observations beyond ~150 periods contribute less than 1% to the estimate. Truncating to a reasonable window (e.g., 252 trading days) speeds up computation without materially affecting results.
The default (
None) uses all available data.- alphafloat, optional
Deprecated since version 0.17.0:
alphais deprecated and will be removed in a future version. Usehalf_lifeinstead. Note:alpha = 1 - decay_factorandhalf_life = -ln(2) / ln(1 - alpha).
- Attributes:
- mu_ndarray of shape (n_assets,)
Estimated expected returns of the assets. Contains NaN for assets that are inactive or have not yet accumulated
min_observationsvalid observations.- n_features_in_int
Number of assets seen during
fit.- feature_names_in_ndarray of shape (
n_features_in_,) Names of assets seen during
fit. Defined only whenXhas assets names that are all strings.
Methods
fit(X[, y, active_mask])Fit the EWMu estimator model.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
partial_fit(X[, y, active_mask])Incrementally fit the EWMu estimator.
set_fit_request(*[, active_mask])Configure whether metadata should be requested to be passed to the
fitmethod.set_params(**params)Set the parameters of this estimator.
set_partial_fit_request(*[, active_mask])Configure whether metadata should be requested to be passed to the
partial_fitmethod.See also
- Online Evaluation of Portfolio Optimization
Online evaluation of portfolio optimization using
MeanRiskwithEWMuand exponentially weighted covariance estimators.
Examples
>>> from skfolio.datasets import load_sp500_dataset >>> from skfolio.moments import EWMu >>> from skfolio.preprocessing import prices_to_returns >>> >>> prices = load_sp500_dataset() >>> X = prices_to_returns(prices) >>> >>> # Batch fitting >>> model = EWMu(half_life=40) >>> model.fit(X) >>> print(model.mu_.shape) >>> >>> # Streaming updates with partial_fit >>> model2 = EWMu(half_life=20) >>> model2.partial_fit(X[:100]) # Initial fit >>> model2.partial_fit(X[100:200]) # Update with new data >>> model2.partial_fit(X[200:]) # Continue updating >>> >>> # NaN-aware fitting with active_mask >>> import numpy as np >>> # Asset 2 is listed starting from observation 50 >>> active_mask = np.ones(X.shape, dtype=bool) >>> active_mask[:50, 2] = False >>> X_nan = X.copy() >>> X_nan[:50, 2] = np.nan >>> model3 = EWMu(half_life=40) >>> model3.fit(X_nan, active_mask=active_mask)
- fit(X, y=None, *, active_mask=None)[source]#
Fit the EWMu estimator model.
- Parameters:
- Xarray-like of shape (n_observations, n_assets)
Price returns of the assets. May contain NaN for missing data (holidays, late listings, delistings).
- yIgnored
Not used, present for API consistency by convention.
- active_maskarray-like of shape (n_observations, n_assets), optional
Boolean mask indicating whether each asset is structurally active at each observation. Use this to distinguish between holidays (
active_mask=Trueand NaN return: mean is frozen) and inactive periods such as pre-listing or post-delisting (active_mask=False: mean is set to NaN). IfNone(default), all assets are assumed active.
- Returns:
- selfEWMu
Fitted estimator.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- partial_fit(X, y=None, *, active_mask=None)[source]#
Incrementally fit the EWMu estimator.
This method allows for streaming/online updates to the expected returns estimate. Each call updates the internal state with new observations.
- Parameters:
- Xarray-like of shape (n_observations, n_assets)
Price returns of the assets. May contain NaN for missing data (holidays, late listings, delistings).
- yIgnored
Not used, present for API consistency by convention.
- active_maskarray-like of shape (n_observations, n_assets), optional
Boolean mask indicating whether each asset is structurally active at each observation. Use this to distinguish between holidays (
active_mask=Trueand NaN return: mean is frozen) and inactive periods such as pre-listing or post-delisting (active_mask=False: mean is set to NaN). IfNone(default), all assets are assumed active.
- Returns:
- selfEWMu
Fitted estimator.
- set_fit_request(*, active_mask='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- active_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
active_maskparameter infit.
- Returns:
- selfobject
The updated object.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_partial_fit_request(*, active_mask='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
partial_fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topartial_fitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topartial_fit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- active_maskstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
active_maskparameter inpartial_fit.
- Returns:
- selfobject
The updated object.