skfolio.model_selection.online_predict#

skfolio.model_selection.online_predict(estimator, X, y=None, warmup_size=252, test_size=1, freq=None, freq_offset=None, previous=False, purged_size=0, reduce_test=False, params=None, portfolio_params=None)[source]#

Generate out-of-sample portfolios using online learning.

Walks forward through the data, updating the estimator incrementally via partial_fit and predicting on each subsequent test window. Unlike cross_val_predict, which clones the estimator for each fold, this function maintains a single stateful estimator that accumulates knowledge over time.

The algorithm:

  1. Clone the estimator to ensure a clean, unfitted starting state.

  2. Initialize the estimator on the first warmup_size observations via partial_fit.

  3. At each step, predict on the test window, then update the model with the newly observed data via partial_fit.

If the estimator declares needs_previous_weights=True, portfolio weights are automatically propagated from one step to the next.

Parameters:
estimatorBaseOptimization

Portfolio optimization estimator. It must implement partial_fit. Pipelines are not supported.

Xarray-like of shape (n_observations, n_assets)

Price returns of the assets. Must be a DataFrame with a DatetimeIndex when freq is provided.

yarray-like of shape (n_observations, n_targets), optional

Target data to pass to partial_fit.

warmup_sizeint, default=252

Number of initial observations (or periods when freq is set) used for the first partial_fit call. No predictions are made during warmup.

test_sizeint, default=1

Length of each test set. If freq is None (default), it represents the number of observations. Otherwise, it represents the number of periods defined by freq. Controls the rebalancing frequency.

freqstr | pandas.offsets.BaseOffset, optional

If provided, it must be a frequency string or a pandas DateOffset, and X must be a DataFrame with an index of type DatetimeIndex. In that case, warmup_size and test_size represent the number of periods defined by freq instead of the number of observations.

freq_offsetpandas.offsets.BaseOffset | datetime.timedelta, optional

Only used if freq is provided. Offsets freq by a pandas DateOffset or a datetime timedelta offset.

previousbool, default=False

Only used if freq is provided. If set to True, and if the period start or period end is not in the DatetimeIndex, the previous observation is used; otherwise, the next observation is used.

purged_sizeint, default=0

The number of observations to exclude from the end of each training window before the test window. Use purged_size >= 1 when execution is delayed relative to observation.

reduce_testbool, default=False

If set to True, the last test window is returned even if it is partial, otherwise it is ignored.

paramsdict, optional

Parameters to pass to the underlying estimator’s partial_fit through metadata routing.

portfolio_paramsdict, optional

Additional parameters forwarded to the resulting MultiPeriodPortfolio.

Returns:
predictionMultiPeriodPortfolio

A MultiPeriodPortfolio containing one Portfolio per test window, ordered chronologically.

Raises:
TypeError

If the estimator is not a portfolio optimization estimator, does not implement partial_fit, or is a pipeline.

ValueError

If warmup_size < 1, test_size < 1, or the data is too short for at least one test window.

See also

Online Evaluation of Portfolio Optimization

Online evaluation of portfolio optimization using online_predict.

Examples

>>> from skfolio.datasets import load_sp500_dataset
>>> from skfolio.model_selection import online_predict
>>> from skfolio.moments import EWCovariance, EWMu
>>> from skfolio.optimization import MeanRisk
>>> from skfolio.preprocessing import prices_to_returns
>>> from skfolio.prior import EmpiricalPrior
>>>
>>> prices = load_sp500_dataset()
>>> X = prices_to_returns(prices)
>>>
>>> model = MeanRisk(
...     prior_estimator=EmpiricalPrior(
...         mu_estimator=EWMu(half_life=40),
...         covariance_estimator=EWCovariance(half_life=40),
...     ),
... )
>>> pred = online_predict(model, X, warmup_size=252, test_size=5)