skfolio.model_selection.online_predict#
- skfolio.model_selection.online_predict(estimator, X, y=None, warmup_size=252, test_size=1, freq=None, freq_offset=None, previous=False, purged_size=0, reduce_test=False, params=None, portfolio_params=None)[source]#
Generate out-of-sample portfolios using online learning.
Walks forward through the data, updating the estimator incrementally via
partial_fitand predicting on each subsequent test window. Unlikecross_val_predict, which clones the estimator for each fold, this function maintains a single stateful estimator that accumulates knowledge over time.The algorithm:
Clone the estimator to ensure a clean, unfitted starting state.
Initialize the estimator on the first
warmup_sizeobservations viapartial_fit.At each step, predict on the test window, then update the model with the newly observed data via
partial_fit.
If the estimator declares
needs_previous_weights=True, portfolio weights are automatically propagated from one step to the next.- Parameters:
- estimatorBaseOptimization
Portfolio optimization estimator. It must implement
partial_fit. Pipelines are not supported.- Xarray-like of shape (n_observations, n_assets)
Price returns of the assets. Must be a DataFrame with a
DatetimeIndexwhenfreqis provided.- yarray-like of shape (n_observations, n_targets), optional
Target data to pass to
partial_fit.- warmup_sizeint, default=252
Number of initial observations (or periods when
freqis set) used for the firstpartial_fitcall. No predictions are made during warmup.- test_sizeint, default=1
Length of each test set. If
freqisNone(default), it represents the number of observations. Otherwise, it represents the number of periods defined byfreq. Controls the rebalancing frequency.- freqstr | pandas.offsets.BaseOffset, optional
If provided, it must be a frequency string or a pandas DateOffset, and
Xmust be a DataFrame with an index of typeDatetimeIndex. In that case,warmup_sizeandtest_sizerepresent the number of periods defined byfreqinstead of the number of observations.- freq_offsetpandas.offsets.BaseOffset | datetime.timedelta, optional
Only used if
freqis provided. Offsetsfreqby a pandas DateOffset or a datetime timedelta offset.- previousbool, default=False
Only used if
freqis provided. If set toTrue, and if the period start or period end is not in theDatetimeIndex, the previous observation is used; otherwise, the next observation is used.- purged_sizeint, default=0
The number of observations to exclude from the end of each training window before the test window. Use
purged_size >= 1when execution is delayed relative to observation.- reduce_testbool, default=False
If set to
True, the last test window is returned even if it is partial, otherwise it is ignored.- paramsdict, optional
Parameters to pass to the underlying estimator’s
partial_fitthrough metadata routing.- portfolio_paramsdict, optional
Additional parameters forwarded to the resulting
MultiPeriodPortfolio.
- Returns:
- predictionMultiPeriodPortfolio
A
MultiPeriodPortfoliocontaining onePortfolioper test window, ordered chronologically.
- Raises:
- TypeError
If the estimator is not a portfolio optimization estimator, does not implement
partial_fit, or is a pipeline.- ValueError
If
warmup_size < 1,test_size < 1, or the data is too short for at least one test window.
See also
- Online Evaluation of Portfolio Optimization
Online evaluation of portfolio optimization using
online_predict.
Examples
>>> from skfolio.datasets import load_sp500_dataset >>> from skfolio.model_selection import online_predict >>> from skfolio.moments import EWCovariance, EWMu >>> from skfolio.optimization import MeanRisk >>> from skfolio.preprocessing import prices_to_returns >>> from skfolio.prior import EmpiricalPrior >>> >>> prices = load_sp500_dataset() >>> X = prices_to_returns(prices) >>> >>> model = MeanRisk( ... prior_estimator=EmpiricalPrior( ... mu_estimator=EWMu(half_life=40), ... covariance_estimator=EWCovariance(half_life=40), ... ), ... ) >>> pred = online_predict(model, X, warmup_size=252, test_size=5)