skfolio.model_selection.online_score#

skfolio.model_selection.online_score(estimator, X, y=None, warmup_size=252, test_size=1, freq=None, freq_offset=None, previous=False, purged_size=0, reduce_test=False, scoring=None, params=None, per_step=False, portfolio_params=None)[source]#

Score an online estimator using walk-forward evaluation.

Walks forward through the data, updating the estimator incrementally via partial_fit and scoring on each subsequent test window. This is the scoring counterpart of online_predict.

The function handles both non-predictor estimators (e.g. covariance, expected returns, prior) and portfolio optimization estimators:

  • non-predictor estimators are scored on each test window independently. By default the average of per-step scores is returned.

  • Portfolio optimization estimators are evaluated by collecting out-of-sample predictions into a MultiPeriodPortfolio and computing the requested measure on the full multi-period portfolio.

Parameters:
estimatorBaseEstimator

Estimator instance to use to fit the data. It must implement partial_fit. Pipelines are not supported.

Xarray-like of shape (n_observations, n_assets)

Price returns of the assets. Must be a DataFrame with a DatetimeIndex when freq is provided.

yarray-like of shape (n_observations, n_targets), optional

Target data to pass to partial_fit.

warmup_sizeint, default=252

Number of initial observations (or periods when freq is set) used for the first partial_fit call. No scores are produced during warmup.

test_sizeint, default=1

Length of each test set. If freq is None (default), it represents the number of observations. Otherwise, it represents the number of periods defined by freq.

freqstr | pandas.offsets.BaseOffset, optional

If provided, it must be a frequency string or a pandas DateOffset, and X must be a DataFrame with an index of type DatetimeIndex. In that case, warmup_size and test_size represent the number of periods defined by freq instead of the number of observations.

freq_offsetpandas.offsets.BaseOffset | datetime.timedelta, optional

Only used if freq is provided. Offsets freq by a pandas DateOffset or a datetime timedelta offset.

previousbool, default=False

Only used if freq is provided. If set to True, and if the period start or period end is not in the DatetimeIndex, the previous observation is used; otherwise, the next observation is used.

purged_sizeint, default=0

The number of observations to exclude from the end of each training window before the test window.

reduce_testbool, default=False

If set to True, the last test window is returned even if it is partial, otherwise it is ignored.

scoringcallable, dict, BaseMeasure, or None

Scoring specification. Semantics depend on the estimator type:

  • Non-predictor estimators (e.g. covariance, expected returns, prior): None uses estimator.score; otherwise pass a callable scorer(estimator, X_test)` or a dict of such callables.

  • Portfolio optimization estimators: a BaseMeasure or a dict of measures. None defaults to SHARPE_RATIO.

Note

For portfolio optimization estimators, online evaluation scores the aggregated out-of-sample MultiPeriodPortfolio, rather than scoring each test window independently and averaging as in GridSearchCV. Pass the measure enum directly; make_scorer is not supported.

paramsdict, optional

Parameters to pass to the underlying estimator’s partial_fit through metadata routing.

per_stepbool, default=False

If True, return per-step score arrays instead of aggregated scalars. Only supported for non-predictor estimators; raises ValueError for portfolio optimization estimators.

portfolio_paramsdict, optional

Additional parameters forwarded to the resulting MultiPeriodPortfolio when scoring a portfolio optimization estimator.

Returns:
scorefloat | dict[str, float] | ndarray | dict[str, ndarray]

By default, an aggregate float (or dict for multi-metric). When per_step=True, a np.ndarray of per-step scores (or dict thereof).

Raises:
TypeError

If the estimator does not implement partial_fit or is a pipeline.

ValueError

If per_step=True is used with a portfolio optimization estimator, or if warmup_size < 1, test_size < 1, or the data is too short for at least one test window.

See also

Online Covariance Hyperparameter Tuning

Programmatic comparison of covariance estimators with online_score.

Online Evaluation of Portfolio Optimization

Portfolio-level evaluation with online_score.

Examples

non-predictor estimator (default estimator.score):

>>> from skfolio.datasets import load_sp500_dataset
>>> from skfolio.model_selection import online_score
>>> from skfolio.moments import EWCovariance
>>> from skfolio.preprocessing import prices_to_returns
>>>
>>> prices = load_sp500_dataset()
>>> X = prices_to_returns(prices)
>>> score = online_score(EWCovariance(), X, warmup_size=252)

Portfolio optimization estimator:

>>> from skfolio.measures import RatioMeasure
>>> from skfolio.moments import EWMu
>>> from skfolio.optimization import MeanRisk
>>> from skfolio.prior import EmpiricalPrior
>>>
>>> model = MeanRisk(
...     prior_estimator=EmpiricalPrior(
...         mu_estimator=EWMu(half_life=40),
...         covariance_estimator=EWCovariance(half_life=40),
...     ),
... )
>>> score = online_score(
...     model,
...     X,
...     warmup_size=252,
...     test_size=5,
...     scoring=RatioMeasure.SHARPE_RATIO,
... )