skfolio.model_selection.OnlineGridSearch#

class skfolio.model_selection.OnlineGridSearch(estimator, param_grid, *, scoring=None, warmup_size=252, test_size=1, freq=None, freq_offset=None, previous=False, purged_size=0, reduce_test=False, refit=True, error_score=nan, return_predictions=False, portfolio_params=None, n_jobs=None, verbose=0)[source]#

Online exhaustive hyperparameter search over a parameter grid.

Each parameter combination is evaluated by running a full online walk-forward pass. The best estimator is selected based on the aggregate out-of-sample score.

Parameters:
estimatorBaseEstimator

Estimator that supports partial_fit.

param_griddict or list[dict]

Dictionary with parameters names (str) as keys and lists of parameter settings to try as values, or a list of such dictionaries, in which case the grids spanned by each dictionary in the list are explored. This enables searching over any sequence of parameter settings.

scoringcallable, dict, BaseMeasure, or None

Scoring specification. Semantics depend on the estimator type:

  • Component estimators (e.g. covariance, expected returns): None uses estimator.score; otherwise pass a callable scorer(estimator, X_test) or a dict of such callables.

  • Portfolio optimization estimators: a BaseMeasure or a dict of measures. None defaults to SHARPE_RATIO.

For portfolio optimization estimators, online evaluation scores the aggregated out-of-sample MultiPeriodPortfolio, rather than scoring each test window independently and averaging as in GridSearchCV. Pass the measure enum directly; make_scorer is not supported.

warmup_sizeint, default=252

Number of initial observations (or periods when freq is set) used for the first partial_fit call.

test_sizeint, default=1

Number of observations (or periods when freq is set) per test window.

freqstr | pandas.offsets.BaseOffset, optional

Rebalancing frequency. When provided, warmup_size and test_size are interpreted as period counts rather than observation counts, and X must be a DataFrame with a DatetimeIndex. See WalkForward for details and examples.

freq_offsetpandas.offsets.BaseOffset | datetime.timedelta, optional

Offset applied to the freq boundaries. Only used when freq is provided.

previousbool, default=False

Only used when freq is provided. If True, period boundaries that fall between observations snap to the previous observation; otherwise they snap to the next.

purged_sizeint, default=0

Number of observations (or periods) to skip between the last data the model sees and the start of the test window.

reduce_testbool, default=False

If True, the last test window is included even when it contains fewer observations than test_size.

refitbool, str, or callable, default=True

Controls how the best candidate is selected and whether the selected fitted candidate is exposed as best_estimator_.

This parameter is named for API alignment with scikit-learn. Unlike scikit-learn search estimators, enabling refit does not trigger an additional fit after model selection because each candidate is already evaluated through a full online walk-forward pass and updated through the full sample.

  • Single-metric scoring: True or False are both supported. If False, best_estimator_ is not stored, but best_index_, best_params_, and best_score_ remain available.

  • Multi-metric scoring: set to a scorer name to select the best candidate for that metric, or to False to disable best-candidate selection and storage of best_estimator_.

  • A callable receives cv_results_ and must return the best candidate index.

error_score“raise” or float, default=np.nan

Value to assign to the score if an error occurs during fitting. If set to "raise", the error is raised.

return_predictionsbool, default=False

If True, store MultiPeriodPortfolio objects per candidate in cv_results_["predictions"]. Only applies to portfolio optimization estimators.

portfolio_paramsdict, optional

Parameters forwarded to MultiPeriodPortfolio when scoring portfolio estimators.

n_jobsint or None, default=None

Number of parallel jobs. None means 1.

verboseint, default=0

Verbosity level for joblib.Parallel.

Attributes:
cv_results_dict[str, ndarray]

A dict with keys:

  • params: list of candidate parameter dicts.

  • mean_score: array of aggregate scores (or mean_score_<name> for multi-metric).

  • rank: array of ranks where 1 is best (or rank_<name> for multi-metric).

  • fit_time: array of wall-clock times.

  • predictions: object array of MultiPeriodPortfolio or None aligned with candidates (only when return_predictions=True and the estimator is portfolio-based).

best_estimator_BaseEstimator

Estimator fitted on the full data with the best parameters. Only available when refit is not False.

best_score_float

Aggregate score of the selected best candidate. Available when best_index_ is defined and refit is not callable.

best_params_dict

Parameter setting that gave the selected best score. Available when best_index_ is defined.

best_index_int

Index into cv_results_ of the best candidate. Available for single-metric scoring and for multi-metric scoring when refit is not False.

multimetric_bool

Whether or not the scorers compute several metrics.

is_portfolio_estimator_bool

Whether or not the estimator is a portfolio optimization estimator.

Methods

fit(X[, y])

Run the online search over all candidate parameter combinations.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict using the best estimator found during search.

score(X[, y])

Score using the best estimator found during search.

set_params(**params)

Set the parameters of this estimator.

See also

Online Covariance Hyperparameter Tuning

Exhaustive online tuning of covariance estimator hyperparameters.

Online Evaluation of Portfolio Optimization

Exhaustive online tuning of a MeanRisk estimator.

Examples

>>> from skfolio.datasets import load_sp500_dataset
>>> from skfolio.model_selection import OnlineGridSearch
>>> from skfolio.moments import EWCovariance, EWMu
>>> from skfolio.optimization import MeanRisk
>>> from skfolio.preprocessing import prices_to_returns
>>> from skfolio.prior import EmpiricalPrior
>>>
>>> prices = load_sp500_dataset()
>>> X = prices_to_returns(prices)
>>>
>>> model = MeanRisk(
...     prior_estimator=EmpiricalPrior(
...         mu_estimator=EWMu(),
...         covariance_estimator=EWCovariance(),
...     ),
... )
>>> search = OnlineGridSearch(
...     model,
...     param_grid={
...         "prior_estimator__mu_estimator__half_life": [20, 40, 60],
...         "prior_estimator__covariance_estimator__half_life": [20, 40, 60],
...     },
...     warmup_size=252,
...     test_size=5,
...     n_jobs=-1,
... )
>>> search.fit(X)
>>> search.best_params_
>>> search.best_estimator_
fit(X, y=None, **fit_params)#

Run the online search over all candidate parameter combinations.

Parameters:
Xarray-like of shape (n_observations, n_assets)

Price returns.

yarray-like, optional

Optional Target.

**fit_params

Additional parameters routed via metadata routing.

Returns:
self
get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X)#

Predict using the best estimator found during search.

Parameters:
Xarray-like of shape (n_observations, n_assets)

Price returns.

Returns:
predictionPortfolio | Population
score(X, y=None)#

Score using the best estimator found during search.

Parameters:
Xarray-like of shape (n_observations, n_assets)

Price returns.

yIgnored

Present for scikit-learn API compatibility.

Returns:
scorefloat
set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.