skfolio.model_selection.OnlineRandomizedSearch#
- class skfolio.model_selection.OnlineRandomizedSearch(estimator, param_distributions, *, n_iter=10, scoring=None, warmup_size=252, test_size=1, freq=None, freq_offset=None, previous=False, purged_size=0, reduce_test=False, refit=True, random_state=None, error_score=nan, return_predictions=False, portfolio_params=None, n_jobs=None, verbose=0)[source]#
Online randomized search on hyper parameters.
Each sampled parameter combination is evaluated by running a full online walk-forward pass. Unlike
OnlineGridSearch, not all parameters are tried out, but a fixed number of parameter settings are sampled from the specified distributions. The number of parameter settings that are tried is given byn_iter.If all parameters are presented as a list, sampling without replacement is performed. If at least one parameter is given as a distribution, sampling with replacement is used. It is highly recommended to use continuous distributions for continuous parameters.
- Parameters:
- estimatorBaseEstimator
Estimator that supports
partial_fit.- param_distributionsdict or list of dicts
Dictionary with parameters names (
str) as keys and distributions or lists of parameters to try. Distributions must provide arvsmethod for sampling (such as those from scipy.stats.distributions). If a list is given, it is sampled uniformly. If a list of dicts is given, first a dict is sampled uniformly, and then a parameter is sampled using that dict as above.- n_iterint, default=10
Number of parameter settings that are sampled. n_iter trades off runtime vs quality of the solution.
- scoringcallable, dict, BaseMeasure, or None
Scoring specification. Semantics depend on the estimator type:
Component estimators (e.g. covariance, expected returns):
Noneusesestimator.score; otherwise pass a callablescorer(estimator, X_test)or a dict of such callables.Portfolio optimization estimators: a
BaseMeasureor a dict of measures.Nonedefaults toSHARPE_RATIO.
For portfolio optimization estimators, online evaluation scores the aggregated out-of-sample
MultiPeriodPortfolio, rather than scoring each test window independently and averaging as inGridSearchCV. Pass the measure enum directly;make_scoreris not supported.- warmup_sizeint, default=252
Number of initial observations (or periods when
freqis set) used for the firstpartial_fitcall.- test_sizeint, default=1
Number of observations (or periods when
freqis set) per test window.- freqstr | pandas.offsets.BaseOffset, optional
Rebalancing frequency. When provided,
warmup_sizeandtest_sizeare interpreted as period counts rather than observation counts, andXmust be a DataFrame with aDatetimeIndex. SeeWalkForwardfor details and examples.- freq_offsetpandas.offsets.BaseOffset | datetime.timedelta, optional
Offset applied to the
freqboundaries. Only used whenfreqis provided.- previousbool, default=False
Only used when
freqis provided. IfTrue, period boundaries that fall between observations snap to the previous observation; otherwise they snap to the next.- purged_sizeint, default=0
Number of observations (or periods) to skip between the last data the model sees and the start of the test window.
- reduce_testbool, default=False
If
True, the last test window is included even when it contains fewer observations thantest_size.- refitbool, str, or callable, default=True
Controls how the best candidate is selected and whether the selected fitted candidate is exposed as
best_estimator_.This parameter is named for API alignment with scikit-learn. Unlike scikit-learn search estimators, enabling
refitdoes not trigger an additional fit after model selection because each candidate is already evaluated through a full online walk-forward pass and updated through the full sample.Single-metric scoring:
TrueorFalseare both supported. IfFalse,best_estimator_is not stored, butbest_index_,best_params_, andbest_score_remain available.Multi-metric scoring: set to a scorer name to select the best candidate for that metric, or to
Falseto disable best-candidate selection and storage ofbest_estimator_.A callable receives
cv_results_and must return the best candidate index.
- random_stateint, RandomState instance or None, default=None
Pseudo random number generator state used for random uniform sampling from lists of possible values instead of scipy.stats distributions. Pass an int for reproducible output across multiple function calls.
- error_score“raise” or float, default=np.nan
Value to assign to the score if an error occurs during fitting. If set to
"raise", the error is raised.- return_predictionsbool, default=False
If
True, storeMultiPeriodPortfolioobjects per candidate incv_results_["predictions"]. Only applies to portfolio optimization estimators.- portfolio_paramsdict, optional
Parameters forwarded to
MultiPeriodPortfoliowhen scoring portfolio estimators.- n_jobsint or None, default=None
Number of parallel jobs.
Nonemeans 1.- verboseint, default=0
Verbosity level for
joblib.Parallel.
- Attributes:
- cv_results_dict[str, ndarray]
A dict with keys:
params: list of candidate parameter dicts.mean_score: array of aggregate scores (ormean_score_<name>for multi-metric).rank: array of ranks where 1 is best (orrank_<name>for multi-metric).fit_time: array of wall-clock times.predictions: object array ofMultiPeriodPortfolioorNonealigned with candidates (only whenreturn_predictions=Trueand the estimator is portfolio-based).
- best_estimator_BaseEstimator
Estimator fitted on the full data with the best parameters. Only available when
refitis notFalse.- best_score_float
Aggregate score of the selected best candidate. Available when
best_index_is defined andrefitis not callable.- best_params_dict
Parameter setting that gave the selected best score. Available when
best_index_is defined.- best_index_int
Index into
cv_results_of the best candidate. Available for single-metric scoring and for multi-metric scoring whenrefitis notFalse.- multimetric_bool
Whether or not the scorers compute several metrics.
- is_portfolio_estimator_bool
Whether or not the estimator is a portfolio optimization estimator.
Methods
fit(X[, y])Run the online search over all candidate parameter combinations.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
predict(X)Predict using the best estimator found during search.
score(X[, y])Score using the best estimator found during search.
set_params(**params)Set the parameters of this estimator.
See also
- Online Covariance Hyperparameter Tuning
Randomized online tuning of covariance estimator hyperparameters.
Examples
>>> from scipy.stats import uniform >>> from skfolio.datasets import load_sp500_dataset >>> from skfolio.model_selection import OnlineRandomizedSearch >>> from skfolio.moments import EWCovariance, EWMu >>> from skfolio.optimization import MeanRisk >>> from skfolio.preprocessing import prices_to_returns >>> from skfolio.prior import EmpiricalPrior >>> >>> prices = load_sp500_dataset() >>> X = prices_to_returns(prices) >>> >>> model = MeanRisk( ... prior_estimator=EmpiricalPrior( ... mu_estimator=EWMu(), ... covariance_estimator=EWCovariance(), ... ), ... ) >>> search = OnlineRandomizedSearch( ... model, ... param_distributions={ ... "prior_estimator__mu_estimator__half_life": uniform(10, 90), ... "prior_estimator__covariance_estimator__half_life": uniform(10, 90), ... }, ... n_iter=20, ... warmup_size=252, ... test_size=5, ... n_jobs=-1, ... random_state=42, ... ) >>> search.fit(X) >>> search.best_params_ >>> search.best_estimator_
- fit(X, y=None, **fit_params)#
Run the online search over all candidate parameter combinations.
- Parameters:
- Xarray-like of shape (n_observations, n_assets)
Price returns.
- yarray-like, optional
Optional Target.
- **fit_params
Additional parameters routed via metadata routing.
- Returns:
- self
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X)#
Predict using the best estimator found during search.
- Parameters:
- Xarray-like of shape (n_observations, n_assets)
Price returns.
- Returns:
- predictionPortfolio | Population
- score(X, y=None)#
Score using the best estimator found during search.
- Parameters:
- Xarray-like of shape (n_observations, n_assets)
Price returns.
- yIgnored
Present for scikit-learn API compatibility.
- Returns:
- scorefloat
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.