skfolio.model_selection.cross_val_predict#

skfolio.model_selection.cross_val_predict(estimator, X, y=None, cv=None, n_jobs=None, method='predict', verbose=0, params=None, pre_dispatch='2*n_jobs', column_indices=None, portfolio_params=None)[source]#

Generate cross-validated Portfolios estimates.

The data is split according to the cv parameter. The optimization estimator is fitted on the training set and portfolios are predicted on the corresponding test set.

For single-path cross-validation such as KFold or WalkForward, the output is a MultiPeriodPortfolio where each Portfolio corresponds to a train/test split (k portfolios for KFold).

For multi-path cross-validation such as CombinatorialPurgedCV or MultipleRandomizedCV, the output is a Population of multiple MultiPeriodPortfolio objects (each test produces a collection of paths rather than a single path).

If the final estimator in the pipeline (or the estimator itself) declares needs_previous_weights=True, this function automatically propagates previous_weights from one fold to the next for sequential CV strategies (e.g., WalkForward or MultipleRandomizedCV).

Parameters:
estimatorBaseEstimator | Pipeline

Estimator or pipeline whose last step is an optimization estimator.

Xarray-like of shape (n_observations, n_assets)

Price returns of the assets.

yarray-like of shape (n_observations, n_targets), optional

Target data (optional). For example, the price returns of the factors.

cvint | cross-validation generator, optional

Determines the cross-validation splitting strategy. Possible inputs for cv are:

  • None, to use the default 5-fold cross validation,

  • int, to specify the number of folds in a (Stratified)KFold,

  • CV splitter,

  • An iterable that generates (train, test) splits as arrays of indices.

n_jobsint, optional

The number of jobs to run in parallel for fit of all estimators. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

methodstr

Invokes the passed method name of the passed estimator.

verboseint, default=0

The verbosity level.

paramsdict, optional

Parameters to pass to the underlying estimator’s fit and the CV splitter.

pre_dispatchint or str, default=’2*n_jobs’

Controls the number of jobs that get dispatched during parallel execution. Reducing this number can be useful to avoid an explosion of memory consumption when more jobs get dispatched than CPUs can process. This parameter can be:

  • None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs

  • An int, giving the exact number of total jobs that are spawned

  • A str, giving an expression as a function of n_jobs, as in ‘2*n_jobs’

column_indicesndarray, optional

Indices of the X columns to cross-validate on.

portfolio_paramsdict, optional

Additional portfolio parameters passed to MultiPeriodPortfolio.

Returns:
predictionsMultiPeriodPortfolio | Population

This is the result of calling predict