skfolio.model_selection
.cross_val_predict#
- skfolio.model_selection.cross_val_predict(estimator, X, y=None, cv=None, n_jobs=None, method='predict', verbose=0, params=None, pre_dispatch='2*n_jobs', column_indices=None, portfolio_params=None)[source]#
Generate cross-validated
Portfolios
estimates.The data is split according to the
cv
parameter. The optimization estimator is fitted on the training set and portfolios are predicted on the corresponding test set.For non-combinatorial cross-validation like
Kfold
, the output is the predictedMultiPeriodPortfolio
where eachPortfolio
corresponds to the prediction on each train/test pair (k
portfolios forKfold
).For combinatorial cross-validation like
CombinatorialPurgedCV
, the output is the predictedPopulation
of multipleMultiPeriodPortfolio
(each test outputs are a collection of multiple paths instead of one single path).- Parameters:
- estimatorBaseOptimization
Optimization estimators use to fit the data.
- Xarray-like of shape (n_observations, n_assets)
Price returns of the assets.
- yarray-like of shape (n_observations, n_targets), optional
Target data (optional). For example, the price returns of the factors.
- cvint | cross-validation generator, optional
Determines the cross-validation splitting strategy. Possible inputs for cv are:
None, to use the default 5-fold cross validation,
int, to specify the number of folds in a
(Stratified)KFold
,CV splitter
,An iterable that generates (train, test) splits as arrays of indices.
- n_jobsint, optional
The number of jobs to run in parallel for
fit
of allestimators
.None
means 1 unless in ajoblib.parallel_backend
context. -1 means using all processors.- methodstr
Invokes the passed method name of the passed estimator.
- verboseint, default=0
The verbosity level.
- paramsdict, optional
Parameters to pass to the underlying estimator’s
fit
and the CV splitter.- pre_dispatchint or str, default=’2*n_jobs’
Controls the number of jobs that get dispatched during parallel execution. Reducing this number can be useful to avoid an explosion of memory consumption when more jobs get dispatched than CPUs can process. This parameter can be:
None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs
An int, giving the exact number of total jobs that are spawned
A str, giving an expression as a function of n_jobs, as in ‘2*n_jobs’
- column_indicesndarray, optional
Indices of the
X
columns to cross-validate on.- portfolio_paramsdict, optional
Additional portfolio parameters passed to
MultiPeriodPortfolio
.
- Returns:
- predictionsMultiPeriodPortfolio | Population
This is the result of calling
predict