skfolio.prior
.SyntheticData#
- class skfolio.prior.SyntheticData(distribution_estimator=None, n_samples=1000, sample_args=None)[source]#
Synthetic Data Estimator.
The Synthetic Data model estimates a
PriorModel
by fitting adistribution_estimator
and sampling new returns data from it.The default
distribution_estimator
is a Regular Vine Copula model. Other common choices are Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs).This class is particularly useful when the historical distribution tail dependencies are sparse and need extrapolation for tail optimizations or when optimizing under conditional or stressed scenarios.
- Parameters:
- distribution_estimatorBaseEstimator, optional
Estimator to model the distribution of asset returns. It must inherit from
BaseEstimator
and implements asample
method. If None, the defaultVineCopula()
model is used.- n_samplesint, default=1000
Number of samples to generate from the
distribution_estimator
, default is 1000.- sample_argsdict, optional
Additional keyword arguments to pass to the
sample
method of thedistribution_estimator
.
- Attributes:
- prior_model_PriorModel
The assets
PriorModel
.- distribution_estimator_BaseEstimator
The fitted distribution estimator.
- n_features_in_int
Number of assets seen during
fit
.- feature_names_in_ndarray of shape (
n_features_in_
,) Names of features seen during
fit
. Defined only whenX
has feature names that are all strings.
Examples
>>> import numpy as np >>> from skfolio.datasets import load_sp500_dataset, load_factors_dataset >>> from skfolio.preprocessing import prices_to_returns >>> from skfolio.distribution import VineCopula >>> from skfolio.optimization import MeanRisk >>> from skfolio.prior import FactorModel, SyntheticData >>> from skfolio import RiskMeasure >>> >>> # Load historical prices and convert them to returns >>> prices = load_sp500_dataset() >>> factors = load_factors_dataset() >>> X, y = prices_to_returns(prices, factors) >>> >>> # Instanciate the SyntheticData model and fit it >>> model = SyntheticData() >>> model.fit(X) >>> print(model.prior_model_) >>> >>> # Minimum CVaR optimization on synthetic returns >>> model = MeanRisk( ... risk_measure=RiskMeasure.CVAR, ... prior_estimator=SyntheticData( ... distribution_estimator=VineCopula(log_transform=True, n_jobs=-1), ... n_samples=2000, ... ) ... ) >>> model.fit(X) >>> print(model.weights_) >>> >>> # Minimum CVaR optimization on Stressed Factors >>> factor_model = FactorModel( ... factor_prior_estimator=SyntheticData( ... distribution_estimator=VineCopula( ... central_assets=["QUAL"], ... log_transform=True, ... n_jobs=-1, ... ), ... n_samples=5000, ... sample_args=dict(conditioning={"QUAL": -0.2}), ... ) ... ) >>> model = MeanRisk(risk_measure=RiskMeasure.CVAR, prior_estimator=factor_model) >>> model.fit(X, y) >>> print(model.weights_) >>> >>> # Stress Test the Portfolio >>> factor_model.set_params(factor_prior_estimator__sample_args=dict( ... conditioning={"QUAL": -0.5} ... )) >>> factor_model.fit(X,y) >>> stressed_X = factor_model.prior_model_.returns >>> stressed_ptf = model.predict(stressed_X)
Methods
fit
(X[, y])Fit the Synthetic Data estimator.
Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
set_params
(**params)Set the parameters of this estimator.
- fit(X, y=None, **fit_params)[source]#
Fit the Synthetic Data estimator.
- Parameters:
- Xarray-like of shape (n_observations, n_assets)
Price returns of the assets.
- yIgnored
Not used, present for API consistency by convention.
- **fit_paramsdict
Parameters to pass to the underlying estimators. Only available if
enable_metadata_routing=True
, which can be set by usingsklearn.set_config(enable_metadata_routing=True)
. See Metadata Routing User Guide for more details.
- Returns:
- selfSyntheticData
Fitted estimator.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.