
The optimization module implements a set of methods intended for portfolio optimization. They follow the same API as scikit-learn’s estimator: the fit method takes X as the assets returns and stores the portfolio weights in its weights_ attribute.

X can be any array-like structure (numpy array, pandas DataFrame, etc.)

Naive Allocation#

The naive module implements a set of naive allocations commonly used as benchmarks for comparing different models:


Naive inverse-volatility allocation:

from sklearn.model_selection import train_test_split

from skfolio.datasets import load_sp500_dataset
from skfolio.optimization import InverseVolatility
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = InverseVolatility()

portfolio = model.predict(X_test)

Mean-Risk Optimization#

The MeanRisk estimator can solve the below 4 objective functions:

  • Minimize Risk:

\[\begin{split}\begin{cases} \begin{aligned} &\min_{w} & & risk_{i}(w) \\ &\text{s.t.} & & w^T\mu \ge min\_return \\ & & & A w \ge b \\ & & & risk_{j}(w) \le max\_risk_{j} \quad \forall \; j \ne i \end{aligned} \end{cases}\end{split}\]
  • Maximize Expected Return:

\[\begin{split}\begin{cases} \begin{aligned} &\max_{w} & & w^T\mu \\ &\text{s.t.} & & risk_{i}(w) \le max\_risk_{i} \\ & & & A w \ge b \\ & & & risk_{j}(w) \le max\_risk_{j} \quad \forall \; j \ne i \end{aligned} \end{cases}\end{split}\]
  • Maximize Utility:

\[\begin{split}\begin{cases} \begin{aligned} &\max_{w} & & w^T\mu - \lambda \times risk_{i}(w)\\ &\text{s.t.} & & risk_{i}(w) \le max\_risk_{i} \\ & & & w^T\mu \ge min\_return \\ & & & A w \ge b \\ & & & risk_{j}(w) \le max\_risk_{j} \quad \forall \; j \ne i \end{aligned} \end{cases}\end{split}\]
  • Maximize Ratio:

\[\begin{split}\begin{cases} \begin{aligned} &\max_{w} & & \frac{w^T\mu - r_{f}}{risk_{i}(w)}\\ &\text{s.t.} & & risk_{i}(w) \le max\_risk_{i} \\ & & & w^T\mu \ge min\_return \\ & & & A w \ge b \\ & & & risk_{j}(w) \le max\_risk_{j} \quad \forall \; j \ne i \end{aligned} \end{cases}\end{split}\]

With \(risk_{i}\) a risk measure among:

  • Variance

  • Semi-Variance

  • Standard-Deviation

  • Semi-Deviation

  • Mean Absolute Deviation

  • First Lower Partial Moment

  • CVaR (Conditional Value at Risk)

  • EVaR (Entropic Value at Risk)

  • Worst Realization (worst return)

  • CDaR (Conditional Drawdown at Risk)

  • Maximum Drawdown

  • Average Drawdown

  • EDaR (Entropic Drawdown at Risk)

  • Ulcer Index

  • Gini Mean Difference

It supports the following parameters:

  • Weight Constraints

  • Budget Constraints

  • Group Constrains

  • Transaction Costs

  • Management Fees

  • L1 and L2 Regularization

  • Turnover Constraint

  • Tracking Error Constraint

  • Uncertainty Set on Expected Returns

  • Uncertainty Set on Covariance

  • Expected Return Constraints

  • Risk Measure Constraints

  • Custom Objective

  • Custom Constraints

  • Prior Estimator


Maximum Sharpe Ratio portfolio:

from sklearn.model_selection import train_test_split

from skfolio import RiskMeasure
from skfolio.datasets import load_sp500_dataset
from skfolio.optimization import MeanRisk, ObjectiveFunction
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = MeanRisk(

portfolio = model.predict(X_test)

Prior Estimator#

Every portfolio optimization has a parameter named prior_estimator. The prior estimator fits a PriorModel containing the estimation of assets expected returns, covariance matrix, returns and Cholesky decomposition of the covariance. It represents the investor’s prior beliefs about the model used to estimate such distribution.

The available prior estimators are:


Minimum Variance portfolio using a Factor Model:

from sklearn.model_selection import train_test_split

from skfolio.datasets import load_factors_dataset, load_sp500_dataset
from skfolio.optimization import MeanRisk
from skfolio.preprocessing import prices_to_returns
from skfolio.prior import FactorModel

prices = load_sp500_dataset()
factor_prices = load_factors_dataset()

X, y = prices_to_returns(prices, factor_prices)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, shuffle=False)

model = MeanRisk(prior_estimator=FactorModel())
model.fit(X_train, y_train)

portfolio = model.predict(X_test)

Combining Prior Estimators#

Prior estimators can be combined together, making it possible to design complex models:


This example is purposely complex to demonstrate how multiple estimators can be combined.

The model below is a Maximum Sharpe Ratio optimization using a Factor Model for the estimation of the assets expected reruns and covariance matrix. A Black & Litterman model is used for the estimation of the factors expected reruns and covariance matrix, incorporating the analyst’ views on the factors. Finally, the Black & Litterman prior expected returns are estimated using an equal-weighted market equilibrium with a risk aversion of 2 and a denoised prior covariance matrix:

from sklearn.model_selection import train_test_split

from skfolio.datasets import load_factors_dataset, load_sp500_dataset
from skfolio.moments import DenoiseCovariance, EquilibriumMu
from skfolio.optimization import MeanRisk, ObjectiveFunction
from skfolio.preprocessing import prices_to_returns
from skfolio.prior import BlackLitterman, EmpiricalPrior, FactorModel

prices = load_sp500_dataset()
factor_prices = load_factors_dataset()

X, y = prices_to_returns(prices, factor_prices)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, shuffle=False)

factor_views = ["MTUM - QUAL == 0.0003 ",
                "SIZE - USMV == 0.0004",
                "VLUE == 0.0006"]

model = MeanRisk(

model.fit(X_train, y_train)

portfolio = model.predict(X_test)

Custom Estimator#

It is very common to use a custom implementation for the moments estimators. For example, you may want to use an in-house estimation for the covariance or a predictive model for the expected returns.

Below is a simple example of how you would implement a custom covariance estimator. For more complex cases and estimators, check the API Reference.

import numpy as np

from skfolio.datasets import load_sp500_dataset
from skfolio.moments import BaseCovariance
from skfolio.optimization import MeanRisk
from skfolio.preprocessing import prices_to_returns
from skfolio.prior import EmpiricalPrior

prices = load_sp500_dataset()
X = prices_to_returns(prices)

class MyCustomCovariance(BaseCovariance):
    def __init__(self, my_param=0):
        self.my_param = my_param

    def fit(self, X, y=None):
        X = self._validate_data(X)
        # Your custom implementation goes here
        covariance = np.cov(X.T, ddof=self.my_param)
        return self

model = MeanRisk(

Worst-Case Optimization#

With the mu_uncertainty_set_estimator parameter, the expected returns of the assets are modeled with an ellipsoidal uncertainty set. This approach is known as worst-case optimization and falls under the class of robust optimization. It mitigates the instability that arises from estimation errors of the expected returns.


Worst-case maximum Mean/CDaR ratio (Conditional Drawdown at Risk) with an ellipsoidal uncertainty set for the expected returns of the assets:

from sklearn.model_selection import train_test_split

from skfolio import RiskMeasure
from skfolio.datasets import load_sp500_dataset
from skfolio.optimization import MeanRisk, ObjectiveFunction
from skfolio.preprocessing import prices_to_returns
from skfolio.uncertainty_set import BootstrapMuUncertaintySet

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = MeanRisk(

portfolio = model.predict(X_test)

Going Further#

You can explore the remaining parameters (constraints, L1 and L2 regularization, costs, turnover, tracking error, etc.) with the Mean-Risk examples and the MeanRisk API.

Risk Budgeting#

The RiskBudgeting solves the below convex problem:

\[\begin{split}\begin{cases} \begin{aligned} &\min_{w} & & risk_{i}(w) \\ &\text{s.t.} & & b^T log(w) \ge c \\ & & & w^T\mu \ge min\_return \\ & & & A w \ge b \\ & & & w \ge0 \end{aligned} \end{cases}\end{split}\]

with \(b\) the risk budget vector and \(c\) an auxiliary variable of the log barrier.

And \(risk_{i}\) a risk measure among:

  • Variance

  • Semi-Variance

  • Standard-Deviation

  • Semi-Deviation

  • Mean Absolute Deviation

  • First Lower Partial Moment

  • CVaR (Conditional Value at Risk)

  • EVaR (Entropic Value at Risk)

  • Worst Realization (worst return)

  • CDaR (Conditional Drawdown at Risk)

  • Maximum Drawdown

  • Average Drawdown

  • EDaR (Entropic Drawdown at Risk)

  • Ulcer Index

  • Gini Mean Difference

  • First Lower Partial Moment

It supports the following parameters:

  • Weight Constraints

  • Budget Constraints

  • Group Constrains

  • Transaction Costs

  • Management Fees

  • Expected Return Constraints

  • Custom Objective

  • Custom constraints

  • Prior Estimator

Limitations are imposed on certain constraints, such as long-only weights, to ensure the problem remains convex.


CVaR (Conditional Value at Risk) Risk Parity portfolio:

from sklearn.model_selection import train_test_split

from skfolio import RiskMeasure
from skfolio.datasets import load_sp500_dataset
from skfolio.optimization import RiskBudgeting
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = RiskBudgeting(risk_measure=RiskMeasure.CVAR)

portfolio_train = model.predict(X_train)

portfolio_test = model.predict(X_test)

Maximum Diversification#

The MaximumDiversification maximizes the diversification ratio, which is the ratio of the weighted volatilities over the total volatility.


from sklearn.model_selection import train_test_split

from skfolio.datasets import load_sp500_dataset
from skfolio.optimization import MaximumDiversification
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = MaximumDiversification()

portfolio = model.predict(X_test)

Distributionally Robust CVaR#

The DistributionallyRobustCVaR constructs a Wasserstein ball in the space of multivariate and non-discrete probability distributions centered at the uniform distribution on the training samples and finds the allocation that minimizes the CVaR of the worst-case distribution within this Wasserstein ball. Esfahani and Kuhn proved that for piecewise linear objective functions, which is the case of CVaR, the distributionally robust optimization problem over a Wasserstein ball can be reformulated as finite convex programs.

A solver like Mosek that can handle a high number of constraints is preferred.


from sklearn.model_selection import train_test_split

from skfolio.datasets import load_sp500_dataset
from skfolio.optimization import DistributionallyRobustCVaR
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X = X["2020":]
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = DistributionallyRobustCVaR(wasserstein_ball_radius=0.01)

portfolio = model.predict(X_test)

Hierarchical Risk Parity#

The HierarchicalRiskParity (HRP) is a portfolio optimization method developed by Marcos Lopez de Prado.

This algorithm uses a distance matrix to compute hierarchical clusters using the Hierarchical Tree Clustering algorithm then employs seriation to rearrange the assets in the dendrogram, minimizing the distance between leafs.

The final step is the recursive bisection where each cluster is split between two sub-clusters by starting with the topmost cluster and traversing in a top-down manner. For each sub-cluster, we compute the total cluster risk of an inverse-risk allocation. A weighting factor is then computed from these two sub-cluster risks, which is used to update the cluster weight.


The original paper uses the variance as the risk measure and the single-linkage method for the Hierarchical Tree Clustering algorithm. Here we generalize it to multiple risk measures and linkage methods. The default linkage method is set to the Ward variance minimization algorithm, which is more stable and has better properties than the single-linkage method.

It supports all prior estimators and risk measures as well as weight constraints.

It also supports all distance estimators through the distance_estimator parameter. It fits a distance model for the estimation of the codependence and the distance matrix used to compute the linkage matrix:


Hierarchical Risk Parity with semi (downside) standard-deviation as the risk measure and mutual information as the distance estimator:

from sklearn.model_selection import train_test_split

from skfolio import RiskMeasure
from skfolio.datasets import load_sp500_dataset
from skfolio.distance import MutualInformation
from skfolio.optimization import HierarchicalRiskParity
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = HierarchicalRiskParity(
    risk_measure=RiskMeasure.SEMI_DEVIATION, distance_estimator=MutualInformation()

portfolio = model.predict(X_test)

Hierarchical Equal Risk Contribution#

The HierarchicalEqualRiskContribution (HERC) is a portfolio optimization method developed by Thomas Raffinot.

This algorithm uses a distance matrix to compute hierarchical clusters using the Hierarchical Tree Clustering algorithm. It then computes, for each cluster, the total cluster risk of an inverse-risk allocation.

The final step is the top-down recursive division of the dendrogram, where the assets weights are updated using a naive risk parity within clusters.

It differs from the Hierarchical Risk Parity by exploiting the dendrogram shape during the top-down recursive division instead of bisecting it.


The default linkage method is set to the Ward variance minimization algorithm, which is more stable and has better properties than the single-linkage method.

It supports all prior estimators and risk measures as well as weight constraints.

It also supports all distance estimator through the distance_estimator parameter. It fits a distance model for the estimation of the codependence and the distance matrix used to compute the linkage matrix:


Hierarchical Equal Risk Contribution with CVaR (Conditional Value at Risk) as the risk measure and mutual information as the distance estimator:

from sklearn.model_selection import train_test_split

from skfolio import RiskMeasure
from skfolio.datasets import load_sp500_dataset
from skfolio.distance import MutualInformation
from skfolio.optimization import HierarchicalEqualRiskContribution
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = HierarchicalEqualRiskContribution(
    distance_estimator = MutualInformation()

portfolio = model.predict(X_test)

Nested Clusters Optimization#

The NestedClustersOptimization (NCO) is a portfolio optimization method developed by Marcos Lopez de Prado.

It uses a distance matrix to compute clusters using a clustering algorithm ( Hierarchical Tree Clustering, KMeans, etc..). For each cluster, the inner-cluster weights are computed by fitting the inner-estimator on each cluster using the whole training data. Then the outer-cluster weights are computed by training the outer-estimator using out-of-sample estimates of the inner-estimators with cross-validation. Finally, the final assets weights are the dot-product of the inner-weights and outer-weights.


The original paper uses KMeans as the clustering algorithm, minimum Variance for the inner-estimator and equal-weighted for the outer-estimator. Here we generalize it to all sklearn and skfolio clustering algorithms (Hierarchical Tree Clustering, KMeans, etc.), all portfolio optimizations (Mean-Variance, HRP, etc.) and risk measures (variance, CVaR, etc.). To avoid data leakage at the outer-estimator, we use out-of-sample estimates to fit the outer estimator.

It supports all distance estimator and clustering estimator (both skfolio and sklearn)


Nested Clusters Optimization with KMeans as the clustering algorithm, Kendall Distance as the distance estimator, Minimum Semi-Variance as the inner estimator, and CVaR Risk Parity as the outer (meta) estimator trained on the out-of-sample estimates from the KFolds cross-validation and run with parallelization:

from sklearn.cluster import KMeans
from sklearn.model_selection import KFold, train_test_split

from skfolio import RiskMeasure
from skfolio.datasets import load_sp500_dataset
from skfolio.distance import KendallDistance
from skfolio.optimization import MeanRisk, NestedClustersOptimization, RiskBudgeting
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

model = NestedClustersOptimization(

portfolio = model.predict(X_test)

The cv parameter can also be a combinatorial cross-validation, such as CombinatorialPurgedCV, in which case each cluster’s out-of-sample outputs are a collection of multiple paths instead of one single path. The selected out-of-sample path among this collection of paths is chosen according to the quantile and quantile_measure parameters.

Stacking Optimization#

StackingOptimization is an ensemble method that consists in stacking the output of individual portfolio optimizations with a final portfolio optimization.

The weights are the dot-product of individual optimizations weights with the final optimization weights.

Stacking allows to use the strength of each individual portfolio optimization by using their output as input of a final portfolio optimization.

To avoid data leakage, out-of-sample estimates are used to fit the outer optimization.


Stacking Optimization with Minimum Semi-Variance and CVaR Risk Parity stacked together using Minimum Variance as the final (meta) estimator.

from sklearn.model_selection import KFold, train_test_split

from skfolio import RiskMeasure
from skfolio.datasets import load_sp500_dataset
from skfolio.optimization import MeanRisk, RiskBudgeting, StackingOptimization
from skfolio.preprocessing import prices_to_returns

prices = load_sp500_dataset()

X = prices_to_returns(prices)
X_train, X_test = train_test_split(X, test_size=0.33, shuffle=False)

estimators = [
    ('model1', MeanRisk(risk_measure=RiskMeasure.SEMI_VARIANCE)),
    ('model2', RiskBudgeting(risk_measure=RiskMeasure.CVAR))

model = StackingOptimization(

portfolio = model.predict(X_test)

The cv parameter can also be a combinatorial cross-validation, such as CombinatorialPurgedCV, in which case each out-of-sample outputs are a collection of multiple paths instead of one single path. The selected out-of-sample path among this collection of paths is chosen according to the quantile and quantile_measure parameters.