skfolio.prior.OpinionPooling#

class skfolio.prior.OpinionPooling(estimators, opinion_probabilities=None, prior_estimator=None, is_linear_pooling=True, divergence_penalty=0.0, n_jobs=None)[source]#

Opinion Pooling estimator.

Opinion Pooling (also called Belief Aggregation or Risk Aggregation) is a process in which different probability distributions (opinions), produced by different experts, are combined to yield a single probability distribution (consensus).

Expert opinions (also called individual prior distributions) can be elicited from domain experts or derived from quantitative analyses.

The OpinionPooling estimator takes a list of prior estimators, each of which produces scenario probabilities (which we use as sample_weight), and pools them into a single consensus probability .

You can choose between linear (arithmetic) pooling or logarithmic (geometric) pooling, and optionally apply robust pooling using a Kullback-Leibler divergence penalty to down-weight experts whose views deviate strongly from the group consensus.

Parameters:
estimatorslist of (str, BasePrior)

A list of prior estimators representing opinions to be pooled into a single consensus. Each element of the list is defined as a tuple of string (i.e. name) and an estimator instance. Each must expose sample_weight such as in EntropyPooling.

opinion_probabilitiesarray-like of float, optional

Probability mass assigned to each opinion, in [0,1] summing to ≤1. Any leftover mass is assigned to the uniform (uninformative) prior. The default (None), is to assign the same probability to each opinion.

prior_estimatorBasePrior, optional

Common prior for all estimators. If provided, each estimator from estimators will be fitted using this common prior before pooling. Setting prior_estimator inside individual estimators is disabled to avoid mixing different prior scenarios (each estimator must have the same underlying distribution). For example, using prior_estimator = SyntheticData(n_samples=10_000) will generate 10,000 synthetic data points from a Vine Copula before fitting the estimators on this common distribution.

is_linear_poolingbool, default=True

If True, combine each opinion via Linear Opinion Pooling (arithmetic mean); if False, use Logarithmic Opinion Pooling (geometric mean).

Linear Opinion Pooling:
  • Retains all nonzero support (no “zero-forcing”).

  • Produces an averaging that is more evenly spread across all expert opinions.

Logarithmic Opinion Pooling:
  • Zero-Preservation. Any scenario assigned zero probability by any expert remains zero in the aggregate.

  • Information-Theoretic Optimality. Yields the distribution that minimizes the weighted sum of KL-divergences from each expert’s distribution.

  • Robust to Extremes: down-weight extreme or contrarian views more severely.

divergence_penaltyfloat, default=0.0

Non-negative factor (\(\alpha\)) that penalizes each opinion’s divergence from the group consensus, yielding more robust pooling. A higher value more strongly down-weights deviating opinions.

The robust opinion probabilities are given by:

\[\tilde{p}_i = \frac{p_i \exp\bigl(-\alpha D_i\bigr)} {\displaystyle \sum_{k=1}^N p_k \exp\bigl(-\alpha D_k\bigr)} \quad\text{for }i = 1,\dots,N\]

where

  • \(N\) is the number of experts len(estimators)

  • \(M\) is the number of scenarios len(observations)

  • \(D_i\) is the KL-divergence of expert i’s distribution from consensus:

    \[D_i = \mathrm{KL}\bigl(w_i \,\|\, c\bigr) = \sum_{j=1}^M w_{ij}\,\ln\!\frac{w_{ij}}{c_j} \quad\text{for }i = 1,\dots,N.\]
  • \(w_i\) is the sample-weight vector (scenario probabilities) from expert i, with \(\sum_{j=1}^M w_{ij} = 1\).

  • \(p_i\) is the initial opinion probability of expert i, with \(\sum_{i=1}^N p_i \le 1\) (any leftover mass goes to a uniform prior).

  • \(c_j\) is the consensus of scenario \(j\):

    \[c_j = \sum_{i=1}^N p_i \, w_{ij} \quad\text{for }j = 1,\dots,M.\]
n_jobsint, optional

The number of jobs to run in parallel for fit of all estimators. The value -1 means using all processors. The default (None) means 1 unless in a joblib.parallel_backend context.

Attributes:
return_distribution_ReturnDistribution

Fitted ReturnDistribution to be used by the optimization estimators, containing the assets distribution, moments estimation and the opinion-pooling sample weights.

estimators_list[BasePrior]

The elements of the estimators parameter, having been fitted on the training data.

named_estimators_dict[str, BasePrior]

Attribute to access any fitted sub-estimators by name.

prior_estimator_BasePrior

Fitted prior_estimator if provided.

opinion_probabilities_ndarray of shape (n_opinions,)

Final opinion probabilities after applying the KL-divergence penalty. If the initial opinion_probabilities doesn’t sum to one, the last element of opinion_probabilities_ is the probability assigned to the uniform prior.

n_features_in_int

Number of assets seen during fit.

feature_names_in_ndarray of shape (n_features_in_,)

Names of assets seen during fit. Defined only when X has assets names that are all strings.

References

[1]

“Probabilistic opinion pooling generalized”, Social Choice and Welfare, Dietrich & List (2017)

[2]

“Opinion Aggregation and Individual Expertise”, Oxford University Press, Martini & Sprenger (2017)

[3]

“Rational Decisions”, Journal of the Royal Statistical Society, Good (1952)

Examples

For a full tutorial on entropy pooling, see Opinion Pooling.

>>> from skfolio import RiskMeasure
>>> from skfolio.datasets import load_sp500_dataset
>>> from skfolio.preprocessing import prices_to_returns
>>> from skfolio.prior import EntropyPooling, OpinionPooling
>>> from skfolio.optimization import RiskBudgeting
>>>
>>> prices = load_sp500_dataset()
>>> X = prices_to_returns(prices)
>>>
>>> # We consider two expert opinions, each generated via Entropy Pooling with
>>> # user-defined views.
>>> # We assign probabilities of 40% to Expert 1, 50% to Expert 2, and by default
>>> # the remaining 10% is allocated to the prior distribution:
>>> opinion_1 = EntropyPooling(cvar_views=["AMD == 0.10"])
>>> opinion_2 = EntropyPooling(
...     mean_views=["AMD >= BAC", "JPM <= prior(JPM) * 0.8"],
...     cvar_views=["GE == 0.12"],
... )
>>>
>>> opinion_pooling = OpinionPooling(
...     estimators=[("opinion_1", opinion_1), ("opinion_2", opinion_2)],
...     opinion_probabilities=[0.4, 0.5],
... )
>>>
>>> opinion_pooling.fit(X)
>>>
>>> print(opinion_pooling.return_distribution_.sample_weight)
>>>
>>> # CVaR Risk Parity optimization on opinion Pooling
>>> model = RiskBudgeting(
...     risk_measure=RiskMeasure.CVAR,
...     prior_estimator=opinion_pooling
... )
>>> model.fit(X)
>>> print(model.weights_)
>>>
>>> # Stress Test the Portfolio
>>> opinion_1 = EntropyPooling(cvar_views=["AMD == 0.05"])
>>> opinion_2 = EntropyPooling(cvar_views=["AMD == 0.10"])
>>> opinion_pooling = OpinionPooling(
...     estimators=[("opinion_1", opinion_1), ("opinion_2", opinion_2)],
...     opinion_probabilities=[0.6, 0.4],
... )
>>> opinion_pooling.fit(X)
>>>
>>> stressed_dist = opinion_pooling.return_distribution_
>>>
>>> stressed_ptf = model.predict(stressed_dist)

Methods

fit(X[, y])

Fit the Opinion Pooling estimator.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get the parameters of an estimator from the ensemble.

set_params(**params)

Set the parameters of an estimator from the ensemble.

fit(X, y=None, **fit_params)[source]#

Fit the Opinion Pooling estimator.

Parameters:
Xarray-like of shape (n_observations, n_assets)

Price returns of the assets.

yIgnored

Not used, present for API consistency by convention.

**fit_paramsdict

Parameters to pass to the underlying estimators. Only available if enable_metadata_routing=True, which can be set by using sklearn.set_config(enable_metadata_routing=True). See Metadata Routing User Guide for more details.

Returns:
selfOpinionPooling

Fitted estimator.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get the parameters of an estimator from the ensemble.

Returns the parameters given in the constructor as well as the estimators contained within the estimators parameter.

Parameters:
deepbool, default=True

Setting it to True gets the various estimators and the parameters of the estimators as well.

Returns:
paramsdict

Parameter and estimator names mapped to their values or parameter names mapped to their values.

property named_estimators#

Dictionary to access any fitted sub-estimators by name.

Returns:
Bunch
set_params(**params)[source]#

Set the parameters of an estimator from the ensemble.

Valid parameter keys can be listed with get_params(). Note that you can directly set the parameters of the estimators contained in estimators.

Parameters:
**paramskeyword arguments

Specific parameters using e.g. set_params(parameter_name=new_value). In addition, to setting the parameters of the estimator, the individual estimator of the estimators can also be set, or can be removed by setting them to ‘drop’.

Returns:
selfobject

Estimator instance.