skfolio.distance.MutualInformation#

class skfolio.distance.MutualInformation(n_bins_method=FREEDMAN, n_bins=None, normalize=True)[source]#

Mutual Information estimator.

In information theory, the mutual information is a measure of the mutual dependence between variables. The related distance metric is called the variation of information.

For two random variables X and Y, the mutual information I(X,Y) is defined as:

\[I(X,Y) = H(X) + H(Y) - H(X,Y)\]

with H(X) and H(Y) the marginal entropies and H(X,Y) the joint entropy.

The related distance metric known as the variation of information is defined as:

\[d(X,Y) = H(X,Y) - I(X,Y) = H(X) + H(Y) - 2 \times I(X,Y)\]

and its normalization as:

\[D(X,Y) = \frac{d(X,Y)}{H(X,Y)} = \frac{H(X) + H(Y) - 2 \times I(X,Y)}{H(X) + H(Y) - I(X,Y)}\]
Parameters:
n_bins_methodNBinsMethod, default=NBinsMethod.FREEDMAN

Method to compute the number of bins for the contingency matrix estimation used for the computation of the mutual information. Possible values are:

  • FREEDMAN (default)

  • KNUTH

n_binsint, optional

Instead of using n_bins_method, you can directly specify the number of bins with n_bins.

normalizebool, default=True

If this is set to True, the variation of information is normalized. The default is True.

Attributes:
codependence_ndarray of shape (n_assets, n_assets)

Codependence matrix.

distance_ndarray of shape (n_assets, n_assets)

Distance matrix.

n_features_in_int

Number of assets seen during fit.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Defined only when X has feature names that are all strings.

Methods

fit(X[, y])

Fit the Mutual Information estimator.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

set_params(**params)

Set the parameters of this estimator.

fit(X, y=None)[source]#

Fit the Mutual Information estimator.

Parameters:
Xarray-like of shape (n_observations, n_assets)

Price returns of the assets.

yIgnored

Not used, present for API consistency by convention.

Returns:
selfMutualInformation

Fitted estimator.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.