skfolio.utils.stats.cov_nearest#

skfolio.utils.stats.cov_nearest(cov, higham=False, higham_max_iteration=100, warn=False)[source]#

Compute the nearest covariance matrix that is positive definite and with a cholesky decomposition than can be computed. The variance is left unchanged. A covariance matrix that is not positive definite often occurs in high dimensional problems. It can be due to multicollinearity, floating-point inaccuracies, or when the number of observations is smaller than the number of assets.

First, it converts the covariance matrix to a correlation matrix. Then, it finds the nearest correlation matrix and converts it back to a covariance matrix using the initial standard deviation.

Cholesky decomposition can fail for symmetric positive definite (SPD) matrix due to floating point error and inversely, Cholesky decomposition can success for non-SPD matrix. Therefore, we need to test for both. We always start by testing for Cholesky decomposition which is significantly faster than checking for positive eigenvalues.

Parameters:
covndarray of shape (n, n)

Covariance matrix.

highambool, default=False

If this is set to True, the Higham & Nick (2002) algorithm [1] is used, otherwise the eigenvalues are clipped to threshold above zeros (1e-13). The default (False) is to use the clipping method as the Higham & Nick algorithm can be slow for large datasets.

higham_max_iterationint, default=100

Maximum number of iteration of the Higham & Nick (2002) algorithm. The default value is 100.

warnbool, default=False

If this is set to True, a user warning is emitted when the covariance matrix is not positive definite and replaced by the nearest. The default is False.

Returns:
covndarray

The nearest covariance matrix.

References

[1]

“Computing the nearest correlation matrix - a problem from finance” IMA Journal of Numerical Analysis Higham & Nick (2002)