`skfolio.utils.stats`.cov_nearest#

skfolio.utils.stats.cov_nearest(cov, higham=False, higham_max_iteration=100, warn=False)[source]#

Compute the nearest covariance matrix that is positive definite and with a cholesky decomposition than can be computed. The variance is left unchanged. A covariance matrix that is not positive definite often occurs in high dimensional problems. It can be due to multicollinearity, floating-point inaccuracies, or when the number of observations is smaller than the number of assets.

First, it converts the covariance matrix to a correlation matrix. Then, it finds the nearest correlation matrix and converts it back to a covariance matrix using the initial standard deviation.

Cholesky decomposition can fail for symmetric positive definite (SPD) matrix due to floating point error and inversely, Cholesky decomposition can success for non-SPD matrix. Therefore, we need to test for both. We always start by testing for Cholesky decomposition which is significantly faster than checking for positive eigenvalues.

Parameters:

covndarray of shape (n, n): Covariance matrix.
highambool, default=False: If this is set to True, the Higham & Nick (2002) algorithm [1] is used, otherwise the eigenvalues are clipped to threshold above zeros (1e-13). The default (False) is to use the clipping method as the Higham & Nick algorithm can be slow for large datasets.
higham_max_iterationint, default=100: Maximum number of iteration of the Higham & Nick (2002) algorithm. The default value is 100.
warnbool, default=False: If this is set to True, a user warning is emitted when the covariance matrix is not positive definite and replaced by the nearest. The default is False.

Returns:

covndarray: The nearest covariance matrix.

References

[1]

“Computing the nearest correlation matrix - a problem from finance” IMA Journal of Numerical Analysis Higham & Nick (2002)

skfolio.utils.stats.cov_nearest#

`skfolio.utils.stats`.cov_nearest#