skfolio.utils.stats
.cov_nearest#
- skfolio.utils.stats.cov_nearest(cov, higham=False, higham_max_iteration=100, warn=False)[source]#
Compute the nearest covariance matrix that is positive definite and with a cholesky decomposition than can be computed. The variance is left unchanged. A covariance matrix that is not positive definite often occurs in high dimensional problems. It can be due to multicollinearity, floating-point inaccuracies, or when the number of observations is smaller than the number of assets.
First, it converts the covariance matrix to a correlation matrix. Then, it finds the nearest correlation matrix and converts it back to a covariance matrix using the initial standard deviation.
Cholesky decomposition can fail for symmetric positive definite (SPD) matrix due to floating point error and inversely, Cholesky decomposition can success for non-SPD matrix. Therefore, we need to test for both. We always start by testing for Cholesky decomposition which is significantly faster than checking for positive eigenvalues.
- Parameters:
- covndarray of shape (n, n)
Covariance matrix.
- highambool, default=False
If this is set to True, the Higham & Nick (2002) algorithm [1] is used, otherwise the eigenvalues are clipped to threshold above zeros (1e-13). The default (
False
) is to use the clipping method as the Higham & Nick algorithm can be slow for large datasets.- higham_max_iterationint, default=100
Maximum number of iteration of the Higham & Nick (2002) algorithm. The default value is
100
.- warnbool, default=False
If this is set to True, a user warning is emitted when the covariance matrix is not positive definite and replaced by the nearest. The default is False.
- Returns:
- covndarray
The nearest covariance matrix.
References
[1]“Computing the nearest correlation matrix - a problem from finance” IMA Journal of Numerical Analysis Higham & Nick (2002)