skfolio.utils.stats.combination_by_index#

skfolio.utils.stats.combination_by_index(idx, n, k)[source]#

Retrieve the k-combination at a given lexicographic position without enumerating all combinations.

This function implements the unranking algorithm (also known as the combinatorial number system or “combinadic”) to retrieve the specific k-combination corresponding to a given lexicographic idx without generating all C(n, k) possible subsets.

Given a universe of size n, there are M = C(n, k) possible subsets of size k. This function returns the subset corresponding to the idx in lex order.

This approach is crucial when M = C(n, k) is too large to generate or store all combinations, and you need to draw random subsets uniformly (sampling k=5 from n=100 gives M ≈ 7.5e7).

Time complexity: O(k) Space complexity: O(k)

Parameters:
idxint

Index (rank) of the desired combination in lex order. Must satisfy 0 <= idx < C(n, k).

nint

Size of the universe.

kint

Size of each combination (0 <= k <= n).

Returns:
combinationndarray of shape (k,)

1D integer array of length k containing the sorted k-combination.

Raises:
ValueError

If parameters are out of valid range.

References

..[1] “The Art of Computer Programming”, Vol. 4A: Combinatorial Algorithms,

Section 7.2.1.3. Knuth, D. E. (1998).