skfolio.distribution.IndependentCopula#

class skfolio.distribution.IndependentCopula(random_state=None)[source]#

Bivariate Independent Copula (also called the product copula).

It is defined by:

\[C(u, v) = u \cdot v\]
Parameters:
random_stateint, RandomState instance or None, default=None

Seed or random state to ensure reproducibility.

Attributes:
fitted_repr

String representation of the fitted copula.

lower_tail_dependence

Theoretical lower tail dependence coefficient.

n_params

Number of model parameters.

upper_tail_dependence

Theoretical upper tail dependence coefficient.

References

[1]

“An Introduction to Copulas (2nd ed.)”, Nelsen (2006)

[2]

“Multivariate Models and Dependence Concepts”, Joe, Chapman & Hall (1997)

[3]

“Quantitative Risk Management: Concepts, Techniques and Tools”, McNeil, Frey & Embrechts (2005)

[4]

“The t Copula and Related Copulas”, Demarta & McNeil (2005)

[5]

“Copula Methods in Finance”, Cherubini, Luciano & Vecchiato (2004)

Methods

aic(X)

Compute the Akaike Information Criterion (AIC) for the model given data X.

bic(X)

Compute the Bayesian Information Criterion (BIC) for the model given data X.

cdf(X)

Compute the CDF of the bivariate Independent copula.

fit(X[, y])

Fit the Bivariate Independent Copula.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

inverse_partial_derivative(X[, first_margin])

Compute the inverse of the bivariate copula's partial derivative, commonly known as the inverse h-function.

partial_derivative(X[, first_margin])

Compute the h-function (partial derivative) for the bivariate Independent copula.

plot_pdf_2d([title])

Plot a 2D contour of the estimated probability density function (PDF).

plot_pdf_3d([title])

Plot a 3D surface of the estimated probability density function (PDF).

plot_tail_concentration([X, title])

Plot the tail concentration function.

sample([n_samples])

Generate random samples from the bivariate copula using the inverse Rosenblatt transform.

score(X[, y])

Compute the total log-likelihood under the model.

score_samples(X)

Compute the log-likelihood of each sample (log-pdf) under the model.

set_params(**params)

Set the parameters of this estimator.

tail_concentration(quantiles)

Compute the tail concentration function for a set of quantiles.

aic(X)#

Compute the Akaike Information Criterion (AIC) for the model given data X.

The AIC is defined as:

\[\mathrm{AIC} = -2 \, \log L \;+\; 2 k,\]

where

  • \(\log L\) is the total log-likelihood

  • \(k\) is the number of parameters in the model

A lower AIC value indicates a better trade-off between model fit and complexity.

Parameters:
Xarray-like of shape (n_observations, n_features)

The input data on which to compute the AIC.

Returns:
aicfloat

The AIC of the fitted model on the given data.

Notes

In practice, both AIC and BIC measure the trade-off between model fit and complexity, but BIC tends to prefer simpler models for large \(n\) because of the \(\ln(n)\) term.

References

[1]

“A new look at the statistical model identification”, Akaike (1974).

bic(X)#

Compute the Bayesian Information Criterion (BIC) for the model given data X.

The BIC is defined as:

\[\mathrm{BIC} = -2 \, \log L \;+\; k \,\ln(n),\]

where

  • \(\log L\) is the (maximized) total log-likelihood

  • \(k\) is the number of parameters in the model

  • \(n\) is the number of observations

A lower BIC value suggests a better fit while imposing a stronger penalty for model complexity than the AIC.

Parameters:
Xarray-like of shape (n_observations, n_features)

The input data on which to compute the BIC.

Returns:
bicfloat

The BIC of the fitted model on the given data.

Notes

In practice, both AIC and BIC measure the trade-off between model fit and complexity, but BIC tends to prefer simpler models for large \(n\) because of the \(\ln(n)\) term.

References

[1]

“Estimating the dimension of a model”, Schwarz, G. (1978).

cdf(X)[source]#

Compute the CDF of the bivariate Independent copula.

Parameters:
Xarray-like of shape (n_observations, 2)

An array of bivariate inputs (u, v) where each row represents a bivariate observation. Both u and v must be in the interval [0, 1], having been transformed to uniform marginals.

Returns:
cdfndarray of shape (n_observations,)

CDF values for each observation in X.

fit(X, y=None)[source]#

Fit the Bivariate Independent Copula.

Provided for compatibility with the API.

Parameters:
Xarray-like of shape (n_observations, 2)

An array of bivariate inputs (u, v) where each row represents a bivariate observation. Both u and v must be in the interval [0, 1], having been transformed to uniform marginals.

yNone

Ignored. Provided for compatibility with scikit-learn’s API.

Returns:
selfIndependentCopula

Returns the instance itself.

property fitted_repr#

String representation of the fitted copula.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

inverse_partial_derivative(X, first_margin=False)[source]#

Compute the inverse of the bivariate copula’s partial derivative, commonly known as the inverse h-function.

For the independent copula, the h-function with respect to the second margin is

\[h(u\mid v)= u,\]

and the derivative with respect to the first margin is

\[g(u,v)= v.\]

Their inverses are trivial:

  • Given (p,v) for h(u|v)= p, we have u = p.

  • Given (p,u) for g(u,v)= p, we have v = p.

Parameters:
Xarray-like of shape (n_observations, 2)

An array of bivariate inputs (p, v), each in the interval [0, 1]. - The first column p corresponds to the value of the h-function. - The second column v is the conditioning variable.

first_marginbool, default=False

If True, compute the inverse partial derivative with respect to the first margin u; otherwise, compute the inverse partial derivative with respect to the second margin v.

Returns:
undarray of shape (n_observations,)

A 1D-array of length n_observations, where each element is the computed \(u = h^{-1}(p \mid v)\) for the corresponding pair in X.

property lower_tail_dependence#

Theoretical lower tail dependence coefficient.

property n_params#

Number of model parameters.

partial_derivative(X, first_margin=False)[source]#

Compute the h-function (partial derivative) for the bivariate Independent copula.

The h-function with respect to the second margin represents the conditional distribution function of \(u\) given \(v\):

\[\frac{\partial C(u,v)}{\partial v}=u,\]
Parameters:
Xarray-like of shape (n_samples, 2)

Array of pairs \((u,v)\), where each value is in the interval [0,1].

Returns:
np.ndarray

Array of h-function values for each observation in X.

plot_pdf_2d(title=None)#

Plot a 2D contour of the estimated probability density function (PDF).

This method generates a grid over [0, 1]^2, computes the PDF, and displays a contour plot of the PDF. Contour levels are limited to the 97th quantile to avoid extreme densities.

Parameters:
titlestr, optional

The title for the plot. If not provided, a default title based on the fitted copula’s representation is used.

Returns:
figgo.Figure

A Plotly figure object containing the 2D contour plot of the PDF.

plot_pdf_3d(title=None)#

Plot a 3D surface of the estimated probability density function (PDF).

This method generates a grid over [0, 1]^2, computes the PDF, and displays a 3D surface plot of the PDF using Plotly.

Parameters:
titlestr, optional

The title for the plot. If not provided, a default title based on the fitted copula’s representation is used.

Returns:
figgo.Figure

A Plotly figure object containing a 3D surface plot of the PDF.

plot_tail_concentration(X=None, title=None)#

Plot the tail concentration function.

This method computes the tail concentration function at 100 evenly spaced quantile levels between 0.005 and 0.995. The plot displays the concentration values on the y-axis and the quantile levels on the x-axis.

The tail concentration is defined as:
  • Lower tail: λ_L(q) = P(U₂ ≤ q | U₁ ≤ q)

  • Upper tail: λ_U(q) = P(U₂ ≥ q | U₁ ≥ q)

where U₁ and U₂ are the pseudo-observations of the first and second variables, respectively.

Parameters:
Xarray-like of shape (n_samples, 2), optional

If provided, it is used to plot the empirical tail concentration for comparison versus the model tail concentration.

titlestr, optional

The title for the plot. If not provided, a default title based on the fitted copula’s representation is used.

Returns:
figgo.Figure

A Plotly figure object containing the tail concentration curve.

References

[1]

“Quantitative Risk Management: Concepts, Techniques, and Tools”, McNeil, Frey, Embrechts (2005)

sample(n_samples=1)#

Generate random samples from the bivariate copula using the inverse Rosenblatt transform.

Parameters:
n_samplesint, default=1

Number of samples to generate.

Returns:
Xarray-like of shape (n_samples, 2)

An array of bivariate inputs (u, v) where each row represents a bivariate observation. Both u and v are uniform marginals in the interval [0, 1].

score(X, y=None)#

Compute the total log-likelihood under the model.

Parameters:
Xarray-like of shape (n_observations, n_features)

An array of data points for which the total log-likelihood is computed.

yNone

Ignored. Provided for compatibility with scikit-learn’s API.

Returns:
logprobfloat

The total log-likelihood (sum of log-pdf values).

score_samples(X)[source]#

Compute the log-likelihood of each sample (log-pdf) under the model.

Parameters:
Xarray-like of shape (n_samples, 2)

The input data where each row represents a bivariate observation. The data should be transformed to uniform marginals in [0, 1].

Returns:
densityndarray of shape (n_samples,)

The log-likelihood of each sample under the fitted copula.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

tail_concentration(quantiles)#

Compute the tail concentration function for a set of quantiles.

The tail concentration function is defined as follows:
  • For quantiles q ≤ 0.5:

    C(q) = P(U ≤ q, V ≤ q) / q

  • For quantiles q > 0.5:

    C(q) = (1 - 2q + P(U ≤ q, V ≤ q)) / (1 - q)

where U and V are the pseudo-observations of the first and second variables, respectively. This function returns the concentration values for each q provided.

Parameters:
quantilesndarray of shape (n_quantiles,)

A 1D array of quantile levels (values between 0 and 1) at which to compute the tail concentration.

Returns:
concentrationndarray of shape (n_quantiles,)

The computed tail concentration values corresponding to each quantile.

Raises:
ValueError

If any value in quantiles is not in the interval [0, 1].

References

[1]

“Quantitative Risk Management: Concepts, Techniques, and Tools”, McNeil, Frey, Embrechts (2005)

property upper_tail_dependence#

Theoretical upper tail dependence coefficient.