skfolio.distribution.IndependentCopula#

class skfolio.distribution.IndependentCopula(random_state=None)[source]#

Bivariate Independent Copula (also called the product copula).

It is defined by:

\[C(u, v) = u \cdot v\]

Parameters:

random_stateint, RandomState instance or None, default=None: Seed or random state to ensure reproducibility.

Attributes:

fitted_repr: String representation of the fitted copula.
lower_tail_dependence: Theoretical lower tail dependence coefficient.
n_params: Number of model parameters.
upper_tail_dependence: Theoretical upper tail dependence coefficient.

Methods

`aic`(X)	Compute the Akaike Information Criterion (AIC) for the model given data X.
`bic`(X)	Compute the Bayesian Information Criterion (BIC) for the model given data X.
`cdf`(X)	Compute the CDF of the bivariate Independent copula.
`fit`(X[, y])	Fit the Bivariate Independent Copula.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`inverse_partial_derivative`(X[, first_margin])	Compute the inverse of the bivariate copula's partial derivative, commonly known as the inverse h-function.
`partial_derivative`(X[, first_margin])	Compute the h-function (partial derivative) for the bivariate Independent copula.
`plot_pdf_2d`([title])	Plot a 2D contour of the estimated probability density function (PDF).
`plot_pdf_3d`([title])	Plot a 3D surface of the estimated probability density function (PDF).
`plot_tail_concentration`([X, title])	Plot the tail concentration function.
`sample`([n_samples])	Generate random samples from the bivariate copula using the inverse Rosenblatt transform.
`score`(X[, y])	Compute the total log-likelihood under the model.
`score_samples`(X)	Compute the log-likelihood of each sample (log-pdf) under the model.
`set_params`(**params)	Set the parameters of this estimator.
`tail_concentration`(quantiles)	Compute the tail concentration function for a set of quantiles.

References

[1]

“An Introduction to Copulas (2nd ed.)”, Nelsen (2006)

[2]

“Multivariate Models and Dependence Concepts”, Joe, Chapman & Hall (1997)

[3]

“Quantitative Risk Management: Concepts, Techniques and Tools”, McNeil, Frey & Embrechts (2005)

[4]

“The t Copula and Related Copulas”, Demarta & McNeil (2005)

[5]

“Copula Methods in Finance”, Cherubini, Luciano & Vecchiato (2004)

aic(X)#

Compute the Akaike Information Criterion (AIC) for the model given data X.

The AIC is defined as:

\[\mathrm{AIC} = -2 \, \log L \;+\; 2 k,\]

where

\(\log L\) is the total log-likelihood
\(k\) is the number of parameters in the model

A lower AIC value indicates a better trade-off between model fit and complexity.

Parameters:

Xarray-like of shape (n_observations, n_features): The input data on which to compute the AIC.

Returns:

aicfloat: The AIC of the fitted model on the given data.

Notes

In practice, both AIC and BIC measure the trade-off between model fit and complexity, but BIC tends to prefer simpler models for large \(n\) because of the \(\ln(n)\) term.

References

[1]

“A new look at the statistical model identification”, Akaike (1974).

bic(X)#

Compute the Bayesian Information Criterion (BIC) for the model given data X.

The BIC is defined as:

\[\mathrm{BIC} = -2 \, \log L \;+\; k \,\ln(n),\]

where

\(\log L\) is the (maximized) total log-likelihood
\(k\) is the number of parameters in the model
\(n\) is the number of observations

A lower BIC value suggests a better fit while imposing a stronger penalty for model complexity than the AIC.

Parameters:

Xarray-like of shape (n_observations, n_features): The input data on which to compute the BIC.

Returns:

bicfloat: The BIC of the fitted model on the given data.

Notes

In practice, both AIC and BIC measure the trade-off between model fit and complexity, but BIC tends to prefer simpler models for large \(n\) because of the \(\ln(n)\) term.

References

[1]

“Estimating the dimension of a model”, Schwarz, G. (1978).

cdf(X)[source]#

Compute the CDF of the bivariate Independent copula.

Parameters:

Xarray-like of shape (n_observations, 2): An array of bivariate inputs (u, v) where each row represents a bivariate observation. Both u and v must be in the interval [0, 1], having been transformed to uniform marginals.

Returns:

cdfndarray of shape (n_observations,): CDF values for each observation in X.

fit(X, y=None)[source]#

Fit the Bivariate Independent Copula.

Provided for compatibility with the API.

Parameters:

Xarray-like of shape (n_observations, 2): An array of bivariate inputs (u, v) where each row represents a bivariate observation. Both u and v must be in the interval [0, 1], having been transformed to uniform marginals.
yNone: Ignored. Provided for compatibility with scikit-learn’s API.

Returns:

selfIndependentCopula: Returns the instance itself.

property fitted_repr#: String representation of the fitted copula.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

inverse_partial_derivative(X, first_margin=False)[source]#

Compute the inverse of the bivariate copula’s partial derivative, commonly known as the inverse h-function.

For the independent copula, the h-function with respect to the second margin is

\[h(u\mid v)= u,\]

and the derivative with respect to the first margin is

\[g(u,v)= v.\]

Their inverses are trivial:

Given (p,v) for h(u|v)= p, we have u = p.

Given (p,u) for g(u,v)= p, we have v = p.

Parameters:

Xarray-like of shape (n_observations, 2): An array of bivariate inputs (p, v), each in the interval [0, 1]. - The first column p corresponds to the value of the h-function. - The second column v is the conditioning variable.
first_marginbool, default=False: If True, compute the inverse partial derivative with respect to the first margin u; otherwise, compute the inverse partial derivative with respect to the second margin v.

Returns:

undarray of shape (n_observations,): A 1D-array of length n_observations, where each element is the computed \(u = h^{-1}(p \mid v)\) for the corresponding pair in X.

property lower_tail_dependence#: Theoretical lower tail dependence coefficient.

property n_params#: Number of model parameters.

partial_derivative(X, first_margin=False)[source]#

Compute the h-function (partial derivative) for the bivariate Independent copula.

The h-function with respect to the second margin represents the conditional distribution function of \(u\) given \(v\):

\[\frac{\partial C(u,v)}{\partial v}=u,\]

Parameters:

Xarray-like of shape (n_samples, 2): Array of pairs \((u,v)\), where each value is in the interval [0,1].

Returns:

np.ndarray: Array of h-function values for each observation in X.

plot_pdf_2d(title=None)#

Plot a 2D contour of the estimated probability density function (PDF).

This method generates a grid over [0, 1]^2, computes the PDF, and displays a contour plot of the PDF. Contour levels are limited to the 97th quantile to avoid extreme densities.

Parameters:

titlestr, optional: The title for the plot. If not provided, a default title based on the fitted copula’s representation is used.

Returns:

figgo.Figure: A Plotly figure object containing the 2D contour plot of the PDF.

plot_pdf_3d(title=None)#

Plot a 3D surface of the estimated probability density function (PDF).

This method generates a grid over [0, 1]^2, computes the PDF, and displays a 3D surface plot of the PDF using Plotly.

Parameters:

titlestr, optional: The title for the plot. If not provided, a default title based on the fitted copula’s representation is used.

Returns:

figgo.Figure: A Plotly figure object containing a 3D surface plot of the PDF.

plot_tail_concentration(X=None, title=None)#

Plot the tail concentration function.

This method computes the tail concentration function at 100 evenly spaced quantile levels between 0.005 and 0.995. The plot displays the concentration values on the y-axis and the quantile levels on the x-axis.

The tail concentration is defined as:

Lower tail: λ_L(q) = P(U₂ ≤ q | U₁ ≤ q)
Upper tail: λ_U(q) = P(U₂ ≥ q | U₁ ≥ q)

where U₁ and U₂ are the pseudo-observations of the first and second variables, respectively.

Parameters:

Xarray-like of shape (n_samples, 2), optional: If provided, it is used to plot the empirical tail concentration for comparison versus the model tail concentration.
titlestr, optional: The title for the plot. If not provided, a default title based on the fitted copula’s representation is used.

Returns:

figgo.Figure: A Plotly figure object containing the tail concentration curve.

References

[1]

“Quantitative Risk Management: Concepts, Techniques, and Tools”, McNeil, Frey, Embrechts (2005)

sample(n_samples=1)#

Generate random samples from the bivariate copula using the inverse Rosenblatt transform.

Parameters:

n_samplesint, default=1: Number of samples to generate.

Returns:

Xarray-like of shape (n_samples, 2): An array of bivariate inputs (u, v) where each row represents a bivariate observation. Both u and v are uniform marginals in the interval [0, 1].

score(X, y=None)#

Compute the total log-likelihood under the model.

Parameters:

Xarray-like of shape (n_observations, n_features): An array of data points for which the total log-likelihood is computed.
yNone: Ignored. Provided for compatibility with scikit-learn’s API.

Returns:

logprobfloat: The total log-likelihood (sum of log-pdf values).

score_samples(X)[source]#

Compute the log-likelihood of each sample (log-pdf) under the model.

Parameters:

Xarray-like of shape (n_samples, 2): The input data where each row represents a bivariate observation. The data should be transformed to uniform marginals in [0, 1].

Returns:

densityndarray of shape (n_samples,): The log-likelihood of each sample under the fitted copula.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

tail_concentration(quantiles)#

Compute the tail concentration function for a set of quantiles.

The tail concentration function is defined as follows:

For quantiles q ≤ 0.5:
C(q) = P(U ≤ q, V ≤ q) / q
For quantiles q > 0.5:
C(q) = (1 - 2q + P(U ≤ q, V ≤ q)) / (1 - q)

where U and V are the pseudo-observations of the first and second variables, respectively. This function returns the concentration values for each q provided.

Parameters:

quantilesndarray of shape (n_quantiles,): A 1D array of quantile levels (values between 0 and 1) at which to compute the tail concentration.

Returns:

concentrationndarray of shape (n_quantiles,): The computed tail concentration values corresponding to each quantile.

Raises:

ValueError: If any value in quantiles is not in the interval [0, 1].

References

[1]

“Quantitative Risk Management: Concepts, Techniques, and Tools”, McNeil, Frey, Embrechts (2005)

property upper_tail_dependence#: Theoretical upper tail dependence coefficient.