skfolio.linear_model.CSLinearRegressorWrapper#

class skfolio.linear_model.CSLinearRegressorWrapper(regressor, n_jobs=1)[source]#

Cross-sectional regression based on a scikit-learn regressor.

This estimator wraps a scikit-learn regressor and fits one independent regression across assets for each observation. These independent observation-level regressions can be fitted in parallel by setting n_jobs. The wrapped regressor must define fit_intercept, implement fit, accept a sample_weight argument, and expose fitted coef_ and intercept_ attributes.

Missing-value handling is driven by cs_weights on each (observation, asset) pair:

  • If cs_weights > 0, all features in X and y must be finite.

  • If cs_weights == 0, the pair is excluded from estimation and X and y may be finite or missing.

  • Each observation must retain at least one valid asset after applying cs_weights.

Parameters:
regressorBaseEstimator

Scikit-learn regressor used at each observation.

n_jobsint, default=1

Number of parallel jobs used to fit the observation-level regressions.

Attributes:
coef_ndarray of shape (n_observations, n_features)

Estimated coefficients for each observation.

intercept_ndarray of shape (n_observations,)

Intercept for each observation. Set to zeros if fit_intercept=False.

n_features_in_int

Number of features seen during fit.

n_valid_assets_ndarray of shape (n_observations,)

Number of assets that participated in estimation (those with positive weight) for each observation.

Methods

fit(X, y[, cs_weights])

Fit one wrapped regressor per observation.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict using the cross-sectional linear model.

score(X, y[, cs_weights])

Return the mean coefficient of determination across observations.

set_fit_request(*[, cs_weights])

Configure whether metadata should be requested to be passed to the fit method.

set_params(**params)

Set the parameters of this estimator.

set_score_request(*[, cs_weights])

Configure whether metadata should be requested to be passed to the score method.

Examples

>>> import numpy as np
>>> from sklearn.linear_model import HuberRegressor
>>> from skfolio.linear_model import CSLinearRegressorWrapper
>>>
>>> rng = np.random.default_rng(42)
>>> X = rng.normal(size=(3, 5, 2))
>>> y = rng.normal(size=(3, 5))
>>> cs_weights = 1.0 + rng.random(size=(3, 5))
>>>
>>> model = CSLinearRegressorWrapper(
...     regressor=HuberRegressor(fit_intercept=True, max_iter=200)
... )
>>> model.fit(X, y, cs_weights=cs_weights)
CSLinearRegressorWrapper(...)
>>>
>>> model.intercept_.shape
(3,)
>>> model.coef_.shape
(3, 2)
>>> model.predict(X).shape
(3, 5)
>>> model.score(X, y)
0.4901...
fit(X, y, cs_weights=None)[source]#

Fit one wrapped regressor per observation.

Each observation must contain at least one asset with positive weight and finite X and y values.

Parameters:
Xarray-like of shape (n_observations, n_assets, n_features)

Input feature tensor.

yarray-like of shape (n_observations, n_assets)

Target values.

cs_weightsarray-like of shape (n_observations, n_assets), optional

Cross-sectional weights passed to the wrapped regressor as sample_weight. If None, all assets receive unit weight.

Returns:
selfCSLinearRegressorWrapper

Fitted estimator.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X)#

Predict using the cross-sectional linear model.

For each observation \(t\) and asset \(i\), the prediction is the systematic part; realized outcomes satisfy \(y_{ti} = \hat{y}_{ti} + \epsilon_{ti}\) with residual \(\epsilon_{ti}\). The prediction is

\[\hat{y}_{ti} = X_{ti}^{T} \beta_t + \beta_{t,0}\]
Parameters:
Xarray-like of shape (n_observations, n_assets, n_features)

Feature tensor used for prediction. The observation and feature axes must match those seen during fit. The asset axis may differ.

Returns:
y_predndarray of shape (n_observations, n_assets)

Predicted values.

score(X, y, cs_weights=None)#

Return the mean coefficient of determination across observations.

The coefficient of determination \(R^2\) is computed independently for each observation and then averaged. For observation \(t\):

\[R^2_t = 1 - \frac{\sum_i w_{ti}(y_{ti} - \hat{y}_{ti})^2} {\sum_i w_{ti}(y_{ti} - \bar{y}_t)^2}\]

where \(\bar{y}_t\) is the weighted mean of \(y\) for observation \(t\).

Parameters:
Xarray-like of shape (n_observations, n_assets, n_features)

Feature tensor on which to evaluate the model.

yarray-like of shape (n_observations, n_assets)

Target values aligned with X.

cs_weightsarray-like of shape (n_observations, n_assets), optional

Asset weights for computing weighted \(R^2\) scores. If None, all assets are given equal weight. Pairs with zero weight are excluded from the score. Pairs with positive weight must have finite X and finite y.

Returns:
scorefloat

Mean \(R^2\) across all observations with finite values. Returns NaN if no observations have valid \(R^2\) values.

set_fit_request(*, cs_weights='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
cs_weightsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for cs_weights parameter in fit.

Returns:
selfobject

The updated object.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

set_score_request(*, cs_weights='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
cs_weightsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for cs_weights parameter in score.

Returns:
selfobject

The updated object.