skfolio.linear_model.CSLinearRegression#
- class skfolio.linear_model.CSLinearRegression(fit_intercept=False)[source]#
Cross-sectional weighted least squares regression.
This estimator fits one weighted least squares regression per observation across the asset cross-section. The implementation is fully vectorized and is designed for panel data whose asset universe may vary over time.
The model solves the weighted least squares problem independently for each observation:
\[\beta_t = \arg\min_{\beta} \sum_{i=1}^{n_{\text{assets}}} w_{ti} (y_{ti} - X_{ti}^T \beta)^2\]where \(t\) denotes the observation, \(i\) denotes the asset, \(w_{ti}\) are the cross-section weights, and \(X_{ti}\) is the feature vector for asset \(i\) at observation \(t\).
The cross-sectional weights must be finite and non-negative. A pair with zero weight is excluded from estimation for that observation. This is the intended way to represent inactive pairs such as assets outside the estimation universe, listed or delisted assets, or pairs with missing (NaNs) data.
For each
(observation, asset)pair:If
cs_weights > 0, all features inXandymust be finite.If
cs_weights == 0, the pair is excluded from estimation andXandymay be finite or missing (NaNs).
- Parameters:
- fit_interceptbool, default=False
Whether to calculate the intercept for each observation. If set to False, no intercept will be used in calculations.
- Attributes:
- coef_ndarray of shape (n_observations, n_features)
Estimated coefficients for each observation.
- intercept_ndarray of shape (n_observations,)
Intercept for each observation. Set to zeros if
fit_intercept=False.- n_features_in_int
Number of features seen during fit.
- n_valid_assets_ndarray of shape (n_observations,)
Number of assets that participated in estimation (those with positive weight) for each observation.
Methods
fit(X, y[, cs_weights])Fit the cross-sectional regression model.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
predict(X)Predict using the cross-sectional linear model.
score(X, y[, cs_weights])Return the mean coefficient of determination across observations.
set_fit_request(*[, cs_weights])Configure whether metadata should be requested to be passed to the
fitmethod.set_params(**params)Set the parameters of this estimator.
set_score_request(*[, cs_weights])Configure whether metadata should be requested to be passed to the
scoremethod.Examples
>>> import numpy as np >>> from skfolio.linear_model import CSLinearRegression >>> >>> rng = np.random.RandomState(42) >>> X = rng.randn(3, 5, 2) >>> y = rng.randn(3, 5) >>> >>> model = CSLinearRegression() >>> model.fit(X, y) CSLinearRegression() >>> >>> model.intercept_.shape (3,) >>> model.coef_.shape (3, 2) >>> model.predict(X).shape (3, 5) >>> model.score(X, y) 0.6353...
- fit(X, y, cs_weights=None)[source]#
Fit the cross-sectional regression model.
Estimates regression coefficients independently for each observation by solving weighted least squares problems across assets.
- Parameters:
- Xarray-like of shape (n_observations, n_assets, n_features)
Training data. 3D array where the first axis indexes observations, the second axis indexes assets, and the third axis indexes features.
- yarray-like of shape (n_observations, n_assets)
Target values.
- cs_weightsarray-like of shape (n_observations, n_assets), optional
Cross-sectional weights for each
(observation, asset)pair.Must be finite and non-negative.
Pairs with zero weight are excluded from estimation.
If None, all pairs receive unit weight.
- Returns:
- selfCSLinearRegression
Fitted estimator.
Notes
Each
(observation, asset)pair with positivecs_weightsmust have finiteXand finitey. Pairs with zero weight are excluded from estimation and may contain missing values.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X)#
Predict using the cross-sectional linear model.
For each observation \(t\) and asset \(i\), the prediction is the systematic part; realized outcomes satisfy \(y_{ti} = \hat{y}_{ti} + \epsilon_{ti}\) with residual \(\epsilon_{ti}\). The prediction is
\[\hat{y}_{ti} = X_{ti}^{T} \beta_t + \beta_{t,0}\]- Parameters:
- Xarray-like of shape (n_observations, n_assets, n_features)
Feature tensor used for prediction. The observation and feature axes must match those seen during
fit. The asset axis may differ.
- Returns:
- y_predndarray of shape (n_observations, n_assets)
Predicted values.
- score(X, y, cs_weights=None)#
Return the mean coefficient of determination across observations.
The coefficient of determination \(R^2\) is computed independently for each observation and then averaged. For observation \(t\):
\[R^2_t = 1 - \frac{\sum_i w_{ti}(y_{ti} - \hat{y}_{ti})^2} {\sum_i w_{ti}(y_{ti} - \bar{y}_t)^2}\]where \(\bar{y}_t\) is the weighted mean of \(y\) for observation \(t\).
- Parameters:
- Xarray-like of shape (n_observations, n_assets, n_features)
Feature tensor on which to evaluate the model.
- yarray-like of shape (n_observations, n_assets)
Target values aligned with
X.- cs_weightsarray-like of shape (n_observations, n_assets), optional
Asset weights for computing weighted \(R^2\) scores. If None, all assets are given equal weight. Pairs with zero weight are excluded from the score. Pairs with positive weight must have finite
Xand finitey.
- Returns:
- scorefloat
Mean \(R^2\) across all observations with finite values. Returns NaN if no observations have valid \(R^2\) values.
- set_fit_request(*, cs_weights='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- cs_weightsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
cs_weightsparameter infit.
- Returns:
- selfobject
The updated object.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_score_request(*, cs_weights='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- cs_weightsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
cs_weightsparameter inscore.
- Returns:
- selfobject
The updated object.