BaseEstimator

A class called BaseEstimator is at the core of most of XCast’s functionality. It takes two-dimensional estimator classes, as in those of scikit-learn and other python data science packages, and extends them to be able to accommodate four-dimensional gridded climate data, as in xarray.DataArrays. It broadcasts the methods of the underlying two dimensional estimator across latitude and longitude / gridpoints, working with a separate instance of the estimator at each gridpoint efficiently in parallel. It gives you access to a subset of their methods - namely, .fit, .predict, .transform, and .predict_proba. (or a subset thereof, if your estimator is more of a transformer than an estimator, like in those in sklearn.decomposition).

BaseEstimator and its subclasses also accomodates gridpoint-wise hyperparameter tuning, using a home-brewed stochastic depth-first search tuning algorithm. Before fitting an estimator, use the .tune method to generate a set of optimized parameters, then pass that to your estimator during instantiation. Hyperparameter tuning is VERY slow when you have to broadcast it across gridpoints- so don’t expect results any time soon! For examples, see the XCast github repository.

XCast obviously doesn’t implement every possible estimator in the world - in fact the options are limited to only a small subset of methods so we can better support them. However, it is pretty easy to implement new XCast-style estimators by subclassing BaseEstimator:

import xcast as xc 
from sklearn.linear_model import ElasticNet 

class XCElasticNet(xc.BaseEstimator):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.model_type = ElasticNet

You’ll then be able to use XCElasticNet.fit, and the other sklearn.linear_model.ElasticNet class methods (out of .fit, .predict, .transform, and .predict_proba) as XCast-style functions. See all of the things implemented with BaseEstimator in the table below:

2D Estimator XCast-Style Estimator Deterministic Forecasts? Tercile Probability Forecasts? Non-Exceedance Forecasts?
xcast.linear_regression xcast.MLR YES YES YES
xcast.multivariate_extended_logistic_regression xcast.MELR NO YES YES
xcast.extended_logistic_regression xcast.ELR NO YES YES
xcast.extreme_learning_machine xcast.ELM YES YES NO
xcast.epoelm xcast.EPOELM YES YES YES
xcast.quantile_regression_forest xcast.QRF YES YES YES

For the sake of completeness, here is a list of all of the other currently implemented XCast estimators (as of v1.0.0) that are not subclasses of BaseEstimator

Method XCast-Style Estimator Deterministic Forecasts? Tercile Probability Forecasts? Non-Exceedance Forecasts?
Ensemble Mean / Member Count xcast.Ensemble YES YES NO
Anomaly Correlation / Probability Anomaly Correlation xcast.ACPAC YES YES NO
Canonical Correlation Analysis xcast.CCA YES YES YES
Principal Components Regression xcast.PCR YES YES YES

NOTE: In estimators where non-exceedance forecasts are available, pass the quantile keyword to indicate the percentile-based non-exceedance threshold you want the non-exceedance probability of.