A High-Performance Data Science Toolkit for the Earth Sciences
Kyle Hall & Nachiketa Acharya
XCast is a High-Performance Data Science toolkit for the Earth Sciences. It allows one to perform gridpoint-wise statistical and machine learning analyses in an efficient way using Dask Parallelism, through an API that closely mirrors that of SciKit-Learn, with the exception that XCast produces and consumes Xarray DataArrays, rather than two-dimensional NumPy arrays.
Our goal is to lower the barriers to entry to Earth Science (and, specifically, climate forecasting) by bridging the gap between Python’s Gridded Data utilities (Xarray, NetCDF4, etc) and its Data Science utilities (Scikit-Learn, Scipy, OpenCV), which are normally incompatible. Through XCast, you can use all your favorite estimators, skill metrics, etc with NetCDF, Grib2, Zarr, and other types of gridded data.
XCast also lets you scale your gridpoint-wise earth science machine learning approaches to institutional supercomputers and computer clusters with ease. Its compatibility with Dask-Distributed’s client schedulers make scalability a non-issue.
THIS PAGE HAS BEEN MOVED TO XCAST’S NEW HOME, XCAST-LIB.GITHUB.IO!