catboost v0.24.1 release notes (2020-08-27)

« Changelog History

catboost v0.24.1 Release Notes

Release Date: 2020-08-27 // over 3 years ago

Uncertainty prediction

🚀 Main feature of this release is total uncertainty prediction support via virtual ensembles.
🖨 You can read the theoretical background in the preprint Uncertainty in Gradient Boosting via Ensembles from our research team.
We introduced new training parameter posterior_sampling, that allows to estimate total uncertainty.
Setting posterior_sampling=True implies enabling Langevin boosting, setting model_shrink_rate to 1/(2*N) and setting diffusion_temperature to N, where N is dataset size.
CatBoost object method virtual_ensembles_predict splits model into virtual_ensembles_count submodels.
Calling model.virtual_ensembles_predict(.., prediction_type='TotalUncertainty') returns mean prediction, variance (and knowledge uncertrainty for models, trained with RMSEWithUncertainty loss function).
Calling model.virtual_ensembles_predict(.., prediction_type='VirtEnsembles') returns virtual_ensembles_count predictions of virtual submodels for each object.

🆕 New functionality
- 👌 Supported non-owning model deserialization for models with categorical feature counters
Speedups
- 📜 We've done lot's of speedups for sparse data loading. For example, on bosch sparse dataset preprocessing speed got 4.5x speedup while running in 28 thread setting.
🛠 Bugfixes:
- 🛠 Fixed target check for PairLogitPairwise on GPU. Issue #1217
- 🔋 Supported n_features_in_ attribute required for using CatBoost in sklearn pipelines. Issue #1363

catboost v0.24.1

Version Release Notes from August 27, 2020 (over 3 years ago)

« Changelog History

catboost v0.24.1 Release Notes

Uncertainty prediction

🆕 New functionality

Speedups

🛠 Bugfixes: