catboost v0.21 Release Notes

Release Date: 2020-01-31 // about 1 year ago
  • ๐Ÿ†• New features:

    • The main feature of this release is the Stochastic Gradient Langevin Boosting (SGLB) mode that can improve quality of your models with non-convex loss functions. To use it specify langevin option and tune diffusion_temperature and model_shrink_rate. See the corresponding paper for details.

    ๐Ÿ‘Œ Improvements:

    • 0๏ธโƒฃ Automatic learning rate is applied by default not only for Logloss objective, but also for RMSE (on CPU and GPU) and MultiClass (on GPU).
    • Class labels type information is stored in the model. Now estimators in python package return values of proper type in classes_ attribute and for prediction functions with prediction_type=Class. #305, #999, #1017.
      ๐Ÿ“„ Note: Class labels loaded from datasets in CatBoost dsv format always have string type now.

    ๐Ÿ› Bug fixes:

    • ๐Ÿ›  Fixed huge memory consumption for text features. #1107
    • ๐Ÿ›  Fixed crash on GPU on big datasets with groups (hundred million+ groups).
    • ๐Ÿ›  Fixed class labels consistency check and merging in model sums (now class names in binary classification are properly checked and added to the result as well)
    • ๐Ÿ›  Fix for confusion matrix (PR #1152), thanks to @dmsivkov.
    • Fixed shap values calculation when boost_from_average=True. #1125
    • ๐Ÿ›  Fixed use-after-free in fstr PredictionValuesChange with specified dataset
    • Target border and class weights are now taken from model when necessary for feature strength, metrics evaluation, roc_curve, object importances and calc_feature_statistics calculations.
    • ๐Ÿ›  Fixed that L2 regularization was not applied for non symmetric trees for binary classification on GPU.
    • ๐Ÿ”‹ [R-package] Fixed the bug that catboost.get_feature_importance did not work after model is loaded #1064
    • ๐Ÿ“ฆ [R-package] Fixed the bug that catboost.train did not work when called with the single dataset parameter. #1162
    • ๐Ÿ›  Fixed L2 score calculation on CPU


    • ๐Ÿš€ Starting from this release Java applier is released simultaneously with other components and has the same version.


    • ๐Ÿš€ Models trained with this release require applier from this release or later to work correctly.