catboost v0.20 Release Notes

Release Date: 2019-11-28 // over 4 years ago
  • ๐Ÿ†• New submodule for text processing!
    It contains two classes to help you make text features ready for training:

    • Tokenizer -- use this class to split text into tokens (automatic lowercase and punctuation removal)
    • Dictionary -- with this class you create a dictionary which maps tokens to numeric identifiers. You then use these identifiers as new features.

    ๐Ÿ†• New features:

    • Enabled boost_from_average for MAPE loss function

    ๐Ÿ› Bug fixes:

    • ๐Ÿ›  Fixed Pool creation from pandas.DataFrame with discontinuous columns, #1079
    • ๐Ÿ›  Fixed standalone_evaluator, PR #1083

    Speedups:

    • ๐Ÿ“ฆ Huge speedup of preprocessing in python-package for datasets with many samples (>10 mln)

    ๐Ÿš€ We also release precompiled packages for Python 3.8