RocksDB v6.24.0 Release Notes

Release Date: 2021-08-20 // over 2 years ago
  • ๐Ÿ› Bug Fixes

    • If the primary's CURRENT file is missing or inaccessible, the secondary instance should not hang repeatedly trying to switch to a new MANIFEST. It should instead return the error code encountered while accessing the file.
    • ๐Ÿ”€ Restoring backups with BackupEngine is now a logically atomic operation, so that if a restore operation is interrupted, DB::Open on it will fail. Using BackupEngineOptions::sync (default) ensures atomicity even in case of power loss or OS crash.
    • ๐Ÿ›  Fixed a race related to the destruction of ColumnFamilyData objects. The earlier logic unlocked the DB mutex before destroying the thread-local SuperVersion pointers, which could result in a process crash if another thread managed to get a reference to the ColumnFamilyData object.
    • โœ‚ Removed a call to RenameFile() on a non-existent info log file ("LOG") when opening a new DB. Such a call was guaranteed to fail though did not impact applications since we swallowed the error. Now we also stopped swallowing errors in renaming "LOG" file.
    • ๐Ÿ›  Fixed an issue where OnFlushCompleted was not called for atomic flush.
    • ๐Ÿ›  Fixed a bug affecting the batched MultiGet API when used with keys spanning multiple column families and sorted_input == false.
    • ๐Ÿ›  Fixed a potential incorrect result in opt mode and assertion failures caused by releasing snapshot(s) during compaction.
    • ๐Ÿ›  Fixed passing of BlobFileCompletionCallback to Compaction job and Atomic flush job which was default paramter (nullptr). BlobFileCompletitionCallback is internal callback that manages addition of blob files to SSTFileManager.
    • Fixed MultiGet not updating the block_read_count and block_read_byte PerfContext counters.

    ๐Ÿ†• New Features

    • Made the EventListener extend the Customizable class.
    • EventListeners that have a non-empty Name() and that are registered with the ObjectRegistry can now be serialized to/from the OPTIONS file.
    • Insert warm blocks (data blocks, uncompressed dict blocks, index and filter blocks) in Block cache during flush under option BlockBasedTableOptions.prepopulate_block_cache. Previously it was enabled for only data blocks.
    • BlockBasedTableOptions.prepopulate_block_cache can be dynamically configured using DB::SetOptions.
    • Add CompactionOptionsFIFO.age_for_warm, which allows RocksDB to move old files to warm tier in FIFO compactions. Note that file temperature is still an experimental feature.
    • โž• Add a comment to suggest btrfs user to disable file preallocation by setting options.allow_fallocate=false.
    • Fast forward option in Trace replay changed to double type to allow replaying at a lower speed, by settings the value between 0 and 1. This option can be set via ReplayOptions in Replayer::Replay(), or via --trace_replay_fast_forward in db_bench.
    • โž• Add property LiveSstFilesSizeAtTemperature to retrieve sst file size at different temperature.
    • โž• Added a stat rocksdb.secondary.cache.hits.
    • Added a PerfContext counter secondary_cache_hit_count.
    • The integrated BlobDB implementation now supports the tickers BLOB_DB_BLOB_FILE_BYTES_READ, BLOB_DB_GC_NUM_KEYS_RELOCATED, and BLOB_DB_GC_BYTES_RELOCATED, as well as the histograms BLOB_DB_COMPRESSION_MICROS and BLOB_DB_DECOMPRESSION_MICROS.
    • Added hybrid configuration of Ribbon filter and Bloom filter where some LSM levels use Ribbon for memory space efficiency and some use Bloom for speed. See NewRibbonFilterPolicy. This also changes the default behavior of NewRibbonFilterPolicy to use Bloom for flushes under Leveled and Universal compaction and Ribbon otherwise. The C API function rocksdb_filterpolicy_create_ribbon is unchanged but adds new rocksdb_filterpolicy_create_ribbon_hybrid.

    Public API change

    • Added APIs to decode and replay trace file via Replayer class. Added DB::NewDefaultReplayer() to create a default Replayer instance. Added TraceReader::Reset() to restart reading a trace file. Created trace_record.h, trace_record_result.h and utilities/replayer.h files to access the decoded Trace records, replay them, and query the actual operation results.
    • โž• Added Configurable::GetOptionsMap to the public API for use in creating new Customizable classes.
    • Generalized bits_per_key parameters in C API from int to double for greater configurability. Although this is a compatible change for existing C source code, anything depending on C API signatures, such as foreign function interfaces, will need to be updated.

    ๐ŸŽ Performance Improvements

    • โšก๏ธ Try to avoid updating DBOptions if SetDBOptions() does not change any option value.

    Behavior Changes

    • StringAppendOperator additionally accepts a string as the delimiter.
    • BackupEngineOptions::sync (default true) now applies to restoring backups in addition to creating backups. This could slow down restores, but ensures they are fully persisted before returning OK. (Consider increasing max_background_operations to improve performance.)