RocksDB v5.9.0 Release Notes

Release Date: 2017-11-01 // over 6 years ago
  • Public API Change

    • BackupableDBOptions::max_valid_backups_to_open == 0 now means no backups will be opened during BackupEngine initialization. Previously this condition disabled limiting backups opened.
    • 0๏ธโƒฃ DBOptions::preserve_deletes is a new option that allows one to specify that DB should not drop tombstones for regular deletes if they have sequence number larger than what was set by the new API call DB::SetPreserveDeletesSequenceNumber(SequenceNumber seqnum). Disabled by default.
    • API call DB::SetPreserveDeletesSequenceNumber(SequenceNumber seqnum) was added, users who wish to preserve deletes are expected to periodically call this function to advance the cutoff seqnum (all deletes made before this seqnum can be dropped by DB). It's user responsibility to figure out how to advance the seqnum in the way so the tombstones are kept for the desired period of time, yet are eventually processed in time and don't eat up too much space.
    • ReadOptions::iter_start_seqnum was added; if set to something > 0 user will see 2 changes in iterators behavior 1) only keys written with sequence larger than this parameter would be returned and 2) the Slice returned by iter->key() now points to the memory that keep User-oriented representation of the internal key, rather than user key. New struct FullKey was added to represent internal keys, along with a new helper function ParseFullKey(const Slice& internal_key, FullKey* result);.
    • ๐Ÿ—„ Deprecate trash_dir param in NewSstFileManager, right now we will rename deleted files to .trash instead of moving them to trash directory
    • ๐Ÿ‘ Allow setting a custom trash/DB size ratio limit in the SstFileManager, after which files that are to be scheduled for deletion are deleted immediately, regardless of any delete ratelimit.
    • ๐Ÿ”€ Return an error on write if write_options.sync = true and write_options.disableWAL = true to warn user of inconsistent options. Previously we will not write to WAL and not respecting the sync options in this case.

    ๐Ÿ†• New Features

    • ๐ŸŽ CRC32C is now using the 3-way pipelined SSE algorithm crc32c_3way on supported platforms to improve performance. The system will choose to use this algorithm on supported platforms automatically whenever possible. If PCLMULQDQ is not supported it will fall back to the old Fast_CRC32 algorithm.
    • DBOptions::writable_file_max_buffer_size can now be changed dynamically.
    • DBOptions::bytes_per_sync, DBOptions::compaction_readahead_size, and DBOptions::wal_bytes_per_sync can now be changed dynamically, DBOptions::wal_bytes_per_sync will flush all memtables and switch to a new WAL file.
    • Support dynamic adjustment of rate limit according to demand for background I/O. It can be enabled by passing true to the auto_tuned parameter in NewGenericRateLimiter(). The value passed as rate_bytes_per_sec will still be respected as an upper-bound.
    • Support dynamically changing ColumnFamilyOptions::compaction_options_fifo.
    • Introduce EventListener::OnStallConditionsChanged() callback. Users can implement it to be notified when user writes are stalled, stopped, or resumed.
    • Add a new db property "rocksdb.estimate-oldest-key-time" to return oldest data timestamp. The property is available only for FIFO compaction with compaction_options_fifo.allow_compaction = false.
    • ๐Ÿš€ Upon snapshot release, recompact bottommost files containing deleted/overwritten keys that previously could not be dropped due to the snapshot. This alleviates space-amp caused by long-held snapshots.
    • Support lower bound on iterators specified via ReadOptions::iterate_lower_bound.
    • ๐Ÿ‘Œ Support for differential snapshots (via iterator emitting the sequence of key-values representing the difference between DB state at two different sequence numbers). Supports preserving and emitting puts and regular deletes, doesn't support SingleDeletes, MergeOperator, Blobs and Range Deletes.

    ๐Ÿ› Bug Fixes

    • ๐Ÿ›  Fix a potential data inconsistency issue during point-in-time recovery. DB:Open() will abort if column family inconsistency is found during PIT recovery.
    • ๐Ÿ›  Fix possible metadata corruption in databases using DeleteRange().