RocksDB v6.15.0 Release Notes

  • ๐Ÿ› Bug Fixes

    • ๐Ÿ”– Fixed a bug in the following combination of features: indexes with user keys (format_version >= 3), indexes are partitioned (index_type == kTwoLevelIndexSearch), and some index partitions are pinned in memory (BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache). The bug could cause keys to be truncated when read from the index leading to wrong read results or other unexpected behavior.
    • ๐Ÿ“Œ Fixed a bug when indexes are partitioned (index_type == kTwoLevelIndexSearch), some index partitions are pinned in memory (BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache), and partitions reads could be mixed between block cache and directly from the file (e.g., with enable_index_compression == 1 and mmap_read == 1, partitions that were stored uncompressed due to poor compression ratio would be read directly from the file via mmap, while partitions that were stored compressed would be read from block cache). The bug could cause index partitions to be mistakenly considered empty during reads leading to wrong read results.
    • Since 6.12, memtable lookup should report unrecognized value_type as corruption (#7121).
    • Since 6.14, fix false positive flush/compaction Status::Corruption failure when paranoid_file_checks == true and range tombstones were written to the compaction output files.
    • Since 6.14, fix a bug that could cause a stalled write to crash with mixed of slowdown and no_slowdown writes (WriteOptions.no_slowdown=true).
    • ๐Ÿ›  Fixed a bug which causes hang in closing DB when refit level is set in opt build. It was because ContinueBackgroundWork() was called in assert statement which is a no op. It was introduced in 6.14.
    • ๐Ÿ›  Fixed a bug which causes Get() to return incorrect result when a key's merge operand is applied twice. This can occur if the thread performing Get() runs concurrently with a background flush thread and another thread writing to the MANIFEST file (PR6069).
    • Reverted a behavior change silently introduced in 6.14.2, in which the effects of the ignore_unknown_options flag (used in option parsing/loading functions) changed.
    • โช Reverted a behavior change silently introduced in 6.14, in which options parsing/loading functions began returning NotFound instead of InvalidArgument for option names not available in the present version.
    • ๐Ÿ›  Fixed MultiGet bugs it doesn't return valid data with user defined timestamp.
    • ๐Ÿ›  Fixed a potential bug caused by evaluating TableBuilder::NeedCompact() before TableBuilder::Finish() in compaction job. For example, the NeedCompact() method of CompactOnDeletionCollector returned by built-in CompactOnDeletionCollectorFactory requires BlockBasedTable::Finish() to return the correct result. The bug can cause a compaction-generated file not to be marked for future compaction based on deletion ratio.
    • ๐Ÿ›  Fixed a seek issue with prefix extractor and timestamp.
    • Fixed a bug of encoding and parsing BlockBasedTableOptions::read_amp_bytes_per_bit as a 64-bit integer.
    • ๐Ÿ›  Fixed a bug of a recovery corner case, details in PR7621.

    Public API Change

    • Deprecate BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache and BlockBasedTableOptions::pin_top_level_index_and_filter. These options still take effect until users migrate to the replacement APIs in BlockBasedTableOptions::metadata_cache_options. Migration guidance can be found in the API comments on the deprecated options.
    • โž• Add new API DB::VerifyFileChecksums to verify SST file checksum with corresponding entries in the MANIFEST if present. Current implementation requires scanning and recomputing file checksums.
    • Added a new option track_and_verify_wals_in_manifest. If true, the log numbers and sizes of the synced WALs are tracked in MANIFEST, then during DB recovery, if a synced WAL is missing from disk, or the WAL's size does not match the recorded size in MANIFEST, an error will be reported and the recovery will be aborted. Note that this option does not work with secondary instance.

    Behavior Changes

    • The dictionary compression settings specified in ColumnFamilyOptions::compression_opts now additionally affect files generated by flush and compaction to non-bottommost level. Previously those settings at most affected files generated by compaction to bottommost level, depending on whether ColumnFamilyOptions::bottommost_compression_opts overrode them. Users who relied on dictionary compression settings in ColumnFamilyOptions::compression_opts affecting only the bottommost level can keep the behavior by moving their dictionary settings to ColumnFamilyOptions::bottommost_compression_opts and setting its enabled flag.
    • When the enabled flag is set in ColumnFamilyOptions::bottommost_compression_opts, those compression options now take effect regardless of the value in ColumnFamilyOptions::bottommost_compression. Previously, those compression options only took effect when ColumnFamilyOptions::bottommost_compression != kDisableCompressionOption. Now, they additionally take effect when ColumnFamilyOptions::bottommost_compression == kDisableCompressionOption (such a setting causes bottommost compression type to fall back to ColumnFamilyOptions::compression_per_level if configured, and otherwise fall back to ColumnFamilyOptions::compression).

    ๐Ÿ†• New Features

    • An EXPERIMENTAL new Bloom alternative that saves about 30% space compared to Bloom filters, with about 3-4x construction time and similar query times is available using NewExperimentalRibbonFilterPolicy.