RocksDB v5.14.0 Release Notes

Release Date: 2018-05-16 // almost 6 years ago
  • Public API Change

    • ➕ Add a BlockBasedTableOption to align uncompressed data blocks on the smaller of block size or page size boundary, to reduce flash reads by avoiding reads spanning 4K pages.
    • 👍 The background thread naming convention changed (on supporting platforms) to "rocksdb:", e.g., "rocksdb:low0".
    • ➕ Add a new ticker stat rocksdb.number.multiget.keys.found to count number of keys successfully read in MultiGet calls
    • ⏱ Touch-up to write-related counters in PerfContext. New counters added: write_scheduling_flushes_compactions_time, write_thread_wait_nanos. Counters whose behavior was fixed or modified: write_memtable_time, write_pre_and_post_process_time, write_delay_time.
    • Posix Env's NewRandomRWFile() will fail if the file doesn't exist.
    • Now, DBOptions::use_direct_io_for_flush_and_compaction only applies to background writes, and DBOptions::use_direct_reads applies to both user reads and background reads. This conforms with Linux's open(2) manpage, which advises against simultaneously reading a file in buffered and direct modes, due to possibly undefined behavior and degraded performance.
    • 👀 Iterator::Valid() always returns false if !status().ok(). So, now when doing a Seek() followed by some Next()s, there's no need to check status() after every operation.
    • 👀 Iterator::Seek()/SeekForPrev()/SeekToFirst()/SeekToLast() always resets status().
    • 0️⃣ Introduced CompressionOptions::kDefaultCompressionLevel, which is a generic way to tell RocksDB to use the compression library's default level. It is now the default value for CompressionOptions::level. Previously the level defaulted to -1, which gave poor compression ratios in ZSTD.

    🆕 New Features

    • Introduce TTL for level compaction so that all files older than ttl go through the compaction process to get rid of old data.
    • 🔧 TransactionDBOptions::write_policy can be configured to enable WritePrepared 2PC transactions. Read more about them in the wiki.
    • ➕ Add DB properties "rocksdb.block-cache-capacity", "rocksdb.block-cache-usage", "rocksdb.block-cache-pinned-usage" to show block cache usage.
    • ➕ Add Env::LowerThreadPoolCPUPriority(Priority) method, which lowers the CPU priority of background (esp. compaction) threads to minimize interference with foreground tasks.
    • ⏱ Fsync parent directory after deleting a file in delete scheduler.
    • In level-based compaction, if bottom-pri thread pool was setup via Env::SetBackgroundThreads(), compactions to the bottom level will be delegated to that thread pool.
    • 🚚 prefix_extractor has been moved from ImmutableCFOptions to MutableCFOptions, meaning it can be dynamically changed without a DB restart.

    🐛 Bug Fixes

    • 👷 Fsync after writing global seq number to the ingestion file in ExternalSstFileIngestionJob.
    • Fix WAL corruption caused by race condition between user write thread and FlushWAL when two_write_queue is not set.
    • Fix BackupableDBOptions::max_valid_backups_to_open to not delete backup files when refcount cannot be accurately determined.
    • Fix memory leak when pin_l0_filter_and_index_blocks_in_cache is used with partitioned filters
    • 🔀 Disable rollback of merge operands in WritePrepared transactions to work around an issue in MyRocks. It can be enabled back by setting TransactionDBOptions::rollback_merge_operands to true.
    • 🛠 Fix wrong results by ReverseBytewiseComparator::FindShortSuccessor()

    Java API Changes

    • ➕ Add BlockBasedTableConfig.setBlockCache to allow sharing a block cache across DB instances.
    • ➕ Added SstFileManager to the Java API to allow managing SST files across DB instances.