RocksDB v6.11 Release Notes

Release Date: 2020-06-12 // almost 4 years ago
  • ๐Ÿ› Bug Fixes

    • Fix consistency checking error swallowing in some cases when options.force_consistency_checks = true.
    • ๐Ÿ›  Fix possible false NotFound status from batched MultiGet using index type kHashSearch.
    • ๐Ÿ›  Fix corruption caused by enabling delete triggered compaction (NewCompactOnDeletionCollectorFactory) in universal compaction mode, along with parallel compactions. The bug can result in two parallel compactions picking the same input files, resulting in the DB resurrecting older and deleted versions of some keys.
    • Fix a use-after-free bug in best-efforts recovery. column_family_memtables_ needs to point to valid ColumnFamilySet.
    • Let best-efforts recovery ignore corrupted files during table loading.
    • ๐Ÿ›  Fix corrupt key read from ingested file when iterator direction switches from reverse to forward at a key that is a prefix of another key in the same file. It is only possible in files with a non-zero global seqno.
    • ๐Ÿ›  Fix abnormally large estimate from GetApproximateSizes when a range starts near the end of one SST file and near the beginning of another. Now GetApproximateSizes consistently and fairly includes the size of SST metadata in addition to data blocks, attributing metadata proportionally among the data blocks based on their size.
    • ๐Ÿ›  Fix potential file descriptor leakage in PosixEnv's IsDirectory() and NewRandomAccessFile().
    • ๐Ÿ›  Fix false negative from the VerifyChecksum() API when there is a checksum mismatch in an index partition block in a BlockBasedTable format table file (index_type is kTwoLevelIndexSearch).
    • ๐Ÿ›  Fix sst_dump to return non-zero exit code if the specified file is not a recognized SST file or fails requested checks.
    • ๐Ÿ›  Fix incorrect results from batched MultiGet for duplicate keys, when the duplicate key matches the largest key of an SST file and the value type for the key in the file is a merge value.
    • ๐Ÿ›  Fix "bad block type" error from persistent cache on Windows.

    Public API Change

    • Flush(..., column_family) may return Status::ColumnFamilyDropped() instead of Status::InvalidArgument() if column_family is dropped while processing the flush request.
    • 0๏ธโƒฃ BlobDB now explicitly disallows using the default column family's storage directories as blob directory.
    • โœ‚ DeleteRange now returns Status::InvalidArgument if the range's end key comes before its start key according to the user comparator. Previously the behavior was undefined.
    • ldb now uses options.force_consistency_checks = true by default and "--disable_consistency_checks" is added to disable it.
    • DB::OpenForReadOnly no longer creates files or directories if the named DB does not exist, unless create_if_missing is set to true.
    • The consistency checks that validate LSM state changes (table file additions/deletions during flushes and compactions) are now stricter, more efficient, and no longer optional, i.e. they are performed even if force_consistency_checks is false.
    • Disable delete triggered compaction (NewCompactOnDeletionCollectorFactory) in universal compaction mode and num_levels = 1 in order to avoid a corruption bug.
    • pin_l0_filter_and_index_blocks_in_cache no longer applies to L0 files larger than 1.5 * write_buffer_size to give more predictable memory usage. Such L0 files may exist due to intra-L0 compaction, external file ingestion, or user dynamically changing write_buffer_size (note, however, that files that are already pinned will continue being pinned, even after such a dynamic change).
    • In point-in-time wal recovery mode, fail database recovery in case of IOError while reading the WAL to avoid data loss.
    • A new method Env::LowerThreadPoolCPUPriority(Priority, CpuPriority) is added to Env to be able to lower to a specific priority such as CpuPriority::kIdle.
    • DB::GetDbSessionId(std::string& session_id) is added. session_id stores a unique identifier that gets reset every time the DB is opened. This DB session ID should be unique among all open DB instances on all hosts, and should be unique among re-openings of the same or other DBs. This identifier is recorded in the LOG file on the line starting with DB Session ID:.

    ๐Ÿ†• New Features

    • sst_dump to add a new --readahead_size argument. Users can specify read size when scanning the data. Sst_dump also tries to prefetch tail part of the SST files so usually some number of I/Os are saved there too.
    • Generate file checksum in SstFileWriter if Options.file_checksum_gen_factory is set. The checksum and checksum function name are stored in ExternalSstFileInfo after the sst file write is finished.
    • Add a value_size_soft_limit in read options which limits the cumulative value size of keys read in batches in MultiGet. Once the cumulative value size of found keys exceeds read_options.value_size_soft_limit, all the remaining keys are returned with status Abort without further finding their values. By default the value_size_soft_limit is std::numeric_limits::max().
    • Enable SST file ingestion with file checksum information when calling IngestExternalFiles(const std::vector& args). Added files_checksums and files_checksum_func_names to IngestExternalFileArg such that user can ingest the sst files with their file checksum information. Added verify_file_checksum to IngestExternalFileOptions (default is True). To be backward compatible, if DB does not enable file checksum or user does not provide checksum information (vectors of files_checksums and files_checksum_func_names are both empty), verification of file checksum is always sucessful. If DB enables file checksum, DB will always generate the checksum for each ingested SST file during Prepare stage of ingestion and store the checksum in Manifest, unless verify_file_checksum is False and checksum information is provided by the application. In this case, we only verify the checksum function name and directly store the ingested checksum in Manifest. If verify_file_checksum is set to True, DB will verify the ingested checksum and function name with the genrated ones. Any mismatch will fail the ingestion. Note that, if IngestExternalFileOptions::write_global_seqno is True, the seqno will be changed in the ingested file. Therefore, the checksum of the file will be changed. In this case, a new checksum will be generated after the seqno is updated and be stored in the Manifest.

    ๐ŸŽ Performance Improvements

    • Eliminate redundant key comparisons during random access in block-based tables.