TileDB v2.3.0 Release Notes

  • Disk Format

    • Format version incremented to 9. #2108

    ๐Ÿ’ฅ Breaking behavior

    • The setting of `sm.read_range_oob` now defaults to `warn`, allowing queries to run with bounded ranges that errored before. #2176
    • โœ‚ Removes TBB as an optional dependency #2181

    ๐Ÿ†• New features

    • Support TILEDB_DATETIME_{SEC,MS,US,NS} in arrow_io_impl.h #2228
    • โž• Adds support for filtering query results on attribute values #2141
    • โž• Adding support for time datatype dimension and attribute #2140
    • โž• Add support for serialization of config objects #2164
    • Add C and C++ examples to the examples/ directory for the tiledb_fragment_info_t APIs. #2160
    • ๐Ÿ‘Œ supporting serialization (using capnproto) build on windows #2100
    • ๐Ÿ‘ Config option "vfs.s3.sse" for S3 server-side encryption support #2130
    • โšก๏ธ Name attribute/dimension files by index. This is fragment-specific and updates the format version to version 9. #2107
    • ๐Ÿšš Smoke Test, remove nullable structs from global namespace. #2078

    ๐Ÿ‘Œ Improvements

    • replace ReadFromOffset with ReadRange in GCS::read() to avoid excess gcs egress traffic #2307
    • ๐Ÿ›  Hilbert partitioning fixes #2269
    • ๐Ÿ”จ Stats refactor #2267
    • ๐Ÿ‘Œ Improve Cap'n Proto cmake setup for system installations #2263
    • โš™ Runtime check for minimum validity buffer size #2261
    • Enable partial vacuuming when vacuuming with timestamps #2251
    • Consolidation: de-dupe FragmentInfo #2250
    • Consolidation: consider non empty domain before start timestamp #2248
    • โž• Add size details to s3 read error #2249
    • Consolidation: do not re-open array for each fragment #2243
    • ๐Ÿ‘Œ Support back compat writes #2230
    • ๐Ÿ‘ Serialization support for query conditions #2240
    • Make SubarrayPartitioner's member functions to return Status after calling Subarray::get_range_num. #2235
    • โšก๏ธ Update bzip2 super build version to 1.0.8 to address CVE-2019-12900 in libbzip2 #2233
    • Timestamp start and end for vacuuming and consolidation #2227
    • ๐Ÿ›  Fix memory leaks reported on ASAN when running with leak-detection. #2223
    • ๐Ÿ“‡ Use relative paths in consolidated fragment metadata #2215
    • Optimize Subarray::compute_relevant_fragments #2216
    • AWS S3: improve is_dir #2209
    • โž• Add nullable string to nullable attribute example #2212
    • AWS S3: adding option to skip Aws::InitAPI #2204
    • โž• Added additional stats for subarrays and subarray partitioners #2200
    • Introduces config parameter "sm.skip_est_size_partitioning" #2203
    • โž• Add config to query serialization. #2177
    • ๐Ÿ‘ Consolidation support for nullable attributes #2196
    • โœ… Adjust unit tests to reduce memory leaks inside the tests. #2179
    • โฌ‡๏ธ Reduces memory usage in multi-range range reads #2165
    • Add config option `sm.read_range_oob` to toggle bounding read ranges to domain or erroring #2162
    • ๐Ÿ Windows msys2 build artifacts are no longer uploaded #2159
    • โž• Add internal log functions to log at different log levels #2161
    • Parallelize Writer::filter_tiles #2156
    • โฑ Added config option "vfs.gcs.request_timeout_ms" #2148
    • ๐Ÿ‘Œ Improve fragment info loading by parallelizing fragment_size requests #2143
    • ๐Ÿ‘ Allow open array stats to be printed without read query #2131
    • ๐Ÿ‘ท Cleanup the GHA CI scripts - put common code into external shell scripts. #2124
    • โฌ‡๏ธ Reduced memory consumption in the read path for multi-range reads. #2118
    • ๐Ÿšš The latest version of dev was leaving behind a test/empty_string3/. This ensures that the directory is removed when make check is run. #2113
    • ๐Ÿ‘ท Migrating AZP CI to GA #2111
    • Cache non_empty_domain for REST arrays like all other arrays #2105
    • โž• Add additional stats printing to breakdown read state initialization timings #2095
    • โœ… Places the in-memory filesystem under unit test #1961
    • โž• Adds a Github Action to automate the HISTORY.md #2075
    • ๐Ÿ”„ Change printfs in C++ examples to cout, edit C print statements to fix format warnings #2226

    ๐Ÿ—„ Deprecations

    • The following APIs have been deprecated: tiledb_array_open_at, tiledb_array_open_at_with_key, tiledb_array_reopen_at. #2142

    ๐Ÿ› Bug fixes

    • ๐Ÿ›  Fix a segfault on VFS::ls for the in-memory filesystem #2255
    • ๐Ÿ›  Fix rare read corruption in S3 #2253
    • โšก๏ธ Update some union initializers to use strict syntax #2242
    • ๐Ÿ›  Fix race within S3::init_client #2247
    • ๐Ÿ Expand accepted windows URIs. #2237
    • ๐Ÿ›  Write fix for unordered writes on nullable, fixed attributes. #2241
    • ๐Ÿ›  Fix tile extent to be reported as domain extent for sparse arrays with Hilbert ordering #2231
    • Do not consider option sm.read_range_oob for set_subarray() on Write queries #2211
    • ๐Ÿ”„ Change avoiding generation of multiple, concatenated, subarray flattened data. #2190
    • ๐Ÿ”„ Change mutex from basic to recursive #2180
    • ๐Ÿ›  Fixes a memory leak in the S3 read path #2189
    • ๐Ÿ›  Fixes a potential memory leak in the filter pipeline #2185
    • ๐Ÿ›  Fixes misc memory leaks in the unit tests #2183
    • Fix memory leak of `tiledb_config_t` in error path of `tiledb_config_alloc`. #2178
    • ๐Ÿ›  Fix check for null pointer in query deserialization #2163
    • ๐Ÿ›  Fixes a potential crash when retrying incomplete reads #2137
    • ๐Ÿ›  Fixes a potential crash when opening an array with consolidated fragment metadata #2135
    • ๐Ÿ“œ Corrected a bug where sparse cells may be incorrectly returned using string dimensions. #2125
    • ๐Ÿ›  Fix segfault in serialized queries when partition is unsplittable #2120
    • Always use original buffer size in serialized read queries serverside. #2115
    • ๐Ÿ›  Fix an edge-case where a read query may hang on array with string dimensions #2089

    API additions

    C API

    • Added tiledb_array_set_open_timestamp_start and tiledb_array_get_open_timestamp_start #2285
    • Added tiledb_array_set_open_timestamp_end and tiledb_array_get_open_timestamp_end #2285
    • Addition of tiledb_array_set_config to directly assign a config to an array. #2142
    • tiledb_query_get_array now returns a deep-copy #2184
    • Added `tiledb_serialize_config` and `tiledb_deserialize_config` #2164
    • Add new api, tiledb_query_get_config to get a query's config. #2167
    • Removes non-default parameter in "tiledb_config_unset". #2099

    C++ API

    • Added Array::set_open_timestamp_start and Array::open_timestamp_start #2285
    • Added Array::set_open_timestamp_end and Array::open_timestamp_end #2285
    • add Query::result_buffer_elements_nullable support for dims #2238
    • Addition of tiledb_array_set_config to directly assign a config to an array. #2142
    • โž• Add new api, Query.config() to get a query's config. #2167
    • โœ‚ Removes non-default parameter in "Config::unset". #2099
    • โž• Add support for a string-typed, variable-sized, nullable attribute in the C++ API. #2090