ClickHouse v20.8.2.3 Release Notes

Release Date: 2020-09-08 // over 3 years ago
  • Backward Incompatible Change

    • ⚡️ Now OPTIMIZE FINAL query doesn't recalculate TTL for parts that were added before TTL was created. Use ALTER TABLE ... MATERIALIZE TTL once to calculate them, after that OPTIMIZE FINAL will evaluate TTL's properly. This behavior never worked for replicated tables. #14220 (alesapin).
    • Extend parallel_distributed_insert_select setting, adding an option to run INSERT into local table. The setting changes type from Bool to UInt64, so the values false and true are no longer supported. If you have these values in server configuration, the server will not start. Please replace them with 0 and 1, respectively. #14060 (Azat Khuzhin).
    • ✂ Remove support for the ODBCDriver input/output format. This was a deprecated format once used for communication with the ClickHouse ODBC driver, now long superseded by the ODBCDriver2 format. Resolves #13629. #13847 (hexiaoting).
    • ⚡️ When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

    🆕 New Feature

    • ➕ Add the ability to specify Default compression codec for columns that correspond to settings specified in config.xml. Implements: #9074. #14049 (alesapin).
    • 👌 Support Kerberos authentication in Kafka, using krb5 and cyrus-sasl libraries. #12771 (Ilya Golshtein).
    • ➕ Add function normalizeQuery that replaces literals, sequences of literals and complex aliases with placeholders. Add function normalizedQueryHash that returns identical 64bit hash values for similar queries. It helps to analyze query log. This closes #11271. #13816 (alexey-milovidov).
    • ➕ Add time_zones table. #13880 (Bharat Nallan).
    • ➕ Add function defaultValueOfTypeName that returns the default value for a given type. #13877 (hcz).
    • ➕ Add countDigits(x) function that count number of decimal digits in integer or decimal column. Add isDecimalOverflow(d, [p]) function that checks if the value in Decimal column is out of its (or specified) precision. #14151 (Artem Zuikov).
    • ➕ Add quantileExactLow and quantileExactHigh implementations with respective aliases for medianExactLow and medianExactHigh. #13818 (Bharat Nallan).
    • ➕ Added date_trunc function that truncates a date/time value to a specified date/time part. #13888 (Vladimir Golovchenko).
    • ➕ Add new optional section <user_directories> to the main config. #13425 (Vitaly Baranov).
    • ➕ Add ALTER SAMPLE BY statement that allows to change table sample clause. #13280 (Amos Bird).
    • 👍 Function position now supports optional start_pos argument. #13237 (vdimir).

    🐛 Bug Fix

    👌 Improvement

    • 🛠 Disallows CODEC on ALIAS column type. Fixes #13911. #14263 (Bharat Nallan).
    • When waiting for a dictionary update to complete, use the timeout specified by query_wait_timeout_milliseconds setting instead of a hard-coded value. #14105 (Nikita Mikhaylov).
    • Add setting min_index_granularity_bytes that protects against accidentally creating a table with very low index_granularity_bytes setting. #14139 (Bharat Nallan).
    • Now it's possible to fetch partitions from clusters that use different ZooKeeper: ALTER TABLE table_name FETCH PARTITION partition_expr FROM 'zk-name:/path-in-zookeeper'. It's useful for shipping data to new clusters. #14155 (Amos Bird).
    • 🐎 Slightly better performance of Memory table if it was constructed from a huge number of very small blocks (that's unlikely). Author of the idea: Mark Papadakis. Closes #14043. #14056 (alexey-milovidov).
    • Conditional aggregate functions (for example: avgIf, sumIf, maxIf) should return NULL when miss rows and use nullable arguments. #13964 (Winter Zhang).
    • Increase limit in -Resample combinator to 1M. #13947 (Mikhail f. Shiryaev).
    • Corrected an error in AvroConfluent format that caused the Kafka table engine to stop processing messages when an abnormally small, malformed, message was received. #13941 (Gervasio Varela).
    • 🛠 Fix wrong error for long queries. It was possible to get syntax error other than Max query size exceeded for correct query. #13928 (Nikolai Kochetov).
    • 👍 Better error message for null value of TabSeparated format. #13906 (jiang tao).
    • Function arrayCompact will compare NaNs bitwise if the type of array elements is Float32/Float64. In previous versions NaNs were always not equal if the type of array elements is Float32/Float64 and were always equal if the type is more complex, like Nullable(Float64). This closes #13857. #13868 (alexey-milovidov).
    • 🛠 Fix data race in lgamma function. This race was caught only in tsan, no side effects a really happened. #13842 (Nikolai Kochetov).
    • 👻 Avoid too slow queries when arrays are manipulated as fields. Throw exception instead. #13753 (alexey-milovidov).
    • ➕ Added Redis requirepass authorization (for redis dictionary source). #13688 (Ivan Torgashov).
    • ➕ Add MergeTree Write-Ahead-Log (WAL) dump tool. WAL is an experimental feature. #13640 (BohuTANG).
    • 🏗 In previous versions lcm function may produce assertion violation in debug build if called with specifically crafted arguments. This fixes #13368. #13510 (alexey-milovidov).
    • 👍 Provide monotonicity for toDate/toDateTime functions in more cases. Monotonicity information is used for index analysis (more complex queries will be able to use index). Now the input arguments are saturated more naturally and provides better monotonicity. #13497 (Amos Bird).
    • 👌 Support compound identifiers for custom settings. Custom settings is an integration point of ClickHouse codebase with other codebases (no benefits for ClickHouse itself) #13496 (Vitaly Baranov).
    • 🚚 Move parts from DiskLocal to DiskS3 in parallel. DiskS3 is an experimental feature. #13459 (Pavel Kovalenko).
    • 0️⃣ Enable mixed granularity parts by default. #13449 (alesapin).
    • 🔒 Proper remote host checking in S3 redirects (security-related thing). #13404 (Vladimir Chebotarev).
    • ➕ Add QueryTimeMicroseconds, SelectQueryTimeMicroseconds and InsertQueryTimeMicroseconds to system.events. #13336 (ianton-ru).
    • 🛠 Fix debug assertion when Decimal has too large negative exponent. Fixes #13188. #13228 (alexey-milovidov).
    • ➕ Added cache layer for DiskS3 (cache to local disk mark and index files). DiskS3 is an experimental feature. #13076 (Pavel Kovalenko).
    • 🛠 Fix readline so it dumps history to file now. #13600 (Amos Bird).
    • 0️⃣ Create system database with Atomic engine by default (a preparation to enable Atomic database engine by default everywhere). #13680 (tavplubix).

    🐎 Performance Improvement

    • ⚡️ Slightly optimize very short queries with LowCardinality. #14129 (Anton Popov).
    • Enable parallel INSERTs for table engines Null, Memory, Distributed and Buffer when the setting max_insert_threads is set. #14120 (alexey-milovidov).
    • Fail fast if max_rows_to_read limit is exceeded on parts scan. The motivation behind this change is to skip ranges scan for all selected parts if it is clear that max_rows_to_read is already exceeded. The change is quite noticeable for queries over big number of parts. #13677 (Roman Khavronenko).
    • 🐎 Slightly improve performance of aggregation by UInt8/UInt16 keys. #13099 (alexey-milovidov).
    • ⚡️ Optimize has(), indexOf() and countEqual() functions for Array(LowCardinality(T)) and constant right arguments. #12550 (myrrc).
    • When performing trivial INSERT SELECT queries, automatically set max_threads to 1 or max_insert_threads, and set max_block_size to min_insert_block_size_rows. Related to #5907. #12195 (flynn).

    Experimental Feature

    • ClickHouse can work as MySQL replica - it is implemented by MaterializeMySQL database engine. Implements #4006. #10851 (Winter Zhang).
    • Add types Int128, Int256, UInt256 and related functions for them. Extend Decimals with Decimal256 (precision up to 76 digits). New types are under the setting allow_experimental_bigint_types. It is working extremely slow and bad. The implementation is incomplete. Please don't use this feature. #13097 (Artem Zuikov).

    🏗 Build/Testing/Packaging Improvement

    • ➕ Added clickhouse install script, that is useful if you only have a single binary. #13528 (alexey-milovidov).
    • 👍 Allow to run clickhouse binary without configuration. #13515 (alexey-milovidov).
    • ✏️ Enable check for typos in code with codespell. #13513 #13511 (alexey-milovidov).
    • 👕 Enable Shellcheck in CI as a linter of .sh tests. This closes #13168. #13530 #13529 (alexey-milovidov).
    • ➕ Add a CMake option to fail configuration instead of auto-reconfiguration, enabled by default. #13687 (Konstantin).
    • 🔖 Expose version of embedded tzdata via TZDATA_VERSION in system.build_options. #13648 (filimonov).
    • 👌 Improve generation of system.time_zones table during build. Closes #14209. #14215 (filimonov).
    • 🏗 Build ClickHouse with the most fresh tzdata from package repository. #13623 (alexey-milovidov).
    • ➕ Add the ability to write js-style comments in skip_list.json. #14159 (alesapin).
    • Ensure that there is no copy-pasted GPL code. #13514 (alexey-milovidov).
    • 🐳 Switch tests docker images to use test-base parent. #14167 (Ilya Yatsishin).
    • Adding retry logic when bringing up docker-compose cluster; Increasing COMPOSE_HTTP_TIMEOUT. #14112 (vzakaznikov).
    • ✅ Enabled system.text_log in stress test to find more bugs. #13855 (Nikita Mikhaylov).
    • ✅ Testflows LDAP module: adding missing certificates and dhparam.pem for openldap4. #13780 (vzakaznikov).
    • 👷 ZooKeeper cannot work reliably in unit tests in CI infrastructure. Using unit tests for ZooKeeper interaction with real ZooKeeper is bad idea from the start (unit tests are not supposed to verify complex distributed systems). We already using integration tests for this purpose and they are better suited. #13745 (alexey-milovidov).
    • ➕ Added docker image for style check. Added style check that all docker and docker compose files are located in docker directory. #13724 (Ilya Yatsishin).
    • 🛠 Fix cassandra build on Mac OS. #13708 (Ilya Yatsishin).
    • 🛠 Fix link error in shared build. #13700 (Amos Bird).
    • ⚡️ Updating LDAP user authentication suite to check that it works with RBAC. #13656 (vzakaznikov).
    • Removed -DENABLE_CURL_CLIENT for contrib/aws. #13628 (Vladimir Chebotarev).
    • 🐳 Increasing health-check timeouts for ClickHouse nodes and adding support to dump docker-compose logs if unhealthy containers found. #13612 (vzakaznikov).
    • 👉 Make sure https://github.com/ClickHouse/ClickHouse/issues/10977 is invalid. #13539 (Amos Bird).
    • Skip PR's from robot-clickhouse. #13489 (Nikita Mikhaylov).
    • 👷 Move Dockerfiles from integration tests to docker/test directory. docker_compose files are available in runner docker container. Docker images are built in CI and not in integration tests. #13448 (Ilya Yatsishin).