ClickHouse v20.11.2.1 Release Notes

Release Date: 2020-11-11 // over 3 years ago
  • Backward Incompatible Change

    • 0️⃣ If some profile was specified in distributed_ddl config section, then this profile could overwrite settings of default profile on server startup. It's fixed, now settings of distributed DDL queries should not affect global server settings. #16635 (tavplubix).
    • Restrict to use of non-comparable data types (like AggregateFunction) in keys (Sorting key, Primary key, Partition key, and so on). #16601 (alesapin).
    • Remove ANALYZE and AST queries, and make the setting enable_debug_queries obsolete since now it is the part of full featured EXPLAIN query. #16536 (Ivan).
    • Aggregate functions boundingRatio, rankCorr, retention, timeSeriesGroupSum, timeSeriesGroupRateSum, windowFunnel were erroneously made case-insensitive. Now their names are made case sensitive as designed. Only functions that are specified in SQL standard or made for compatibility with other DBMS or functions similar to those should be case-insensitive. #16407 (alexey-milovidov).
    • 👉 Make rankCorr function return nan on insufficient data https://github.com/ClickHouse/ClickHouse/issues/16124. #16135 (hexiaoting).
    • ⚡️ When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

    🆕 New Feature

    • ➕ Added support of LDAP as a user directory for locally non-existent users. #12736 (Denis Glazachev).
    • ➕ Add system.replicated_fetches table which shows currently running background fetches. #16428 (alesapin).
    • Added setting date_time_output_format. #15845 (Maksim Kita).
    • ➕ Added minimal web UI to ClickHouse. #16158 (alexey-milovidov).
    • 👍 Allows to read/write Single protobuf message at once (w/o length-delimiters). #15199 (filimonov).
    • Added initial OpenTelemetry support. ClickHouse now accepts OpenTelemetry traceparent headers over Native and HTTP protocols, and passes them downstream in some cases. The trace spans for executed queries are saved into the system.opentelemetry_span_log table. #14195 (Alexander Kuzmenkov).
    • 👍 Allow specify primary key in column list of CREATE TABLE query. This is needed for compatibility with other SQL dialects. #15823 (Maksim Kita).
    • Implement OFFSET offset_row_count {ROW | ROWS} FETCH {FIRST | NEXT} fetch_row_count {ROW | ROWS} {ONLY | WITH TIES} in SELECT query with ORDER BY. This is the SQL-standard way to specify LIMIT. #15855 (hexiaoting).
    • 🌲 errorCodeToName function - return variable name of the error (useful for analyzing query_log and similar). system.errors table - shows how many times errors has been happened (respects system_events_show_zero_values). #16438 (Azat Khuzhin).
    • ➕ Added function untuple which is a special function which can introduce new columns to the SELECT list by expanding a named tuple. #16242 (Nikolai Kochetov, Amos Bird).
    • Now we can provide identifiers via query parameters. And these parameters can be used as table objects or columns. #16594 (Amos Bird).
    • ➕ Added big integers (UInt256, Int128, Int256) and UUID data types support for MergeTree BloomFilter index. Big integers is an experimental feature. #16642 (Maksim Kita).
    • ➕ Add farmFingerprint64 function (non-cryptographic string hashing). #16570 (Jacob Hayes).
    • Add log_queries_min_query_duration_ms, only queries slower then the value of this setting will go to query_log/query_thread_log (i.e. something like slow_query_log in mysql). #16529 (Azat Khuzhin).
    • 🐳 Ability to create a docker image on the top of Alpine. Uses precompiled binary and glibc components from ubuntu 20.04. #16479 (filimonov).
    • ➕ Added toUUIDOrNull, toUUIDOrZero cast functions. #16337 (Maksim Kita).
    • Add max_concurrent_queries_for_all_users setting, see #6636 for use cases. #16154 (nvartolomei).
    • Add a new option print_query_id to clickhouse-client. It helps generate arbitrary strings with the current query id generated by the client. Also print query id in clickhouse-client by default. #15809 (Amos Bird).
    • ➕ Add tid and logTrace functions. This closes #9434. #15803 (flynn).
    • ➕ Add function formatReadableTimeDelta that format time delta to human readable string ... #15497 (Filipe Caixeta).
    • ➕ Added disable_merges option for volumes in multi-disk configuration. #13956 (Vladimir Chebotarev).

    Experimental Feature

    • New functions encrypt, aes_encrypt_mysql, decrypt, aes_decrypt_mysql. These functions are working slowly, so we consider it as an experimental feature. #11844 (Vasily Nemkov).

    🐛 Bug Fix

    • Mask password in data_path in the system.distribution_queue. #16727 (Azat Khuzhin).
    • Fix IN operator over several columns and tuples with enabled transform_null_in setting. Fixes #15310. #16722 (Anton Popov).
    • The setting max_parallel_replicas worked incorrectly if the queried table has no sampling. This fixes #5733. #16675 (alexey-milovidov).
    • Fix optimize_read_in_order/optimize_aggregation_in_order with max_threads > 0 and expression in ORDER BY. #16637 (Azat Khuzhin).
    • 0️⃣ Calculation of DEFAULT expressions was involving possible name collisions (that was very unlikely to encounter). This fixes #9359. #16612 (alexey-milovidov).
    • Fix query_thread_log.query_duration_ms unit. #16563 (Azat Khuzhin).
    • 🛠 Fix a bug when using MySQL Master -> MySQL Slave -> ClickHouse MaterializeMySQL Engine. MaterializeMySQL is an experimental feature. #16504 (TCeason).
    • 🛠 Specifically crafted argument of round function with Decimal was leading to integer division by zero. This fixes #13338. #16451 (alexey-milovidov).
    • 🛠 Fix DROP TABLE for Distributed (racy with INSERT). #16409 (Azat Khuzhin).
    • 🛠 Fix processing of very large entries in replication queue. Very large entries may appear in ALTER queries if table structure is extremely large (near 1 MB). This fixes #16307. #16332 (alexey-milovidov).
    • 🛠 Fixed the inconsistent behaviour when a part of return data could be dropped because the set for its filtration wasn't created. #16308 (Nikita Mikhaylov).
    • 🛠 Fix dictGet in sharding_key (and similar places, i.e. when the function context is stored permanently). #16205 (Azat Khuzhin).
    • 🛠 Fix the exception thrown in clickhouse-local when trying to execute OPTIMIZE command. Fixes #16076. #16192 (filimonov).
    • 🛠 Fixes #15780 regression, e.g. indexOf([1, 2, 3], toLowCardinality(1)) now is prohibited but it should not be. #16038 (Mike).
    • 🛠 Fix bug with MySQL database. When MySQL server used as database engine is down some queries raise Exception, because they try to get tables from disabled server, while it's unnecessary. For example, query SELECT ... FROM system.parts should work only with MergeTree tables and don't touch MySQL database at all. #16032 (Kruglov Pavel).
    • 0️⃣ Now exception will be thrown when ALTER MODIFY COLUMN ... DEFAULT ... has incompatible default with column type. Fixes #15854. #15858 (alesapin).
    • 🛠 Fixed IPv4CIDRToRange/IPv6CIDRToRange functions to accept const IP-column values. #15856 (vladimir-golovchenko).

    👌 Improvement

    • 🛠 Treat INTERVAL '1 hour' as equivalent to INTERVAL 1 HOUR, to be compatible with Postgres and similar. This fixes #15637. #15978 (flynn).
    • 📜 Enable parsing enum values by their numeric ids for CSV, TSV and JSON input formats. #15685 (vivarum).
    • Better read task scheduling for JBOD architecture and MergeTree storage. New setting read_backoff_min_concurrency which serves as the lower limit to the number of reading threads. #16423 (Amos Bird).
    • ➕ Add missing support for LowCardinality in Avro format. #16521 (Mike).
    • ↪ Workaround for use S3 with nginx server as proxy. Nginx currenty does not accept urls with empty path like http://domain.com?delete, but vanilla aws-sdk-cpp produces this kind of urls. This commit uses patched aws-sdk-cpp version, which makes urls with "/" as path in this cases, like http://domain.com/?delete. #16814 (ianton-ru).
    • 👍 Better diagnostics on parse errors in input data. Provide row number on Cannot read all data errors. #16644 (alexey-milovidov).
    • 🛠 Make the behaviour of minMap and maxMap more desireable. It will not skip zero values in the result. Fixes #16087. #16631 (Ildus Kurbangaliev).
    • 👍 Better update of ZooKeeper configuration in runtime. #16630 (sundyli).
    • Apply SETTINGS clause as early as possible. It allows to modify more settings in the query. This closes #3178. #16619 (alexey-milovidov).
    • Now event_time_microseconds field stores in Decimal64, not UInt64. #16617 (Nikita Mikhaylov).
    • Now paratmeterized functions can be used in APPLY column transformer. #16589 (Amos Bird).
    • 👌 Improve scheduling of background task which removes data of dropped tables in Atomic databases. Atomic databases do not create broken symlink to table data directory if table actually has no data directory. #16584 (tavplubix).
    • Subqueries in WITH section (CTE) can reference previous subqueries in WITH section by their name. #16575 (Amos Bird).
    • Add current_database into system.query_thread_log. #16558 (Azat Khuzhin).
    • 👍 Allow to fetch parts that are already committed or outdated in the current instance into the detached directory. It's useful when migrating tables from another cluster and having N to 1 shards mapping. It's also consistent with the current fetchPartition implementation. #16538 (Amos Bird).
    • 🛠 Multiple improvements for RabbitMQ: Fixed bug for #16263. Also minimized event loop lifetime. Added more efficient queues setup. #16426 (Kseniia Sumarokova).
    • 🛠 Fix debug assertion in quantileDeterministic function. In previous version it may also transfer up to two times more data over the network. Although no bug existed. This fixes #15683. #16410 (alexey-milovidov).
    • ➕ Add TablesToDropQueueSize metric. It's equal to number of dropped tables, that are waiting for background data removal. #16364 (tavplubix).
    • 👍 Better diagnostics when client has dropped connection. In previous versions, Attempt to read after EOF and Broken pipe exceptions were logged in server. In new version, it's information message Client has dropped the connection, cancel the query.. #16329 (alexey-milovidov).
    • Add total_rows/total_bytes (from system.tables) support for Set/Join table engines. #16306 (Azat Khuzhin).
    • 🔀 Now it's possible to specify PRIMARY KEY without ORDER BY for MergeTree table engines family. Closes #15591. #16284 (alesapin).
    • If there is no tmp folder in the system (chroot, misconfigutation etc) clickhouse-local will create temporary subfolder in the current directory. #16280 (filimonov).
    • ➕ Add support for nested data types (like named tuple) as sub-types. Fixes #15587. #16262 (Ivan).
    • Support for database_atomic_wait_for_drop_and_detach_synchronously/NO DELAY/SYNC for DROP DATABASE. #16127 (Azat Khuzhin).
    • Add allow_nondeterministic_optimize_skip_unused_shards (to allow non deterministic like rand() or dictGet() in sharding key). #16105 (Azat Khuzhin).
    • Fix memory_profiler_step/max_untracked_memory for queries via HTTP (test included). Fix the issue that adjusting this value globally in xml config does not help either, since those settings are not applied anyway, only default (4MB) value is used. Fix query_id for the most root ThreadStatus of the http query (by initializing QueryScope after reading query_id). #16101 (Azat Khuzhin).
    • Now it's allowed to execute ALTER ... ON CLUSTER queries regardless of the <internal_replication> setting in cluster config. #16075 (alesapin).
    • 🛠 Fix rare issue when clickhouse-client may abort on exit due to loading of suggestions. This fixes #16035. #16047 (alexey-milovidov).
    • ➕ Add support of cache layout for Redis dictionaries with complex key. #15985 (Anton Popov).
    • Fix query hang (endless loop) in case of misconfiguration (connections_with_failover_max_tries set to 0). #15876 (Azat Khuzhin).
    • 🔄 Change level of some log messages from information to debug, so information messages will not appear for every query. This closes #5293. #15816 (alexey-milovidov).
    • ✂ Remove MemoryTrackingInBackground* metrics to avoid potentially misleading results. This fixes #15684. #15813 (alexey-milovidov).
    • ➕ Add reconnects to zookeeper-dump-tree tool. #15711 (alexey-milovidov).
    • 👍 Allow explicitly specify columns list in CREATE TABLE table AS table_function(...) query. Fixes #9249 Fixes #14214. #14295 (tavplubix).

    🐎 Performance Improvement

    • 🔀 Do not merge parts across partitions in SELECT FINAL. #15938 (Kruglov Pavel).
    • 👌 Improve performance of -OrNull and -OrDefault aggregate functions. #16661 (alexey-milovidov).
    • 👌 Improve performance of quantileMerge. In previous versions it was obnoxiously slow. This closes #1463. #16643 (alexey-milovidov).
    • 👌 Improve performance of logical functions a little. #16347 (alexey-milovidov).
    • 👌 Improved performance of merges assignment in MergeTree table engines. Shouldn't be visible for the user. #16191 (alesapin).
    • 📜 Speedup hashed/sparse_hashed dictionary loading by preallocating the hash table. #15454 (Azat Khuzhin).
    • Now trivial count optimization becomes slightly non-trivial. Predicates that contain exact partition expr can be optimized too. This also fixes #11092 which returns wrong count when max_parallel_replicas > 1. #15074 (Amos Bird).

    🏗 Build/Testing/Packaging Improvement