Introduction
Databricks has rolled out Runtime 18.1 (Beta), and it is packed with meaningful enhancements across streaming, Delta Lake, SQL, geospatial, performance, and Apache Spark 4.1.0 improvements. This release builds on 18.0 and introduces new capabilities that make pipelines faster, smarter, and more reliable. Below is a breakdown of what is new and why it matters.
Key New Features & Improvements
Auto Loader Enhancements
Auto Loader now uses file events by default when available, reducing directory listing costs and improving latency. You can still override behavior using:
useIncrementalListing
useNotifications
Or disable file events with:
Delta Lake & Unity Catalog Improvements
Optimized Writes for CRTAS
Partitioned Unity Catalog tables created via CREATE OR REPLACE TABLE AS SELECT now automatically use optimized writes for fewer, larger files.
Schema Evolution with INSERT
The new WITH SCHEMA EVOLUTION clause allows automatic schema evolution during:
INSERT INTO
INSERT OVERWRITE
INSERT INTO … REPLACE
It handles:
Delta Sharing
Delta Sharing now supports multi‑statement transactions for shared tables using pre‑signed URLs or cloud tokens.
SQL & Scripting Enhancements
New SQL Functions
parse_timestamp — photonized for fast multi‑pattern timestamp parsing
Approximate top‑k sketch functions:
approx_top_k_accumulate
approx_top_k_combine
approx_top_k_estimate
Tuple sketch functions for distinct counting and key‑summary aggregation
SQL Cursor Support
Compound SQL statements now support:
DECLARE CURSOR
OPEN
FETCH
CLOSE
This enables row‑by‑row processing.
Behavioural Changes
FILTER clause now works with MEASURE aggregate functions
Timestamp partitions now use Spark session timezone instead of JVM timezone
DESCRIBE FLOW is now a reserved keyword
Streaming Improvements
Geospatial Performance Boost
Geospatial Boolean set operations now use a new, faster implementation, with minor precision differences beyond 15 decimal places.
DataFrame & Compute Enhancements
Cloud & External System Improvements
Apache Spark 4.1.0 Included
Databricks Runtime 18.1 ships with Apache Spark 4.1.0, bringing:
Major performance fixes
Improved pandas interoperability
New geospatial type support
Arrow and Pandas UDF improvements
Streaming enhancements
Stability and error‑handling improvements
Summary
Databricks Runtime 18.1 (Beta) builds on 18.0 with improvements across Auto Loader, Delta Lake, Unity Catalog, SQL, streaming, geospatial processing, and compute behavior, while upgrading to Apache Spark 4.1.0. The release focuses on performance optimization, schema flexibility, transaction reliability, and improved interoperability across cloud systems and analytics workloads.