Skip to main content
Hudi banner

Welcome to Apache Hudi!

Hudi is the Streaming Data Lake Platform!

Hudi Data Lakes#

Hudi brings stream processing to data lakes, providing fresh data while being an order of magnitude efficient over traditional batch processing.
Hudi Data Lake

Hudi Features

Upserts, Deletes with fast, pluggable indexing.Incremental queries, Record level change streams
Transactions, Rollbacks, Concurrency Control.SQL Read/Writes from Spark, Presto, Trino, Hive & more
Automatic file sizing, data clustering, compactions, cleaning.Streaming ingestion, Built-in CDC sources & tools.
Built-in metadata tracking for scalable storage access.Backwards compatible schema evolution and enforcement.