The design and latest progress of the integration of Apache Hudi and Apache Flink.
Ingesting multiple tables using Hudi at a single go is now possible. This blog gives a detailed explanation of how to achieve the same using HoodieMultiTableDeltaStreamer.java
Mechanisms for executing compaction jobs in Hudi asynchronously
Migrating a large parquet table to Apache Hudi without having to rewrite the entire dataset.
How Apache Hudi provides ability for incremental data processing.
Introducing the feature of reporting Hudi metrics via Datadog HTTP API
Integrating HUDI’s real-time and read-optimized query capabilities into Apache Zeppelin’s notebook
Learn how to copy or export HUDI dataset in various formats.
In this blog, we will build an end-end solution for capturing changes from a MySQL instance running on AWS RDS to a Hudi table on S3, using capabilities in the Hudi 0.5.1 release.
Deletes are supported at a record level in Hudi with 0.5.1 release. This blog is a “how to” blog on how to delete records in hudi.
Learn how to ingesting changes from a HUDI dataset using Sqoop/Hudi
How to manually register HUDI dataset into Hive using beeline
In the coming weeks, we will be moving in our new home on the Apache Incubator.
We will be presenting Hudi & general concepts around how incremental processing works at Uber. Catch our talk “Incremental Processing on Hadoop At Uber”