
15 posts tagged with "apache spark"
View All Tags

Apache Hudi, Spark and Minio: Hands-on Lab in Docker

Hands-on with Apache Hudi and Spark

Building Data Lakes on AWS with Kafka Connect, Debezium, Apicurio Registry, and Apache Hudi

Building an Open Source Data Lake House with Hudi, Postgres Hive Metastore, Minio, and StarRocks

Apache Hudi: Managing Partition on a petabyte-scale table

Leverage Partition Paths of your data lake tables to Optimize Data Retrieval Costs on the cloud

Data Engineering: Bootstrapping Data lake with Apache Hudi

Learn How to Move Data From MongoDB to Apache Hudi Using PySpark

In-House Data Lake with CDC Processing, Hudi, Docker

Introduction to Apache Hudi
