Skip to main content

Archive

Archive

2026

2025

2024

January 1 - Data Lake to Microservices: Apache Hudi's Record Index, FastAPI, Spark Connect with Swagger UI
January 6 - Dynamic Delta Streamer Jobs with JDBC Puller for Postgres | Bring all Tables from particular Schema
January 6 - Dynamic Delta Streamer Jobs with JDBC Puller for Postgres | Bring all Tables from particular Schema- Full Video
January 13 - Setup HUDI with AWS Glue and MINIO locally using Docker Container in Minutes
January 17 - How to Delete Items from Hudi using Delta Streamer operating in UPSERT Mode with Kafka Avro MSG #12
January 21 - Learn How to Move Data From MongoDB to Apache Hudi Using PySpark
February 3 - Apache Hudi Table Services | Offline Compaction | HoodieCompactor | Hands on labs
February 3 - Apache Hudi Table Services | Export Services | HoodieSnapshotExporter | Hands on labs
February 7 - Building an Open Source Data Lake House with Hudi, Postgres Hive Metastore, Minio, and StarRocks
February 10 - Data Ingestion to Visualization: Hudi + MinIO + StarRocks + HiveMetaStore + Apache SuperSet Hands on Guide
February 17 - Learn How to Integerate Hudi Spark job with Airflow and MinIO | Hands on Labs
February 18 - Build Incremental ETL pipeline with Hudi and Airflow and MinIO
February 23 - Getting Started with Open Data lineage | Marquez Project | Apache Hudi Spark jobs
February 27 - Learn How you can run DeltaStreamer Running on AWS Glue with Hudi 0.14 Step by Step Guide
March 1 - How to Query Apache Hudi tables from Glue Interactive Notebook for AdHoc Analysis
March 11 - Getting Started Tutorial: Building a Data Lakehouse With StarRocks, Apache Hudi, and MinIO
March 12 - Managing Updates & Deletes in Glue Hudi Spark Jobs with CDC Data
March 18 - Mastering Incremental ETL with DeltaStreamer and SQL-Based Transformer
March 20 - How to perform Backfilling jobs with Hudi DeltaStreamer and Spark SQL using SqlSource Class
March 29 - Open Lakehouse Evolution: Powering the Future with YugabyteDB & Apache Hudi | Episode 102
March 30 - Building DataLakeHouse: XTable, MinIO, StarRocks, DeltaStreamer - Interoperating Hudi, IceBerg,Delta
April 3 - Reading Data from Hudi INC & Joining with Delta Tables using HudiStreamer & SQL-Based Transformer
April 6 - Build Universal Data lake with Posgres + Debezium+Kafka+DeltaSTreamer + Minio+HiveMetastore+Trino
April 10 - Build Universal Data lake with MySQL + Debezium+Kafka+DeltaSTreamer + Minio+HiveMetastore+Trino
April 22 - Hudi with Kyuubi, a distributed & multi-tenant gateway, to provide serverless SQL on lakehouses
May 4 - Learn How to Display Data From Hudi Tables to your Frontend with Flask and Daft (NO SPARK NEEDED)
May 8 - How to read Hudi Dataset Using AWS Glue Ray and Glue Notebooks (withouth Spark)
May 12 - Unleashing the Power of Serverless: Serving Gold Hudi Tables with AWS Lambda
May 18 - Learn How to use Cloudwatch metrics with Hudi AWS Glue Jobs
May 20 - DeltaStreamer with incremental ETL and Broadcast Joins for Faster ETL
May 22 - Hudi Streamer implementing Slowly Changing Dimension Type 2 and Query Real Time Trino | Hands on
May 22 - Demo Video : Hudi Delta Streamer Implementing Slowly Changing Dimension and Query that using Trino
May 23 - Build Hudi Date Dimension in Minutes with Spark SQL Minio and Query with Trino
May 25 - Learn How to Ingest data from pulsar Topic into Hudi with DeltaStreamer | Hands on Labs
June 5 - Multiple Spark Writers to Hudi tables | Hands on Labs
June 12 - Hudi Cleaning Process | hoodie.keep.min.commits and hoodie.keep.max.commits Explained
June 15 - How we Utilized Hudi's Time Travel Query to Investigate Bid and Spend | Going Back in Time with Hudi
June 16 - Hudi with Spark SQL for Beginners | Insert| Updates | Delete | incremental Query | Stored procedures
June 18 - Learn How to Ingest XML files with AWS Glue into Hudi Datalakes | Step by Step guide
June 21 - 4 Different Ways to fetch Apache Hudi Commit time in Python and PySpark
September 1 - How to Consume Apache Hudi Tables in Snowflake, Iceberg, and Athena | Hands-On Labs
September 26 - Create Apache Hudi table using Glue(in catalog) by reading streaming data from AWS Kinesis
October 6 - Learn How to Read Hudi Tables on S3 Locally in Your PySpark Job | Essential Packages You Need to Use
October 22 - Practice of building a lakehouse based on Apache Hudi at Kuaishou Inc
November 17 - Create Data Lake using aws Glue as beginner
December 25 - Learn About Secondary Indexes in Apache Hudi 1.0.0 | Hands-On Labs

2023

2022