Develop Incremental Pipeline with CDC from Hudi to Aurora Postgres | Demo VideoMarch 4, 2023 bySoumil Shahguideamazon s3aws glueamazon aurorapostgrescdcincremental queryincremental etlapache hudi
Use Glue 4.0 to take regular save points for your Hudi tables for backup or disaster RecoveryFebruary 22, 2023 bySoumil Shahguidebackupdisaster recoverysavepointrestoreaws glueapache hudi
How do I Ingest Extremely Small Files into Hudi Data lake with Glue Incremental data processingFebruary 7, 2023 bySoumil Shahguidesmall filesincremental-processingpysparkaws glueamazon s3apache hudi
Writing data quality and validation scripts for a Hudi data lake with AWS Glue and pydeequ| Hands on LabJanuary 23, 2023 bySoumil Shahguidedata qualityvalidationpydeequpythonaws glueapache hudi
How to detect and Mask PII data in Apache Hudi Data Lake | Hands on LabJanuary 21, 2023 bySoumil Shahguidemask piihipaagdprmaskingcomplianceamazon s3aws glueapache hudiamazon athena
How do I identify Schema Changes in Hudi Tables and Send Email Alert when New Column added/removedJanuary 20, 2023 bySoumil Shahguideschema changesschema evolutionalertingamazon s3aws glueapache hudiamazon athena
Leverage Apache Hudi incremental query to process new & updated data | Hudi LabsJanuary 17, 2023 bySoumil Shahguideincremental queryaws glueapache hudi
Leverage Apache Hudi upsert to remove duplicates on a data lake | Hudi LabsJanuary 17, 2023 bySoumil Shahguideduplicatesde-duplicateupsertaws glueapache hudi
Streaming ETL using Apache Flink joining multiple Kinesis streams | DemoJanuary 1, 2023 bySoumil Shahguidestreaming ingestionstreaming etljoinsamazon kinesisapache flinkaws glueapache hudi
Transaction Hudi Data Lake with Streaming ETL from Multiple Kinesis Streams & Joining using FlinkJanuary 1, 2023 bySoumil Shahguidestreaming ingestionstreaming etljoinsamazon kinesisapache flinkaws glueapache hudi