How do I Ingest Extremely Small Files into Hudi Data lake with Glue Incremental data processingFebruary 7, 2023 bySoumil Shahguidesmall filesincremental-processingpysparkaws glueamazon s3apache hudi
Writing data quality and validation scripts for a Hudi data lake with AWS Glue and pydeequ| Hands on LabJanuary 23, 2023 bySoumil Shahguidedata qualityvalidationpydeequpythonaws glueapache hudi
How to detect and Mask PII data in Apache Hudi Data Lake | Hands on LabJanuary 21, 2023 bySoumil Shahguidemask piihipaagdprmaskingcomplianceamazon s3aws glueapache hudiamazon athena
How do I identify Schema Changes in Hudi Tables and Send Email Alert when New Column added/removedJanuary 20, 2023 bySoumil Shahguideschema changesschema evolutionalertingamazon s3aws glueapache hudiamazon athena
Leverage Apache Hudi incremental query to process new & updated data | Hudi LabsJanuary 17, 2023 bySoumil Shahguideincremental queryaws glueapache hudi
Leverage Apache Hudi upsert to remove duplicates on a data lake | Hudi LabsJanuary 17, 2023 bySoumil Shahguideduplicatesde-duplicateupsertaws glueapache hudi
Streaming ETL using Apache Flink joining multiple Kinesis streams | DemoJanuary 1, 2023 bySoumil Shahguidestreaming ingestionstreaming etljoinsamazon kinesisapache flinkaws glueapache hudi
Transaction Hudi Data Lake with Streaming ETL from Multiple Kinesis Streams & Joining using FlinkJanuary 1, 2023 bySoumil Shahguidestreaming ingestionstreaming etljoinsamazon kinesisapache flinkaws glueapache hudi
Bring Data from Source using Debezium with CDC into Kafka&S3Sink &Build Hudi Datalake | Hands on labDecember 27, 2022 bySoumil Shahguidepostgresqlmysqldebeziumincremental etlapache kafkaapache hudiaws glueamazon athenapostgres
Apache Hudi with DBT Hands on Lab.Transform Raw Hudi tables with DBT and Glue Interactive SessionDecember 23, 2022 bySoumil Shahguidedbtaws glueapache hudi