Use Amazon Athena with Spark SQL for your open-source transactional table formatsJanuary 24, 2024 by Pathik Shah, Raj Devnathbeginnerqueryingclusteringcompactionapache icebergawsdelta lake
Data Engineering: Bootstrapping Data lake with Apache HudiJanuary 20, 2024 by Krishna Prasadbeginneretlawsapache spark
Learn How to Move Data From MongoDB to Apache Hudi Using PySparkJanuary 20, 2024 by Soumil Shahbeginnermongodbapache spark
Deleting Items from Apache Hudi using Delta Streamer in UPSERT Mode with Kafka Avro MessagesJanuary 18, 2024 by Soumil Shahbeginnerhudi streamerapache kafkaapache avrodml
Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake FormationJanuary 17, 2024 by Raymond Lai, Aditya Shah, Bin Wang, and Melody Yangawsaccess control
In-House Data Lake with CDC Processing, Hudi, DockerJanuary 11, 2024 by Rahuldockercdcapache kafkadebeziumapache sparkaws
Build a federated query solution with Apache Doris, Apache Flink, and Apache HudiJanuary 2, 2024 by Apache Dorisbeginnerapache dorisapache flink
From Data lake to Microservices: Unleashing the Power of Apache Hudi's Record Level Index with FastAPI and Spark ConnectJanuary 1, 2024 by Soumil Shahbeginnerapache sparkindexingdmlfastapi