EMR Serverless Made Easy: Submitting Hive SQL Queries for Beginners with NYC Taxi DatasetMay 13, 2023 bySoumil Shahguideapache hudiapache hiveamazon emremr serverlesshive sqlhive metastore
EMR Serverless for Beginners: | Ingest Data incrementally | Submit Spark Job with EMR-CLI |Data lakeMay 11, 2023 bySoumil Shahguideapache hudiamazon emremr Serverlessapache sparkdata lakeincremental data processing
Maximizing Efficiency DataLake(Hudi) Glue ETL Jobs with Templated Approach &Serverless ArchitectureMay 7, 2023 bySoumil Shahguideapache hudiaws glueetltemplated architectureserverless
How to Build Your Own Version of AWS Glue Bookmark to get Only New Incremental FilesMay 6, 2023 bySoumil Shahguideapache hudiaws glueincremental processingglue bookmarks
Build, deploy, and run Spark jobs on Amazon EMR with the open-source EMR CLI toolMay 3, 2023 bySoumil Shahguideamazon emr cliapache sparkamazon emr serverlessapache hudiamazon emrcommand line interface
Mastering Slowly Changing Dimension with Hudi: A Step-by-Step Guide to Efficient Data Management|May 3, 2023 bySoumil Shahguideapache hudidata managementdimension fieldsupdatesdata upsert
Building a Scalable and Resilient Streaming ETL Pipeline with Hudi's Incremental Processing #1May 1, 2023 bySoumil Shahguidestreamingstreaming etlincremental processingjoinsnear real-time analyticsapache hudi
Efficiently Managing Ride & Late Arriving Tips Data with Incremental ETL using Apache Hudi :Hands OnApril 29, 2023 bySoumil Shahguidelate arriving dataincremental etlupsertapache hudi
From Raw Data to Insights: Building a Lake House with Hudi and Star Schema | Step by Step GuideApril 26, 2023 bySoumil Shahguidelakehousestar schemaapache hudi
Joining Hudi Raw Tables for Powerful Data Analysis with Spark SQLApril 25, 2023 bySoumil Shahguidejoinsspark sqlapache hudi