Hands-On Guide: Reading Data from Hudi Tables Incrementally, Joining with Delta Tables using HudiStreamer and SQL-Based TransformerApril 3, 2024 bySoumil Shahblogapache hudideltastreamerhudi streamerdeltasql transformerlinkedin
Record Level Indexing in Apache Hudi Delivers 70% Faster Point LookupsMarch 30, 2024 bySoumil Shahblogapache hudirecord level indexperformancelinkedin
Options on Kafka sink to open table Formats: Apache Iceberg and Apache HudiMarch 23, 2024 byAlbert Wongblogapache hudiapache icebergapache Kafkakafka connectstarrocksdevgenius
Cost Optimization Strategies for scalable Data LakehouseMarch 22, 2024 bySuresh Hasundiblogapache hudiamazon s3amazon emrapcache sparklakehousecost optimizationhalodoc
Modern Datalakes with Hudi, MinIO, and HMSMarch 14, 2024 byBrenna Buuckblogapache hudiminiohmshive metastoremin
Navigating the Future: The Evolutionary Journey of Upstox’s Data PlatformMarch 10, 2024 byManish Gauravuse-caseapache hudiupstox-engineering
Apache Hudi: From Zero To One (9/10)March 5, 2024 byShiyan Xublogapache hudideltastreamerhudi streamertable servicedatumagic
Building Data Lakes on AWS with Kafka Connect, Debezium, Apicurio Registry, and Apache HudiFebruary 27, 2024 byGary A. Staffordblogapache hudiitnextbeginnerapache kafkakafka connectdebeziumapicurio registryawsapache sparkdeltastreamerhudi streameramazon rdsamazon mksamazon eksaws glueamazon emr
How a POC became a production-ready Hudi data lakehouse through close team collaborationFebruary 12, 2024 byXiaoxiao Rey and Hussein Awalause-caseapache hudileboncoin-tech-blogbeginnerdeletegdpr deletionupsert
Building an Open Source Data Lake House with Hudi, Postgres Hive Metastore, Minio, and StarRocksFebruary 6, 2024 bySoumil Shahblogapache hudilinkedinbeginnerapache sparkapache hivehive metastoreminiostarrocksdockerpythonpostgrespostgresql
Apache Hudi: Managing Partition on a petabyte-scale tableFebruary 4, 2024 byKrishna Prasadblogapache hudimediumintermediatepartitionaws glueapache sparkaws s3
Leverage Partition Paths of your data lake tables to Optimize Data Retrieval Costs on the cloudJanuary 30, 2024 byKrishna Prasadblogapache hudimediumintermediateaws gluecostapache sparkpartition