Deep Dive Into Hudi’s Indexing Subsystem (Part 1 of 2)October 29, 2025 byShiyan Xuhudiindexingdata lakehousedata skipping
Partition Stats: Enhancing Column Stats in Hudi 1.0October 22, 2025 byAditya Goenka and Shiyan Xuhudiindexingdata lakehousedata skipping
Introducing Secondary Index in Apache Hudi Lakehouse PlatformApril 2, 2025 byDipankar Mazumdar, Aditya GoenkaApache HudiIndexingPerformance
Record Level Index: Hudi's blazing fast indexing for large-scale datasetsNovember 1, 2023 byShiyan Xu and Sivabalan Narayanandesignindexingmetadataapache hudiblog
UPSERT Performance Evaluation of Hudi 0.14 and Spark 3.4.1: Record Level Index vs. Global Bloom & Global Simple IndexesOctober 29, 2023 bySoumil Shahlinkedinapache hudiqueryingindexingperformance
Get started with Apache Hudi using AWS Glue by implementing key design concepts – Part 1October 17, 2023 bySrinivas KandiandRavi Ithaaws glueapache hudihow-toamazondesignaws glueupsertsbulk insertindexing
Text-Based Search: From Elastic Search to Vector SearchJune 3, 2023 byKaushik Muniandiblogvector searchindexingbloommedium
Speed up your write latencies using Bucket Index in Apache HudiApril 7, 2023 bySivabalan Narayananhow-toindexingmedium
What, Why and How : Apache Hudi’s Bloom IndexOctober 8, 2022 bySivabalan Narayananhow-todesignbloomindexingmedium
Hudi’s Column Stats Index and Data Skipping feature help speed up queries by an orders of magnitude!June 9, 2022 byAlexey Kudinkindesignindexingdata skippingonehouse