Skip to main content

Hudi by Example

· One min read

I had a fortunate opportunity to work on the adoption of Apache Hudi at my current workplace.

Spent a great deal of time in understanding, evaluating, deploying, and operating Apache Hudi based Data lakehouse platform.

Apache Hudi project became feature rich and stable over the past year with great releases (https://hudi.apache.org/releases/release-0.12.0/), But I had lot of tough debugging experiences initially.

So had to dive into the source code to find fixes sometimes. But Hudi community members were really helpful.

I have a pending Kubernetes based lock implementation for which I have to find time to test and submit upstream.

Tried to build UI interfaces for easy onboarding of tables and blogged about it.

Presented in the community meeting

Even thought of creating a 'table-service'

So just to help out any Data Engineers, planning to create a tutorial series on Hudi based on my temporal locality of knowledge.

Based on the responses I got from inquiry post, it can be useful to few folks.

Keep checking https://www.ekalavya.dev/how-i-integrated-airbyte-and-apache-hudi-again/ for updates.

For a managed cloud data warehouse offering on top of Apache Hudi.

Visit https://www.onehouse.ai/