JuiceFS
In this page, we explain how to use Hudi with JuiceFS.
JuiceFS configs
JuiceFS is a high-performance distributed file system. Any data stored into JuiceFS, the data itself will be persisted in object storage (e.g. Amazon S3), and the metadata corresponding to the data can be persisted in various database engines such as Redis, MySQL, and TiKV according to the needs of the scene.
There are three configurations required for Hudi-JuiceFS compatibility:
- Creating JuiceFS file system
- Adding JuiceFS configuration for Hudi
- Adding required JAR to
classpath
Creating JuiceFS file system
JuiceFS supports multiple metadata engines such as Redis, MySQL, SQLite, and TiKV. And supports almost all object storage as data storage, e.g. Amazon S3, Google Cloud Storage, Azure Blob Storage.
The following example uses Redis as "Metadata Engine" and Amazon S3 as "Data Storage" in Linux environment.
Download JuiceFS client
$ JFS_LATEST_TAG=$(curl -s https://api.github.com/repos/juicedata/juicefs/releases/latest | grep 'tag_name' | cut -d '"' -f 4 | tr -d 'v')
$ wget "https://github.com/juicedata/juicefs/releases/download/v${JFS_LATEST_TAG}/juicefs-${JFS_LATEST_TAG}-linux-amd64.tar.gz"