IBM Cloud Object Storage Filesystem

In this page, we explain how to get your Hudi spark job to store into IBM Cloud Object Storage.

IBM COS configs

There are two configurations required for Hudi-IBM Cloud Object Storage compatibility:

  • Adding IBM COS Credentials for Hudi
  • Adding required Jars to classpath

IBM Cloud Object Storage Credentials

Simplest way to use Hudi with IBM Cloud Object Storage, is to configure your SparkSession or SparkContext with IBM Cloud Object Storage credentials using Stocator storage connector for Spark. Hudi will automatically pick this up and talk to IBM Cloud Object Storage.

Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the fs.defaultFS with your IBM Cloud Object Storage bucket name and Hudi should be able to read/write from the bucket.

For example, using HMAC keys and service name myCOS:

  <property>
      <name>fs.defaultFS</name>
      <value>cos://myBucket.myCOS</value>
  </property>

  <property>
      <name>fs.cos.flat.list</name>
      <value>true</value>
  </property>

  <property>
	  <name>fs.stocator.scheme.list</name>
	  <value>cos</value>
  </property>

  <property>
	  <name>fs.cos.impl</name>
	  <value>com.ibm.stocator.fs.ObjectStoreFileSystem</value>
  </property>

  <property>
	  <name>fs.stocator.cos.impl</name>
	  <value>com.ibm.stocator.fs.cos.COSAPIClient</value>
  </property>

  <property>
	  <name>fs.stocator.cos.scheme</name>
	  <value>cos</value>
  </property>

  <property>
	  <name>fs.cos.myCos.access.key</name>
	  <value>ACCESS KEY</value>
  </property>

  <property>
	  <name>fs.cos.myCos.endpoint</name>
	  <value>http://s3-api.us-geo.objectstorage.softlayer.net</value>
  </property>

  <property>
	  <name>fs.cos.myCos.secret.key</name>
	  <value>SECRET KEY</value>
  </property>

For more options see Stocator documentation.

IBM Cloud Object Storage Libs

IBM Cloud Object Storage hadoop libraries to add to our classpath

  • com.ibm.stocator:stocator:1.1.3
Back to top ↑