This initialization action installs Hive HCatalog on a Google Cloud Dataproc cluster. The script will also configure Pig to use HCatalog.
You can use this initialization action to create a new Cloud Dataproc cluster with HCatalog installed by doing the following.
-
Use the
gcloud
command to create a new cluster with this initialization action.REGION=<region> CLUSTER_NAME=<cluster_name> gcloud dataproc clusters create ${CLUSTER_NAME} \ --region ${REGION} \ --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/hive-hcatalog/hive-hcatalog.sh
-
Once the cluster has been created HCatalog should be installed and configured for use with Pig.