Livy is a wrapper
around spark-submit
to expose Spark as a REST service for integration with external applications.
As it is still in beta, it is not included in the CDH bundle (as of CDH 5.5.1). This scripts in this repo creates both a Livy parcel and CSD for deploying it with Cloudera Manager.
Cloudera publishes a tool for making CM parcels. Install it on your workstation like so:
mkdir -p ~/github/cloudera; cd ~/github/cloudera
git clone [email protected]:cloudera/cm_ext.git; cd cm_ext
mvn package
- Create the Livy parcel:
./build_parcel.sh <Version> <Distro>
. Example:./build_parcel.sh 1.0 wheezy
- Serve the parcel:
./serve-parcel.sh
- In CM, add your machine to the list of
Remote Parcel Repository URLs
and clickCheck for New Parcels
. - Download, Distribute, Activate. No need to restart the cluster as this parcel is not a dependency for any service.
- Create the CSD:
./build_csd.sh <Version>
. Example:./build_csd.sh 1.0
. The CSD version is independent of the parcel version. - Copy the CSD jar (
LIVY-1.0.jar
) to CM host at/opt/cloudera/csd/
and change the ownership of the jar tocloudera-scm:cloudera-scm
- Restart CM
/etc/init.d/cloudera-scm-server restart
- Restart Cloudera Management Service from the CM UI.
- Install the Livy service with Add a service option in CM.
Note: Step 3 and 4 can be avoided by using the
experimental
api for installing CSDs without restarting CM with a GET https://CM:7180/cmf/csd/install?csdName=LIVY-1.0
Livy has been defined as a javabased service in CSD which enables the basic monitoring and configuration relevant to java process through CM. Additional Livy parameters has also be added to the CSD to enable configuration management from CM. We are forcing logback as the logging implementation.
As of now, custom CSDs support declaring custom montitoring metrics but no way to publish them from the service.