Skip to content

Does v1.6 support spark on yarn? #1901

Open
@foxgarden

Description

@foxgarden

Hi,

In reference.conf I set (and some other minor options):
# The execution modes in Sparta are: local, mesos or marathon
sparta.config.executionMode = yarn

# Yarn cluster name
sparta.yarn.master = yarn

# Cluster or Client. If the user need more than one policy running is necessary use "cluster". Is the same as the variable spark.submit.deployMode
sparta.yarn.deployMode = cluster

I have a correct workflow which can run on local mode, but after switching to yarn mode, I get below logs. It seems like sparta cannot connect with Resource Manager. Could anybody help with this issue?

02 Jul 2018 15:29:31.053 INFO c.s.s.s.c.a.ClusterLauncherActor Sparta submit options initialized correctly
02 Jul 2018 15:29:31.062 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Failed ---> NotStarted
Status Information: The checker detects that the policy not start/stop correctly ---> Sparta submit options initialized correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:31.103 INFO c.s.s.s.c.a.ClusterLauncherActor Launching Sparta Job with options ...
Policy name: test1
Main Class: com.stratio.sparta.driver.SparkDriver
Driver file: http://0.0.0.0:9090/sparta/driver/driver-1.6.0-SNAPSHOT.jar
Master: yarn
Spark submit arguments: --deploy-mode -> cluster,--num-executors -> 1,--properties-file -> /etc/spark2/conf/spark-defaults.conf,--proxy-user -> hdfs
Spark configurations: spark.sql.parquet.binaryAsString -> true,spark.app.name -> test1-2018/07/02-03:29:30,spark.driver.memory -> 1G,spark.driver.cores -> 1,spark.mesos.driverEnv.SPARK_USER -> ,spark.executor.memory -> 1G,spark.executor.cores -> 1
Driver arguments: Map(plugins -> ICw=, clusterConfig -> eyJ5YXJuIjp7ImRlcGxveU1vZGUiOiJjbHVzdGVyIiwiZHJpdmVyQ29yZXMiOjEsImRyaXZlck1lbW9yeSI6IjFHIiwiZXhlY3V0b3JDb3JlcyI6MSwiZXhlY3V0b3JNZW1vcnkiOiIxRyIsImtpbGxVcmwiOiIvdjEvc3VibWlzc2lvbnMva2lsbCIsIm1hc3RlciI6Inlhcm4iLCJudW1FeGVjdXRvcnMiOjEsInByb3BlcnRpZXNGaWxlIjoiL2V0Yy9zcGFyazIvY29uZi9zcGFyay1kZWZhdWx0cy5jb25mIiwicHJveHktdXNlciI6ImhkZnMiLCJzcGFyayI6eyJzcWwiOnsicGFycXVldCI6eyJiaW5hcnlBc1N0cmluZyI6dHJ1ZX19fSwic3BhcmtIb21lIjoiL29wdC9jbG91ZGVyYS9wYXJjZWxzL1NQQVJLMi0yLjEuMC5jbG91ZGVyYTItMS5jZGg1LjcuMC5wMC4xNzE2NTgvbGliL3NwYXJrMiJ9fQ==, detailConfig -> eyJjb25maWciOnsiYWRkVGltZVRvQ2hlY2twb2ludFBhdGgiOmZhbHNlLCJhdXRvRGVsZXRlQ2hlY2twb2ludCI6dHJ1ZSwiYXdhaXRQb2xpY3lDaGFuZ2VTdGF0dXMiOiIxODBzIiwiYmFja3Vwc0xvY2F0aW9uIjoiL29wdC9zZHMvc3BhcnRhL2JhY2t1cHMiLCJjaGVja3BvaW50UGF0aCI6Ii90bXAvc3BhcnRhL2NoZWNrcG9pbnQiLCJkcml2ZXJQYWNrYWdlTG9jYXRpb24iOiIvb3B0L3Nkcy9zcGFydGEvZHJpdmVyIiwiZHJpdmVyVVJJIjoiaHR0cDovLzAuMC4wLjA6OTA5MC9zcGFydGEvZHJpdmVyL2RyaXZlci0xLjYuMC1TTkFQU0hPVC5qYXIiLCJleGVjdXRpb25Nb2RlIjoieWFybiIsImZyb250ZW5kIjp7InRpbWVvdXQiOjUwMDB9LCJwbHVnaW5QYWNrYWdlTG9jYXRpb24iOiIvb3B0L3Nkcy9zcGFydGEvcGx1Z2lucyIsInJlbWVtYmVyUGFydGl0aW9uZXIiOnRydWV9fQ==, storageConfig -> IA==, policyId -> d23359d0-de5b-4589-bb5a-236b1bde8eed, zookeeperConfig -> eyJ6b29rZWVwZXIiOnsiY29ubmVjdGlvblN0cmluZyI6IjEwLjAuMTEuMjI6MjE4MSwxMC4wLjExLjMwOjIxODEsMTAuMC4xMS4zMToyMTgxIiwiY29ubmVjdGlvblRpbWVvdXQiOjE1MDAwLCJyZXRyeUF0dGVtcHRzIjo1LCJyZXRyeUludGVydmFsIjoxMDAwMCwic2Vzc2lvblRpbWVvdXQiOjYwMDAwfX0=)
02 Jul 2018 15:29:31.128 INFO c.s.s.s.c.a.ClusterLauncherActor Sparta cluster job launched correctly
02 Jul 2018 15:29:31.131 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: NotStarted ---> Launched
Status Information: Sparta submit options initialized correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> UNKNOWN
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:31.205 INFO c.s.s.s.c.a.ClusterLauncherActor Cluster context listener added to test1 with id: d23359d0-de5b-4589-bb5a-236b1bde8eed
02 Jul 2018 15:29:31.218 INFO c.s.s.s.c.a.ClusterLauncherActor Starting scheduler task in awaitPolicyChangeStatus with time: 180s
02 Jul 2018 15:29:33.764 INFO c.s.s.s.c.a.ClusterLauncherActor Submission state changed to ... CONNECTED
02 Jul 2018 15:29:33.767 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Launched
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: UNKNOWN ---> CONNECTED
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:34.299 INFO c.s.s.s.c.a.ClusterLauncherActor Submission state changed to ... LOST
02 Jul 2018 15:29:34.301 INFO c.s.s.s.c.a.ClusterLauncherActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Launched
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: CONNECTED ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:51.657 INFO c.s.s.s.core.actor.StatusActor Updating context d23359d0-de5b-4589-bb5a-236b1bde8eed with name test1:
Status: Launched ---> Stopping
Status Information: Sparta cluster job launched correctly ---> Sparta cluster job launched correctly
Submission Id: undefined ---> undefined
Submission Status: LOST ---> LOST
Marathon Id: undefined ---> undefined
Last Error: undefined ---> undefined
Last Execution Mode: yarn-cluster ---> yarn-cluster
Resource Manager URL: undefined ---> undefined
02 Jul 2018 15:29:51.678 INFO c.s.s.s.c.a.ClusterLauncherActor Stopping message received from Zookeeper
02 Jul 2018 15:29:51.678 INFO c.s.s.s.c.a.ClusterLauncherActor The Sparta System don't have submission id associated to policy test1
02 Jul 2018 15:29:51.679 INFO c.s.s.s.c.a.ClusterLauncherActor Node cache to cluster context listener closed correctly

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions