Sono abbastanza nuovo per Spark. Ho provato a cercare ma non ho potuto ottenere una soluzione adeguata. Ho installato hadoop 2.7.2 su due caselle (un nodo principale e l'altro nodo di lavoro). Ho installato il cluster seguendo il link sottostante http://javadev.org/docs/hadoop/centos/6/installation/multi-node-installation-on-centos-6-non-sucure-mode/ Stavo eseguendo hadoop e spark un'applicazione come utente root per testare il cluster.Spark Job in esecuzione su Yarn Cluster java.io.FileNotFoundException: Il file non esce, anche se il file termina sul nodo master
Ho installato la scintilla sul nodo master e la scintilla inizia senza errori. Tuttavia, quando invio il lavoro utilizzando spark submit, ricevo l'eccezione File Not Found anche se il file è presente nel nodo master nella stessa posizione nell'errore. Sto eseguendo il comando Spark Submit sotto il comando Spark Submit e trovo l'output dei registri sotto comando.
/bin/spark-submit --class com.test.Engine --master yarn --deploy-mode cluster /app/spark-test.jar
16/04/21 19:16:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/04/21 19:16:13 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/04/21 19:16:14 INFO Client: Requesting a new application from cluster with 1 NodeManagers 16/04/21 19:16:14 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 16/04/21 19:16:14 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 16/04/21 19:16:14 INFO Client: Setting up container launch context for our AM 16/04/21 19:16:14 INFO Client: Setting up the launch environment for our AM container 16/04/21 19:16:14 INFO Client: Preparing resources for our AM container 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/app/spark-test.jar 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-120aeddc-0f87-4411-9400-22ba01096249/__spark_conf__5619348744221830008.zip 16/04/21 19:16:14 INFO SecurityManager: Changing view acls to: root 16/04/21 19:16:14 INFO SecurityManager: Changing modify acls to: root 16/04/21 19:16:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 16/04/21 19:16:15 INFO Client: Submitting application 1 to ResourceManager 16/04/21 19:16:15 INFO YarnClientImpl: Submitted application application_1461246306015_0001 16/04/21 19:16:16 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:16 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461246375622 final status: UNDEFINEDsparkcluster01.testing.com tracking URL: http://sparkcluster01.testing.com:8088/proxy/application_1461246306015_0001/ user: root 16/04/21 19:16:17 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:18 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:19 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:20 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:21 INFO Client: Application report for application_1461246306015_0001 (state: FAILED) 16/04/21 19:16:21 INFO Client: client token: N/A diagnostics: Application application_1461246306015_0001 failed 2 times due to AM Container for appattempt_1461246306015_0001_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001Then, click on links to logs of each attempt. Diagnostics: java.io.FileNotFoundException: File file:/app/spark-test.jar does not exist Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461246375622 final status: FAILED tracking URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001 user: root Exception in thread "main" org.ap/app/spark-test.jarache.spark.SparkException: Application application_1461246306015_0001 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Ho anche provato a fare funzionare la scintilla nel file system HDFS mettendo la mia applicazione su HDFS e dando il percorso HDFS nel Spark Invia comando. Anche in quel caso l'eccezione File not found di lancio su un file Spark Conf. Sto eseguendo il comando Spark Submit qui sotto e trovo l'output dei registri sotto il comando.
./bin/spark-submit --class com.test.Engine --master yarn --deploy-mode cluster hdfs://sparkcluster01.testing.com:9000/beacon/job/spark-test.jar
16/04/21 18:11:45 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/04/21 18:11:46 INFO Client: Requesting a new application from cluster with 1 NodeManagers 16/04/21 18:11:46 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 16/04/21 18:11:46 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 16/04/21 18:11:46 INFO Client: Setting up container launch context for our AM 16/04/21 18:11:46 INFO Client: Setting up the launch environment for our AM container 16/04/21 18:11:46 INFO Client: Preparing resources for our AM container 16/04/21 18:11:46 INFO Client: Source and destination file systems are the same. Not copying file:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar 16/04/21 18:11:47 INFO Client: Uploading resource hdfs://sparkcluster01.testing.com:9000/beacon/job/spark-test.jar -> file:/root/.sparkStaging/application_1461234217994_0017/spark-test.jar 16/04/21 18:11:49 INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip 16/04/21 18:11:50 INFO SecurityManager: Changing view acls to: root 16/04/21 18:11:50 INFO SecurityManager: Changing modify acls to: root 16/04/21 18:11:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 16/04/21 18:11:50 INFO Client: Submitting application 17 to ResourceManager 16/04/21 18:11:50 INFO YarnClientImpl: Submitted application application_1461234217994_0017 16/04/21 18:11:51 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:51 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461242510849 final status: UNDEFINED tracking URL: http://sparkcluster01.testing.com:8088/proxy/application_1461234217994_0017/ user: root 16/04/21 18:11:52 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:53 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:54 INFO Client: Application report for application_1461234217994_0017 (state: FAILED) 16/04/21 18:11:54 INFO Client: client token: N/A diagnostics: Application application_1461234217994_0017 failed 2 times due to AM Container for appattempt_1461234217994_0017_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017Then, click on links to logs of each attempt. Diagnostics: File file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip does not exist java.io.FileNotFoundException: File file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461242510849 final status: FAILED tracking URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017 user: root Exception in thread "main" org.apache.spark.SparkException: Application application_1461234217994_0017 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 16/04/21 18:11:55 INFO ShutdownHookManager: Shutdown hook called 16/04/21 18:11:55 INFO ShutdownHookManager: Deleting directory /tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21
È bello che tu abbia incluso tutte queste informazioni, ma il codice scintilla non sarebbe stato utile per diagnosticare il problema? –
@ cricket_007 Il problema non è dovuto al codice spark perché anche se eseguo spark shell o altri esempi di scintille usando Yarn, sto ricevendo lo stesso errore. per esempio: spark-shell --master yarn-client – Ajay
Potrebbe aggiungere i tuoi log di filati per favore. potresti ottenerli facendo '$ logs di filato -applicationId application_1461246306015_0001' – user1314742