2015-10-07 7 views
6

Sto tentando di eseguire spark-submit su un file jar che ho creato. Quando lo eseguo localmente sulla mia macchina funziona correttamente, ma quando viene distribuito su Amazon EC2 restituisce il seguente errore.Spark Streaming su EC2: eccezione nel thread "main" java.lang.ExceptionInInitializerError

[email protected] bin]$ ./spark-submit --master local[2] --class main.java.Streamer ~/streaming-project-1.0-jar-with-dependencies.jar 
Exception in thread "main" java.lang.ExceptionInInitializerError 
    at org.apache.spark.streaming.StreamingContext$.<init>(StreamingContext.scala:728) 
    at org.apache.spark.streaming.StreamingContext$.<clinit>(StreamingContext.scala) 
    at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:81) 
    at main.java.Streamer$.main(Streamer.scala:24) 
    at main.java.Streamer.main(Streamer.scala) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672) 
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) 
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) 
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) 
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
Caused by: java.lang.NoSuchFieldException: SHUTDOWN_HOOK_PRIORITY 
    at java.lang.Class.getField(Class.java:1592) 
    at org.apache.spark.util.SparkShutdownHookManager.install(ShutdownHookManager.scala:220) 
    at org.apache.spark.util.ShutdownHookManager$.shutdownHooks$lzycompute(ShutdownHookManager.scala:50) 
    at org.apache.spark.util.ShutdownHookManager$.shutdownHooks(ShutdownHookManager.scala:48) 
    at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:189) 
    at org.apache.spark.util.ShutdownHookManager$.<init>(ShutdownHookManager.scala:58) 
    at org.apache.spark.util.ShutdownHookManager$.<clinit>(ShutdownHookManager.scala) 
... 14 more 

Qui di seguito è il mio file di pom.xml:

<?xml version="1.0" encoding="UTF-8"?> 
<project> 
    <groupId>astiefel</groupId> 
    <artifactId>streaming-project</artifactId> 
    <modelVersion>4.0.0</modelVersion> 
    <name>Streamer Project</name> 
    <packaging>jar</packaging> 
    <version>1.0</version> 
    <properties> 
     <maven.compiler.source>1.6</maven.compiler.source> 
     <maven.compiler.target>1.6</maven.compiler.target> 
     <encoding>UTF-8</encoding> 
     <scala.tools.version>2.10</scala.tools.version> 
     <!-- Put the Scala version of the cluster --> 
     <scala.version>2.10.4</scala.version> 

    </properties> 
    <dependencies> 
     <dependency> <!-- Spark dependency --> 
      <groupId>org.apache.spark</groupId> 
      <artifactId>spark-core_2.10</artifactId> 
      <version>1.5.1</version> 
     </dependency> 
     <dependency> 
      <groupId>org.apache.spark</groupId> 
      <artifactId>spark-streaming_2.10</artifactId> 
      <version>1.5.1</version> 
     </dependency> 
     <dependency> 
      <groupId>org.scala-lang</groupId> 
      <artifactId>scala-compiler</artifactId> 
      <version>${scala.version}</version> 
      <scope>compile</scope> 
     </dependency> 
     <dependency> 
      <groupId>org.apache.spark</groupId> 
      <artifactId>spark-mllib_2.10</artifactId> 
      <version>1.0.0</version> 
     </dependency> 
     <dependency> 
      <groupId>org.apache.spark</groupId> 
      <artifactId>spark-mllib_2.10</artifactId> 
      <version>1.3.1</version> 
     </dependency> 
     <dependency> 
      <groupId>org.scalanlp</groupId> 
      <artifactId>breeze_2.10</artifactId> 
      <version>0.10</version> 
     </dependency> 
     <dependency> 
      <groupId>org.apache.spark</groupId> 
      <artifactId>spark-sql_2.10</artifactId> 
      <version>1.4.0</version> 
     </dependency> 
    </dependencies> 

    <repositories> 
     <repository> 
      <id>cloudera-repo-releases</id> 
      <url>https://repository.cloudera.com/artifactory/repo/</url> 
     </repository> 
    </repositories> 

    <build> 
     <sourceDirectory>src/main/java</sourceDirectory> 
     <plugins> 
      <plugin> 
       <!-- see http://davidb.github.com/scala-maven-plugin --> 
       <groupId>net.alchim31.maven</groupId> 
       <artifactId>scala-maven-plugin</artifactId> 
       <!--<version>3.1.3</version>--> 
       <executions> 
        <execution> 
         <goals> 
          <goal>compile</goal> 
          <goal>testCompile</goal> 
         </goals> 
         <configuration> 
          <args> 
           <arg>-make:transitive</arg> 
           <arg>-dependencyfile</arg> 
           <arg>${project.build.directory}/.scala_dependencies</arg> 
          </args> 
         </configuration> 
        </execution> 
       </executions> 
      </plugin> 
      <plugin> 
       <groupId>org.apache.maven.plugins</groupId> 
       <artifactId>maven-surefire-plugin</artifactId> 
       <!--<version>2.13</version>--> 
       <configuration> 
        <useFile>false</useFile> 
        <disableXmlReport>true</disableXmlReport> 
        <!-- If you have classpath issue like NoDefClassError,... --> 
        <useManifestOnlyJar>false</useManifestOnlyJar> 
        <includes> 
         <include>**/*Test.*</include> 
         <include>**/*Suite.*</include> 
        </includes> 
       </configuration> 
      </plugin> 

      <!-- "package" command plugin --> 
      <plugin> 
       <artifactId>maven-assembly-plugin</artifactId> 
       <!--<version>2.4.1</version>--> 
       <configuration> 
        <descriptorRefs> 
         <descriptorRef>jar-with-dependencies</descriptorRef> 
        </descriptorRefs> 
       </configuration> 
       <executions> 
        <execution> 
         <id>make-assembly</id> 
         <phase>package</phase> 
         <goals> 
          <goal>single</goal> 
         </goals> 
        </execution> 
       </executions> 
      </plugin> 
     </plugins> 
    </build> 
</project> 
+0

quale versione di scintilla hai su ec2? – eliasah

+0

Penso che sia 1.5.0 – astiefel

+0

Ho esattamente lo stesso errore – Neil

risposta

6

Quando si avvia spark-ec2 la versione di default Hadoop è 1.2.1. Tuttavia, le versioni recenti di Spark (almeno 1.5.1) richiedono il campo SHUTDOWN_HOOK_PRIORITY nella classe hadoop.fs.FileSystem introdotta in Hadoop 2+.

Una soluzione per ovviare a questo problema è avviare il cluster di accensione con Hadoop versione 2+. Vedi spark-ec2 --help per le opzioni disponibili. Esempio: --hadoop-major-version=yarn installerà la versione 2.4 di Hadoop.

+0

Grazie mille; quello lo ha riparato per me. – Neil