2015-05-20 14 views
5

Ho impostato il mio CHD5.4.1 per eseguire alcuni test Spark SQL su Spark. Spark funziona bene ma Spark SQL ha alcuni problemi.Esegui spark SQL su CHD5.4.1 NoClassDefFoundError

comincio pyspark come di seguito:

/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/bin/pyspark --master yarn-client

voglio selezionare una tabella in Hive con Spark SQL: results = sqlCtx.sql("SELECT * FROM my_table").collect()

It log degli errori di stampa: http://pastebin.com/u98psBG8

> Welcome to 
>  ____    __ 
> /__/__ ___ _____/ /__ 
>  _\ \/ _ \/ _ `/ __/ '_/ /__/.__/\_,_/_/ /_/\_\ version 1.3.0 
>  /_/ 
> 
> Using Python version 2.7.6 (default, Mar 22 2014 22:59:56) 
> SparkContext available as sc, HiveContext available as sqlCtx. 
> >>> results = sqlCtx.sql("SELECT * FROM vt_call_histories").collect() 15/05/20 06:57:07 INFO HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 
> 15/05/20 06:57:07 INFO ObjectStore: ObjectStore, initialize called 
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is 
> already registered. Ensure you dont have multiple JAR versions of the 
> same plugin in the classpath. The URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar" 
> is already registered, and you are trying to register an identical 
> plugin located at URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar." 
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.6.jar" 
> is already registered, and you are trying to register an identical 
> plugin located at URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.1.jar." 
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont 
> have multiple JAR versions of the same plugin in the classpath. The 
> URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.1.jar" 
> is already registered, and you are trying to register an identical 
> plugin located at URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.9.jar." 
> 15/05/20 06:57:08 INFO Persistence: Property datanucleus.cache.level2 
> unknown - will be ignored 15/05/20 06:57:08 INFO Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored 
> 15/05/20 06:57:08 WARN HiveMetaStore: Retrying creating default 
> database after error: Error creating transactional connection factory 
> javax.jdo.JDOFatalInternalException: Error creating transactional 
> connection factory at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
> at java.lang.reflect.Method.invoke(Method.java:606) at 
> javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at 
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) 
> at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) 
> at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258) 
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) 
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56) 
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:606) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448) 
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66) 
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193) 
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) 
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486) 
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64) 
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74) 
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845) 
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453) 
> at 
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229) 
> at 
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225) 
> at 
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241) 
> at 
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240) 
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
> at java.lang.reflect.Method.invoke(Method.java:606) at 
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at 
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
> at py4j.Gateway.invoke(Gateway.java:259) at 
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) 
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at 
> py4j.GatewayConnection.run(GatewayConnection.java:207) at 
> java.lang.Thread.run(Thread.java:745) NestedThrowablesStackTrace: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) 
> at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325) 
> at 
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282) 
> at 
> org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240) 
> at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286) 
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) 
> at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) 
> at 
> org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187) 
> at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
> at java.lang.reflect.Method.invoke(Method.java:606) at 
> javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at 
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) 
> at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) 
> at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258) 
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) 
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56) 
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:606) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448) 
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66) 
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193) 
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) 
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486) 
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64) 
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74) 
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845) 
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453) 
> at 
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229) 
> at 
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225) 
> at 
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241) 
> at 
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240) 
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
> at java.lang.reflect.Method.invoke(Method.java:606) at 
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at 
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
> at py4j.Gateway.invoke(Gateway.java:259) at 
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) 
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at 
> py4j.GatewayConnection.run(GatewayConnection.java:207) at 
> java.lang.Thread.run(Thread.java:745) Caused by: 
> java.lang.ExceptionInInitializerError at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> at java.lang.Class.newInstance(Class.java:374) at 
> org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:47) 
> at 
> org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:54) 
> at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238) 
> at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131) 
> at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:85) 
> ... 73 more Caused by: java.lang.SecurityException: sealing violation: 
> package org.apache.derby.impl.services.locks is sealed at 
> java.net.URLClassLoader.getAndVerifyPackage(URLClassLoader.java:388) 
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:417) at 
> java.net.URLClassLoader.access$100(URLClassLoader.java:71) at 
> java.net.URLClassLoader$1.run(URLClassLoader.java:361) at 
> java.net.URLClassLoader$1.run(URLClassLoader.java:355) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at 
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at 
> java.lang.ClassLoader.defineClass1(Native Method) at 
> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) 
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at 
> java.net.URLClassLoader.access$100(URLClassLoader.java:71) at 
> java.net.URLClassLoader$1.run(URLClassLoader.java:361) at 
> java.net.URLClassLoader$1.run(URLClassLoader.java:355) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at 
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at 
> java.lang.Class.forName0(Native Method) at 
> java.lang.Class.forName(Class.java:190) at 
> org.apache.derby.impl.services.monitor.BaseMonitor.getImplementations(Unknown 
> Source) at 
> org.apache.derby.impl.services.monitor.BaseMonitor.getDefaultImplementations(Unknown 
> Source) at 
> org.apache.derby.impl.services.monitor.BaseMonitor.runWithState(Unknown 
> Source) at 
> org.apache.derby.impl.services.monitor.FileMonitor.<init>(Unknown 
> Source) at 
> org.apache.derby.iapi.services.monitor.Monitor.startMonitor(Unknown 
> Source) at org.apache.derby.iapi.jdbc.JDBCBoot.boot(Unknown Source) 
> at org.apache.derby.jdbc.EmbeddedDriver.boot(Unknown Source) at 
> org.apache.derby.jdbc.EmbeddedDriver.<clinit>(Unknown Source) ... 83 
> more 15/05/20 06:57:08 INFO HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 
> 15/05/20 06:57:08 INFO ObjectStore: ObjectStore, initialize called 
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is 
> already registered. Ensure you dont have multiple JAR versions of the 
> same plugin in the classpath. The URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar" 
> is already registered, and you are trying to register an identical 
> plugin located at URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar." 
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.6.jar" 
> is already registered, and you are trying to register an identical 
> plugin located at URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-api-jdo-3.2.1.jar." 
> 15/05/20 06:57:08 WARN General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont 
> have multiple JAR versions of the same plugin in the classpath. The 
> URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.1.jar" 
> is already registered, and you are trying to register an identical 
> plugin located at URL 
> "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-rdbms-3.2.9.jar." 
> 15/05/20 06:57:08 INFO Persistence: Property datanucleus.cache.level2 
> unknown - will be ignored 15/05/20 06:57:08 INFO Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored 
> Traceback (most recent call last): File "<stdin>", line 1, in 
> <module> File 
> "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/pyspark/sql/context.py", 
> line 528, in sql 
>  return DataFrame(self._ssql_ctx.sql(sqlQuery), self) File "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", 
> line 538, in __call__ File 
> "/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", 
> line 300, in get_return_value py4j.protocol.Py4JJavaError: An error 
> occurred while calling o31.sql. : java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472) 
> at 
> org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229) 
> at 
> org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225) 
> at 
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241) 
> at 
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240) 
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
> at java.lang.reflect.Method.invoke(Method.java:606) at 
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at 
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) 
> at py4j.Gateway.invoke(Gateway.java:259) at 
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) 
> at py4j.commands.CallCommand.execute(CallCommand.java:79) at 
> py4j.GatewayConnection.run(GatewayConnection.java:207) at 
> java.lang.Thread.run(Thread.java:745) Caused by: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1488) 
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64) 
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74) 
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2845) 
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2864) at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453) 
> ... 16 more Caused by: java.lang.reflect.InvocationTargetException 
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486) 
> ... 21 more Caused by: javax.jdo.JDOFatalInternalException: Error 
> creating transactional connection factory NestedThrowables: 
> java.lang.reflect.InvocationTargetException at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
> at java.lang.reflect.Method.invoke(Method.java:606) at 
> javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at 
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) 
> at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) 
> at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258) 
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) 
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56) 
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:579) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:557) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:610) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:448) 
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66) 
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5601) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:193) 
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) 
> ... 26 more Caused by: java.lang.reflect.InvocationTargetException 
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) 
> at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325) 
> at 
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282) 
> at 
> org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240) 
> at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286) 
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
> at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) 
> at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) 
> at 
> org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187) 
> at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775) 
> ... 55 more Caused by: java.lang.NoClassDefFoundError: Could not 
> initialize class org.apache.derby.jdbc.EmbeddedDriver at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 

risposta

10

Ho trovato la risposta per il mio problema.

Dopo CDH 5.3 è installato e Spark viene abilitata mediante Cloudera Manager, procedere come segue per abilitare l'accesso alveare:

  1. Assicurarsi alveare sta lavorando da alveare CLI e JDBC attraverso HiveServer2 (dovrebbe funzionare per difetto).
  2. Copia hive-site.xml nella cartella SPARK_HOME/conf.
  3. Aggiungi librerie Hive a Spark classpath -> modifica il file SPARK_HOME/bin/compute-classpath.sh e aggiungi quanto segue: CLASSPATH = "$ CLASSPATH: /opt/cloudera/parcels/CDH-5.3.0-1. cdh5.3.0.p0.30/lib/hive/lib/* "(esempio specifico CDH, usa la tua posizione lib di hive).
  4. Riavvia il cluster Spark perché tutto abbia effetto.

Read full document here

+0

ho avuto lo stesso problema ... CDH 5.4.1, Spark 1.3 .1. SparkSql non funziona ... Per me, il passaggio 2 era l'unico passaggio necessario. – MFARID

+0

Inoltre, devo aggiungere che ho dovuto copiare il file hive-site.xml su TUTTI i nodi di scintilla – MFARID

+0

Ho avuto lo stesso problema e con il passo 2 è stato risolto! Grazie ! CDH 5.4.8, pacchi –

1

Ho affrontato questo problema e abbiamo scoperto che il vero problema era librerie alveare si scontrano con le librerie scintilla. Se guardate i vostri registri sopra -

15/05/20 06:57:08 WARN General: Plugin (Bundle) "org.datanucleus" is 
already registered. Ensure you dont have multiple JAR versions of the 
same plugin in the classpath. The URL 
"file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/jars/datanucleus-core-3.2.2.jar. 

Questo non è un avvertimento innocuo. Questo è il nocciolo del problema. Ho già avuto barattoli di alveare nel mio CLASSPATH. Li ho rimossi e ho avviato Spark e tutto è andato bene. Quindi, prova SOLO prima. Vedere https://issues.apache.org/jira/browse/HIVE-9198

Il bit di copiare l'alveare-site.xml è necessaria se si ottiene un errore come Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D

ho finito di avere un alveare-site.xml come qui di seguito

<property> 
    <name>hive.metastore.schema.verification</name> 
    <value>false</value> 
    <description>blah</description> 
</property> 
    <property> 
    <name>hive.exec.scratchdir</name> 
    <value>/tmp/<yourusername></value> 
    <description>blah</description> 
</property> 
    <property> 
    <name>hive.exec.local.scratchdir</name> 
    <value>/tmp/<yourusername></value> 
    <description>blah</description> 
    </property> 
    <property> 
    <name>hive.downloaded.resources.dir</name> 
    <value>/tmp/<yourusername></value> 
    <description>blah</description> 
    </property> 
    <property> 
    <name>hive.scratch.dir.permission</name> 
    <value>733</value> 
    <description>blah</description> 
    </property> 
    <property> 
    <name>hive.querylog.location</name> 
    <value>/tmp/<yourusername></value> 
    <description>blah</description> 
    </property> 
    <property> 
    <name>hive.server2.logging.operation.log.location</name> 
    <value>/tmp/<yourusername>/operation_logs</value> 
    <description>blah</description> 
    </property> 

I rimosso tutto il resto da hive-site.xml. Vedi java.net.URISyntaxException when starting HIVE

0

Inoltre è possibile impostare la configurazione alveare durante l'esecuzione scintilla presentare questo è ok per me in CDH 5.4.5

spark-submit \ 
--class com.xxx.main.TagHive \ 
--master yarn-client \ 
--name HiveTest \ 
--num-executors 3 \ 
--driver-memory 500m \ 
--executor-memory 500m \ 
--executor-cores 1 \ 
--conf "spark.executor.extraClassPath=/etc/hive/conf:/opt/cloudera/parcels/CDH/lib/hive/lib/*.jar" \ 
--files ./log4j-spark.properties \ 
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties" \ 
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties" \ 
dmp-tag-etl-0.0.1-SNAPSHOT-jar-with-dependencies.jar $1 $2