Tuesday, March 10, 2015

R Programming 6: R on Hadoop Hive




Connection:
-> library(RHive)
Loading required package: rJava
Loading required package: Rserve
-> rhive.init(hiveHome="/usr/hdp/current/hive-client/",hadoopHome="/usr/hdp/current/hadoop-client")
-> rhive.connect(host="HS2",port=10000,defaultFS="hdfs://HiveCLI/R server:8020")

Extensions in R:
rhive.connect
fhive.query
rhive.assign
rhive.export
rhive.napply
rhive.sapply
rhive.aggregate
rhive.list.tables
rhive.load.table
rhive.desc.table
Ex:
rhive.desc.table("diva.tablename")

Setting hive.execution.engine as tez in R:
rhive.set('hive.execution.engine','tez')

input <- rhive.query("select * from db.tableanme limit 10")


Issues:

> hive.query("show tables")
Error: could not find function "hive.query"
> library(RHive)
> rhive.init(hiveHome="/usr/hdp/current/hive-client/",hadoopHome="/usr/hdp/current/hadoop-client")
> rhive.connect(host="HiveServer2",port=10000,defaultFS="hdfs://hiveClient:8020"
+ hive.query("show tables")
Error: unexpected symbol in:
"rhive.connect(host="HiveServer2",port=10000,defaultFS="hdfs://iveClient:8020"
hive.query"

2015-03-11 20:55:19,572 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-03-11 20:55:20,281 WARN  [main] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(116)) - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Warning:
+----------------------------------------------------------+
+ / hiveServer2 argument has not been provided correctly.  +
+ / RHive will use a default value: hiveServer2=TRUE.      +
+----------------------------------------------------------+

2015-03-11 20:55:20,615 INFO  [Thread-4] jdbc.Utils (Utils.java:parseURL(285)) - Supplied authorities: HS2:10000


No comments:

Post a Comment