大数据进阶之路SparkSQL基本配

文章目录

Spark安装编译失败环境搭建Standalon本地IDEHivContxtAPPSparkSssinonSparkShllSparkSqlthriftsrvr/blin的使用jdbc

MapRduc的局限性:

1)代码繁琐;

2)只能够支持map和duc方法;

)执行效率低下;

4)不适合迭代多次、交互式、流式的处理;

框架多样化:1)批处理(离线):MapRduc、Hiv、Pig2)流式处理(实时):Storm、JStorm)交互式计算:Impala

学习、运维成本无形中都提高了很多

===Spark

Spark安装

前置要求:

1)BuildingSparkusingMavnquisMavn..9ornwrandJava7+2)xportMAVEN_OPTS="-Xmx2g-XX:RsrvdCodCachSiz=m"

mvn编译命令:

./build/mvn-Pyarn-Phadoop-2.4-Dhadoop.vrsion=2.4.0-DskipTstsclanpackag

[hadoop

hadoopspark-2.1.0]$catpom.xml[hadoop

hadoopspark-2.1.0]$pwd/hom/hadoop/sourc/spark-2.1.0proprtishadoop.vrsion2.2.0/hadoop.vrsionprotobuf.vrsion2.5.0/protobuf.vrsionyarn.vrsion${hadoop.vrsion}/yarn.vrsion....../proprtis...............profilidhadoop-2.6/idproprtishadoop.vrsion2.6.4/hadoop.vrsionjtst.vrsion0.9./jtst.vrsionzookpr.vrsion.4.6/zookpr.vrsioncurator.vrsion2.6.0/curator.vrsion/proprtis/profil

路径下执行

[hadoop

hadoopspark-2.1.0]$pwd/hom/hadoop/sourc/spark-2.1.0

==./build/mvn-Pyarn-Phadoop-2.6-Phiv-Phiv-thriftsrvr-Dhadoop.vrsion=2.6.0-cdh5.7.0-DskipTstsclanpackag

编译可以运行的包

./dv/mak-distribution.sh--nam2.6.0-cdh5.7.0--tgz-Pyarn-Phadoop-2.6-Phiv-Phiv-thriftsrvr-Dhadoop.vrsion=2.6.0-cdh5.7.0

mak-distribution.sh

spark-$VERSION-bin-$NAME.tgz

—spark-2.1.0-bin-2.6.0-cdh5.7.0.tgz

编译失败

Faildtoxcutgoalonprojct...:Couldnotsolvdpndncisforprojct...

在pom.xml中添加

positoryidcloudra/idurl


转载请注明:http://www.aierlanlan.com/rzdk/2610.html

  • 上一篇文章:
  •   
  • 下一篇文章: