文章目录
Spark安装编译失败环境搭建Standalon本地IDEHivContxtAPPSparkSssinonSparkShllSparkSqlthriftsrvr/blin的使用jdbc
MapRduc的局限性:
1)代码繁琐;
2)只能够支持map和duc方法;
)执行效率低下;
4)不适合迭代多次、交互式、流式的处理;
框架多样化:1)批处理(离线):MapRduc、Hiv、Pig2)流式处理(实时):Storm、JStorm)交互式计算:Impala
学习、运维成本无形中都提高了很多
===Spark
Spark安装
前置要求:
1)BuildingSparkusingMavnquisMavn..9ornwrandJava7+2)xportMAVEN_OPTS="-Xmx2g-XX:RsrvdCodCachSiz=m"
mvn编译命令:
./build/mvn-Pyarn-Phadoop-2.4-Dhadoop.vrsion=2.4.0-DskipTstsclanpackag
[hadoop
hadoopspark-2.1.0]$catpom.xml[hadoophadoopspark-2.1.0]$pwd/hom/hadoop/sourc/spark-2.1.0proprtishadoop.vrsion2.2.0/hadoop.vrsionprotobuf.vrsion2.5.0/protobuf.vrsionyarn.vrsion${hadoop.vrsion}/yarn.vrsion....../proprtis...............profilidhadoop-2.6/idproprtishadoop.vrsion2.6.4/hadoop.vrsionjtst.vrsion0.9./jtst.vrsionzookpr.vrsion.4.6/zookpr.vrsioncurator.vrsion2.6.0/curator.vrsion/proprtis/profil路径下执行
[hadoop
hadoopspark-2.1.0]$pwd/hom/hadoop/sourc/spark-2.1.0==./build/mvn-Pyarn-Phadoop-2.6-Phiv-Phiv-thriftsrvr-Dhadoop.vrsion=2.6.0-cdh5.7.0-DskipTstsclanpackag
编译可以运行的包
./dv/mak-distribution.sh--nam2.6.0-cdh5.7.0--tgz-Pyarn-Phadoop-2.6-Phiv-Phiv-thriftsrvr-Dhadoop.vrsion=2.6.0-cdh5.7.0
mak-distribution.sh
spark-$VERSION-bin-$NAME.tgz
—spark-2.1.0-bin-2.6.0-cdh5.7.0.tgz
编译失败
Faildtoxcutgoalonprojct...:Couldnotsolvdpndncisforprojct...
在pom.xml中添加
positoryidcloudra/idurl