1、spark安装前提——必须安装好Hadoop(本人有三台机,已安装好Hadoop)
2、下载spark,解压至master机本地文件
3、修改/conf/spark-env.sh(原名spark-env.sh.template,把它改过来)
配置如下:
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.5export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.6.5/etc/hadoopexport JAVA_HOME=/usr/local/jdk/jdk1.8.0_121export SCALA_HOME=/usr/local/scala/scala-2.12.1export SPARK_MASTER_IP=masterexport SPARK_WORKER_MEMORY=1Gexport SPARK_EXECUTOR_MEMORY=1Gexport SPARK_DRIVER_MEMORY=1Gexport SPARK_WORKER_CORES=6
4、修改/conf/spark-default.conf(原名spark-default.conf.template,把它改过来)
配置如下:
spark.eventLog.enabled truespark.eventLog.dir hdfs://master:9000/historyserverforSparkspark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"spark.yarn.historyServer.address master:18080spark.history.fs.logDirectory hdfs://master:9000/historyserverforSpark
5、修改/conf/slaves(我有两个节点)
配置如下:
slave1slave2
6、配置/etc/profile
#SPARK_HOMEexport SPARK_HOME=/usr/local/spark/spark-2.1.0-bin-hadoop2.6export PATH=$PATH:$SPATK_HOME/bin:$SPARK_HOME/sbin
7、将master上的spark文件和/etc/profile文件传给slave机(以slave1为例)
scp /usr/local/spark root@slave1:/usr/local/spark
scp /etc/profile root@slave1:/etc/profile
7.2 创建 historyserverforSpark 文件夹
进入Hadoop/bin
hadoop dfs -mkdir /historyserverforSpark
8、启动Hadoop后,启动spark
/sbin/start-all.sh
之后启动历史记录
/sbin/start-history-server.sh
9、查看是否成功
网页登录 master:8080 和 master:18080
10、bin/spark-submit 运行实例