| 環(huán)境:Ubuntu 12.04 LTS Desktop 64bit 0.準(zhǔn)備環(huán)境我這里全程都是不用root模式的。 0.1 設(shè)置用戶名 0.2 配置hosts文檔 10.1.1.107 master 10.1.1.108 slave1 10.1.1.109 slave2 配置之后ping一下用戶名看是否生效 ping master 0.3 關(guān)閉防火墻 sudo ufw disable 0.4 安裝JAVA tar xvzf jdk-7u75-linux-x64.gz 修改環(huán)境變量 export JAVA_HOME=/home/administrator/work/jdk1.7.0_75 export JRE_HOME=/home/administrator/work/jdk1.7.0_75/jre export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH export CLASSPATH=$CLASSPATH:.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib 使環(huán)境變量生效 source /etc/profile 驗證一下JAVA是否成功安裝 java -version 如果打印出版本信息,說明安裝成功 0.5 安裝配置Scala tar xvzf scala-2.10.3.tgz 修改環(huán)境變量 export SCALA_HOME=/usr/java/scala-2.10.3 export PATH=$PATH:$SCALA_HOME/bin 使環(huán)境變量生效 source /etc/profile 驗證Scala是否安裝成功 scala -version 如果打印出來版本信息,說明安裝成功 0.6 配置SSH無密碼通信 apt-get install openssh-server 我這里沒聯(lián)網(wǎng),所以就在http://www./source/openssl-1.0.1e.tar.gz下安裝包編譯。 tar xvzf openssl-1.0.1e.tar.gz cd openssl-1.0.1e ./config shared --prefix=/usr/local make && make install 在所有機器上都生成私鑰和公鑰 ssh-keygen -t rsa (然后一直按回車) 如果要機器間都能相互訪問,要把所有機器都公鑰都拷到authorized_keys,傳輸公鑰可以用scp來傳輸。 scp .ssh/id_rsa.pub administrator@slave1:/home/administrator/.ssh/id_rsa_master.pub 注意提示的密鑰所在的目錄,然后切換到那個目錄里面,然后把所有的公鑰都添加進authorized_keys cat .ssh/id_rsa.pub >> .ssh/authorized_keys 驗證SSH無密碼通信 ssh master ssh slave1 ssh slave2 1.hadoop YARN安裝1.1 安裝hadoop tar xvzf hadoop-2.6.0.tar.gz hadoop-2.6.0不應(yīng)該解壓到/usr/java,因為/usr/java需要root權(quán)限才能訪問,最好不要用到root權(quán)限。 1.2 配置hadoop export JAVA_HOME=//home/administrator/work/jdk1.7.0_75 在hadoop-2.6.0/etc/hadoop下,將mapred-site.xml.templat重命名成mapred-site.xml,并添加以下內(nèi)容: <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> 在hadoop-2.6.0/etc/hadoop/中,修改core-site.xml <configuration> <property> <name>fs.default.name</name> <value>hdfs://master:8020</value> <final>true</final> </property> </configuration> 在hadoop-2.6.0/etc/hadoop/中,修改yarn-site.xml: <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> </configuration> 注:之前從網(wǎng)上看到的配置文檔都是mapreduce.shuffle,然后起yarn的時候yarn就一直沒起起來,運行jps命令,可以看到y(tǒng)arn起起來,而且有進程號,但是其實已經(jīng)掛了,去看log文檔發(fā)現(xiàn)有錯,具體錯誤忘記截圖留念了 <configuration> <property> <name>dfs.namenode.name.dir</name> <value>/home/administrator/work/mnt/disk1/yarn/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/administrator/work/mnt/disk1/yarn/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> </configuration> 在hadoop-2.6.0/etc/hadoop/slaves文件中添加你的節(jié)點ip或者h(yuǎn)ost: slave1 slave2 修改bashrc文件 export HADOOP_COMMON_LIB_NATIVE_DIR=/home/administrator/work/hadoop-2.6.0/lib/native export HADOOP_OPTS="-Djava.library.path=/home/administrator/work/hadoop-2.6.0/lib" export HADOOP_ROOT_LOGGER=DEBUG,console //開啟debug調(diào)試 source /home/administrator/.bashrc 通常編譯好的hadoop庫是在lib中,如果你不想編譯,可以用lib/native里面的預(yù)編譯庫,然后把native的庫移動到lib文件夾中。 cp hadoop-2.6.0/lib/native/* hadoop-2.6.0/lib/ 1.3 驗證hadoop是否安裝成功 bin/hadoop namenode -format 進入hadoop所在目錄 sbin/start-dfs.sh 啟動YARN sbin/start-yarn.sh 打開瀏覽器,輸入http://master:8088 這里有一個大坑,如果運行上面的format命令時,會問要不要清空原來的文件,如果清空的話,第二次啟動是datanode死活都啟動不起來,因為那時候slave節(jié)點的文件id是從master節(jié)點復(fù)制過來的,要保持一致才能打開datanode節(jié)點,如果format之后,master節(jié)點的文件id就回重新生成,所以這時候要不就不清空文件;如果清空文件就要把datanode的文件也順便手動刪除一下。 1.4 在YARN下嘗試運行MapReduce例子程序 bin/hadoop namenode -format 運行pi例子程序 bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 20 10 命令里的20 10分別指20個map10個reduce 2.Spark安裝2.1 安裝Spark 2.2 配置Spark export SPARK_LOCAL_IP=根據(jù)實際情況填寫 export SCALA_HOME=/home/administrator/work/scala-2.10.3 export JAVA_HOME=/home/administrator/work/jdk1.7.0_75 export HADOOP_HOME=/home/administrator/work/hadoop-2.6.0 export SPARK_LOCAL_DIR=/home/administrator/work/spark-1.2.1-bin-hadoop2.4 export SPARK_JAVA_OPTS="-Dspark.storage.blockManagerHeartBeatMs=60000 -Dspark.local.dir=$SPARK_LOCAL_DIR -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$SPARK_HOME/logs/gc.log -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=60 -Xms256m -Xmx256m -XX:MaxPermSize=128m" export SPARK_MASTER_IP=master export SPARK_MASTER_PORT=7077 export SPARK_WORKER_CORES=1 export SPARK_WORKER_MEMORY=2g export SPARK_WORKER_PORT=9090 export SPARK_WORKER_WEBUI_PORT=9099 export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop 在slave文件下填上slave主機名: slave1 slave2 2.3 啟動Spark sbin/start-all.sh 2.4 驗證Spark 8566 SecondaryNameNode 8955 NodeManager 8082 NameNode 17022 Jps 8733 ResourceManager 8296 DataNode 進入Spark的Web管理頁面: master:8080 運行命令 #本地模式兩線程運行 ./bin/run-example SparkPi 10 --master local[2] # Run application locally on 8 cores ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master local[6] lib/spark-examples-1.2.1-hadoop2.4.0.jar 100 # Run on a Spark Standalone cluster in client deploy mode ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://master:7077 --executor-memory 8G --total-executor-cores 4 lib/spark-examples-1.2.1-hadoop2.4.0.jar 100 # Run on a Spark Standalone cluster in cluster deploy mode with supervise ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://master:7077 --deploy-mode cluster --supervise --executor-memory 8G --total-executor-cores 8 lib/spark-examples-1.2.1-hadoop2.4.0.jar 100 # Run on a YARN cluster ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster \ # can also be `yarn-client` for client mode --executor-memory 8G --num-executors 8 /path/to/examples.jar 100更多 | 
|  |