一. 對hadoop eclipse plugin認(rèn)識(shí)不足
http://zy19982004./blog/2024467曾經(jīng)說到我最hadoop eclipse plugin作用的認(rèn)識(shí)。但事實(shí)上我犯了一個(gè)錯(cuò)誤,Win7 Eclipse里的MyWordCount程序一直在本地運(yùn)行,沒有提交到集群環(huán)境上運(yùn)行(查看192.168.1.200:50030)沒有這個(gè)Job。運(yùn)行方式分為兩種,右鍵Run As
- Java Application
- Run on Hadoop
如果說Run As Java Application在本地運(yùn)行還好說,它直接使用項(xiàng)目下的依賴的Hadoop Jar,使用Hdfs作為input,對MyWordCount main方法的一步步調(diào)用,把輸出結(jié)果寫入Hdfs完成。這一切都跟MapReduce集群無關(guān)。
但Run on Hadoop為什么不行呢,難道這個(gè)插件的作用僅僅如http://zy19982004./blog/2024467說到的這么簡單?
二. Hadoop2.x eclispe-plugin
再次下載源碼https://github.com/winghc/hadoop2x-eclipse-plugin。簡單看了幾個(gè)類,如
- Wizard for publishing a job to a Hadoop server
- public class RunOnHadoopWizard extends Wizard {}
- Representation of a Map/Reduce running job on a given location
- public class HadoopJob {}
從注釋就可以看出來插件是支持遠(yuǎn)程提交Job的。那是我使用不當(dāng)嗎?
三. Hadoop2.x eclispe-plugin工作原理
Run on Hadoop時(shí)
- 會(huì)在EclipseWorkspace\.metadata\.plugins\org.apache.hadoop.eclipse\下生成一個(gè)MapReduce Jar和一個(gè)對應(yīng)文件夾(包含core-site.xml)。我們在Eclipse里配置的Hadoop集群信息會(huì)寫到core-site.xml里面。
- 然后把此Job依據(jù)配置信息提交到本地或者集群。
我去看了下Job對應(yīng)的core-site.xml,mapreduce.framework.name居然是local,yarn.resourcemanager.address居然是0.0.0.0:8032,于是回到Eclispe配置集群環(huán)境的地方,發(fā)現(xiàn)果然是這樣的,也就是說插件根本沒有把集群環(huán)境下的配置信息全部copy到Eclipse下。把Eclipse下這兩項(xiàng)修改后,還是在本地運(yùn)行,我就奇怪了,于是我在程序里加上
- conf.set("mapreduce.framework.name", "yarn");
- conf.set("yarn.resourcemanager.address", "192.168.1.200:8032");
終于正常提交到集群環(huán)境了。我懷疑插件最后在什么地方還是讀取到了local和0.0.0.0:8032,寫入了core-site.xml,有時(shí)間再去看看插件的源碼。
四. 繼續(xù)出問題
Job雖然是提交到了集群環(huán)境,但運(yùn)行失敗了。
查看日志如下
- 2014-04-01 19:50:36,731 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1396351641800_0005_02_000001 :
- %JAVA_HOME% -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
- 2014-03-13 22:50:41,317 INFO org.apache.hadoop.mapreduce.Job - Job job_1394710790246_0003 failed with state FAILED due to: Application application_1394710790246_0003 failed 2 times due to AM Container for appattempt_1394710790246_0003_000002 exited with exitCode: 1 due to: Exception from container-launch:
- org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control
- at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
- at org.apache.hadoop.util.Shell.run(Shell.java:379)
- at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
- at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
- at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
- at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
- at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
- at java.util.concurrent.FutureTask.run(FutureTask.java:166)
- at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
- at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
- at java.lang.Thread.run(Thread.java:722)
上網(wǎng)找答案,發(fā)現(xiàn)是Hadoop本身的問題。
https://issues./jira/browse/YARN-1298
https://issues./jira/browse/MAPREDUCE-5655。
五. 自己編譯Hadoop2.2
- 下載Hadoop2.2源碼http://apache./apache-mirror/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz
- 下載https://issues./jira/i#browse/MAPREDUCE-5655兩個(gè)patch
- 下載https://issues./jira/i#browse/HADOOP-10110這個(gè)patch
- patch指令。patch -p0 < MRApps.patch。p0的0代表進(jìn)去層次。不會(huì)的參考http://hi.baidu.com/thinkinginlamp/item/0ba1d051319b5ac09e2667f8
- 然后按照http://my.oschina.net/yiyuqiuchi/blog/188510去編譯。hadoop-2.2.0-src/hadoop-dist/targethadoop-2.2.0.tar.gz就是編譯好的。
給兩張patch前后的對照圖
下圖左邊為patch前Hadoop源碼
下圖左邊為patch成功后Hadoop源碼
六. 使用自己編譯的包
- 檢查看看patch是否被打包進(jìn)去了。通過查看MRApps.class字節(jié)碼得知已經(jīng)被打包進(jìn)去。
通過查看YARNRunner.class字節(jié)碼也是正確的,我在YARNRunner.java里面設(shè)置了一個(gè)PATCH_TEST編譯時(shí)常量,"zy19982004"的字節(jié)碼已經(jīng)被內(nèi)嵌到class的字節(jié)碼里。

- 替換集群Jar,因?yàn)樯鲜鋈齻€(gè)patch只涉及到兩個(gè)Jar,另外一個(gè)pacth是修改pom文件并且是scope test,可以不管。用hadoop-2.2.0\share\hadoop\mapreduce\hadoop-mapreduce-client-common-2.2.0.jar(MRApps.patch)替換集群下的對應(yīng)jar,用hadoop-2.2.0\share\hadoop\mapreduce\hadoop-mapreduce-client-jobclient-2.2.0.jar(YARNRunner.patch)替換集群下的對應(yīng)jar。
- 修改windows環(huán)境下的mapred-site.xml,添加
- <property>
- <name>mapred.remote.os</name>
- <value>Linux</value>
- <description>Remote MapReduce framework's OS, can be either Linux or Windows</description>
- </property>
- 重啟集群,這個(gè)錯(cuò)誤已經(jīng)沒有了,但出現(xiàn)了另外的錯(cuò)誤。然后
- Application application_1396339724108_0014 failed 2 times due to AM Container for appattempt_1396339724108_0014_000002 exited with exitCode: 1 due to: Exception from container-launch:
- org.apache.hadoop.util.Shell$ExitCodeException:
- 2014-04-01 19:50:36,731 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1396351641800_0005_02_000001 :
- $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
- 1) Add such config property to your mapred-site.xml (client side only):
- <property>
- <name>mapreduce.application.classpath</name>
- <value>
- $HADOOP_CONF_DIR,
- $HADOOP_COMMON_HOME/share/hadoop/common/*,
- $HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
- $HADOOP_HDFS_HOME/share/hadoop/hdfs/*,
- $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
- $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,
- $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,
- $HADOOP_YARN_HOME/share/hadoop/yarn/*,
- $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*
- </value>
- </property>
終于成功了。。。
七. 不使用hadoop eclispe plugin的場景
自己調(diào)試Hadoop源碼的時(shí)候,Debug As Java Application即可。前面兩篇博客解決錯(cuò)誤都是通過debug源碼來解決的。
八. 總結(jié)
Window向Linux Hadoop提交作業(yè)的方法
- 配置好hadoop eclipse plugin。
- Job配置文件里mapreduce.framework.name為yarn。其它配置也需要正確。
- Run On Hadoop
Run As Application其實(shí)也可以提交Job,依賴于上一次Run on Hadoop過程中產(chǎn)生的jar,這為我們debug提供了一種思路。







評論
你的兩個(gè)問題
1.還是報(bào)錯(cuò)“exited with exitCode: 1 due to: Exception from container-launch” 需要你自己去debug了
2.ClassNotFoundException考慮插件是不是沒有把jar包提交到hdfs上,也可以通過debug才看到。
我運(yùn)行方式是 run on hadoop
14/06/26 15:43:51 INFO mapreduce.Job: Job job_1403768617899_0002 failed with state FAILED due to: Application application_1403768617899_0002 failed 2 times due to AM Container for appattempt_1403768617899_0002_000002 exited with exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
1) Add such config property to your mapred-site.xml (client side only):
<property>
<name>mapreduce.application.classpath</name>
<value>
$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/share/hadoop/common/*,
$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,
$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,
$HADOOP_YARN_HOME/share/hadoop/yarn/*,
$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*
</value>
</property> 這個(gè)一定要寫才可以啊,
14/06/26 15:43:51 INFO mapreduce.Job: Job job_1403768617899_0002 failed with state FAILED due to: Application application_1403768617899_0002 failed 2 times due to AM Container for appattempt_1403768617899_0002_000002 exited with exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
WordCount程序怎么運(yùn)行的?
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging./log4j/1.2/faq.html#noconfig for more info.
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1752)
at org.apache.hadoop.mapred.JobConf.getCombinerClass(JobConf.java:1139)
at org.apache.hadoop.mapred.Task$CombinerRunner.create(Task.java:1517)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1010)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:390)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1744)
... 11 more
Caused by: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 12 more
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1752)
at org.apache.hadoop.mapred.JobConf.getCombinerClass(JobConf.java:1139)
at org.apache.hadoop.mapred.Task$CombinerRunner.create(Task.java:1517)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1010)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:390)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1744)
... 11 more
Caused by: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 12 more
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1752)
at org.apache.hadoop.mapred.JobConf.getCombinerClass(JobConf.java:1139)
at org.apache.hadoop.mapred.Task$CombinerRunner.create(Task.java:1517)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1010)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:390)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1744)
... 11 more
Caused by: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 12 more
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1752)
at org.apache.hadoop.mapred.JobConf.getCombinerClass(JobConf.java:1139)
at org.apache.hadoop.mapred.Task$CombinerRunner.create(Task.java:1517)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1010)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:390)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1744)
... 11 more
Caused by: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 12 more
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1752)
at org.apache.hadoop.mapred.JobConf.getCombinerClass(JobConf.java:1139)
at org.apache.hadoop.mapred.Task$CombinerRunner.create(Task.java:1517)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1010)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:390)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1744)
... 11 more
Caused by: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 12 more
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1752)
at org.apache.hadoop.mapred.JobConf.getCombinerClass(JobConf.java:1139)
at org.apache.hadoop.mapred.Task$CombinerRunner.create(Task.java:1517)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1010)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:390)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1744)
... 11 more
Caused by: java.lang.ClassNotFoundException: Class test.WordCount$Reduce not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 12 more
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at test.WordCount.main(WordCount.java:72)