Adding Hadoop Jobtracker History retention in BDE

openstack vmware

As we’ve been working on very large datasets tied back to an Isilon array for the HDFS layer, we discovered that the history server functionality was missing from BDE (both 1.1 and 2.0). After talking to a few individuals and getting some direction, but no solution, I realized the ability to turn the feature was available — just not done.

In order to turn on the jobtracker history server functionality, so that you can see the job logs after they complete, add the following code to the file:

  • BDE 1.1 /opt/serengeti/cookbooks/cookbooks/hadoop_cluster/templates/default/mapred-site.xml.erb
  • BDE 2.0
  • /opt/serengeti/chef/cookbooks/hadoop_cluster/templates/default/mapred-site.xml.erb

27 <property>
28  <name>mapreduce.jobhistory.webapp.address</name>
29   <value><%= @resourcemanager_address %>:19888</value>
30 </property>
31
32 <property>
33   <name>mapreduce.jobhistory.address</name>
34   <value><%= @resourcemanager_address %>:10020</value>
35 </property>
As always, be sure to run ‘knife cookbook upload -a’ after editing the file and then it will be available for you to use during your cluster deployments.