Thursday, August 11, 2016

[Hadoop] Setting up a Single Node Cluster

Basically these resource links are good enough to do a single node cluster of Hadoop MapReduce. But I still want to add some comments for my reference.
http://www.thebigdata.cn/Hadoop/15184.html
http://www.powerxing.com/install-hadoop/

Login the user "hadoop"
# sudo su - hadoop

Go to the location of Hadoop
# /usr/local/hadoop

Add the variables in ~/.bashrc
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL

Modify $JAVA_HOME in etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

Start dfs and yarn
# sbin/start-dfs.sh
# sbin/start-yarn.sh

Finally, we can try the Hadoop MapReduce example as follows:
# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar grep input output 'dfs[a-z.]+'

P.S:
In order to forcefully let the namenode leave safemode, following command should be executed:
# hdfs dfsadmin -safemode leave





No comments: