Contributing limited storage as data node in the Hadoop cluster!

OBJECTIVE:
The objective is to find a way to contribute limited storage to the name node from data node in the Hadoop cluster. And YES.. We can achieve this by using the concept of Linux Partitions !
For this first we need to set up a simple Hadoop cluster. With one name node and at least one data node.
Let’s move on !
We’ll start off by Launching two instances one data node and one name node

Then we need to have Hadoop and jdk(java) installed in both the instances
We can do this by using the command
rpm -ivh <software_name>


Configuring Name node and Data node…
Name node:
Before configuring create a directory using mkdir command
mkdir /nn
hdfs-site.xml file

core-site.xml file

Formatting Hadoop Name node:
hadoop namenode -format

Data node:
Before configuring create a directory using mkdir command
mkdir /dn
hdfs-site.xml file

core-site.xml file

Starting the service of Name node and Data node…
Name node:
hadoop-daemon.sh start namenode
Data node:
hadoop-daemon.sh start datanode
So now the cluster is set-up, to see the the contributed storage we can use the following command in either of the instances (data node or name node)
hadoop-dfsadmin -report

Here we can see the data node has contributed its entire storage(14.99 GB) to the name node…
Now we can see how to limit the storage!!
To achieve this we are going to use Linux Partitions concept.
First we’ll see how many partitions are primarily there in my hard disk
fdisk -l

The hard disk is with a single partition of 15 GB
Steps to follow to limit the contributing storage by creating a partition.
Step 1:
Create and attach a Volume
I already have 1 GB volume created in AWS cloud , now I’m going to attach it the instance that is configured as data node.


Give the ID of the instance that is configured as data node

Successfully Attached!

We can check by using fdisk -l command

Step 2:
Partitioning the attached volume:
Go inside the attached volume
// fdisk <volume_name>
fdisk /dev/xvdf

Enter ’n’ that stands for new partition

Then ‘p’ that stands for primary partition

Then enter the size of the partition to be created. In my case I’m creating 512 MB which can be done by giving “+512M” likewise for 1GB then we can use “+1G” an then press enter. Then enter ‘w’ to save the partition.

A partition of size 512 MB has been successfully created!

Step 3:
Formatting the created partition
udevadm settlemkfs.ext4 /dev/xvdf1

Step 4:
Create a new directory and mount the partition to the directory
mkdir /dn2
To mount use the command
mount /dev/xvdf1 /dn2

Succssfully mounted !
Step 4:
Stop the running data node and name node


Step 5: (Step Close to contribute limited storage (512 MB in this case))
Update the hdfs-site.xml file of data node with the /dn2 directory where we mounted the customized 512 MB size of partition

Step 6: (Final step)
Start the services of name node and data node and you’ll get to see that the data node instead of contributing its entire storage of 15GB, it’ll contribute only the limited storage of 512 MB to the cluster !!!
hadoop-daemon.sh start namenode

hadoop-daemon.sh start datanode

hadoop dfsadmin -report

Successfully contributed Limited Storage to Name node from Data node in Hadoop cluster!!!!!
Finally terminate the instances

