Contributing limited storage as data node in the Hadoop cluster!

OBJECTIVE:

The objective is to find a way to contribute limited storage to the name node from data node in the Hadoop cluster. And YES.. We can achieve this by using the concept of Linux Partitions !

For this first we need to set up a simple Hadoop cluster. With one name node and at least one data node.

Let’s move on !

We’ll start off by Launching two instances one data node and one name node

Then we need to have Hadoop and jdk(java) installed in both the instances

We can do this by using the command

rpm -ivh <software_name>

Name node
Data node

Configuring Name node and Data node…

Name node:

Before configuring create a directory using mkdir command

mkdir /nn

hdfs-site.xml file

core-site.xml file

Formatting Hadoop Name node:

hadoop namenode -format

Data node:

Before configuring create a directory using mkdir command

mkdir /dn

hdfs-site.xml file

core-site.xml file

Starting the service of Name node and Data node…

Name node:

hadoop-daemon.sh start namenode

Data node:

hadoop-daemon.sh start datanode

So now the cluster is set-up, to see the the contributed storage we can use the following command in either of the instances (data node or name node)

hadoop-dfsadmin -report

Here we can see the data node has contributed its entire storage(14.99 GB) to the name node…

Now we can see how to limit the storage!!

To achieve this we are going to use Linux Partitions concept.

First we’ll see how many partitions are primarily there in my hard disk

fdisk -l

The hard disk is with a single partition of 15 GB

Steps to follow to limit the contributing storage by creating a partition.

Step 1:

Create and attach a Volume

I already have 1 GB volume created in AWS cloud , now I’m going to attach it the instance that is configured as data node.

Give the ID of the instance that is configured as data node

Successfully Attached!

We can check by using fdisk -l command

Step 2:

Partitioning the attached volume:

Go inside the attached volume

// fdisk <volume_name>
fdisk /dev/xvdf

Enter ’n’ that stands for new partition

Then ‘p’ that stands for primary partition

Then enter the size of the partition to be created. In my case I’m creating 512 MB which can be done by giving “+512M” likewise for 1GB then we can use “+1G” an then press enter. Then enter ‘w’ to save the partition.

A partition of size 512 MB has been successfully created!

Step 3:

Formatting the created partition

udevadm settlemkfs.ext4 /dev/xvdf1

Step 4:

Create a new directory and mount the partition to the directory

mkdir /dn2

To mount use the command

mount /dev/xvdf1 /dn2

Succssfully mounted !

Step 4:

Stop the running data node and name node

Step 5: (Step Close to contribute limited storage (512 MB in this case))

Update the hdfs-site.xml file of data node with the /dn2 directory where we mounted the customized 512 MB size of partition

Step 6: (Final step)

Start the services of name node and data node and you’ll get to see that the data node instead of contributing its entire storage of 15GB, it’ll contribute only the limited storage of 512 MB to the cluster !!!

hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode
hadoop dfsadmin -report

Successfully contributed Limited Storage to Name node from Data node in Hadoop cluster!!!!!

Finally terminate the instances

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Microfrontends to the rescue of big SPA Monoliths [Video] part 1

Biopython- installation of Biopython package in MacOS

Explore the possibilities with hundreds of domain extensions — {link} -

Reduce Cost and Increase Productivity with Value Added IT Services from buzinessware — {link} -

Algorithms: Searching an Unbalanced Tree

Experienced Big Data Software Engineering BUT using Windows — how is it possible?

Using Google Cloud Platform to build serverless front-end applications. Part -1

Problems with the Current Preloading Landscape

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sangeeth Sahana D

Sangeeth Sahana D

More from Medium

CSV file in S3 from a Trino Query

Message platform patterns

Autoscaling Unattended RPA with Power Automate

99.999999999% Durability for SQLite Data