Introduction
There is a single per slave node that is used to manage the communication between the master and slaves. This process is known as the “heartbeat.” The heartbeat consists of a small amount of data that is exchanged between the nodes on a regular basis. This data is used to keep track of any changes that have been made on either node. Suppose the other side does not receive the heartbeat within a certain timeframe. In that case, it is assumed that something has happened to the node that sent the last heartbeat and an error recovery procedure is initiated.
What is a Slave Node?
A slave node is a computer that is connected to a master node in a cluster. The slave node acts as a backup for the master node and its resources are used when the master node is unavailable.
Setting up a Slave Node
In order to set up a slave node, you must first have a master node. The master node will generally be the one that you run your programs on, and the slave node will be used for storage and processing power. You can set up a slave node by following these steps.
Installing the Master Node
Before you can set up a Slave node, you need to have a Master node already set up and working. If you don’t have a Master node yet, follow these instructions to set one up.
- Choose an instance type for your Master node. We recommend using a C4 medium instance type.
- Choose an Availability Zone in the same region as your Slave nodes.
- Launch your instance using the AMI for your chosen region and instance type: a. For region us-east-1 (Northern Virginia), use ami-0ff8a91507f77f867 b. For all other regions, search for and choose the latest Amazon EMR release (EMR release 5.18.0 or later). The AMI ID will start with ami-. c. Select Configure Instance Details, Networking Configuration Details as shown following, then choose Next: Add Storage… .
 datanode
/usr/local/hadoop/bin/hadoop-daemon.sh (start|stop) tasktracker
Updating the Slave Node
From the Ambari Home page, click the Services tab.
From the list of services, select HDFS. The Summary page for the HDFS service displays.
Click the Configs tab.
Update the following properties:
x dfs.namenode.secondary.http-address — Set this property to the host and port for the NameNode process on the slave node. For example, nn2.example.com:50090 . This is done so that the NameNode on the slave node can be accessed by its hostname or IP address during a failover. If you have configured a virtual IP address using heartbeat, you can use this virtual IP address instead of the hostname or IP address of either node in this property definition on both nodes in your cluster to provide failover capabilities without changing configuration properties after a failover occurs. You must update this property on both nodes in your cluster before starting or restarting either NameNode service in your cluster..
x dfs.namenode.checkpoint.dir — Set all occurrences of this property to point to a quorum journal node (QJM) data directory located on each slave node’s local storage device (for example, file:///data1/qjm ) instead of using an NFS mount as was previously done for Hadoop 1 and Hadoop 2 not using High Availability (non-HA). You must update this property on all nodes in your cluster before starting or restarting either NameNode service in your cluster..
Conclusion
To conclude, it is important to note that there is a single per slave node. This can help reduce the overall complexity of your Hadoop system and make it easier to manage.