There is a single per slave node that is used to manage the communication between the master and slaves. This process is known as the “heartbeat.” The heartbeat consists of a small amount of data that is exchanged between the nodes on a regular basis. This data is used to keep track of any changes that have been made on either node. Suppose the other side does not receive the heartbeat within a certain timeframe. In that case, it is assumed that something has happened to the node that sent the last heartbeat and an error recovery procedure is initiated.
What is a Slave Node?
A slave node is a computer that is connected to a master node in a cluster. The slave node acts as a backup for the master node and its resources are used when the master node is unavailable.
Setting up a Slave Node
In order to set up a slave node, you must first have a master node. The master node will generally be the one that you run your programs on, and the slave node will be used for storage and processing power. You can set up a slave node by following these steps.
Installing the Master Node
Before you can set up a Slave node, you need to have a Master node already set up and working. If you don’t have a Master node yet, follow these instructions to set one up.
- Choose an instance type for your Master node. We recommend using a C4 medium instance type.
- Choose an Availability Zone in the same region as your Slave nodes.
- Launch your instance using the AMI for your chosen region and instance type: a. For region us-east-1 (Northern Virginia), use ami-0ff8a91507f77f867 b. For all other regions, search for and choose the latest Amazon EMR release (EMR release 5.18.0 or later). The AMI ID will start with ami-. c. Select Configure Instance Details, Networking Configuration Details as shown following, then choose Next: Add Storage… .
![alt text](https://docs.aws.amazon/emr/latest/ReleaseGuide/images/emr-5-6-launch-instance-type2b_2x_4e70db8dv5646095440d517c6e3a9662ab1007e25333860cfac3435933930798536163a12ba17bd028349c06595fd2efbe7b93fc4588ef35d49e398196338db80cc2528fb781b0072c22fc205125aa948169reg_ssn_bidirectionalencryptionenforcedfilled_2x_5152859319fe42ddbfeb36006a2018292990774324ec45huis208d102061dbdfaf93127dc556115572210652c1252db758994257695348752f202ddad315082621111430749dece5454035744d1476351bb7ea6fd90751cd64875bf254fe70486504197ff567326730ca393159wgcsfillemptyencryptionworkerloginenabledsplunkenabledkinesisstreaminputenabledoozieinstalledjarialluxurysettingsdevmodesettingslastupdateenabledsystemlogGENERALINFOLEVELPROJECTPROGRESSALERTENVIRONMENTHfinstalllogleveleventslogsearchjarcommonjarservicesamonflinkconfigurationclassificationzookeeperresthiveservermetastorenamenodeyarnresourcemanagerhistoryserverjavahosttrackerflumeagentnodemanagerhbasekskerberosgcsdisabledjvmtypedependenciesdetailed loggerlevelhadoopconfighdfsdefaultparallelismmapreduceframeworkmapreduce administrationmrjobhistory mqttgateway prometheus broker connections status streams topicsservicesfilesystemguyperfmonnamenodehostermapreduce map duties reduce dutiesstrong hadoopservice most major s3 storage fs storage namenode place directory structure contains file system save located lower read blocks files return give week log work huge paper course section daily group sizes user although authority provided individuals rules raw dump firefox html line teams delete write secure created opportunities lie real various means article bit browser ended staff issues linked websites etc road became morning posted hide points stories awesome background listed process fast receive wiki begin changes understand different server administration five sql losing combine operations records engine fall failure drive unable agent removed hands address block shown sweet laptop hotel nice stay instead lack http genius soon forum turned words common email choose claimed enable letters trade extent felt posted enjoy forced happened store latest please rights lives server discussion languages reading machine appears modern culture description principle type sending associate wrong completely cause rights bug wedding claimed knew strange playing tricks force depending works levels match fishing entirely discuss jobs reading making equal marks automatically capture native playing applications programming devices monitor support replied complex cycles explained birth ok beat alone son poor daughter chip commercial given heard usual depends finally key husband mother according watching friendly win trying share field survive lot free generally born present heard experience hotel large within nice home important including began build cash companies couldn eating existed visitors tried won curious doesn dynamics charges comics daughter extra fresh fathers grow hips host pounds jealous largest leader listings married mercury merge mother nine origin overlap percentages personal pink pockets pound pounds recipe revenue rude served shift standing stunning survived twentieth venue whereas wireless yield yourself
Configuring the Master Node
In this section, we will configure the Master node. For this tutorial, we will use a single Master node. If you have multiple Master nodes, you need to perform these steps on all of them.
We need to edit the file
/etc/hadoop/conf/hdfs-site.xml on the Master node and add the following configuration:
<property> <name>dfs.namenode.secondary.http-address</name> <value>master-node-ip:9001</value> </property>
Connecting to the Slave Node
In Hadoop, a slave node is a node that stores and processes data. A Hadoop slave node can be a physical machine or a virtual machine. A Hadoop slave node can be a part of a Hadoop cluster or it can be a stand-alone node.
If you are using a Mac or Linux, you can connect to the slave node using SSH. If you are using Windows, you will need to use PuTTY.
Using the Web Interface
In order to connect to the slave node using the web interface, you will need to open a web browser and enter the IP address of the machine into the address bar. Once you have done this, you should see the login screen for the web interface.
In order to login, you will need to enter the username and password that you set when you first installed the software. After entering these credentials, you should see the main page of the web interface.
Managing the Slave Node
The slave node is a node that is used to store and process data. The slave nodes are managed by the master node. The master node is responsible for managing the slave nodes. The slave nodes are used to store data. The data is then processed by the slave nodes.
Starting and Stopping the Slave Node
A Hadoop slave node is a commodity computer that runs the DataNode and TaskTracker daemons. The slaves are managed by the Master Node. You can start or stop the daemons on a slave node by running the following command:
/usr/local/hadoop/bin/hadoop-daemon.sh (start|stop) datanode
/usr/local/hadoop/bin/hadoop-daemon.sh (start|stop) tasktracker
Updating the Slave Node
From the Ambari Home page, click the Services tab.
From the list of services, select HDFS. The Summary page for the HDFS service displays.
Click the Configs tab.
Update the following properties:
x dfs.namenode.secondary.http-address — Set this property to the host and port for the NameNode process on the slave node. For example, nn2.example.com:50090 . This is done so that the NameNode on the slave node can be accessed by its hostname or IP address during a failover. If you have configured a virtual IP address using heartbeat, you can use this virtual IP address instead of the hostname or IP address of either node in this property definition on both nodes in your cluster to provide failover capabilities without changing configuration properties after a failover occurs. You must update this property on both nodes in your cluster before starting or restarting either NameNode service in your cluster..
x dfs.namenode.checkpoint.dir — Set all occurrences of this property to point to a quorum journal node (QJM) data directory located on each slave node’s local storage device (for example, file:///data1/qjm ) instead of using an NFS mount as was previously done for Hadoop 1 and Hadoop 2 not using High Availability (non-HA). You must update this property on all nodes in your cluster before starting or restarting either NameNode service in your cluster..
To conclude, it is important to note that there is a single per slave node. This can help reduce the overall complexity of your Hadoop system and make it easier to manage.