Rack Awareness Algorithm
Rack - Rack is a collection of machine which are connected using same network switch. If the network goes down, all the machine in a network will go down.
Rack Awareness algorithm came into the picture to overcome this problem.
In Rack Awareness, NameNode chooses the DataNode which is closer to the same rack or nearby rack.
NameNode maintains Rack ids of each DataNode to achieve rack information. Thus, this concept chooses DataNode based on the rack information. NameNode in Hadoop makes ensures that all the replicas should not stored on the same rack or single rack. Rack Awareness Algorithm reduces latency as well as Fault Tolerance.
Default replication factor is 3. Therefore according to Rack Awareness Algorithm:
- The first replica of the block will store on a local rack.
- The next replica will store on another DataNode within the same rack.
- The third replica stored on the different rack.
Comments
Post a Comment
If you have any doubts, Please let me know