Question - Explain Rack Awareness in Hadoop.
Answer -
Rack Awareness is one of the popular big data interview questions. Rach awareness is an algorithm that identifies and selects DataNodes closer to the NameNode based on their rack information. It is applied to the NameNode to determine how data blocks and their replicas will be placed. During the installation process, the default assumption is that all nodes belong to the same rack.
Rack awareness helps to:
- Improve data reliability and accessibility.
- Improve cluster performance.
- Improve network bandwidth.
- Keep the bulk flow in-rack as and when possible.
- Prevent data loss in case of a complete rack failure.