Question - How to recover a NameNode when it is down?
Answer -
The following steps need to execute to make the Hadoop cluster up and running:
- Use the FsImage which is file system metadata replica to start a new NameNode.
- Configure the DataNodes and also the clients to make them acknowledge the newly started NameNode.
- Once the new NameNode completes loading the last checkpoint FsImage which has received enough block reports from the DataNodes, it will start to serve the client.
In case of large Hadoop clusters, the NameNode recovery process consumes a lot of time which turns out to be a more significant challenge in case of routine maintenance.