Recently, while setting up a virtual Cloudera CDH 5 Hadoop cluster, I ran into an issue where the node running Cloudera Manager ran out of system resources while provisioning services on other nodes.

RAM began to swap and everything started to take a turn for the worse. I shut down killed the VM due to the fact that it was horribly locked up, gave it 8 GB of RAM (Linux services are usually light, but not 2 GB of RAM light 😉 ) and fired it back up.

The Cloudera Manager suite came back up ok and I was able to log back into the interface but my cluster was in seriously bad shape. Multiple nodes had multiple services half configured and everything was out of wack.

Off to google I go, looking for anyone that had the same problem. Luckily, in my top 10 hits, was this forum post.

The fix:

1) Stop the cluster
2) Delete the cluster
3) cd /dfs on each node
4) rm -r -f *
5) Using cloudera manager add a new cluster
6) Select the hosts tab and select alll hosts
7) Choose all the defaults and click continue
   to start First Run (The setup Wizard)

After following the above steps, all nodes in my cluster were successfully provisioned ALL services!

I really like the job that Cloudera has done, not only making their web UI very robust, but also insuring the needed checks are in place to “start over” should something go catastrophically wrong when building a cluster. The Cloudera Manager has the intelligence to not require a bunch of manual cleanup, but instead skip things that are already installed and continue where it left off.