hi Samesh Mitbaa, You please install vmware/virtual box on windows7 first, by installing Linux (any flavor like CentOS or Ubantu) Hadoop can be configured very easily. Is there any special reason to install it on directly Windows7. If it is like that please do mention.
While the data that makes up a file is stored in blocks at the data nodes, the metadata for a file is stored at the NameNode.
The Name Node is also responsible for the file system namespace. To compensate for the fact that there is only one NameNode, one should configure the Name Node to write a copy of its state information to multiple locations, such as a local disk and an NFS mount.
If there is one node in the cluster to spend money on the best enterprise hardware for maximum reliability, it is the NameNode.
The NameNode should also have as much RAM as possible because it keeps the entire file system metadata in memory.
DataNodes.
An typical HDFS cluster has many DataNodes.
DataNodes store the blocks of data and blocks from different files can be stored on the same DataNode.
When a client requests a file, the client finds out from the NameNode which DataNodes stored the blocks that make up that file and the client directly reads the blocks from the individual DataNodes.
Each DataNode also reports to the NameNode periodically with the list of blocks it stores.
DataNodes do not require expensive enterprise hardware or replication at the hardware layer. The DataNodes are designed to run on commodity hardware and replication is provided at the software layer.