Hadoop is an open source Big Data framework from the Apache Software Foundation designed to manage huge amounts of data on clusters of servers. The storage is handled by the HDFS (Hadoop distributed file system) and the data are sorted and summarized in parallel by Hadoop Map reducing programming type, a version of Google's MapReduce. Required Java files are included in Hadoop Common, and Hadoop yarn provides the cluster management.