Hadoop has evolved greatly since early 2008, where the Apache Software Foundation initially recognized its importance and promoted it to a top-level open source project. The ‘big data deluge’ is compounding, with IT teams taking a “grab the data first, and figure out what to do with it later” approach. There is no way to predict big data applications in the future. The choice of infrastructure around an organization’s big data is an important one to make. With its constant evolution, IT teams are having a difficult time properly acquiring, deploying, and integrating Hadoop properly. Thus, while Hadoop is an open source project, many vendors began offering their own distributions of Hadoop, each with their own perks. This is an analysis of what …show more content…
Cloudera was founded in 2008 by a group of engineers from Yahoo, Google, and Facebook. The company is also currently the frontrunner in Hadoop environment, with 53% of Hadoop platform market share (excluding pay-as-you-go services such as Amazon EMR). Part of this success can be attribute to strong corporate partnerships, with companies such as Oracle and MongoDB pushing its services (Top 6 Hadoop Vendors providing Big Data Solutions in Open Data Platform). Cloudera offers their core system, CDH, in combination with subscriptions services such as Cloudera Enterprise and purports to be the most popular distribution of Apache Hadoop. CDH offers the core elements of Hadoop with all of the integration work completed for the organization with additional components, including Hue, Cloudera’s user interface. Cloudera Manager is also included with the Express and Enterprise subscriptions, allowing for automated deployment, configuration, and cluster …show more content…
With a highly specialized version of HBase, which claims higher reliability and security, MapR promotes itself as the only distribution that provides full data protection with no single point of failure (deRoos, 2014). The MapR distribution is also production-ready and can run online analytical processing and applications on a single platform, touting a lower “total cost of ownership” resulting from less administrative burdens (Why