In the opening of this series we looked at Hadoop as the most popular Big Data tool on the market today, briefly covered what it is, and why small businesses should feel empowered to get onboard with this latest technology – even if they don’t have huge data sets.
There are a growing number of companies today that offer accessible and user friendly distributions of Apache Hadoop. We’ve already discussed Hortonworks as one of the most popular platforms out there. In this segment we’ll take a look at several more distributions that can help your small business get the most out of Hadoop.
Cloudera is the major competitor of Hortonworks and both are running neck and neck in the Hadoop distribution space.
Like Hortonworks, Cloudera offers a number of high quality solutions for getting Hadoop up and running in your business. The center of Cloudera’s offerings is CDH, which is its version of the Apache Hadoop distribution. Cloudera offers software, services, and support in three different bundles:
* Cloudera Express includes CDH and a version of Cloudera Manager, which provides robust cluster management capabilities like automated deployment, centralized administration, monitoring, and diagnostic tools.
* Cloudera Enterprise is a series of subscriptions that include everything in Cloudera Express plus enhanced management capabilities and support.
* CDH may be downloaded from Cloudera’s website at no charge, but with no technical support nor Cloudera Manager. All versions may be downloaded from Cloudera’s website.
Cloudera is also heavily vested in Hadoop training and certification for developers, analysts, and administrators. And like Hortonworks, Cloudera boasts a considerable partner ecosystem to ensure the scalable and efficient rollout of enterprise Hadoop to businesses and emerging markets.
IBM entered the Hadoop race in 2010 with the role out of its BigInsights enterprise Hadoop platform. BigInsights offers a unique ecosystem of tools and resources to help businesses of all sizes manage structured and unstructured data. The branding of IBM assures customers that they will get robust Big Data technology to provide visualization, exploration, advanced analytics, and security features. InfoSphere BigInsights comes bundled with a variety of enterprise grade features such as:
* Social Data Analytics Accelerator: Ingest and process large volumes of social media data
* Machine Data Analytics Accelerator: Ingest and process large volumes of machine data
* Big R: Enables the use of R as a query language to explore, transform, and visualize data
* BigSQL: Offers robust security and performance for SQL on Hadoop
* Big Sheets: Web-based analysis and visualization tool that allows analysis of large data sets
BigInsights comes a considerable number of additional management, integration, and optimization tools and features that support IBM’s reputation for high quality products and services.
The best way to get started with BigInsights is through the BigInsights Quick Start Edition, which provides the free, downloadable, non-production version that provides access to the enterprise level features of BigInsights, along with hands-on learning tutorials to guide you through your Hadoop experience.
Datameer has emerged in recent years as another major player in the Hadoop distribution space. Based in San Francisco, Datameer’s main product Datameer Analytics Solution (DAS) is a business integration platform for Hadoop that comes bundled with data source integration; an analytics engine with a spreadsheet interface; as well as a visualization features that include reports, charts and dashboards. Through the triad of integrate, analyze, visualize functionalities, Datameer “simplifies this complex environment into a single application on top of the powerful Hadoop platform.”
Datameer also offers an extensive listing of use cases by industry to help businesses of all sizes get onboard with Big Data and Hadoop.
In the last several years Hadoop has scaled up to become the most popular Big Data processing ecosystem on the market today. And while small businesses might be tempted to overlook Hadoop because of its complexity, we’ve reviewed a number of user-friendly platforms that make enterprise level Hadoop accessible and scalable for businesses of any size. It’s never too early to get started with this technology. Today’s data sizes will soon seem miniscule compared to what’s next, especially considering the massive growth expected in the Internet of Things market. Small businesses would be well advised to start looking at the Hadoop ecosystem today and exploring relevant use cases. As Big Data increases in orders of magnitude, companies will continue to rely on Hadoop as a primary platform for integrating, analyzing, and visualizing large sets of structured and unstructured data.