Don’t miss the latest developments in business and finance.

EMC Delivers Industry's First Unified Big Data Analytics Appliance

Image
Announcement Corporate
Last Updated : Jan 21 2013 | 12:40 AM IST

Greenplum’s Scalable, Modular System Combines Shared-Nothing MPP Relational Database with Enterprise-Class Apache Hadoop for Structured and Unstructured Data Co-processing

Continuing to build momentum for its “Big Data” analytics solutions, EMC Corporation today introduced the EMC® Greenplum® Modular Data Computing Appliance (DCA), the industry’s first complete Big Data analytics platform. Available now, the Greenplum DCA features a revolutionary modular architecture that for the first time allows enterprises to combine a shared-nothing MPP relational database with enterprise-class Apache Hadoop – along with Greenplum partner BI and ELT applications – to achieve true structured and unstructured data co-processing, and expand it gracefully as needed in a single, unified platform.

DCA modules enable enterprises to completely change how they think about scaling by providing the ability to start small and expand the appliance in more flexible and cost-effective quarter-rack increments based on processing performance or storage capacity needs. In addition to mixing and matching Greenplum Database and Greenplum HD (Hadoop) modules, enterprises can also bring all their BI applications and ELT tools directly into the cluster, in the same DCA, through the use of new Greenplum Data Integration Accelerator modules. The result is a unified Big Data platform combining structured and unstructured data and applications in a single infrastructure that is also monitored, managed and supported by EMC.

Today, enterprises are seeking to make better use of their data warehouses for advanced analytics, and this trend will accelerate as organizations strive to move from running business intelligence point solutions to deploying comprehensive analytics enterprise-wide. At the same time, enterprises are placing greater importance on integrating and analyzing their accumulation of their unstructured and semi-structured data. But as data warehouses get bigger, enterprises are facing scalability, performance degradation and management complexity, and are seeking ways to enable more concurrent users to access the data for business applications.

Four Greenplum Data Computing Appliance modules are available today:

  • The Greenplum Database Module is a purpose-built, highly scalable data-warehousing appliance module that architecturally integrates database, computing, storage and network into an enterprise-class, easy-to-implement system. It is the industry leader in price and performance.
  • The Greenplum Database High Capacity Module is designed to host multi-petabytes of data without surging power consumption, increasing costs or mushrooming space. For businesses that require detailed analysis of extremely large amounts of data or those looking for a longer term archive, this high-capacity version offers the lowest cost-per-unit data warehouse.
  • The Greenplum HD Module is the world’s first high-performance data co-processing Hadoop appliance module. It marries Hadoop with the Greenplum Database, allowing true co-processing of both structured and unstructured data within a single, seamless solution.
  • The Greenplum Data Integration Accelerator (DIA) Module hosts partner analytics applications and places them directly on the same high performance, low latency interconnect as the other appliance modules. This enables market-leading data loading performance in a parallel and scalable model, to shorten batch loads or implement micro-batch loading.

Enterprises can start with a single, primary rack, which includes a single standard or high-capacity Greenplum Database quarter-rack module and room for three additional modules, as well as two master servers that are responsible for authentication, optimizing the query, balancing the workload among the different segment servers, managing the fault tolerant mechanism of data and other tasks for the cluster. Enterprises may expand the appliance in quarter-rack increments using Greenplum Database, Greenplum HD or Greenplum DIA modules in any order and amount, up to six racks total as their demand for processing capacity grows. All modules are linked via a redundant, high-performance, low-latency interconnect.

This new release of the Greenplum DCA also includes increased high availability and simple integration with EMC’s market-leading solutions for data protection and disaster recovery. High availability is addressed via automated master-node fail-over and the Greenplum Database High Availability Group, enabling each full-rack DCA to sustain up to four server failures, one in each HA Group, nearly doubling the high availability rate. It also integrates with industry-leading EMC Data Domain® deduplication and backup technology from high speed backup and restore and wide area disaster recovery. The Greenplum DCA SAN Mirror Solution uses EMC Symmetrix® VMAX™, TimeFinder®/Snap and Symmetrix Remote Data Facility (SRDF®) for advanced storage and data replication between two sites in synchronous mode.

Supporting Quotes

More From This Section

“Greenplum’s unique approach to Big Data analytics is resonating with a growing number of large enterprise customers in the financial, healthcare, and media industries who are planning for the future and have adopted our technology to make fundamental changes in their overall data infrastructure in order to tackle these Big Data issues. Today’s announcement reflects our shift to a focus on building the analytics platform of the future, and we have had great success with that as enterprises adopt and validate our vision of data coprocessing.”

- Bill Cook, president and general manager of Greenplum, a division of EMC

“We are the only company enabling data coprocessing, the fast, bidirectional sharing of structured and unstructured data between relational and Hadoop modules within a single appliance, to allow each system do what it does best and achieve a whole that’s much greater than the sum of its parts. That’s why data coprocessing is so important and so powerful, and it is the new modularity of our DCA that provides this opportunity for enterprises to get the benefits of Big Data analytics from whatever kind of data they’re working with.”

- Scott Yara, vice president of products of Greenplum, a division of EMC

“EMC has unveiled a comprehensive Big Data strategy and continues to demonstrate their ability to execute by listening to what customers need and leveraging resources to put together a complete Big Data Analytics platform. The four Greenplum Data Computing Appliance modules enable companies to start with a configuration that makes sense for their data analytics application, the size of their wallet and need for scale and agility as their data processing capacities change or grow –all while leveraging a common hardware platform. It is an elegant approach that simplifies how companies incorporate both structured and unstructured data in analytics to gain a competitive advantage.”

- Julie Lockner, Senior Analyst, Vice President Data Management, Enterprise Strategy Group

Availability and Pricing

The modular EMC Greenplum Data Computing Appliance is available now, and introduces new DCA platform pricing for the Greenplum Database Module to meet market demand for more compute per storage. For more details on individual modules, configuration and pricing, please contact EMC.

 

Also Read

First Published: Oct 05 2011 | 12:38 PM IST

Next Story