Combining SQream DB and Hadoop for richer data insights

This is part two of a three part series.  Visit part one or continue to read

SQream DB combined with Hadoop, is the best of both worlds, using the benefits of Hadoop with the performance of SQream DB for analytics.

Hadoop is not just one thing.  It is a group of popular open source components and procedures for big data operations.  The four “modules” work together to facilitate cloud-scale analytics.  The most important of these are a) Hadoop Distributed File System (HDFS) and b) MapReduce, the programming model for analytics on data in HDFS.  These allow data to be stored and offers the basic tools of using the data. 

The other modules are Hadoop Common, which provides the Java tools to use Hadoop under the user’s computer system (Windows, Linux, etc) and YARN, which manages resources.

Hadoop was however not designed as a database, so companies deploy a patchwork of applications, which results in complications, and growing costs in hardware, software and management.

Hadoop is very flexible, and it became widely used for unstructured and structured data, in fact half of the Fortune 500 companies use Hadoop.

However, the Business Intelligence (BI) pipeline built on top of Hadoop is too slow and there are key use cases where Hadoop is not optimal.

This leads to lost insight and heaps of under-analysed data or silos of data.

BI users prefer to use the def-facto standard of SQL instead of struggling with a patchwork of applications.  SQL-on-Hadoop system tools however bring their own issues, they are delicate and not the best solution.

Hadoop issues

SQream DB was designed to bridge the gap by using GPU acceleration to allow rapid analysis of the Hadoop and other MPP ecosystems.  It handles hundreds of terabytes to petabytes of raw data directly, and is the perfect complementary platform for supercharging the Hadoop system you have already invested in.

You still retain the benefits of Hadoop – distributed, cost-effective storage, well integrated and successfully deployed.  On the other hand, SQream is scalable and is high-performance in ad-hoc queries, while leveraging existing SQL skills. 

“To uncover business-critical insights, data experts need fast, flexible, and direct access to their raw data. SQream DB brings mass data analytics as a complement to the Hadoop ecosystem, enabling organizations to generate faster, more accurate, and previously unobtainable insights.”
Ami Gal
CEO and co-founder of SQream

As mentioned in the first part of this series, the SQream method of using GPU acceleration does not require the same careful replication needed with Hadoop systems.  This provides a fast and easy data preparation process without errors as data is ready to be queried immediately.

Using SQream DB allows you to load and query massive amount of data while conforming to the ANSI SQL-92 standard, making is the same as other RDBMS.  The system will respond dynamically to the workload, tuning system resources as required.

Lumen is a distributor and reseller for SQream in Australia and New Zealand.  Our relationship and experience makes us the best team to speak with for your big data analytic needs. 

For more information on BI Visualisers and SQream you can watch the webinar “Is Hadoop still Delivering on it’s promise?

Our SQream DB consultants are eager to show you how you can save money, and increase the quality of your business intelligence.

This is part one of a three part series, continue to part three, or visit our SQream DB Product Page