Thursday, 3 September 2015

Big Data Series – Part 4 Creating a suitable Technology Stack/ Solution


All of these components bring their individual technology features. Companies must wisely put together an overall solution from among those components, leveraging their complementary advantages and customizing those to their particular needs.

There are four fundamental technology stacks (with their variations) offer possible solutions:

1.       Big data core only or with enhancements (with complex event processing, with in-memory database, with query engine or with complex event processing and query engine)

o   This technology is the de-facto standard for exceptional data movement, processing and interactivity.

o   Data usually enters the cluster through batch or streaming.

o   Events are not processed immediately, but in intervals. Enables parallel processing on large data sets, and thus advanced analytics.

o   Applications and services may access the core directly and deliver improved performance of large, unstructured data sets.

o   Adding CEP enhances big data core processing capabilities, real-time detection of patterns in data and trigger events. Enables real-time animated dashboards. Could add machine learning program to the CEP.

o   IMDB can further increase computing power through placing key data in RAM.

o   Query engines can further open interfaces for applications to access big data even faster.

2.       In-memory data base (IMDB) cluster only or with enhancements (with Big Data Platform, with complex event processing)

o   External data is streamed in or transferred as bulk to the IMDB

o   Users and applications can directly query the IMDB, usually through SQL like structures.

o   The incoming data is first pre-processed through the BDP before it goes to the IMDB

o   In case of CEP, the CEP first ingests the data; the processing is then done in the IMDB and then returned to the application for faster interactivity.

3.       Distributed Cache only or with enhancement (with Application and Big Data platform)

o   A simple caching stack sitting atop of the data source repository. The application retrieves the data. The most relevant data subset is placed in the cache.

o   Processing of the data falls to the application (may result in slower processing speeds)

o   If BDP, the BDP ingests the data from the source and does the bulk of the processing, then puts data subset in cache.

4.       Appliance only or with enhancement (with Big Data platform)

o   Data streams directly into the appliances; the application talks directly to the appliance

o   If BDP, the BDP ingests and processes data. The application can directly talk to the appliance for queries.

Continue part 5 out of 5

1 comment:

  1. Yeah, you have delivered an informative post for the user, it's really useful for me. Digital technology is expanding day by day in regular work and corporate sectors like banking and retail. We are a successful digital platform and providing service in digital transformation and assist the people who looking for a automation testing across the globe

    ReplyDelete