Skip to main content

An ounce more to the Big Data chaos !

Having built 2 products that use the Big Data “idea” and “tools”, I am qualified to add to the already existing chaos. Now, what triggers me to do this? Over the weekend, I was trying to explain one of those products to a couple of brilliant architects and thinkers. That’s when I experienced how it would be to be caught up in the belly of a Tsunami wave. Every word and every picture was ripped, rolled and mauled for the next 2 hours. I could quickly infer that the reason for the mayhem was that we did not have a common reference architecture of a typical Big Data solution in our minds. We even needed a totally new vocabulary that does not ring any old conflicting bells. 

During the discussion, my brain was madly racing to bring some “order” on those areas. I will present it all here. You tell me if it brings more order or even more chaos. Basics first! Why do we need a reference architecture? It is basically to communicate a complex concept to others in the least conflicting manner. It is also to stack up the technical capabilities in such a way that things don’t collapse along the way. Once again, presenting a reference architecture itself is an art. If you are well exposed and experienced, you know that you need many perspectives to represent an architecture. A wise architect can be measured by his discipline in not mixing up multiple perspectives in one single artefact, unless its really essential. 

Here is a “processing capabilities” perspective to any Big Data oriented solution. Is it earth shattering? No! But the point is, understanding my view will enable you attack my assumptions effectively. Lack of this will result in shattered egos :)
The vocabulary I used above are self explanatory except, probably, for the “Reduce” part. That is the only aspect that differentiates a big data solution from the traditional information management solutions. You will be reading multiple definitions for Big Data (the “4Vs definition” being the most common.) What is rare and in my opinion, what hits the nail on the head is that ‘Big Data is all about effective Data Reduction’. An implementation that does not result in a cost effective and rapid data reduction is not a proper big data implementation. And needless to say, the primary purpose of data reduction is to extract out and keep the “insights”. 

The above can also be presented as a layered cake as below. Raw data gets ingested from the bottom. As it moves up the layers, it gets converted into insights and actions. Sounds cliche? :) Now, the “elapsed time” between the acquisition stage and action stage is a key factor. How rapidly do you want to take a business action on a newly arriving data? In Real Time? Near Real Time? Or when all the cows have come home? The technical capabilities that you need would be vastly different depending on that.



Comments

Popular posts from this blog

The Stunts for Attention

Some little distraction here before we get on with rebooting the universe. I heard some comments about the writing style of that post (Reboot the Universe - part 1) from various channels. The notable one came from Dhaks, universally recognized as Maams. He said, “hmm...u were alright when I met you last... ”. The genius-unexplored Henry was seconding Maams that “its the usual (weird?) Raja” . Hahhaha. Well, my sanity seems to have left with you Maams. Hereafter, please don’t leave me and go! (Nah, this is not a marriage proposal by any means :P). But honestly, I had my serious doubts about how that style would be received. Actually, I don’t have any fixed style of writing. Nor themes or subject matters. I deliberately keep it that way. In acting, there is such a thing called ‘method acting’. I gathered that its where the characters prevailing above and over the identity or the mannerisms of the actor himself. Daniel Day Lewis! Check him out in IMDB and compare (his real looks with) ...

Reboot the Universe - Final

It started here Little Trisha has 2 things that I like in her. One - she loves reading. Just that her books have rats as the heroes. Duh!! I have been asking her to grow up. Second - she is random. Radiantly random. This was the recent random thing we did together. That day, she influenced me to a soya milk from the Jolly-Bean’s. Now, soya milk can be boring as sewage. So to give it a bang, they add these ‘chewy pearls’ to it. For the uninitiated, chewy pearls are tiny sweet balls, made using tapioca and dropped lovingly into your drink. If you ever crave to nibble and bite into a soft human body part, you shall try the chewy pearls. So we walked out, bought our cups and started drinking. Baby : Jokes time, Daddy! Daddy : Okay…. Where would a bored cow go? Baby : Where? Daddy : Moooooovie, of course! Ouch!! She giggled at that and got chocked with a chewy pearl. And in a short struggle, she managed to shoot it out of her throat onto the floor. First we thought of cleaning it up. But t...

(Some(what)) Clear Thinking on a Cloudy Thing.

Me and Ajay brainstormed on the dynamics of an ultimate cloud computing environment. And I extracted our ideas into this (cartoon) context diagram. We are dealing with daunting levels of complexity in this area today. So ‘Abstraction’ becomes the key with which we approached this subject. An ever expanding box with a dashboard and a toilet man are the subtle(?) visual cues that convey the abstraction and ease of use. The levels of technological maturity as we expect here are not available today. This is the ‘FUTURE’. A Nirvana in Cloud Computing. And it speaks thus: The Cloud is the new operating system. Elastically growing and shrinking hardware are achieved already. The Cloud Management platform will maintain an inventory of the hardware capability - updating it as the hardware had been consumed and released. We will not deal at the application servers, BPM engines, databases level anymore. The tools will not matter. ‘DIY Blocks’ here refers to a catalog of pre-built ensembles that ...