The Emperor Of All Remedies: Big Data

Big Data Studio

05 Jan 2013 — By Fari Payandeh

The thickening line of optimism wrapping around the vector of time is a harbinger of what is to come. As I was browsing through the Big-Data-related articles that I had collected in 2012, I couldn’t help but noticing the positive trend in the way the world is viewing it. Nonetheless, a chorus of voices, conveying a message of admonishment has been reverberating through the corridors of academic and scientific communities. The gap between the overly optimistic business world and the prudent scientific world is primarily created by the forces that have stumbled into a delicious opportunity for making huge profits. They see Big Data as a low hanging fruit and they are working around the clock to promote it. Some of these “experts” don’t have the foggiest idea of what’s under the hood. “Something true when whispered becomes false when shouted”, and that is part of the problem. On the opposite pole, some scientists wrongly assume that Big Data is just another fad and it will collapse under its own weight. I believe that the current state of this technology is the most sensitive barometer for measuring its future potential. However, the technology has two arms, albeit inextricably intertwined, working on two different requirements. Big Data’s first arm came to life out of operational necessities after some Web 2.0 companies hit an impasse while utilizing traditional RDBMS technology. The second arm, the analytics, already existed in the IT industry, but needed a transplant to give it Business Intelligence capabilities. The operational arm has been successfully implemented by companies like Facebook; Facebook ingests around 500 TB of data per day– an astonishing value compared to traditional Data Warehousing limits. The evidence abounds that the operational capabilities of Big Data are indisputable.

Now, let’s examine the point of contention, which is Big Data Analytics. No one questions the value that analytics might bring to businesses. The rift is caused by the disagreements over the practical aspects of creating value, and not its merits.  Here is a good example of what is under scrutiny: How are we going to derive insight from bad data? Many companies are grappling with making sense of the operational data they have collected, let alone distilling raw data into intelligence. With that in mind, there are companies that have succeeded in increasing revenues and reducing costs as a result of implementing Big Data Analytics. As the world becomes one huge Network– one entity one IP address— arena, connecting billions of devices, and as more and more sensors are implanted in “things”– appliances, buildings, offices, homes, vehicles, factories, medical devices, and big machineries, we will witness a societal  metamorphosis. Just as breaking the sound barrier unleashed the forces that changed the world in terms of air travel and warfare, breaking the data size barrier will inevitably change the world in terms of making it a SMART place. Does it mean that every company that sits on top of large amounts of data will directly benefit from this new technology? The answer is No. Is Big Data going to change the world as we know it? The answer is a resounding Yes, but it will take some time for Big Data’s true identity to be revealed.

The Curious Case Of Big Data Definition

 Big Data Studio

As Featured On EzineArticles

15 December 2012 — By Fari Payandeh

Most technical people I have talked to think that Big Data is nothing new. They seem to be proceeding on the premise that Big Data’s sole purpose in life is to serve business intelligence.  As someone said to me the other day, “Walmart has been enjoying the fruit of their investment in data warehousing/business intelligence for years; way before there was a Hadoop or NoSql in existence”. True, but Big Data is not about “What”. It’s about “How”. How long does Walmart’s nightly jobs run to transform the raw data into meaningful data (business data) that can be used by its BI tools? Moreover, is Walmart currently processing its unstructured data to add value to its BI strategy?

I watched Werner Vogels, the CTO of Amazon elaborate on what is today called “Big Data” back in 2006. He was talking about how Amazon had made a radical shift from Relational Databases to flat files to store its customer data. He said that Relational Databases weren’t able to meet Amazon’s requirements. What is interesting is that Werner Vogel was referring to the difficulties they were facing in processing the OLTP portion of their business and not DSS. However, today, Big Data encompasses OLTP, DSS, and real-time BI.

Let’s balance the myth against the facts: What is not Big Data? Big Data is not attached to a set of technologies nor is it applicable to every single company that sits on top of huge amounts of data. It is true that the IT industry has made great strides in data caching, I/O throughput, scalability, availability, consistency, real-time data processing, and working with unstructured data. However, those enhancements could have come to life organically by the invisible hands of market dynamics to support the evolution of business intelligence. Where facts and myth deviate is that the myth fails to take account of the likelihood that we would have been where we are today even if there were no likes of Amazon around.

In conclusion, the term “Big Data”, although legitimate in that it is referring to  new ways of processing large amounts of data, is misleading due to the fact that “size” is part of the name, but size types (small, medium, large) are not constants and they change overtime. What was considered a large data set twenty years ago may fall into small category today. I personally would rather refer to it as “Net Data”, alluding to the way  data is spread across many servers on disk files as opposed to Databases.