Wednesday, February 03, 2016

5 Reasons to Augment your Data Warehouse with Hadoop


Here is the webinar I did on modernizing data warehouse with Hadoop.

Monday, February 01, 2016

A business improvement process

As we gain experience in doing something, we all will learn about something called a process. A process can be thought of as a methodology or a framework to perform or build a solution or a task. Each process can be broken down to some building blocks. These building blocks when interconnected forms a process. There are many benefits for having a clear process for any problem - repeatability, scalability, & maintainability.

Now let's apply a process to business improvement in step by step fashion. As we talked earlier, the first step is to identify the building blocks. The building blocks for business improvement process are:

1. Monitor: Monitor all aspects of business with various tools.

2. Learn: Learning from all the events which are happening to your various actions.

3. Act: Act on your learning.

Here the order is important and these steps form a closed loop ( Monitor --  Learn --  Act ). Apply these steps to your business and improve the outcome.














Thursday, October 29, 2015

Data Visualization in the Age of Big Data


The old cliché - 'A picture is worth a thousand words' - still applies to big data. Humans perceive information via various channels called sensory cues - visual, auditory, and touch. Human brains are wired to understand images better than text or numbers. Studies show that the human brain processes images 60,000x faster than text. This fact signifies the importance of data visualization. Take any dataset either small or large and process it to a desired state and the final step is to communicate trends, gaps, outliers and insights using data visualization to decision makers to result in some business action or decision.

Data visualization is one of the critical components if not the most important component of any big data implementation and it can be used for delivering traditional business intelligence reports, tracking organizational KPIs, and to communicate insights gleaned from the data.


Big data implementation is a journey and it is linear process - each step has a dependency over prior step. This journey can be mapped to a simple four step process which is shown below. Organizations can take the following steps as a blue print for their big data implementation:


STEP1 Define clearly your infrastructure strategy 
STEP2 Select right big data technologies
STEP3 Integrate right data from various sources
STEP4 Process and enrich data
STEP5 Perform data analytics and visualization

Big data implementation journey

Typically, data visualization is the last step in any big data implementation. This is due to the fact that data needs to be integrated, cleansed, transformed, and enriched with other data sources to get more semantic meaning and value out of it. Once data arrives to this final processed stage, visualization can be implemented to get good insights from the data.

If you look at the idiosyncrasies of big data visualization, you will notice that big data visualization is kind of a misnomer. Plotting the whole big data can be too noisy, slow and challenging due to technology limitations - moving data  to the target device (browser, mobile or tablet etc) but in most cases it is a browser. To solve this problem new approaches and techniques are required.

With the advent of big data, a few new use cases are evolving for data visualization. These are the new drivers for big data visualization. Few such requirements for big data visualization are:
1. High speed data: visualize high velocity data in real-time.
2. High volume data: visualize huge volume data

To achieve the above requirements, traditional visualization tools doesn't cut it. These visualization drivers require new hardware capabilities (like large RAM, Multi-core CPU, in some cases GPU etc)  in addition to store, organize and  process big data for efficient data visualization.

Challenges:
Visualization of large datasets will be hard for  human eye. Even if we present such visualization with large dataset, This will be too noisy for the humans. Think this like you were asked to find a needle in a haystack. All you can see is the haystack but not the needle. Another problem related to this is computational complexity in moving large data to the target device (browser, mobile, or a tablet) for rendering. This will be very slow.

Opportunities:
The above challenges are driving new opportunities - algorithms, efficient hardware, commoditization of RAM, new ways to visualize information like graphs, temporal and hierarchical and last but not least is the delivery platforms like cloud and mobile play a crucial role. All these approaches require a new technique for data visualization called interactive analytics. With interactive analytics, you can ask questions, touch and feel the data, and collaborate and brainstorm with teammates. Lots of innovation is required in the delivery of information - when and where it is required.

To conclude, data visualization is hot and require new interactive analytics approaches in the age of big data. Look out for new tools in this space which are either too generalized that can do many things like charts, trends or specialized that can do few specific things like graphs, collaborative, and interactive on large datasets. Pick the right tool based on your requirement and use case.








Friday, March 06, 2015

Man vs Machine

In this post, I want to discuss my view on this hot subject - Man vs Machine. With the rise of Data Science, some worry including prominent members like Elon Musk, Stephen Hawking that AI is a threat to human race. Sure, there is some truth to that statement but we are far away from that belief. If the whole market's (academia, research, business and government) focus is on Data Science then the end result is humongous improvements in this field. This is truly a collective intelligence!

We are in the early phases of commoditization of this field which once considered to be for the elites with big bucks. Due to eager adoption and research interest in data science field could potentially lead to revisit of AI. I think, this is what people worry about. If you have seen certain Sci-Fi Hollywood movies, you get what this is about aka. Giving intelligence to machines! I see on other hand, towards a positive side, there are lots of good aspects with the rise of machine learning and data science namely support for elderly, robotic surgeries, assistance to terminal people etc to name a few.

Though, this wave may take sometime to hit the market, there are certain missing pieces and technology aspects to solve this puzzle. I will leave those pieces to master minds.

Coming back to our topic on Man vs Machine, we really need to understand certain aspects of humans which are really hard to embed to a Machine's brain. Let's see some of these aspects:

1. Creativity
2. Intuition
3. Dreams
4. Emotions
5. Thinking

Why these aspects are important in this debate? These aspects are driving forces for humans. That's how we differentiate!

We are not even close to beat ant's brain in certain functions with machine's intelligence. This quest to give intelligence to machines is a long long journey but definitely an exciting one! 






Wednesday, August 20, 2014

The new brave world of software!

Traditionally software has been proprietary and closed source. Companies use to hire top-notch software programmers internally and write code to develop software products. All was good until open source movement has started and many companies didn't pay attention to Linux until it got popular in the web world in early 2000.

Now many companies are taking that route to open source some great products like Hadoop, Open Office etc. Apache foundation has become the de facto standard due its business friendly license. The VC community seems to be excited and entering the bandwagon. Recently good chunk of funding went to open source software names like MongoDB, Hadoop etc. With all these trends, companies now need an open source strategy to innovate and align with the company's vision. This puts us in the new brave world of software. Bits are developed, tested and certified across the world.

Is the future of software is open source and free? Seems like it!



Sunday, August 17, 2014

Real-time Enterprise

There is some truth and hype in building a true so called a 'Real-time Enterprise'. Let's define first what is a Real-time Enterprise? Real-time Enterprise is one where events are monitored in real-time or at least near real-time across the enterprise to make faster decisions.

This definition on paper looks fancy and great but in reality it is so complex. There are couple of reasons why this is so complex - organization culture, business processes, existing technology and people who make decisions.

With today's technologies, Enterprises will head towards achieving this path of Real-time Enterprise. This will be a continuous process to reach that goal and requires good strategic thinking and excellent resources.




Friday, May 30, 2014

From good to great

I have been coaching my kid to learn soccer for sometime now. Quite recently, I ran into this thought about how one can raise from a good player to a great player. If you think about soccer, anyone who has a soccer ball knows how to kick a ball, I mean just blindly kicking. Very low barrier to entry into this sport. With some hard work, one can move from a novice player to a good player, but to move from a good to a great player is not at all easy. This transition takes relentless practice, dedication, perseverance, planning & many many other skills. It even might take many years to reach there.

The same thing applies to the 'Business' world.