The old cliché -
'A picture is worth a thousand words' - still applies to big data. Humans
perceive information via various channels called sensory cues - visual,
auditory, and touch. Human brains are wired to understand images better than text
or numbers. Studies show that the human brain processes images 60,000x faster
than text. This fact signifies the importance of data visualization. Take any
dataset either small or large and process it to a desired state and the final
step is to communicate trends, gaps, outliers and insights using data
visualization to decision makers to result in some business action or decision.
Data
visualization is one of the critical components if not the most important
component of any big data implementation and it can be used for delivering traditional
business intelligence reports, tracking organizational KPIs, and to communicate
insights gleaned from the data.
Big data
implementation is a journey and it is linear process - each step has a dependency over prior step. This journey can be
mapped to a simple four step process which is shown below. Organizations can
take the following steps as a blue print for their big data implementation:
STEP1 Define clearly your infrastructure strategy
STEP2 Select right big data technologies
STEP3 Integrate right data from various sources
STEP4 Process and enrich data
STEP5 Perform data analytics and visualization
Big data implementation
journey
Typically, data
visualization is the last step in any big data implementation. This is due to
the fact that data needs to be integrated, cleansed, transformed, and enriched
with other data sources to get more semantic meaning and value out of it. Once
data arrives to this final processed stage, visualization can be implemented to
get good insights from the data.
If you look at the
idiosyncrasies of big data visualization, you will notice that big data visualization is
kind of a misnomer. Plotting the whole big data can be too noisy,
slow and challenging due to technology limitations - moving data to
the target device (browser, mobile or tablet etc) but in most cases it is a
browser. To solve this problem new approaches and techniques are required.
With the advent
of big data, a few new use cases are evolving for data visualization. These are
the new drivers for big data visualization. Few such requirements for big data visualization
are:
1. High speed
data: visualize high velocity data in real-time.
2. High volume
data: visualize huge volume data
To achieve the
above requirements, traditional visualization tools doesn't cut it. These visualization
drivers require new hardware capabilities (like large RAM, Multi-core CPU, in
some cases GPU etc) in addition to
store, organize and process big data for
efficient data visualization.
Challenges:
Visualization of
large datasets will be hard for human
eye. Even if we present such visualization with large dataset, This will be too
noisy for the humans. Think this like you were asked to find a needle in a
haystack. All you can see is the haystack but not the needle. Another problem
related to this is computational complexity in moving large data to the target
device (browser, mobile, or a tablet) for rendering. This will be very slow.
Opportunities:
The above
challenges are driving new opportunities - algorithms, efficient hardware,
commoditization of RAM, new ways to visualize information like graphs, temporal
and hierarchical and last but not least is the delivery platforms like cloud
and mobile play a crucial role. All these approaches require a new technique
for data visualization called interactive analytics. With interactive
analytics, you can ask questions, touch and feel the data, and collaborate and
brainstorm with teammates. Lots of innovation is required in the delivery of
information - when and where it is required.
To conclude, data visualization is hot and require new
interactive analytics approaches in the age of big data. Look out for new tools
in this space which are either too generalized that can do many things like
charts, trends or specialized that can do few specific things like graphs,
collaborative, and interactive on large datasets. Pick the right tool based on
your requirement and use case.
No comments:
Post a Comment