Showing posts with label Big Data HBase NoSQL unstructured data BI DB. Show all posts
Showing posts with label Big Data HBase NoSQL unstructured data BI DB. Show all posts

Friday, August 3, 2012

What is Big Data?

Big Data is a tech buzz word, each of us may have our own understanding of what Big Data is or what it contains. It contains data, we know; perhaps very large quantities of data in formats that are new to us and stored differently from what we have seen ?

I was in a faculty meeting discussing BI/analytics courses where I first felt extent of the diversity of meanings that the phrase "Big Data" conveys, therefore decided to write about it.

source:  http://www.bigdatabytes.com/managing-big-data-starts-here/ 
Big Data is mostly associated with Hadoop. An open source technology introduced by Apache for real time, scalable & reliable processing of and intelligence on Big Data. Big Data and Hadoop are well described by Tim Elliott on his blog.




How is it different? I believe understanding HBase is the key. HBase is a non-relational, distributed data storage system for the Big Data in Hadoop. InfoSys educators need to develop course modules (for 1 or 2 class meetings) to illustrate how Big Data is structured, or better say, unstructured in HBase, contrast its features with those w/ RDBMS and discuss NoSQL.

Parallel processing, synchronization, and other technical matters, belong to master's program in more technical programs, I would think. But they can be briefly discussed in our DB/BI classes too.

I am working on my course module because I think it is necessary for our students to enter the world of Big Data informed and prepared. Will share it here as soon as it's ready, stay tuned!