Read this article and focus on the definition, types of data considered big data, and how to analyze it. Then take notes on the tools currently used to analyze big data.
Technology continues to advance an organization's ability to collect data. Next, you will learn about storing and making big data accessible by developing the infrastructure.
What is big data?
There is no hard and fast rule about exactly what size a database needs to be for the data inside of it to be considered "big". Instead, what typically defines big data is the need for new techniques and tools to be able to process it. In order to use big data, you need programs that span multiple physical and/or virtual machines working together in concert to process all of the data in a reasonable span of time.
Getting programs on multiple machines to work together in an efficient way so that each program knows which components of the data to process, and then being able to put the results from all the machines together to make sense of a large pool of data, takes special programming techniques. Since it is typically much faster for programs to access data stored locally instead of over a network, the distribution of data across a cluster and how those machines are networked together are also important considerations when thinking about big data problems.