• Unit 4: Big Data Processing and Cloud Computing

    What is "big data?" Big data is usually defined as a substantial amount of unstructured and structured data that is so large it is difficult to process using traditional methods. The amount of data is either too big, received too quickly, or exceeds available processing capacity. This requires organizations to have advanced systems to manage big data. Big data processing consists of a set of techniques and computing models that access large scales of data. This process extracts useful information that supports or provides evidence for decision-making. However, this can require a lot of space, and purchasing hardware can become costly. Cloud computing allows for the delivery of different services through the internet. For example, resources can include tools and applications like data storage, databases, and software. Cloud computing systems stores and grant access to data over the internet instead of on a local server or hard drive. This unit will cover big data and cloud computing.

    Completing this unit should take you approximately 7 hours.

    • 4.1: Big Data

      There are many definitions of big data. For this lesson, think of big data as consisting of extremely large data sets. They are large primarily because of volume, velocity, and variability. Volume does not relate to storage. However, it addresses how to discover relevant insights from enormous amounts of data.

      Because of technology, organizations can collect and store data faster than ever before. Velocity refers to this accelerated pace and challenges organizations to find ways to collect, process, and make use of vast amounts of collected data. Variability refers to data that comes in unstructured and structured forms. Therefore, organizations require scalable architecture to store, manipulate and analyze big data.

      • 4.1.1: Big Data Storage

        Big data storage is an infrastructure designed to store, manage, and retrieve massive amounts of data. This allows the storage and sorting of big data. It also makes the data easily accessible for processing by other software and services that work with big data.

      • 4.1.2: Big Data Analytics

        Big data analytics explores large amounts of data to reveal unhidden patterns, correlations, and other relevant insights. Because of today’s technology, organizations can analyze data and almost immediately get results.

        There are benefits to using technology and big data analytics. While earlier organizations had to gather information and conduct analyses to make future decisions, today, organizations can identify insights to make immediate decisions.

      • 4.1.3: Challenges of Big Data

        Not many people are trained to work with big data. This creates challenges for organizations that store and use big data. This is not the only challenge. Other challenges include (1) handling large amounts of data, (2) real-time analysis can be complex, and (3) data security.

    • 4.2: Cloud-Based Services

      Cloud-based services provide on-demand service to users via the internet. This is also known as cloud computing. Cloud computing delivers computer services ranging from applications to storage processing power. For example, instead of owning your own computing infrastructure, companies will rent access to applications and storage from a cloud service provider.

      • 4.2.1: Cloud-Based Organizations

        Cloud-based organizations use cloud computing to deliver computing services. This includes servers, databases, networking, software, analytics, and intelligence over the cloud (internet). The cloud offers faster innovation, economic scalability, and flexibility of organizational resources.

      • 4.2.2: Disaster Recovery

        An organization’s entire virtual server can be copied or backed up to an offsite data center. Cloud disaster recovery is a service that allows for backup and recovery of remote systems on a cloud-based platform. This enables data to be retrieved and restored on a virtual host in minutes.

      • 4.2.3: Challenges of Cloud Computing

        You learned that cloud computing provides services such as applications, data servers, and computer networking. This is normally done by a third-party server located in a data center or a private cloud. The ability for organizations to become flexible, scale economically, and make faster analysis makes cloud computing the primary choice. However, there are also challenges to consider before implementing cloud computing technology.

        Big data includes large amounts of both unstructured and structured data. Recent advances in technology allow organizations to store big data using traditional (non-virtual) and cloud-based services (virtual). These systems play an important role in big data analytics. However, big data and cloud computing have challenges. Remember, refer to your notes on the advantages and challenges of cloud computing.

    • Study Guide: Unit 4

      We recommend reviewing this Study Guide before taking the Unit 4 Assessment.

    • Unit 4 Assessment

      • Receive a grade