Topic outline

  • Unit 5: Data Analytics

    Data analytics is the "thinking" part of BI. Once the information has been mined, organized, and stored, the analyst must access it through structured queries. The analysis process applies rigorous methodologies to study information and interpret the results. Using these methodologies allows the analyst to determine how the information relates to the needs of their management team. Data analysis is often done using dashboards such as Tableau. Analytics is where information becomes intelligence. It is transformed from disparate data points that can be described in terms of data sets into patterns resulting from the analysis. This is where the real brainwork of the analytic process takes place. The methods are myriad and highly dependent upon the available inputs and requirements for your particular project.

    Completing this unit should take you approximately 8 hours.

    • Upon successful completion of this unit, you will be able to:

      • explain the difference between describing and analyzing data;
      • apply various analytic techniques to various datasets to make analytic estimates; and
      • determine what kinds of scenarios and simulations would be most useful for your business case.
    • 5.1: Overview of Data Analysis

      Often people speak authoritatively on issues, such as government policy, using anecdotes and providing value judgments. Anecdotal evidence describes one or maybe a few cases that may apply to a decision or a policy. In legal cases, this kind of evidence is circumstantial or based on specific circumstances in which all other things are held constant. As we say in the analyst biz, "one data point does not make a trend".

      To provide actionable intelligence for an important business or any other decision, it must be backed by solid, replicable data and robust analysis that can withstand scrutiny. For example, your relative might tell you how "everyone who has their hip replaced gets MRSA", for instance, and they can point to three friends in one hospital. However, thousands, if not millions, of people have had hip replacements without complications. Your relative has fallen prey to availability bias – the only information they have is about these three cases. They may not realize they know a dozen people with hip replacements without mishaps. They dwell on the difficult cases that stand out to them – this is familiarity bias. Perhaps instead of cautioning people about these negative outcomes, your relative should investigate why this hospital has had so many recent MRSA cases. The results of that investigation would be based on data analysis and could potentially yield actionable results.

      Most people accept the information they have that resonates with them most and use the mental shortcut of descriptive information to make their decisions, not analysis, which takes more mental work. We see this in political discussions, which is why they can become so heated. Each side is hammered with confirmatory evidence of one shortcoming, real or imagined, and they make emotional decisions based on their unreliable access to descriptions that lead to confirmation bias. Analysts do not do this. Once you become accustomed to using analysis for work, you cannot resist using it for all aspects of your life, questioning every slightly odd pronouncement you hear from your friends and family and the news you read or hear. The analytical mind is both a blessing and a curse. It helps to have a naturally curious mind with the tenacity to get to the bottom of the story. In this way, analysts are not unlike the best journalists.

      When have you relied on anecdotal information to make a major decision? How did it work out?

      • Look at this simple flow chart that provides a basic analysis process. These were the preparation phases of analysis. This diagram does not include warehousing, where data is stored, and where much processing occurs. Analytics is where the information becomes intelligence. It is transformed from disparate data points that can be described in terms of data sets into patterns resulting from the analysis. This is where the real brainwork of the analytic process takes place. The methods are myriad and highly dependent upon a particular project's available inputs and requirements.

      • A Fortune 500 insurance company hired an analyst team to assess how well their 2,000 independent agents used online marketing tools. The company had provided a one-page web template to all the analysts the previous year to, at a minimum, have a single result in a search for the name of their agency so people could find their contact information and perhaps a basic idea of the products they offered. A year later, the company hoped to discover that most of their agents had moved on to develop a more robust web presence. To conduct this analysis, the team worked with the company's strategic marketing team to identify what the company wanted its agents to use. For instance, was there a simple description of each of their products, was there a one-click method to get an insurance quote, were their links to the agents' Facebook, Instagram, Twitter, or other online social media account, had there been something posted on these accounts in the past week, etc.? Once these "robust web presence" parameters had been established, the analyst team painstakingly reviewed all 2,000 websites and used a spreadsheet to mark an "X" where each agent had each desired item on their website. These were not weighted but only tabulated to give each agent a "web use score". These scores were analyzed to determine how close each agent was to a "perfect" score, meaning they had all the desired items. A one-page snapshot was developed for each agency when the analysis was completed. It told them their overall score was out of the total possible, showing what they were doing well and where they needed to improve.

        This was a classic case of using internal data for a business intelligence process to improve internal processes to increase market share. It is also a case of developing new indicators and a new analytic method to determine an outcome that me the client's requirements. The project could have continued and become a competitive intelligence project by assessing what share of their geographic market each agent had captured the previous year. This would likely have yielded more evidence for the company to persuade agents that more effective use of web-based tools could result in a competitive advantage...but ONLY if that is what the data indicated. One firm, for example, is one of the oldest in this company's agent "family". It is located within one block of the corporate headquarters. This agency only used a single template page and had a massive geographic market share. Its market was older, wealthier, traditional people with generational histories of using this firm. Their marketing was nil. It was based on families bringing new drivers who needed auto insurance, newlyweds who wanted to explore life insurance, etc., for their market. This model worked for them, but it is unclear whether it would remain sustainable as their customers became more web savvy. Thus, it is important to look at outliers, who may tell more of the story but not assume they tell the whole story.

    • 5.2: Analytic Techniques

      Analytic techniques are methodologies used to find patterns in structured data. These can include common ones such as benchmarking, cost-benefit analysis, win-loss, and other scenarios, decision trees, demographic and psychographic analysis, geospatial analysis, and crime or purchase mapping. The novice analyst will use tested methodologies to identify appropriate processes for adding structure to unstructured data and then apply common analytic techniques, and rightly so. The more seasoned analyst, depending upon the requirements, will find new ways to categorize and highlight specific aspects of data that may have already been used commonly for operational or rote analysis but will use it in new ways, applying unique or even developing new analytic techniques to produce unique findings. This subunit will highlight a few common techniques, but an entire course could be delving into dozens of them available for various datasets and requirements.

      • The amount of data collected is staggering. The article was written in the middle of 2019; how much data is now collected daily? The National Security Agency monitors hot spots for terrorist activities using drone feeds. They admitted several years ago that analyzing what they had already collected would take decades, and the collection continues. The key to effective analysis is identifying the most relevant datasets and applying the correct analytic techniques, returning to our mix of art and science. As the article indicates, very little has been studied on removing uncertainty from the value of datasets growing daily. At least with BI, you are typically looking mainly at the data created within your firm, which places some limits on the amounts and type of data, but in a firm as large as, say, Amazon, imagine the amount of data created every day, not only at the point of purchase but in all of its hundreds (maybe thousands) of automated fulfillment centers around the world. Looking at figure 1, the 5Vs of Big Data characteristics, think about the challenges of the kinds and amount of data collected daily by your firm. Is it housed in a common system or different systems depending on the department collecting and using it? How would you characterize its various Vs? Is it manageable? What level and types of uncertainty would you assign to the various datasets you regularly work with?

      • 5.2.1: Decision trees

        Analysts can use decision trees to help decision-makers understand the likely outcomes of several decisions. It is useful to remember that these are typically also the basis for developing most computer algorithms.

        • This article outlines the structure, purpose, and use of decision trees. The nice thing about them is that they allow for chance, which can throw other analytic techniques into a tailspin. Remember our hapless weatherman, who carried out his analysis to a perfect outcome but was thwarted when his boss smelled french fries and ended up at the same fast food restaurant where the weatherman was hiding? Follow the logic and build a decision tree to decide something simple, but to which you can add a lot of variables, like what to have for dinner tonight. Chance occurrences include your child or roommate bringing a friend home unexpectedly.

        • This article provides a step-by-step process on how to build a decision tree. Follow the logic and build a decision tree to decide something simple, but to which you can add a lot of variables, like what to have for dinner tonight. Chance occurrences include your child or roommate bringing a friend home unexpectedly. As noted, decision trees are inherently built on the cost-benefit analysis model but carry it out to the furthest degree possible. Decision trees can also include a financial component. Create a decision tree to decide what your next vehicle should be. This may inform your actual purchase or be a fantasy. Remember to stick to facts when you do your analysis exercises and try to remove any biases. There are true impediments to purchasing a Lamborghini when you are a student or have several small children.

        • This decision tree model uses more data types and provides ways to classify data. These graphics clarify how and why decision trees can be used. Follow the examples closely to see how these might be useful in your work. Have you used decision trees in your work? Does your firm have software or other drawing tools to help you create decision trees? Without a program, the analyst has to depend on their bias reduction skills, so it is best to have a team work together to ensure all possibilities are considered and inherent biases do not creep in to rot your tree.

      • 5.2.2: Structured Analysis of Competing Hypotheses (SACH)

        One of the best general-purpose methodologies for intelligence analysis is Richards Heuer's Analysis of Competing Hypotheses (ACH). Heuer is often considered the "godfather of modern intelligence analysis because of the rigor he applied to it. Heuer developed the method between 1978 and 1986 while an analyst at the Central Intelligence Agency (CIA). ACH draws on the scientific method, cognitive psychology, and decision analysis. This method became widely available for the first time when the CIA published online Heuer's now-classic book, The Psychology of Intelligence Analysis.

        • This article provides a nice overview of analytic tools and their value. While SACH is one of the best-known for reducing bias and ensuring all information is included, it is onerous for analysts to learn. Still, once they do, it is one of the best for ensuring a complete audit trail and keeping analysts "honest" by ensuring they include all evidence without bias. Adding automation to these tools makes them much easier to use, and as a result, they have increased, and their use has become more ubiquitous in the past decade. What structured methods have you used? Which have proven easy and which difficult? Which would you recommend to a new analyst?

        • ACH is one of the best for ensuring a complete audit trail and keeping analysts "honest" by ensuring they include all evidence without bias. A country stability report is one of the most basic studies new intelligence studies students complete. They use ACH to determine whether an assigned country will likely be stable in the next 18 months. Yes or No are the only options. This does not help a decision-maker trying to ensure a region of the world remains stable; they need more information. Structured ACH allows the analyst to repeat the ACH exercise, drilling deeper into the analysis until the available evidence has been exhausted. For instance, the nation of Diania is expected to become unstable in the next 18 months. So what? The analyst can repeat the analysis with hypotheses positing that instability will be caused by H1: Inflation or by H2: Unemployment. H2 is disproven because the country has had high unemployment for the past ten years, and the informal economy now functions well enough that the official numbers no longer matter.

          H1, however, is of concern. Diania is in a tense standoff with neighboring Ruania over their shared main river access, with Diania starting to build a hydroelectric dam for a self-sufficient energy source. The dam will not be operational for five years. In the meantime, Ruania is threatening to double the price of oil it now sells to Duania, and winter is coming. People are likely to be unable to afford fuel to heat their homes. In this case, the long-term energy independence strategy will have little value if people freeze to death next month due to inflated oil prices. NOW the decision-maker has something to work with beyond "likely to be unstable in 18 months". They can either support the dam building, provide a subsidy for the increased cost of oil, or offer to broker an agreement between the two nations. Adding structure helps ACH to provide decision-makers with far more actionable intelligence.

          Have you used ACH? Can you think of a recent decision at work or home in which using a SACH matrix might have helped you decide? Use an ACH matrix to decide between two open positions you have saved on your favorite job announcement app. Does ACH help you keep your bias from the process and make it more objective? Or did it only show you that there is one more attractive position for reasons you had not articulated before you undertook your analysis? Now you know what they are, and even if your ACH shoots them down, you may still want them... Even if ACH does not eliminate bias, when applied to personal decisions, it can at least reveal what they are. Acknowledgment is half the battle against irrational decision-making.

        • One of the most important lessons of effective use of ACH and many other analytic methods is to ensure you are not wedded to proving your brilliant hypotheses. The most effective approach to achieving analytic objectivity and, thus, accuracy is to work hard to disprove them. Having a diverse analytic team will help with this, especially if all members are competent enough to have confidence in their contributions and collaborative enough to invite conflict and disagreement about what the findings may seem to indicate. 

          The relationship between Alice and George illustrates the aggravating, frustrating team that, if they learn to communicate and check their biases and egos at the door, can ensure accuracy and, more importantly, the seeds for effective advocacy that ensures changes and correct decisions are made. The pressure George placed on Alice daily to prove that her findings along the way and in the end were accurate despite his disputation allowed her to have the certainty and confidence to keep challenging the common wisdom about the safety of x-rays on fetuses to ultimately save the lives of millions of children. Our work may not have that consequential or widespread result, but the results will have the same relative value in the context of our teams and our organizations.

          Have you ever tried to disprove your hypothesis or someone else's? Taking this approach as a single member of a team is akin to playing the "Devil's advocate" and can make you wildly unpopular, hence the name. Have you ever resented a team member who played that role? If so, did you ever recognize their value in the project? It has been proven that just playing at being an oppositional advocate is not a truly effective method. They were being contrary for contrariness' sake and risked team cohesiveness. If the person playing the role is doing so for the right reasons, to ensure a lack of bias and full accuracy, the effect on the overall findings can be enormously positive, as in the case of Alice and George. One of the most important lessons of effective use of ACH and many other analytic methods is to ensure you are not wedded to proving your brilliant hypotheses. The most effective approach to achieving analytic objectivity and, thus, accuracy is to work hard to disprove them. Having a diverse analytic team will help with this, especially if all members are competent enough to have confidence in their contributions and collaborative enough to invite conflict and disagreement about what the findings may seem to indicate.

      • 5.2.3: Predictive Modeling

        Data science helps us collect, store, sort, filter, retrieve, and manipulate data. Applying predictive models to that data makes it so much more valuable as it can now support decision-makers in understanding what is, what can be, and, conversely, what they should not do.

        • This brief overview is a good introduction to thinking about predictive modeling.

        • This fascinating talk describes big data and how it can be used today to ensure everyone in your family can have their favorite pie. Is this why Big Data exists? Well, sort of... It can help find the tiniest niche products and connect them to the people who want them, so in a way, it helps you get the pie you want and not make you always get stuck with apple. However, it has and will have many other uses, some of which may sound scary but still worth knowing about. How likely could it be that learning machines eliminate your job? What can you do to prepare yourself to deal with this future scenario? You might be able to apply a decision tree or ACH to help support your predictive analysis.

        • This article is a bit heavy on jargon for data scientists. Still, it makes the interesting case that what we often call prediction is only making inferences, identifying trends in data, and interpreting them, not using them effectively to predict what is likely to happen next. The article also makes the point that prediction may not be the endpoint of machine learning but that providing prescriptions on what to do about likely future outcomes will become the standard soon. Be sure to read carefully through the box office, marketing, and industry trend examples to see how to apply the concepts in the article.

        • Review this presentation and learn from its many examples of preparing data for improved analysis. However, it is optional, only provided for additional information, and is not central to your learning about predictive analytics.

      • 5.2.4: Other Popular Methods

        There are about as many analytic methods as there are seasoned intel analysts. Sometimes we need to apply tested methods, but sometimes, the project calls for us to create our own methods. This is the most adventurous and artistic side of things. Still, we must always ensure our data are backed by reliable sourcing and use and that our outcomes are replicable to be sure our method has validity.

        • Here is a small list of various analytic methods. How many of these methods have you used? How could these ideas be applied in your work?

        • This is a useful article for ensuring the validation of your statistical analyses. However, much of what a BI analyst does deals with qualitative data that may not as strictly adhere to the validation recommendations and requirements presented here. Within the field of intelligence analysis, much work has been done to identify ways to quantify qualitative assessments of validity, reliability, analytic confidence, and other aspects to ensure validation of intelligence findings, many modeled on statistical validation. Think about your most recent project, whether for work or school. How could you numerically and objectively evaluate the validity of your research?

    • 5.3: Real-World Problem-Solving

      When working as a BI analyst and even a human making decisions, everything you do relates to "real-world" problem-solving for your managers. For BI student analysts, it is important to simulate this real environment as much as possible. The globalization of work and those able to work in remote collaborative teams will have the most versatile skills in the future.

      • This article describes what the human brain is doing when we define the problem (requirements and scope), plan for problem-solving (select datasets and filter or standardize and clean them for relevant information), and engage in the creative thinking process that is analysis. The author differentiates the creative process from the analytical process she terms "insight problem solving", but without creativity, the analyst would not know which methods to apply to the dataset and would have more difficulty expressing their findings in a way that is actionable for the decision-maker. To do this effectively requires a certain amount of empathy to understand what the decision-maker needs and in what format so that they can digest it most thoroughly and see the action steps needed for implementation.

        It is interesting to see the problem-solving process laid out in a neurological sense when it is second nature to seasoned analysts. The author describes Tversky and Kahneman's thinking processes that allow analysts to figure out big problems while driving home in light traffic as if on autopilot. One analyst claims to solve most problems by walking away from her computer, riding her bike, or going rollerblading. Sometimes, like figuring out where we left the remote, the important things are fleeting and can only occur to us when we are doing something else. How might you have solved an important thinking problem in an unlikely place or while doing something non-analytical?

      • This article uses algebraic logic to solve very specific problems. Although intelligence analysis can be a bit messier than algebra, the process is essentially the same. We use our information (datasets) and the questions we need to answer (requirements) to define our real-world problem. We use analytic techniques, rather than linear equations, as our roadmaps, and we find solutions (findings) that we communicate in a standardized language, ensuring our decision-maker understands the reliability of our information, our confidence in our analysis, and the degree to which our estimates are likely to be the future outcomes. We go from A to B, but not always in a straight line.

      • 5.3.1: Scenarios

        Scenarios place analysts in the role of the decision-maker or other figure whose decisions, influences, agendas, and profiles the analyst is attempting to model or forecast. Just as in the military, these games often include "Red Teaming", which means trying to anticipate what your adversary will do given certain conditions. In real military war games, the physical "red team" is challenged, and along the way, key options, needed equipment, sources, or something they expected to rely upon to win is taken away. The value of the exercise is to see how adaptive the military unit, or in this case, the analyst team, can be when environmental challenges present themselves and all requirements, timelines, and other elements of the process remain the same.

        • This blog entry speaks about role-playing in intelligence analysis in the context of a historically-oriented game. Write about a time when you have found yourself unable to rely on a key component of a complex plan, whether it was the weather, a ride, a working computer, an accomplice, etc. Were you able to adapt? If so, you have shown the creative insight needed to be a successful intelligence analyst.

        • This article gives an overview of what BI is and how it can provide actionable information for a business. It provides a brief testimonial on a single case study and a well-done video giving examples of how it can be used. A key point from the article is that the activity of conducting BI can be more broadly dispersed throughout organizations rather than left to a small cadre of analysts with special technical skills who "owned" the data but did not necessarily understand how different departments and departments and managers could best use it. Constantly evolving dashboards and data packages make analysis more accessible. However, it is still important not to let the data be used randomly by anyone with access, as the results may not be sound.

          Consider when you could have used big data to answer a manager's question. Did you have instant access to all the information you needed, or did you have to ask an IT or other area specialist? Was it difficult to obtain? Did you end up getting what you needed? Did you have to spend a lot of time manipulating the data to put it into a usable form? How does your organization collect, organize and disseminate its data?

      • 5.3.2: Simulations

        Simulations are similar to scenarios, although today, simulations often replace computational models representing some problem to be solved that might be too expensive or dangerous to attempt in the real world. These computer simulations enable analysts to see what happens in a given situation, like in red teaming, then ask what happens if something is changed. Simulations are often used to experiment with environmental conditions or to predict behavior, such as consumers in a marketplace when a new competitor is introduced.

        • This article describes using simulation programs for making IT decisions, but similar simulations are made to determine geopolitical and business outcomes based on specific conditions. These are different from scenarios as they are usually computer-based. In contrast, scenarios are typically role-played, even when they are "table-top exercises", meaning that the entire scenario environment has not been replicated but imagined in a symposium-like setting. Have you ever designed or participated in a simulation? Did it provide full information to inform the resulting decision-making?

    • Unit 5 Study Resources

      This review video is an excellent way to review what you've learned so far and is presented by one of the professors who created the course.

      • Watch this as you work through the unit and prepare to take the final exam.

      • We also recommend that you review this Study Guide before taking the Unit 5 Assessment.

    • Unit 5 Assessment

      • Take this assessment to see how well you understood this unit.

        • This assessment does not count towards your grade. It is just for practice!
        • You will see the correct answers when you submit your answers. Use this to help you study for the final exam!
        • You can take this assessment as many times as you want, whenever you want.