Prediction and Inference in Data Science: 4. Applications from Entertainment

4. Applications from Entertainment

In my application domain, entertainment, I have found the duality of prediction and inference to be a useful consideration when developing strategy for a variety of different business challenges. Our group develops and deploys methods to model, understand, and influence consumer behavior and market systems using techniques including natural language processing, Bayesian inference, image recognition, multi-modal deep learning, matrix factorization, and more. Below, I will examine the problems of box office projection and advertising attribution as instructive examples of this duality.

4.1. Box Office Projection

The task of "box office projection" is to model the consumer market that generates revenue via ticket sales for the theatrical exhibitions of a film in one or more territories or worldwide. The most common approach to the task is to construct averages over the historical revenue performance of comparable films identified heuristically based on similarity of film content or production metadata. Model based (regression) approaches are also frequently applied, with independent variables including production characteristics (such as the production budget of a film), talent characteristics (such as the "starpower" of an actor or director as measured from past box office gross or awards), the marketing support behind a film (such as the advertising expenditure and features describing the ad campaign strategy), measures of audience response (such as digital trailer views or volume of social media conversation), and more. For at least the past three decades, a wealth of literature on this task has been produced by the academic community, and many industry groups, including film producers and distributors as well as independent vendors, have invested in proprietary data collection and models for this task.

Consider how the perspectives of §2 apply to this task. From the predictive perspective, the goal of box office projection is to predict the revenue generated by the theatrical release. This has value to help studios anticipate the financial outcome of a film, model the expected financial risk and return of their release portfolio, or analyze the strength of their expected competition on a release weekend. From the inferential perspective, the goal of box office projection is to understand the structure and dynamics of the theatrical market. This enables studios to articulate the properties of their film and the marketplace that generates risk for a release and to reason about how to alter production, marketing, and other factors under their control to optimize the return from each product.

Both of these sets of outcomes are of significant interest to studios. One modeling perspective's set of outcomes is not inherently better than the other, but they are different from each other. Yet the predictive orientation has been most prominent in public interest and discussion. Near theatrical release (within a few weeks of a film's debut), predictions of box office models are routinely reported by the industry press. In this near-release regime, progress has been made in engineering and integrating digital signals from social and search platforms. Moreover, online prediction market communities offer non-model based mechanisms for anticipating performance. Despite these advancements, variance in box office projections near the time of release is notoriously high. Earlier in the production lifecycle, typically years before the film's release, is the critical "greenlighting" stage, when a studio decides whether or not to invest in a film concept. The variance of possible outcomes during that stage is much higher still. Fundamental production and marketing variables may not have been set at that point and the future state of the market is much more difficult to foresee. Predictive modeling during greenlighting is therefore less common.

Given all this context, there is much to refer inference as a high leverage goal of box office projection. Inference allows studios to learn generalizable strategies for production that can be relied upon even in regimes where the absolute predictive outputs of the same model have high variance and limited utility for financial applications. Predictive modeling is widespread near theatrical release, but at this stage of the film lifecycle most production decisions have already been executed. The actual predicted dollar value for the gross output by a box office projection model near release is not highly actionable. The most important outcomes from this modeling, from the studio perspective, is the opportunity to adjust marketing and distribution strategy based on inferences about how predicted gross depends on factors such as audience awareness within different territories and demographics. In the greenlighting phase, predictive precision is highly degraded as described above, but inferences about variation in box office performance by production characteristics such as actor caliber, positioning (the genre framing of the film emphasized to audiences), and sensitivity to audience reception can be highly impactful for product development and release planning. Across all time periods, an understanding of uncertainty–both in the predicted outcome and its relationships with the independent variables–is critical given the high variance inherent to the market and the portfolio management and risk mitigation goals of studios. While it need not be so, analysis of uncertainty is often absent from prediction-oriented modeling approaches for box office projection, as in many of the examples cited above.

4.2. Advertising Attribution

In advertising, the multi-channel attribution modeling task is to allocate the value of a consumer conversion (a behavior such as a product purchase or website visit) across the individual "impressions" that causally contributed to that outcome. Impressions are defined as advertisement exposures on different channels, such as television and online social media, or "organic" interactions with a brand such as word of mouth. This modeling enables measurement of the effectiveness of each channel, or "platform," on influencing consumer behavior.

However, rigorous classical causal attribution modeling is not possible in the practical context of most advertising campaigns. It is prevented by incomplete individual-level data on consumer exposure across key online and offline platforms, a lack of consumer conversion data (particularly for offline behaviors), a lack of integration between exposure and conversion datasets when they are available, and an inability to randomize exposure at the individual level. In particular, in the U.S. film industry, the vast majority of tickets are purchased at the brick and mortar box office, and hence not associated with the consumer's identity by digital tracking; there is little or no ability for studios to capture individual ad exposure logs for many major advertising channels, including broadcast and cable television and online social media. In practice, researchers generally need to accept data that are missing by platform (introducing substantial systematic errors associated with non-attributed platforms), data that are missing by person at random (introducing substantial sampling error depending on the number of observations achieved), and/or data that are missing by person not at random (introducing systematic errors based on demographic, platform usage, or other factors that explain the missingness). It is common, for example, to only apply attribution models to a small subset of available marketing channels where data are more readily available or to a "panel" of consumers that have opted in to more detailed tracking, which may have small sample size and may not be representative of the general population.

Predictions from attribution models for individual consumer behavior, or indeed bulk predictive performance measures for attribution models, should therefore not be taken at face value. They will depend sensitively on the aforementioned systematic sources of error, and hence they may not generalize well to real world scenarios. For example, an attribution model incorporating the effect of web display and television ads may not be a reliable predictor of the actual purchase behavior of a consumer who is also influenced by social media ads, not to mention word of mouth and other organic channels.

Nonetheless, the output of attribution models can provide a critical input to other important models in the marketing domain. Measurements of platform effectiveness can be integrated with or provide comparisons for media mix models, which identify the optimal distribution of a media budget across available advertising platforms, and models for bid optimization, which identify the appropriate value of an individual advertising impression. In this way, attribution models can inform decisions made by advertisers about aspects of campaigns they directly control, although the dependent variable (individual consumer product purchasing choices) and unobservable variables (platform effectiveness measures) of attribution models themselves are not directly controllable. The accuracy of the platform effectiveness measurements from the attribution model may be independently validated by the predictive performance of these dependent models.

One may view attribution modeling as inherently a problem of statistical inference: the intent is to measure an unobservable parameter (platform effectiveness). Indeed, Ji, Wang, and Zhang and Lei, Sanders, and Dawson explicitly formulate attribution modeling as a Bayesian inference task.

However, as in all supervised learning tasks, inferences from attribution models must be calibrated on the basis of their predictive performance on observed outcomes. Because platform effectiveness is an unobservable parameter, there is no ground truth to directly validate its inferences, similar to the stellar physical parameters inferred from supernova observations discussed in §2. Therefore, Ji et al., and Lei et al. both assess inferences from models based on their predictive performance on consumer behavioral data such as the AUC, F1-score, and pointwise predictive density. While, as in the box office projection case, the variance of these individual predictions may be high, a rigorous inference procedure will assess the uncertainty of inferences on quantities such as platform effectiveness measurements, characterize their dependency on other model parameters and assumptions, and test their sensitivity to model mis-specification related to issues like platform coverage. In this way, advertisers can extract meaningful and reliable information about advertising channels despite limitations in predictive precision.

4.3. Industry Generalizations

Both the examples in this section illustrate applications where neither a predictive- or inference-oriented perspective by itself is adequate to extract all the available value from data and modeling investments made by businesses. The balanced perspective, able to extract information and insights from the modeling process while also using predictive measures to study the reliability and boundaries of those inferences, should be preferred.

The examples in this section also showcase the role of inference and prediction in different regimes of decision power. In some circumstances, companies or other actors will have direct control over an independent variable in a model, therefore providing indirect decision power over the outcome from a system (modeled as the dependent variable). An example would be the casting decisions in film production, contributing to box office performance. In this domain, inferences about the role of the independent variable in the system are directly actionable as they can provide decision support for choices made about that independent variable. In another regime, the actor may have much more tenuous decision power over the dependent variable (or even none at all). Examples would include models to predict macroeconomic trends or attribution models applied to measure the latent effectiveness of media platforms. In this regime, inferences from models of systems lacking decision power can inform choices made in related contexts. For example, inferences about the role of housing start rates in predicting macroeconomic outcomes can support the use of housing starts as a leading indicator in making investment or product release decisions, and inferences about platform effectiveness are actionable because they inform media mix models used to make decisions about media spending on different platforms. Model design processes for data science in industry should assess the actionability, e.g. the decision support role, of both inferential and predictive aspects of models.

Course Introduction

Course Syllabus

Unit 1: Business Intelligence Overview

1.1: What is Business Intelligence?

Business Intelligence

Introduction to Business Intelligence

1.1.1: What Business Intelligence is Not

Frontiers of Business Intelligence and Analytics

Business Intelligence Dashboards

1.1.2: Business Intelligence vs. Competitive Intelligence

What is Competitive Intelligence?

1.1.3: From Systems Engineering to Business Engineering

Information Architecture Analysis

Systems Engineering

Business Engineering

1.2.1: Contemporary Applications

Business Intelligence in ERP

Improving Outcomes with Business Intelligence

How Businesses Use Information

1.2.2: BI Approaches for Each Lifecycle Stage

The Business Cycle

Big Data Analytics in Supply Chain Management

1.2.3: BI for Prediction

Goal-Oriented BI

Big Data Analytics

BI System Effectiveness

Data Mining Analytics for BI and Decision Support

1.3: The Future of BI

Future Trends in Information Systems

Internet Trends

Trends in Information Technology

Technology Trends in the COVID-19 Pandemic

The Future of BI

1.3.1: Adapting Business Models to Globalization and Technology

Global Business Strategies for Responding to Cultural Differences

Internationalization and the Need of Business Model Innovation

1.3.2: Maintaining the Firm-Centric Approach

Designing BI Solutions in the Era of Big Data

1.3.3: Incorporating Data from the Internet of Things (IoT)

The Internet of Things

The Cognitive Internet of Things and Big Data

Data Science in Heavy Industry and the Internet of Things

Causality and Variables

The Internet of Things is Revolutionary

Unit 1 Discussion

Unit 1 Study Resources

Unit 1 Review Video

Study Guide: Unit 1

Unit 1 Assessment

Unit 1 Assessment

Unit 2: BI as Business Support

2.1: Defining the Problem

Choice and Happiness

2.1.1: Framing Internal Client Discussions

Overview of Managerial Decision-Making

2.1.2: Drafting the Terms of Reference (TOR)

Defining the Scope of your Project

Developing Terms of Reference

2.1.3: Negotiating the Project Scope

Scope Planning

Negotiation

2.2: The Art and Science of Decision-Making

Decision-Making in Management

Decision-Making Processes in the Workplace

2.2.1: Thinking about Thinking

Experience vs. Memory

Evidence Logs and Metacognitive Logs

2.2.2: Use Analysis, or "Go with Your Gut"?

Problem Solving, Thinking, and Intelligence

Using a Heuristics Checklist

2.2.3: Decision-Making Approaches

Decision-Making Tools

2.2.4: Structuring Decision-Making Effectively

RAPID Decision-Making

2.3: Using Data to Make Decisions

Business Intelligence Dashboards

2.3.1: Everyday Data

2.3.2: Why Expert Judgement is No Better than Yours

Why You Think You're Right Even if You're Wrong

2.3.3: How Forecasting can Help Decision-Making