Narrative Visualization and Storytelling

Narrative structures include events and visualization of characters. An example narrative can be a simple interface that presents trends in keywords over time. Narrative visuals contain the transition between events. It involves "using a tool to visually analyze data and to generate visualizations via vector graphics or images for presentation" to decide "how to thread the representations into a compelling yet understandable sequence". Plowman et al. report that a narrative specifically refers to the macro-structure of a document in contrast to the term story which refers to both structure and content. This structuring of evidence, combined with the choice of appropriate rhetorical strategies, is referred to as "the art of storytelling" among literary scholars. Research in narrative visualization points to visualization features that afford storytelling including guided emphasis and structures for reader-driven storytelling. It also includes the principles that govern effective structuring of transitions between consecutive visualizations in narrative presentations, and how different tactics for sequencing visualizations are combined into global strategies in formats like slideshow presentations. We separate transitions into their own section, Section 5 and Section 6, because of their importance.

A narrative can be seen as a macro-structure which creates global coherence, contributes to local coherence and aids recall through its network of causal links and signposting. The focus of Plowman et al.'s research is how students make sense of their learning with multimedia by constructing their own narratives in conjunction with the narrative guidance. The design elements presented by the software constitute narrative guidance and can be a combination of features specific to interactive media, such as the need for clear navigational procedures, with features associated with traditional media, such as recognizable narrative and a clear relationship between tasks and the macro-narrative.

Plowman et al have developed three versions of Galapagos as a research tool, based on extended observation of students using commercially available CD-ROMs in schools.

The linear version is designed in such a way that students are led through the eight sections of the CD-ROM in sequence and it is closest to a traditional narrative as presented in educational television. The linear version presents a high degree of narrative guidance and little opportunity for learners to decide their own narrative path so they have relatively little control. The resource-based version offers no guidance through the CD-ROM sections and leaves students to define their own route. There is very little narrative guidance offered and learners have to construct a narrative by making decisions about sequence, so there is a high degree of user control and heavy use of the menu to decide the route. The guided discovery (GDL) version offers guidance in breaking down the task by providing paths through the material, questions to stimulate enquiries, and direction to specific resources. The GDL version was designed to offer a balance between narrative guidance and support for narrative construction and this is reflected in a more even balance between user and system control. Learners are able to determine sequence and their course of action but are offered guidance in doing so. Most sections (seven) are accessed by the GDL users because the guide encourages them to be interactive in their approach and to use the material to support their response.

All papers in this section develop methods or structure on how to improve narrative storytelling visualization. Viegas et al. present methods for improving data memorability. Fisher et al. present ways for tracking narrative events over time. Segal and Heer investigate the design of narrative visualizations and identify techniques for telling stories with data. Hullman et al. design the structure of a visualization to present storytelling. Figueiras studies how to incorporate narrative elements as storytelling elements. Again, these papers may cover more than one topic in Table 1. The borders between categories are not 100% black & white. We place papers in the category reflective their main focus.

An overview of the visualization methods used in storytelling for visualization can be found in Figure 2. We include it in the section on narrative visualization since this is where the most research has been done. We can observe that most of the visualization designs used are familiar.


Narrative Visualization for Linear Storytelling

The literature in this sub-section focus on narrative visualization using linear automatic or semi-automatic approaches (as opposed to interactive approaches). The research here involves tools and techniques with an emphasis on how stories are created.

Hullman et al. describe a system called contextifier, which automatically produces custom, annotated visualizations from a given article. The system architecture contains four main sections. A news corpus consists of a large set of news articles. A query generator identifies the most-relevant company in the article. An annotation selection engine integrates selected features into an annotation. And the graph generator generates line graphs using annotations and series. The flow of information can be seen in Figure 13.

Information 09 00065 g013 550


Figure 13. Hullman et al. show the architecture of contextifier (left) and illustrate Parallelism in sequencing in the NYT Copenhagen (right).

Hullman et al. is based on previous work in storytelling in visualization and Kandogan's automatic annotation analytics. It develops a system that can automatically generate custom, annotated visualization from a news article of company. Hullman's work places more emphasis on providing background information or perspective on the data than Kandogan's.

Hullman et al. outline how automatic sequencing (the order in which to present visualizations) can be approached in designing systems to help non-designers navigate structuring decisions in creating narrative visualizations. Their focus is on how linear, slideshow-style presentation can be optimized using knowledge of sequencing styles and strategies by incorporation.
Hullman et al. argue that analysts using narrated data presentations could be aided by tools for identifying effective sequences for visualizations. They conduct a qualitative analysis of the structural aspects of 42 examples of explicitly-guided professional narrative visualizations. One example is shown in Figure 13. They propose a graph-driven approach for finding effective sequences for narrative visualizations informed by their analysis, including defining data attributes for transitions, labelling, and maintaining consistency. The result suggests a need for more sophisticated global constraints than simply summing local transition costs to determine the best path through a graph of weighted visualization transitions. This paper is based on previous work of narrative sequencing and narrative visualization, and demonstrates that narrative sequencing can be systematically approached in visualization systems.

Amini et al. identify the high-level narrative structures found in professionally created data videos and identify their key components. They derive broader implications for the design of an authoring tool to enable a wide audience to create data videos. Amini et al perform two studies to enhance understanding data videos. They conduct a qualitative analysis of 50 data videos from 8 reputable online sources, and observe that data video categories are also hierarchical and can be further decomposed into units: sequences that put forward different points contributing to a single category. They design a series of workshops to observe how professional storytellers create data video storyboards. They observe the creation process is non-linear and iterative. Amini et al. is based on previous work on storytelling and storytelling in information visualization.

Bach et al. develop graph comics for data-driven storytelling to present and explain temporal changes in networks to an audience. Bach et al. present six steps to guide graph comics design. See Figure 14.

Information 09 00065 g014 550


Figure 14. Bach et al. present graph comics for data-driven storytelling.

They first collect diagrams, comic literature, and pictures within comics to understand traditional comics structure. The second step is to find possible visual encodings that can represent graph objects, their properties, and the possible changes which they may undergo. They design principles that define when certain visual marks and their attributes can be used and when not. They exploit their design challenges and the structural principles to create comics. They contact two domain experts to collect external feedback and present a qualitative study to check if graphics comics are readable by a wider audience. Bach et al. is based on previous work on network exploration and data-driven storytelling.


Narrative Visualization for User-Directed and Interactive Storytelling

The literature in this subsection focuses on interactive, user-driven narrative visualization (as opposed to automatic or semi-automatic). In other words, the papers focus on techniques that enable users to create narratives interactively. Viegas et al. summarize two methods of visualizing email archives with the aim of improving memorability of the data. Both focus on the higher level patterns of the user's email habits. The original goal was for these visualizations to uncover social patterns in the archive, but the resulting visualizations caused the user to be more reflective of the data as opposed to analytic. They look at data points and want to recall the story behind it, even share the visualization with friends. See Figure 15.

Information 09 00065 g015 550


Figure 15. Viegas et al. show the PostHistory visualisation. On the left is the calendar view, showing 365 squares to represent each day of the year (This image only shows data up until May). Size corresponds to the volume of email sent on that day. The colour highlights a specific recipient that has been selected in the contact panel (left). The contact panel shows all the contacts the user has been corresponding with over the year. A contact can be selected to highlight their interaction in the calendar view.

For visualizing email activity, the two axes stand for time, and the dyadic relationship between user account holder and each human interaction. Pattern recognition includes interaction frequency, interaction rhythm, interaction balance, and archive size. The visualization interface includes two main panels; the calendar panel, showing email intensity, and the contacts panel, showing the names of the people being interacted with. When the user clicks on a day square in the calendar panel, the contact panel highlights the names of the people communicated with that day. A name can be clicked on in the contacts panel and each day where that person had corresponded will be highlighted in the calendar panel. The contact panel can be viewed as an animation transitioning through the year of data. The email header data is used to derive the social context of the communication. Five different relationship types are classified. This can be either directly between correspondents or through mutual recipients in group emails. The Social Network visualization looks at each message and evaluates the role of the user (through the email address used i.e., work, school or personal) and makes connections regarding the interaction accordingly. This data is visualized as an animation that evolves over time. Each second represents one day in the archive. A clustered word cloud is used to display the data. Previous visualizations of online social interaction data have been focused on unravelling the data from the researchers' perspective, whereas these visualizations are for the benefit of the user.

Hullman and Diakopoulos state that narrative information visualizations are a style of visualization that often explores the interplay between aspects of both exploratory and communicative visualization. This work contributes to information visualization design and theory by providing insight into the types and forms of given rhetorical techniques in narrative visualizations, and the interaction between those techniques and individual and community characteristics of end users. The authors study how rhetorical techniques are used in visualization. They then investigate the resulting effects of these techniques on user interpretation. The authors collect 51 professional narrative visualizations e.g., from the New York Times and BBC. Each visualization is "coded" using theory form semitics, statistics, decision theory, and media and communication studies. The visualizations are categorized according to a selection of rhetorics information access, provenance rhetoric, mapping rhetoric, procedural rhetoric, and linguistic rhetoric. Their work provides a taxonomy of specific information presentation manipulations used in narrative visualization. See Figure 16.

Information 09 00065 g016 550


Figure 16. Hullman and Diakopoulos demonstrate how data can be window dressed to change the viewers opinion of it. These two images visualize the same data but each illustrator has different intended outcomes. The top image shows an unstructured, complicated graph of conflicting colors and shapes, clearly intended to confuse and obstruct the data, whereas the bottom lays the data out in a simple fashion using consistent shapes and colors.

In the mapping America visualization example, The United Stated Census represents a nation wide attempt to provide an objective view of the demographic of the country. Ethnic group is mapped to color, samples are shown on a map and a single ellipse represents 200 people. The poll visualization summarizes the accuracy of political poll predictions from several years and polling agencies in a small multiplies presentation of vertical line graph. Colored bars representing the political parties are drawn to connect data points positioned on the y-axis according to the amount of time prior to the election and on the x-axis according to whether the predictions fell over (to the right) or under (to the left) a centred vertical line representing complete accuracy (or error of zero).

Hullman and Diakopoulos is based on the previous work of Segel and Heer which makes an initial step towards highlighting how varying degrees of authorial intention and user interaction are achieved by general design components in narrative visualization. This work examines the design and end-user interpretation of narrative visualizations in order to deepen understanding of how common design techniques represent rhetorical strategies that make certain interpretations more probable.

A visualization with a narrative is set apart from a visualization without through both its structure and its content. A narrative-based visualization attempts to create a natural flow whereby the data has an obvious progression and therefore permits easier understanding and memorability.

Figueiras takes professionally produced visualizations as case studies to analyze how to incorporate narrative elements as storytelling elements. By presenting prototypes of storytelling in selected case studies, Figueiras presents a design study and model for narrative visualization by using storytelling techniques.

In the "How many households are like yours", users can choose the primary residents and secondary members of a household, then get the number and percentage of households. Figueiras introduces short stories describing different kinds of families instead of having only one article about types of families. This technique engages the user with a focus on creating empathy.

"What does China censor online?" is a tag cloud that only has a title and text shaped on a map of China. Figueiras introduce a tooltips pop up when a user clicks on one region, which provides more detailed information. See Figure 17. Tooltips provide additional context in the form of text which help explain the possible reasons for censorship.

Information 09 00065 g017 550


Figure 17. Figueiras shows a visualization of Chinese online censorship enhanced with storytelling. An interactive feature is added so that the user can click on an instance of censorship to learn more about it. This supplies context to the user and also may draw an empathetic response from the user.

"Death Penalty Statics, Country by Country" figure is a static visualization with different size of bubbles representing the number of death sentence rulings. Figueiras designs an interaction such that when a user chooses a year, a graph displays the number of death sentences handed out that year, which provides extra temporal information and a redesign into a story.
The following Narrative Strategies are described:

  1. Context: Providing context to a visualization enables the user to make sense of the data using additional information. Without a sufficient amount of context, less meaning can be derived from the data, whereas the addition of context gives the user more information to explore the data and begin to understand features found within it. This is made easier by the development of interactive visualizations and the ability for users to choose what layers of information they see.
  2. Empathy: Although not often associated with information visualization, it has been found that emotive/empathetic visualizations are more memorable and more enjoyable for the user.
  3. Time Narrative: Utilizing the temporal nature of data in visualization allows users to mentally map the data by adding a sense of story flow. This improves user memorability and aids in the understanding of the data.

Figueiras is based on previous work of storytelling and narrative visualization, and develops a model to add storytelling in narrative visualization.

Storytelling aims to simplify concepts, create emotional connection, and provides capacity to help retain information. Figueiras presents the results of a focus group study on collecting information on narrative elements. She then suggests strategies for storytelling in visualization.

Sixteen participants are asked to study 11 information visualizations of different types and different characteristics (interactive, non-interactive, introductory text, accompanying article, and audio narration). Then they are asked to rate visualizations in terms of comprehension, navigation, and likability. See Figure 18. The participants give high scores to all visualizations, particularly to interactive visualizations. The study suggests that a good storytelling visualization is well-structured and interactive with audience preferences. The results of the user study suggest that interactivity, the option of drilling-down, context, and a sense of relatability and importance for users to feel engaged.

Information 09 00065 g018 550


Figure 18. Figueiras shows the visualizations used in the focus group study and the elements that compose them.

Figueiras is based on previous work of narrative visualization, and storytelling visualization. The author uses a focus group to examine storytelling effects in information visualization and storytelling visualization.

Nguyen et al. develop a new timeline visualization, SchemaLine, to gather, represent, and analyze information. They then use a preliminary study to evaluate its effectiveness. See Figure 19.

Information 09 00065 g019 550


Figure 19. Nguyen et al. present the SchemLine system.

The system contribution includes: a visual design for an interactive timeline that groups notes into schema determined by the analyst; an algorithm to automatically generate a compact and aesthetically pleasing visualization of these schema on the timeline; and a set of fluid interactions with the timeline to support the sensemaking activities defined in the Data-Frame model. Their work is based on previous work of timeline visualization and sensemarking with timeline.


Narrative Visualization for Storytelling in Parallel

In this category of literature, the structure of events is layed out in parallel. The research here focuses on tools and techniques that create multiple narratives at once, in other words simultaneously. These can be useful for groups.

Information visualization systems enable users to find patterns, relationships, and structures in data which may help users gain knowledge or confirm hypotheses. The most basic element in a narrative is a character. An event occurs through the interaction of a set of characters. In this paradigm, a scene consists of a chunk of events, a story consists of a sequence of scenes, and a world model is made up of a set of stories. Akaishi et al. propose several methods for visualization of chronological data based on the narrative structure of a document. Akaishi et al. map each narrative component (world model, story, scene, event, character onto elements of a document, set of stories, sequence of scenes, part of sentence, sets of terms). The system features a decomposition unit and a composition unit. A set of stories is stored in a database by the decomposition unit. In the database, each story is divided into scenes, forming a world model. Appropriate scenes are selected and used by the composition unit to compose a new story. When a user accesses the information, the software provides the results as a story. The story is presented in various ways.

The dependency relationship among terms forms a directed graph, called a Word Colony. In a Word Colony, interdependent terms are embedded into the same node. The strength of term dependence is mapped onto the distance between nodes of terms, and term attractiveness is mapped onto the size of node. To visualize this relationship, Akaishi et al. use a spring model graph, which is a visual overview of a document. NANA represents the content of a document as a topic sequence and topic matrix. Topic sequence is regarded as the graphical plot of a document and topic matrix represents the relationships among several topic changes. Akaishi et al. support users' efforts to find desired parts of documents and to guess the context (plot) of the document.

Narrative is a simple interface that straightforwardly presents trends in keywords over time. Fisher et al. present narrative as a way of presenting temporally dynamic data. In this case, narratives help the user by tracking concepts found in news stories that change over time. Fisher et al. show how to piece together complex information and examine multiple variables, See Figure 20.

Information 09 00065 g020 550


Figure 20. Fisher et al. show daily references to four US presidential candidates from 1 January to 26 March 2008. Time passes along the x-axis for each candidate; number of mentions of the term along the y-axis.

The first step is based on a business analysis task to find trends and public relations. In this case study, the requirement is to find out how a topic has developed over time and to see the evolution of the latest and most interesting stories. The system design includes data acquisition, temporal visualization, using other tools for correlation, understanding readership, and adding feature in narratives. The narratives project is based on Microsoft's Live Labs which provides real time data acquisition. Temporal visualization enables us see how a small group of words evolves over time relative to one another. By analyzing the form of correspondence and understanding readership, additional features can be added into the narrative project. Fisher et al. is based on previous work in topic detection and tracking, and temporal visualization, and presents narrative as a new technique in visualization.


Narrative Visualization Overviews

Segel and Heer state that storytelling is revealing stories with data and using visualization to function in place of written story. The Oxford English Dictionary defines a narrative as "an account of a series of events, facts, etc., given in order and with the establishing of connections between them". Heer et al. investigate the design of narrative visualizations and identify techniques for telling stories with data graphics and challenges with the salient dimension of visual storytelling. They describe seven genres of narrative visualization: magazine style, annotated chart, partitioned poster, flow chart, comic strip, slide show, and video. See Figure 21. They also discuss directions for future reader-centric research.

Information 09 00065 g021 550


Figure 21. The figure shows the seven genres of narrative visualization presented by Segal and Heer. These vary in terms of the number of frames and the ordering of their visual elements. A video, for example has a strict ordering and high frame number, whereas a 'Magazine Style' poster may have a few frames in one image that are not strictly ordered. These genre elements dictate if a story is author-driven or reader-driven. Author-driven content uses a linear ordering of scenes and has no interactivity. Reader-driven content has no prescribed order to scenes and a high level of interactivity with the reader.

In the New York Times visualization on steroid usage in sports, one larger image and line chart are combined with small images, line charts, and bar charts to illustrate the usage of steroids status over 30 years. The visualization incorporates visual highlighting and connecting elements leading viewing order. The year is mapped to the x-axis, the amount of steroids is mapped to the y-axis, and different colors represent different players.

In the New York Times visualization on budget forecast, a progress bar is used to describe the accuracy of past White House budgets predictions. The time is mapped to x-axis, and budget situation is mapped to the y-axis.

The Afghanistan nation-building development project example is a interactive geographic visualization with details on-demand sliders that present the status of Afghanistan nation-building development projects. Opium cultivation is mapped to the color, and countries are shown on the map. Time can be changed from 2005 to 2009 by dragging the control bar.

The Gapminder visualization uses animated bubble charts to show possible detrimental effects on a person's ability to follow trends. Continent is mapped to color, region is mapped to each bubble, and size is mapped to bubble size, and position is mapped to average yearly income.

The Minnesota Employment Explorer shows how mouse-hover provides details-on-demand, double-clicking an industry triggers a drill-down into that sector while an animated transition updates the display to show sub-industry trends. Color represents different industries, the x-axis represents the time, and the y-axis represents employment.

Segel and Heer is based on previous work of narrative structure, visual narratives, and storytelling with data visualization and observes the storytelling potential of data visualization and drawn parallels to more traditional media. This paper identifies salient design dimensions, clarifies how narrative visualization differs from other storytelling forms and how these differences introduce both opportunities and pitfalls for its narrative potential.