Dos and Don’ts of Data Visualisation

Whether on purpose or not, a statistician can mislead an audience with a chart. This article explains some chart design principles and common mistakes novice data analysts make. Think about the statistical charts you have seen on billboards, in the news, and in research studies. Using these principles as a guide, would you classify any of those charts as misleading? Be sure to take note of the suggestions for successful dashboards.

Make charts correct

Do be careful about how you treat 'no-data/missing data'

Take the following chart as an example of the results of certain observations made on the street. You want to see how many people walking by are wearing glasses (X) or not wearing glasses (Y) in a specific time frame. When you are not able to identify either, you mark it as ‘unknown'. After 1 000 observations you stop collecting data.

Do be careful about how you treat ‘no-data/missing data'

The left chart says that 33.5 % wore glasses (X) and 28.6 % did not (Y), while 37.9 % were unknown (the missing data). The issue with the chart above is that the unknown must not be treated as a third category that is different from the other two. The unknown contains both X and Y, most probably with the same distribution. Therefore, the missing data must be removed and only reported separately. This is standard practice in all statistical surveys. On the right, the chart is corrected without the unknown. In this case, an indication of a margin of error would also help.