# Tag: graph

I spotted a dot plot while watching TV the other day:

It isn't too frequently that one sees a dot plot on TV, so this is a good opportunity to discuss something students might have encountered. Using this commercial might be a worthwhile topic of discussion in a statistics lesson.

The apparently constructed the dot plot by asking 400 people "How old is the oldest person you've known?" A few more details can be gleaned from the Prudential website and a "behind the scenes" video that was shot.

A few things that can be discussed with students come to mind:

• What can we actually conclude from the dot plot?
• The description of the YouTube video describes this as an "experiment" (as does the narrator in the behind the scenes video). Is this really an experiment?
• What do we know about the sample?
• What happens as we get older in terms of the oldest person "[we]'ve known"? (Children and adults with a wide range of ages are asked to place a sticker.)

I saw this graphic reblogged by NPR on Tumblr (originally posted by Luminous Enchiladas, though I can't be sure of the creator), and I must say that it is impressive.

Olympics vs Mars

There are some pretty substantial problems with this impressively bad graphic.

• Pie charts should only be used when comparing parts to a whole. The \$17.5 billion dollars that went to the Olympics and the Curiosity Rover wasn't a priori some whole amount of money. Treating it as "the whole" implies that there was only \$17.5 billion dollars from wherever to be spent, and that it was spent only on the Olympics and Mars.
• The pieces of the pie chart aren't labeled with the dollar amounts. Instead, the pieces are labeled with the piece's name which does address a complaint with pie charts (namely that the reader needs to continually look back and forth from the chart to the key). Because there are only two pieces, there is room for including the dollar figures in the chart area. With more complicated charts, this wouldn't be the case.
• This chart uses an unnecessary "3D" effect which obscures the true areas being compared. A flat pie chart would be less misleading.

Additionally, there are some general problems with pie charts which make them inferior to other charts (specifically bar charts):

• Comparing areas is difficult. Cleveland (1985) writes about how area comparisons are subject to bias, and Schmid (1983) specifically describes how, when comparing two circles (e.g. two pie charts of different size used to indicate change over time), the area of the larger circle is underestimated relative to the smaller.
• Comparing angles is difficult. Cleveland (1985) states that ordering the sections of a pie chart is prone to error based on earlier empirical research.

References:

• Cleveland, W. S. (1985). The elements of graphing data. Monterey, Calif: Wadsworth Advanced Books and Software.
• Schmid, C. F. (1983). Statistical graphics: Design principles and practices. New York: Wiley.

I've begun searching for news articles that either use statistics incorrectly (misunderstanding what a study means, making bogus claims, etc.), present data exceptionally poorly (misleading graphs), or other offenses against numeracy. I wasn't expecting to find much on Day 1. I was wrong.

The Alligator (UF's 'unofficial' school newspaper) has a daily online poll. Monday's question was "Have you eaten pizza this month?" with the results displayed as percentages and a bar chart. For example:

A few problems:

1. The responses are given numerically only as a percentage with no indication of sample size (either by category or total responses). We are left to guess the sample size from the bar chart.
2. The bar chart doesn't show 0 and goes so far as to not display the observations from the "no" respondents! Based on the percentages and guessing that "yes" had 61 respondents (it is hard to tell), it seems like "no" should have 11 which is certainly not what the graph appears to show.
3. It isn't visible in the above chart, but on different pages of The Alligator the poll results showed slightly different numbers  (85%/15% (picture) and 84%/16% (picture)). I could click back and forth between the pages and the numbers didn't change.
Overall, the online poll (powered by amCharts.com apparently) that The Alligator hosts does a woefully inadequate job of displaying the results.

There are tons of examples of statistics being misused or misrepresented by the media and politicians (among others). I'm going to try and collect examples as I find them so that I don't have to search for them at the last minute.

I saw this image floating around the internet (not sure who originally took the screenshot):

This graph is misleading because it exaggerates the resulting increase in tax rate by not showing having the Y-axis display zero. Displaying the graph only from 34% and up makes the tax rate after January 1, 2013 to be 5.6 times what the rate currently is. In fact, the actual tax rate after January 1, 2013 is about 1.13 times what it currently is if the Bush-era tax cuts are allowed to expire. This is what a more accurate graph would look like:

Disturbingly, when I went to make the above in LibreOffice Calc the default scale was 32-40% (same in Microsoft Excel). While there are times when displaying zero is not necessary, not including zero magnifies the relative differences among the categories (Lemon & Tyagi, 2009). Kozak (2011) gives some recommendations about when including and excluding zero is appropriate; essentially, if zero is meaningful in the context of the data it should be included.

• Kozak, M. (2011). When should zero be included on a scale showing magnitude? Teaching Statistics, 33(2), 53–58. (link to abstract)
• Lemon, J. and Tyagi, A. (2009). The fan plot: A technique for displaying relative quantities and differences. Statistical Computing and Graphics Newsletter, 20(1), 8–10. (link to full article) [Note: I don't know if this article is peer-reviewed, but it is a publication of the ASA, so there is some weight to it.]