Bubble Chart
Description
Bubble charts share certain similarities with scatterplots: They are drawn into a Cartesian coordinate system and provide information about the correlation between quantitative attributes as represented by the two coordinate axes. But in opposition to a scatterplot, the raw data of a bubble chart does not consist of an array of anonymous pairs of variates that only become meaningful in the context of a larger group of items. Instead, each dataset has a unique label assigned, usually a plain-text name to identify the corresponding object in the coordinate grid. In other words, the bubble chart is a method to display an array of objects with distinct features that all dataset members have in common. The other significant characteristic of the bubble chart that distinguishes it from the scatterplot, is the ability to display more than two different quantitative attributes in a two-dimensional coordinate system. Instead of simple dots, each item is displayed as a circle or bubble. While two numerical variables can be derived from its x- and y- coordinates in the representation, the remaining data attributes are displayed by the bubbles’ graphic features, including object size, fill color, brightness etc. Their choice depends on the format of the raw data. While quantitative values can be displayed by the position of the bubbles within the coordinate grid, object size or brightness, qualitative (or categorical) values are usually distinguished by the object’s fill color. These considerations are crucial to the correct use of the bubble chart and refer to Jacques Bertin’s theory of graphic variables.
Data Basis
Use a bubble chart to display tabular data. Each dataset is identifiable by a unique label, and consists of a an array of variables. At least two of these variables must be quantitative. The other attributes can be of any scale of measurement - it doesn’t matter whether they represent quantitative or qualitative values.
Usage
Create a two-dimensional Cartesian coordinate system. Choose two of the quantitative variables from your data table, and apply them to the axes. For each row in the table, draw a circle with its center at the corresponding coordinates. Determine which and how many graphic variables you need to express the remaining variables from your table, and apply them to each bubble accordingly. The most common graphic attributes here are the bubble’s size or, for non-numeric variables, fill color and brightness. When you use the bubble size to express a value, keep in mind the two competing concepts of how to translate the represented attribute into a geometric value. The problem here is that a one-dimensional value is represented by a two-dimensional graphical object. The “traditional“ way to perform this task is to draw the circle’s diameter proportionally to the numerical value. As a result, an object that doubles in diameter (in width and height, that is) appears four times as big - if the value triples, the area even grows to the ninefold size! This is why the technique is drawing much criticism for a long time and is often cited as a prime example for the visual distortion of facts. A more professional way to deal with this problem is to set the data value in proportion to the object’s area. In the case of the bubbles used in the current example this would mean that each the object’s radius does not increase linear with the value it represents, but needs to be derived from the circle area as the square root of the division of the data value by pi.
Rationale
The Bubble Chart pattern is a convenient alternative to pseudo-3D diagrams that aim to display data of three variables in two-dimensional environments such as a book page or a computer screen. In some cases, the bubble chart can even bear more than three variables to display. Set in a conventional Cartesian coordinate system, it appears familiar to the user while extending the possibilities of standard scatterplots.
Example
The web-based statistics visualization tool Gapminder makes extensive use of the Bubble Chart pattern. Its main interface consists of a Cartesian coordinate grid that displays a set of countries, each of which is represented by a bubble. This way the user can choose three of eighteen statistical values the application is able to display simultaneously. Additionally, the bubbles carry a fourth variable of the categorical level of measurement: Depending on the continent the country in question belongs to, the representing bubble has a certain color. Note that the countries’ population figures (the value the bubbles usually represent) are proportional to the area of the bubbles, enabling Gapminder to display countries like China or India (with populations of more than a billion) next to very small states such as Swaziland or Brunei, which are still clearly indentifiable. With the circles’ diameter set in relation to the population figures, such a display would have become technically impossible.
The scaling of the bubbles in this software is problematic as it is not proportional to the area but to
an offset least size area, that means that it has a cutoff that distorts the data. The same problem as using barcharts with a non-zero base. I wrote it so I should know, sadly.
Jörgen Abrahamsson | 2008-09-29 10:27:23
Bubble charts rock!
qwertz | 2008-08-29 17:59:29




Pattern Properties
Created
2008-07-28 18:30:53
Last Edit
2008-07-28 18:38:21
Category
  1. Display Patterns
  2. Correlations
Related Patterns
Generalization
Community
Statistics
  1. 4551 Views
Rating
8 votes
Latest Comment
The scaling of the bubbles in this software is problematic as it is not proportional to the area but to an offset least size area, that means that it has a cutoff that distorts the data. The same problem as using barcharts with a non-zero base. I...
Affiliate Links